Ask Al: What are filters checking?

Jerry writes, "Al, a recent email from 'Get to the Point' quoted you as below. My question is this: What, exactly, are spam [content] filters picking up from a generic template that could reduce delivery? Thanks in advance for your reply."

It looks like I was quoted by MarketingProfs: Here's how it happens. "If that partner works with a whole bunch of people sending email," explains Al Iverson in a post at the Spam Resource blog, "[and] if that template is out all over town, then there's a pretty good chance that somebody has sent emails using that template to poorly permissioned lists, causing spamtrap hits, spam complaints, and so forth."

"Spam filters that use content fingerprinting, meanwhile, see the same message coming from your company and lump you in with the abusive senders."

Jerry, thanks for your question. Though, I think a point is being missed here. There is NOT a list I can give you saying "avoid this tag, or avoid this image," or whatever. No such list exists; and it's impossible to compile one.

The thing that these filters catch is commonality. If your content has different variables in common with other messages tagged as bad (for whatever reason), then your messages get tagged as bad, too. What does commonality mean? It can mean a whole bunch of things, and nobody publishes a list of the exact variables that are checked. It probably is all of the following things, and more:
  • Your from domain.
  • What domains you link to.
  • The domain where images are hosted.
  • What images you use.
  • What HTML template you use.
  • What unsubscribe footer you use.
The HTML/text/source/etc overall -- some systems perform message hashing, converting a message to a short numeric or alphanumeric string string of characters, based on the various characteristics of the message. Similar messages will have hashes that are similar or the same, making them easy to identify.

1 comment:

  1. Something that you start to get at in the last paragraph, but don't actually state: the ultimate result of using a template that has also been used for spam is that your message now looks like spam.

    It doesn't really matter whether spam-detection is done with fuzzy body-text hashes or taught on a bayesian corpus or hard-wired by some admin saying to himself "this string of text shows up a lot in spam", the result is that you've sent a message that, in whatever way that was, is indistinguishable from spam. ...and that's generally a bad idea.


Comments policy: Al is always right. Kidding, mostly. Be polite, and you're welcome to join in, even if it's a differing viewpoint.