Gmail: Improving spam filtering with TensorFlow

Google just announced today how they've improved spam filtering using TensorFlow.

What's TensorFlow, you might ask? "An open-source machine learning (ML) framework developed at Google. These new protections complement existing ML and rules-based protections, and they’ve successfully improved our detection capabilities. With TensorFlow, we are now blocking around 100 million additional spam messages every day."

That's a lot of newly blocked email messages. Does it affect you, dear sender? Hopefully not, because Google says that they're "now blocking spam categories that used to be very hard to detect," including "image-based messages, emails with hidden embedded content, and messages from newly created domains that try to hide a low volume of spammy messages within legitimate traffic."

This doesn't mean suddenly it is unsafe to send image-heavy emails to your Gmail subscriber base. Google's not about to intentionally start blocking legitimate mail that people actually signed up for. But it does highlight that the closer you get to the edge of best practices -- if you have any practice failings in different areas, you could end up overlapping with one or more of these categories. If so, your messages might actually merit blocking. I'm guessing the chances that it affects a "legitimate" sender are pretty slim, though. But, just a reminder -- "Don't be like Goofus," as the old Goofus and Gallant stores in Highlights for Children used to tell us.

Spammers often do things like rotate through newly purchased domains, embed content in unique ways to try to evade filters, and use images to hide messaging from machine filter review. Don't do these things, and I think you'll probably be just fine.

2018: Did I get it right?

Just over a year ago I predicted that 2018 would be a year full of mailbox provider consolidation, many folks implementing DMARC, and ISP filtering getting more tougher than ever. Was I right? It sure sounds a lot like what I worked on much of the time last year.

Is it too glib to say 2019: More of the same? Because that's my first thought. Provider filters continue to get tighter, DMARC is bigger than ever, and AOL and Yahoo are not quite done merging. I suspect BIMI will grow in 2019, but I feel like we're two or three years out before somebody can declare that 20xx is the "year of BIMI."

I know I'll be focusing more on international (non-US) deliverability this year, but it's hard to say if that's just me, that might not be an "industry" thing.

What do you foresee for challenges and likely focus areas for email and deliverability in 2019?

Fun while it lasted...

Remember back in September when I blogged about how to create a Google+ account to make your brand icon display next to your emails when sending to Gmail users?

Well, looks like that won't work after a certain point, as Google is shutting down Google+ and will be deleting Google+ accounts and content.

I got a notice this morning that says my various Google+ accounts (used for logo display for various email sender tests I've set up) will be shut down on April 2, 2019.

It was fun while it lasted.

I wonder if this means Google will get on board with the BIMI logo display standard? Or there will be some other way to do this? We shall see.

Stop using NJABL! Now!

I just replied to an email from a guy who thinks I'm blocking his mail. I'm not, because I don't run a blacklist or a spam filter, and haven't done so for years. I would have loved to have helped guide him in the right direction, but my reply to him bounced because his mail server is misconfigured to use the NJABL blacklist.

The NJABL blacklist has been dead for almost five years.

If you still have it in your email server configuration, you're now going to block a lot of wanted mail. Because the domain's name servers just changed and they have a wildcard entry that now has the effect of "blacklisting the world."

You were warned...almost five years ago.

Characters in the local part of an email address

Need a "common sense" breakdown showing you what characters should be allowed in the username part (local part) of an email address? This handy guide from Jochen Topf covers exactly that.

It doesn't EXACTLY align with RFCs, but when you look at it from a common sense perspective, I agree with his categorization of each character. This would be a good thing to reference if you were building your own email capture form. (I'd probably also reject the "maybes" for an email capture form, but not reject them in an MTA configuration. Some of the "maybes" show up in bounce addresses somewhat regularly, but are almost never found in legitimate end user email addresses.)

How to win friends and influence people?

Not like this.

I'm not quite sure who Wonderland Collective are, but when somebody asked them why they are sending unsolicited email, they decided to complain back, instead of apologizing.


But wait, there's more! Be sure to read the whole thread. I sort of assume at some point they'll be changing their tune and apologizing. Unless they prefer to be blacklisted. I wonder if they did something that could get them into enough trouble that they'd even get fined? I'm not sure, as I don't know enough about what's happening here. But sending unsolicited spam, then barking at people who ask you to stop, sure doesn't seem to me like a good way to run a business.