Artificial Intelligence: Failures and Future Impact on Deliverability
I've been thinking about the intersection of A.I. and email, especially in the context of spam filtering, sender reputation, and authentication. Why? Because people ask about it a lot. People really want to know how A.I. will impact their business model, their marketing efforts.
How is it used today?
How A.I. is used in our realms of deliverability reputation today, you already know the answer to this one. I've talked about it before. Content analysis, engagement metrics, anomaly detection, reputation modeling. These are things already being powered by machine learning and artificial intelligence at some mailbox providers. Regardless of how the mechanisms are powered today, use of these signals and metrics will continue.
But what of….the future? What will the future hold for email deliverability, specifically, where and how will A.I. impact email deliverability? We really just don't know what the future will hold. So why not ask Skynet? I mean, ChatGPT. So, I asked ChatGPT: How might A.I. impact email deliverability and sender reputation in the future?
The future holds: more of the same
And what did it tell me? More of the same. Stuff we already know. Models become more individualized. Learning not just what spam looks like in general, but what each user considers unwanted. Real-time reputation models get faster at catching bad guys doing silly things; exploitable gaps reduce. On the sending side, maybe platforms get better at blocking outbound badness; Mailchimp has Omnivore, maybe there will be somebody with an A.I.-driven version of this type of mechanism to analyze sender data and activity and determine that a send will be bad and prevent it from being sent.
Let's turn it sideways
Time to adjust the prompt. Where can A.I. get it wrong? Let's extrapolate, I said. Starting with a well-known example: A beta version of Apple Intelligence incorrectly highlighted phishing emails as important, missing obvious signs that the mail was not legitimate.
Explain to me, how you think that A.I. can get it wrong when trying to determine if an email message is good and wanted, and deserving of inbox delivery, or, if it's garbage, unwanted, phishing or mailware.
A.I. says what?
ChatGPT has some thoughts:
One place where things are clearly going off the rails is when A.I. is used to interpret or prioritize email in user interfaces, without respecting basic technical signals. As you note, during the Apple Intelligence beta, there were reports of phishing emails being labeled "important," despite failing email authentication checks. This wasn't just a user experience fail. It showed that the underlying model may not have been designed to pay attention to things like SPF, DKIM, or domain alignment. It probably just saw a well-written message and decided it looked "helpful."
That's dangerous. A.I. models that score or categorize email based only on tone, appearance, or perceived intent can be tricked easily. Many phishing messages now read like they were written by a customer success team. A spam filter that treats "natural language" as inherently trustworthy is going to have a bad time.
There are plenty of other ways A.I. can misfire in this space. Some models might give too much weight to message formatting or sender names, and not enough to reputation or infrastructure signals. Others may generalize reputation across shared IPs or domains and cause collateral damage to legitimate senders who just happen to be in the same neighborhood. A model might assume a lack of engagement means you're a spammer, even if you're a low-volume, fully opted-in sender doing nothing wrong.
If A.I. filtering systems are built without proper feedback loops, they may also fail to adapt to abuse tactics. Spammers test against filters. They rotate language, shift sending domains, and are fond of attempting varying tricks over time. A static model might not see the pattern until the damage is already done.
Even sender-side A.I. tools can backfire. Using machine learning to write thousands of "personalized" emails may sound smart, but if you're sending cold emails en masse or mailing to people who never signed up, you're just doing bad marketing at scale. That won't help your sender reputation. It will hurt it.
Authentication protocols like SPF, DKIM, and DMARC still matter. Engagement signals still matter. A.I. can be part of the solution, but only when it's used in combination with the existing infrastructure and well-defined standards email has relied on for decades. When those foundations get ignored, A.I. doesn't just make mistakes. It makes dangerous ones.
If A.I. is going to shape the future of email deliverability, the people building it need to remember what already works. A machine learning model can't protect the inbox if it doesn't understand how email authentication works, what sender reputation is based on, or why sudden changes in sending behavior are suspicious. Inbox placement isn't just a language problem. It's an infrastructure problem. It always has been.
There you go
Based on my prompt, and with some light heavy editing, I think that I've gotten something useful out of ChatGPT here today. I actually found this interesting, myself (and I hope you did, too). I was just playing with ChatGPT, not intending to make a blog post of this, but I found it to be a great way to think about, re-frame, and get feedback on, some thoughts I've had on where we're headed in the future, with a particular focus on where it can all go wrong.
Especially when it comes to not forgetting the past; the "legacy" signals that help discern spam from wanted mail, help discern legitimate from illegitimate. A.I. will likely bring iterative improvements, but we're not quite to a point where we should just throw out all of the existing validity and reputation checks.
I've been thinking about the intersection of A.I. and email, especially in the context of spam filtering, sender reputation, and authentication. Why? Because people ask about it a lot. People really want to know how A.I. will impact their business model, their marketing efforts.
How is it used today?
How A.I. is used in our realms of deliverability reputation today, you already know the answer to this one. I've talked about it before. Content analysis, engagement metrics, anomaly detection, reputation modeling. These are things already being powered by machine learning and artificial intelligence at some mailbox providers. Regardless of how the mechanisms are powered today, use of these signals and metrics will continue.But what of….the future? What will the future hold for email deliverability, specifically, where and how will A.I. impact email deliverability? We really just don't know what the future will hold. So why not ask Skynet? I mean, ChatGPT. So, I asked ChatGPT: How might A.I. impact email deliverability and sender reputation in the future?
The future holds: more of the same
And what did it tell me? More of the same. Stuff we already know. Models become more individualized. Learning not just what spam looks like in general, but what each user considers unwanted. Real-time reputation models get faster at catching bad guys doing silly things; exploitable gaps reduce. On the sending side, maybe platforms get better at blocking outbound badness; Mailchimp has Omnivore, maybe there will be somebody with an A.I.-driven version of this type of mechanism to analyze sender data and activity and determine that a send will be bad and prevent it from being sent.Let's turn it sideways
Time to adjust the prompt. Where can A.I. get it wrong? Let's extrapolate, I said. Starting with a well-known example: A beta version of Apple Intelligence incorrectly highlighted phishing emails as important, missing obvious signs that the mail was not legitimate.A.I. says what?
ChatGPT has some thoughts:One place where things are clearly going off the rails is when A.I. is used to interpret or prioritize email in user interfaces, without respecting basic technical signals. As you note, during the Apple Intelligence beta, there were reports of phishing emails being labeled "important," despite failing email authentication checks. This wasn't just a user experience fail. It showed that the underlying model may not have been designed to pay attention to things like SPF, DKIM, or domain alignment. It probably just saw a well-written message and decided it looked "helpful."
That's dangerous. A.I. models that score or categorize email based only on tone, appearance, or perceived intent can be tricked easily. Many phishing messages now read like they were written by a customer success team. A spam filter that treats "natural language" as inherently trustworthy is going to have a bad time.
There are plenty of other ways A.I. can misfire in this space. Some models might give too much weight to message formatting or sender names, and not enough to reputation or infrastructure signals. Others may generalize reputation across shared IPs or domains and cause collateral damage to legitimate senders who just happen to be in the same neighborhood. A model might assume a lack of engagement means you're a spammer, even if you're a low-volume, fully opted-in sender doing nothing wrong.
If A.I. filtering systems are built without proper feedback loops, they may also fail to adapt to abuse tactics. Spammers test against filters. They rotate language, shift sending domains, and are fond of attempting varying tricks over time. A static model might not see the pattern until the damage is already done.
Even sender-side A.I. tools can backfire. Using machine learning to write thousands of "personalized" emails may sound smart, but if you're sending cold emails en masse or mailing to people who never signed up, you're just doing bad marketing at scale. That won't help your sender reputation. It will hurt it.
If A.I. is going to shape the future of email deliverability, the people building it need to remember what already works. A machine learning model can't protect the inbox if it doesn't understand how email authentication works, what sender reputation is based on, or why sudden changes in sending behavior are suspicious. Inbox placement isn't just a language problem. It's an infrastructure problem. It always has been.
There you go
Based on my prompt, and with somelightheavy editing, I think that I've gotten something useful out of ChatGPT here today. I actually found this interesting, myself (and I hope you did, too). I was just playing with ChatGPT, not intending to make a blog post of this, but I found it to be a great way to think about, re-frame, and get feedback on, some thoughts I've had on where we're headed in the future, with a particular focus on where it can all go wrong.Especially when it comes to not forgetting the past; the "legacy" signals that help discern spam from wanted mail, help discern legitimate from illegitimate. A.I. will likely bring iterative improvements, but we're not quite to a point where we should just throw out all of the existing validity and reputation checks.
Comments
Post a Comment
Comments policy: Al is always right. Kidding, mostly. Be polite, please and thank you.