Catching Spam by Looking at Traffic, Not Content
AngryDad writes "HexView has proposed a method to deal with spam without scanning actual message bodies. The method is based solely on traffic analysis. They call it STP (Source Trust Prediction). A server, like a Real-time Spam Black list, collects SMTP session source and destination addresses from participating Mail Transfer Agents (MTAs) and applies statistics to identify spam-like traffic patterns. A credibility score is returned to the MTA, so it can throttle down or drop possibly unwanted traffic. While I find it questionable, the method might be useful when combined with traditional keyword analysis." What do you think? Is this snake oil, or is there something to this?
I was going to say... What would happen if we all started replying with the same auto generated mails? How would the spammers tell the difference from legit spam replies?
The new bread of zombies have wised up to port 25 blocking / throttling and like to funnel everything through the MTA for the domain to which they are connected.
A combination of policyd, postfix, spamassain and ids/bandwidth accounting software has turned it into something manageable, at least where I work. Customers are allowed say, 100 e-mails in a 30 minute time span. If they complain and have a real reason, we can adjust. This also makes finding users with pwned machines a lot easier.
Some of them now (the spam zombies) seem to be moderating their outgoing connections so that it's not so obvious but their volume is still substantial. It just never ends...
... and its not disimilar from greylisting from what I can tell, but I don't think its going to be
effective in the long term. Getting around this type of filter (or delay) seems relatively simple
compared to the task of defeating the bayesian filters over the past couple years.
The lynchpin of greylisting is that legitimate mail will "try again" after being returned by the
server, while spam will not. The conclusion (which we hope is true) is that any mail that is
not re-sent was in fact spam. Never mind the danger that the assumption could be false and
legitimate mail gets lost -- how long will it be before spammers simply "re try" their spam --
or worse -- just send everything twice?
As with any attempt to modify behavior electronically -- behavior usually wins.
------ The best brain training is now totally free : )
yes, traffic shaping is effective in determining the nature of connections
I work for a small email company we process millions of emails an hour inbound, but only a few million a day outbound.
Our most effective filters are:
connect/HELO restrictions: you can only get email into the environment if your IP address resolves to a FQDN.
HELO restrictions: if you connect using X different HELO strings, you are blacklisted. Spambots often randomize the helos, this blocks those.
Spamassassin at the client side, filtering email into various folders based on the score.
antivirus server that filters the few viruses that make it in, and phishing is filtered too.
The problem? All this doesn't catch enough of the spam. We still have loads of CPU dedicated to filtering spam, but something like this technique at the border will help, and I'll predict (based on experience watching the traffic and spam filtering graphs) that we could cut spam another 30% just by watching the curves and tightening the restrictions during those peaks.
My (previous) ISP did this several years ago. I found out when I was making a computer for a friend. At the time (this was a few years ago) I didn't yet know just how quickly an unprotected windows-box is owned by viruses. I thought I'd be okay for the time it takes to download a firewall. 20 seconds later I got a popup that I recognized as an infection, so I shut down the machine, and tried to get the firewall / AV-software with my other machine instead - only to be greeted by a screen where my ISP informs me that "By the look of your outgoing traffic, it would seem that your machine has been turned into a spam-bot by a virus, and your account will be automatically unblocked 1 hour after the suspicious traffic stops." This was followed by some generic instructions for virus removal.