Catching Spam by Looking at Traffic, Not Content
AngryDad writes "HexView has proposed a method to deal with spam without scanning actual message bodies. The method is based solely on traffic analysis. They call it STP (Source Trust Prediction). A server, like a Real-time Spam Black list, collects SMTP session source and destination addresses from participating Mail Transfer Agents (MTAs) and applies statistics to identify spam-like traffic patterns. A credibility score is returned to the MTA, so it can throttle down or drop possibly unwanted traffic. While I find it questionable, the method might be useful when combined with traditional keyword analysis." What do you think? Is this snake oil, or is there something to this?
For everyone screaming that this isn't feasible, will kill mailing lists, and other wise render effective communication via SMTP impossible you might want to consider that about a quarter of global email volume is already flowing through a system very much like what the OP describes.
s ed_control.pdf
Ironport (recently purchased by Cisco for $830 million US) has been doing this kind of service for large providers for several years.
Their statistics site is publicly viewable, but using their stats requires a subscription fee.
http://www.senderbase.org/
Its interesting to look at how well or poorly the MTA's you use are scored. All of the stats are gathered by the systems they sell to ISP's and enterprise customers. These boxes perform the spam filtering for that organization's customers and provide statistical data back to senderbase.org, which allows all Ironport customers to "know" about problems for all other Ironport customers.
The link to their PDF on their metric's is here:
http://ironport.com/pdf/ironport_wp_reputation_ba
We evaluated their system last year as a possible replacement for a third party spam/virus scanning provider and may end up purchasing their equipment once everything with the Cisco purchase shakes out. Their solution, while not perfect, behaves far better than some of the things that large service providers *coughAOLcough* have tried and are (or were when we tested) comparable to most of the content based scanning systems in terms of spam filtering with a lower rate of false positives.