Researchers Build TCP-Based Spam Detection
itwbennett writes "In a presentation at the Usenix LISA conference in Boston, researchers from the Naval Academy showed that signal analysis of factors such as timing, packet reordering, congestion and flow control can reveal the work of a spam-spewing botnet. The work 'advanced both the science of spam fighting and ... worked through all the engineering challenges of getting these techniques built into the most popular open-source spam filter,' said MIT computer science research affiliate Steve Bauer, who was not involved with the work. 'So this is both a clever bit of research and genuinely practical contribution to the persistent problem of fighting spam.'"
People are looking at the wrong end of the problem with much of their efforts - and this is just another example of that. You cannot solve spam with filtering, detection, or legislative actions. We've seen time and time again that those are just time and money-sucking stopgap measures that ignore the reality of the situation.
We won't see a real solution to the spam epidemic until people acknowledge the simple truth that spam is an economic problem. There is still a lot of money to be made by sending out spam, with very little expense for the spammer. The profit margin is high enough that it is well worth their while to find various ways around filters and any other silly mechanisms we throw at them.
If you want to make an actual difference in the fight against spam, you need to approach the economic motivations behind it. If you stop of the flow of money to the spammers, you will stop the spam as well. Because no matter how much some people may want to believe otherwise, spam isn't sent just to piss you off and ruin your day. Spam is sent out because spammers are paid to do so. If they don't get paid, they won't send spam, it is as simple as that. Any other kind of countermeasure only prolongs the fight and throws more money in the wrong direction.
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
Even if the spam click-though rate is 0.0%, there are still enough suckers born every minute to buy the service of spammers.
The best way to fight spam is do what IM systems has been doing, by whitelisting. So, 1st email triggers a white list query, and the rest wil be invisible... May be do this on a per ip or per domain basis...
I'm sure 'itwbennett' would rather everyone go to his employer's website to read that article, but it is clearly not written (or edited) by anyone who has any basic clues about spam-fighting. Just reading the subtitle makes me cringe for the unfortunate "journalists" lassoed into writing it, as it was clearly done by spam neophytes in a desperate scramble for click-scrounging content. The article is vaguely about a paper presented almost a year ago at LISA '11. There are links to an abstract and the original paper at the LISA '11 site: http://www.usenix.org/events/lisa11/tech/
The general space of sniffing out spam by looking at TCP characteristics has been mined for years usefully with Symantec and MailChannels both offering proprietary tools that use such techniques and some open DNSBL's using TCP sniffing to identify sources, but it would be incorrect to believe that any one methodology will ever be a magical silver bullet against spam.
This REALLY sounds like a copy of Sendmail Inc.'s Rate Control component, which has been deployed to many sites for the last several years. Rate Control allows the admin to throttle or otherwise block email that breaks various TCP-related thresholds (messages/second, bad recipients/second, connections/second, etc.). Further, recent real world indications show that spammers are sending fewer spams per second from individual IP addresses--they make up the volume by increasing the size of the botnet, and coordinating activity so that not too many bots hit the same relay at the same time. This is why Rate Control added an IP Reputation subcomponent a couple of years ago.
It appears these Navy guys have simply come up with a tool that has already existed for years.
As far as being a solution to spam, I agree that spam is 99% a financial problem. The problem with attacking it as such, is that one tends to also hurt legitimate endeavors. If all the advertising were removed from the Internet, there would not be much of a non-commercial Internet--the advertising tends to keep many things free or very cheap. Also, education is great, but as soon as you teach one person to not fall prey to spam, there's another person born who will fall prey. Thus you need to do many things in concert to fight spam--educate, identify, legislate, prosecute. The closer to the front end we can identify spam, the more cheaply we can block or redirect it. Why redirect it? For prosecution and legislation reasons. If you can identify where the money goes, this evidence become important to justify cutting off the ability of that spammer to get funds--credit bureaus, etc.
I've always wondered how seemingly smart people can act so stupidly totally oblivious to the repercussions of their actions.
What happens when a busy computer that would cause it to naturally act in a similiar matter as a botnet zombie sends an email and that message is then flagged as spam?
Spammers are no fools or dinosaurs. They will simply adjust their spamming rate in zombie client below the threshold needed to induce effects needed to trigger the detection scheme.
End result as always is the same:
It won't stop anyone from spamming
It WILL make SMTP based Email even more unreliable than it currently is.
While 95% accuracy at detecting spam may sound like "wow", it's a very low rate. Simply using correctly configured greylisting gives an accuracy in the 99% range. So I doubt this technique really improves anything but it will allow to say 'we did it another way'. Given than more and more spam comes from official mail relays, accuracy will only increase when analysing the body of the mail.
I gave up with the idea of an useful sig...
First, we've known for many years that IP-level techniques can deal with a lot of spam. For example, using the Spamhaus "DROP" list in perimeter devices is so incredibly effective that anyone who isn't doing it may summarily be declared incompetent. As another example, perhaps more germane to this paper, see http://use.perl.org/~merlyn/journal/17094 -- which demonstrates how to use passive OS fingerprinting in the BSD pf firewall to throttle traffic from Windows systems. (I presume everyone is well aware that bots are nearly always hosted on Windows systems; my own research indicates that despite inroads by attackers into non-Windows hosts, the probability that any given bot will be found to be on a Windows system is still comfortably above 99.999%.) The technique shown by poster "merlyn" in that example from 2004 can readily be extended and combined with others.
Second, 95% true positive rate is impressive for a single measure, BUT we must also consider the false positive rate, and we have to consider the resource cost necessary to achieve this number. Frankly, doing this inside SpamAssassin is very inefficient -- this is a function that can be handled either in the firewall or in the MTA, or perhaps in a combination of the two. There's really no need to invoke something as heavyweight, slow and complex as SA. (Nor is this desirable: the more complex the anti-spam architecture, the more difficult it is to tune properly and the more susceptible it is to gaming.)
Here's the TL;DR version: if a host passive-OS-fingerprints as Windows then it's suspect. If it does that AND (lacks rDNS OR has generic rDNS) it's a bot.
of all spam comes from dynamic addresses. Their method (95%) is worse than simply rejecting all email from dynamic IP's. I find greylisting dynamics for 36 hours and statics for an hour filters over 99% of spam. If one gets thru, I just blacklist the IP.
I've been doing this for years.
I use p0f to detect connections coming from windows and greylist them. Very little genuine mail comes from windows based mail servers.
I find there is little point greylisting mail from unix machines as very little spam comes from them.