Researchers Build TCP-Based Spam Detection
itwbennett writes "In a presentation at the Usenix LISA conference in Boston, researchers from the Naval Academy showed that signal analysis of factors such as timing, packet reordering, congestion and flow control can reveal the work of a spam-spewing botnet. The work 'advanced both the science of spam fighting and ... worked through all the engineering challenges of getting these techniques built into the most popular open-source spam filter,' said MIT computer science research affiliate Steve Bauer, who was not involved with the work. 'So this is both a clever bit of research and genuinely practical contribution to the persistent problem of fighting spam.'"
People are looking at the wrong end of the problem with much of their efforts - and this is just another example of that. You cannot solve spam with filtering, detection, or legislative actions. We've seen time and time again that those are just time and money-sucking stopgap measures that ignore the reality of the situation.
We won't see a real solution to the spam epidemic until people acknowledge the simple truth that spam is an economic problem. There is still a lot of money to be made by sending out spam, with very little expense for the spammer. The profit margin is high enough that it is well worth their while to find various ways around filters and any other silly mechanisms we throw at them.
If you want to make an actual difference in the fight against spam, you need to approach the economic motivations behind it. If you stop of the flow of money to the spammers, you will stop the spam as well. Because no matter how much some people may want to believe otherwise, spam isn't sent just to piss you off and ruin your day. Spam is sent out because spammers are paid to do so. If they don't get paid, they won't send spam, it is as simple as that. Any other kind of countermeasure only prolongs the fight and throws more money in the wrong direction.
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
Even if the spam click-though rate is 0.0%, there are still enough suckers born every minute to buy the service of spammers.
It works great, until the next bot net or spam cannon iteration.
This seems to be a losing battle. The amount of processing power used to detect or prevent spam is already very high and these increasingly complex detection schemes are just increasing the required processing cycles at an exponential rate. For the spammer, when one system becomes ineffective, they abandon it and move to the next, staying clean and lean. But, the detection system must continue to hold on to the old detection scheme for a very long time or forever because there's always a few spammer that continue to try the old ways.
The best way to fight spam is do what IM systems has been doing, by whitelisting. So, 1st email triggers a white list query, and the rest wil be invisible... May be do this on a per ip or per domain basis...
I'm sure 'itwbennett' would rather everyone go to his employer's website to read that article, but it is clearly not written (or edited) by anyone who has any basic clues about spam-fighting. Just reading the subtitle makes me cringe for the unfortunate "journalists" lassoed into writing it, as it was clearly done by spam neophytes in a desperate scramble for click-scrounging content. The article is vaguely about a paper presented almost a year ago at LISA '11. There are links to an abstract and the original paper at the LISA '11 site: http://www.usenix.org/events/lisa11/tech/
The general space of sniffing out spam by looking at TCP characteristics has been mined for years usefully with Symantec and MailChannels both offering proprietary tools that use such techniques and some open DNSBL's using TCP sniffing to identify sources, but it would be incorrect to believe that any one methodology will ever be a magical silver bullet against spam.
This REALLY sounds like a copy of Sendmail Inc.'s Rate Control component, which has been deployed to many sites for the last several years. Rate Control allows the admin to throttle or otherwise block email that breaks various TCP-related thresholds (messages/second, bad recipients/second, connections/second, etc.). Further, recent real world indications show that spammers are sending fewer spams per second from individual IP addresses--they make up the volume by increasing the size of the botnet, and coordinating activity so that not too many bots hit the same relay at the same time. This is why Rate Control added an IP Reputation subcomponent a couple of years ago.
It appears these Navy guys have simply come up with a tool that has already existed for years.
As far as being a solution to spam, I agree that spam is 99% a financial problem. The problem with attacking it as such, is that one tends to also hurt legitimate endeavors. If all the advertising were removed from the Internet, there would not be much of a non-commercial Internet--the advertising tends to keep many things free or very cheap. Also, education is great, but as soon as you teach one person to not fall prey to spam, there's another person born who will fall prey. Thus you need to do many things in concert to fight spam--educate, identify, legislate, prosecute. The closer to the front end we can identify spam, the more cheaply we can block or redirect it. Why redirect it? For prosecution and legislation reasons. If you can identify where the money goes, this evidence become important to justify cutting off the ability of that spammer to get funds--credit bureaus, etc.
I've always wondered how seemingly smart people can act so stupidly totally oblivious to the repercussions of their actions.
What happens when a busy computer that would cause it to naturally act in a similiar matter as a botnet zombie sends an email and that message is then flagged as spam?
Spammers are no fools or dinosaurs. They will simply adjust their spamming rate in zombie client below the threshold needed to induce effects needed to trigger the detection scheme.
End result as always is the same:
It won't stop anyone from spamming
It WILL make SMTP based Email even more unreliable than it currently is.
Your post advocates a
(x) technical ( ) legislative ( ) market-based ( ) vigilante
approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)
( ) Spammers can easily use it to harvest email addresses
(x) Mailing lists and other legitimate email uses would be affected
( ) No one will be able to find the guy or collect the money
( ) It is defenseless against brute force attacks
(x) It will stop spam for two weeks and then we'll be stuck with it
( ) Users of email will not put up with it
( ) Microsoft will not put up with it
( ) The police will not put up with it
( ) Requires too much cooperation from spammers
( ) Requires immediate total cooperation from everybody at once
( ) Many email users cannot afford to lose business or alienate potential employers
( ) Spammers don't care about invalid addresses in their lists
( ) Anyone could anonymously destroy anyone else's career or business
Specifically, your plan fails to account for
( ) Laws expressly prohibiting it
( ) Lack of centrally controlling authority for email
( ) Open relays in foreign countries
( ) Ease of searching tiny alphanumeric address space of all email addresses
( ) Asshats
( ) Jurisdictional problems
( ) Unpopularity of weird new taxes
( ) Public reluctance to accept weird new forms of money
( ) Huge existing software investment in SMTP
(x) Susceptibility of protocols other than SMTP to attack
( ) Willingness of users to install OS patches received by email
( ) Armies of worm riddled broadband-connected Windows boxes
(x) Eternal arms race involved in all filtering approaches
(x) Extreme profitability of spam
( ) Joe jobs and/or identity theft
( ) Technically illiterate politicians
( ) Extreme stupidity on the part of people who do business with spammers
( ) Dishonesty on the part of spammers themselves
( ) Bandwidth costs that are unaffected by client filtering
( ) Outlook
and the following philosophical objections may also apply:
(x) Ideas similar to yours are easy to come up with, yet none have ever
been shown practical
( ) Any scheme based on opt-out is unacceptable
( ) SMTP headers should not be the subject of legislation
( ) Blacklists suck
( ) Whitelists suck
( ) We should be able to talk about Viagra without being censored
( ) Countermeasures should not involve wire fraud or credit card fraud
( ) Countermeasures should not involve sabotage of public networks
( ) Countermeasures must work if phased in gradually
( ) Sending email should be free
( ) Why should we have to trust you and your servers?
( ) Incompatiblity with open source or open source licenses
( ) Feel-good measures do nothing to solve the problem
( ) Temporary/one-time email addresses are cumbersome
( ) I don't want the government reading my email
( ) Killing them that way is not slow and painful enough
Furthermore, this is what I think about you:
(x) Sorry dude, but I don't think it would work.
( ) This is a stupid idea, and you're a stupid person for suggesting it.
( ) Nice try, assh0le! I'm going to find out where you live and burn your
house down!
While 95% accuracy at detecting spam may sound like "wow", it's a very low rate. Simply using correctly configured greylisting gives an accuracy in the 99% range. So I doubt this technique really improves anything but it will allow to say 'we did it another way'. Given than more and more spam comes from official mail relays, accuracy will only increase when analysing the body of the mail.
I gave up with the idea of an useful sig...
First, we've known for many years that IP-level techniques can deal with a lot of spam. For example, using the Spamhaus "DROP" list in perimeter devices is so incredibly effective that anyone who isn't doing it may summarily be declared incompetent. As another example, perhaps more germane to this paper, see http://use.perl.org/~merlyn/journal/17094 -- which demonstrates how to use passive OS fingerprinting in the BSD pf firewall to throttle traffic from Windows systems. (I presume everyone is well aware that bots are nearly always hosted on Windows systems; my own research indicates that despite inroads by attackers into non-Windows hosts, the probability that any given bot will be found to be on a Windows system is still comfortably above 99.999%.) The technique shown by poster "merlyn" in that example from 2004 can readily be extended and combined with others.
Second, 95% true positive rate is impressive for a single measure, BUT we must also consider the false positive rate, and we have to consider the resource cost necessary to achieve this number. Frankly, doing this inside SpamAssassin is very inefficient -- this is a function that can be handled either in the firewall or in the MTA, or perhaps in a combination of the two. There's really no need to invoke something as heavyweight, slow and complex as SA. (Nor is this desirable: the more complex the anti-spam architecture, the more difficult it is to tune properly and the more susceptible it is to gaming.)
Here's the TL;DR version: if a host passive-OS-fingerprints as Windows then it's suspect. If it does that AND (lacks rDNS OR has generic rDNS) it's a bot.
of all spam comes from dynamic addresses. Their method (95%) is worse than simply rejecting all email from dynamic IP's. I find greylisting dynamics for 36 hours and statics for an hour filters over 99% of spam. If one gets thru, I just blacklist the IP.
I've been doing this for years.
I use p0f to detect connections coming from windows and greylist them. Very little genuine mail comes from windows based mail servers.
I find there is little point greylisting mail from unix machines as very little spam comes from them.
So basically to identify a spambot, you have to look at the network traffic?
Hmm, that's interesting. What else would there be to look at?
What I do is detecting these SMTP and TCP level characteristics of senders in the MTA, and then writing an X- header in the received message with the characteristics found during the transfer.
This header is then evaluated by SpamAssassin to assign spampoints to the received message.
The advantage is that faults in the sender's SMTP implementation do not immediately lead to hard blocks, but can be valued together with other characteristics of the message.
It also means the message ultimately ends up in the spam folder instead of being rejected. Over time, it has become clear to me that no matter how spammy a message may appear to be, it can always be that valuable message that the boss really wanted to receive.