Catching Spam by Looking at Traffic, Not Content

← Back to Stories (view on slashdot.org)

Catching Spam by Looking at Traffic, Not Content

Posted by Zonk on Thursday January 25, 2007 @03:43AM from the content-sniffing-doesn't-work-anyway dept.

AngryDad writes "HexView has proposed a method to deal with spam without scanning actual message bodies. The method is based solely on traffic analysis. They call it STP (Source Trust Prediction). A server, like a Real-time Spam Black list, collects SMTP session source and destination addresses from participating Mail Transfer Agents (MTAs) and applies statistics to identify spam-like traffic patterns. A credibility score is returned to the MTA, so it can throttle down or drop possibly unwanted traffic. While I find it questionable, the method might be useful when combined with traditional keyword analysis." What do you think? Is this snake oil, or is there something to this?

20 of 265 comments (clear)

sounds good to me by seanadams.com · 2007-01-25 03:44 · Score: 5, Insightful

I realize most of us here would ordinarily prefer for our ISPs to just move bits around, but it seems like they are in a pretty good position to curb spam if they were to start look at traffic patterns like this. If some DSL customer suddenly starts opening hundreds of outgoing SMTP connections, that would be a pretty reliable sign that his machine is pwned. Just block or throttle port 25, and send the customer an email telling him to fix his computer, and keep it blocked until he does - or he contacts abuse@ with a legitimate explanation. Not filtering based on the contents of the data should let them maintain plausible deniability and common carrier status.

We can't do this on our personal or company internet connections because we only see individual messages coming from many different IPs, but on the other end of the connection, or even at the backbone level, this strikes me as a pretty solid solution. They could even just tag the packets with the evil bit and let us decide if we want to filter them or not.
1. Re:sounds good to me by GreggBz · 2007-01-25 04:07 · Score: 4, Interesting
  
  The new bread of zombies have wised up to port 25 blocking / throttling and like to funnel everything through the MTA for the domain to which they are connected.
  
  A combination of policyd, postfix, spamassain and ids/bandwidth accounting software has turned it into something manageable, at least where I work. Customers are allowed say, 100 e-mails in a 30 minute time span. If they complain and have a real reason, we can adjust. This also makes finding users with pwned machines a lot easier.
  
  Some of them now (the spam zombies) seem to be moderating their outgoing connections so that it's not so obvious but their volume is still substantial. It just never ends...
2. Re:sounds good to me by kripkenstein · 2007-01-25 04:51 · Score: 4, Insightful
  
  Sounds good? Don't major email providers already do something like this? What else are Google doing when lots of people click on "This is Spam" for a particular email - surely they notice such things? The same should be true of email traffic patterns. Yet, perhaps some minor detail in TFA is the new bit. Obviously any improvement in this area is welcome.
  
  While this will not stop spam, it will be reduced dramatically. The STP value of a spam source will grow proportionally to the number of junk messages sent. The first several thousands emails will get to unlucky recipients when spamming starts, but the rest hundreds of thousands will not.
  Actually, webmail can do one better: if a message is marked as spam at some point in time, the system can retroactively remove it from the Inboxes of the 'first few thousand unlucky recipients' (or mark it 'this may be spam', gray it out, etc., at the least). I don't know of anyone doing this, but I wish they would.
3. Re:sounds good to me by GreggBz · 2007-01-25 07:28 · Score: 4, Funny
  
  It's blue! It's moldy! It's the The night of the living Bread.
Re:This is painfully obvious and hopelessly naive by jimicus · 2007-01-25 03:50 · Score: 5, Funny

As soon as you've found a way to get that message through effectively to 100% of the population, do let us know.
I'll never stop by diskofish · 2007-01-25 03:51 · Score: 5, Funny

Where else would I get my Viagra from?
1. Re:I'll never stop by El_Muerte_TDS · 2007-01-25 05:13 · Score: 4, Funny
  
  You shouldn't. Impotence is nature's signal that you are not fit for reproduction. Your reproduction will only result in more people responding to spam, which is ofcourse a bad thing.
  
  So do the world a favor... please...
unlikely indicators by Speare · 2007-01-25 03:51 · Score: 4, Insightful

I think the question raises an interesting point: spams *behave* differently on the network than most legitimate emails. It may not be a perfect discriminator, but it sure might be a corroborative scoring aid. This reminded me of the controversy when Slashdot started using text compressibility as a metric for "lameness." I was a disbeliever, and still have my reservations about it, but as a part of the overall toolbox for filtering lameness, the technique seems to have value.

--
[ .sig file not found ]
The problem with this by wiredog · 2007-01-25 03:54 · Score: 5, Insightful

Mailing lists. How does it not tag a server that sends out mail to a list as a spammer?

--

Best Slashdot Co
Re:This is painfully obvious and hopelessly naive by the+dark+hero · 2007-01-25 03:57 · Score: 4, Insightful

That's the problem. this world is full of stupid people. They might not make money off of most people the spam gets to, but if you cast a big enough net you're bound to catch something(including some dolphins). Millions of pennies still add up to thousands of dollars.

--
You constantly struggle for self improvement - and it shows.
Hooray for bad Engrish on fortune cookies
Re:This is painfully obvious and hopelessly naive by Grey+Ninja · 2007-01-25 03:59 · Score: 5, Funny

We could try mass mailing them. I've had some success with that in the past. =)
Obligatory by teslar · 2007-01-25 04:00 · Score: 4, Funny

Your post advocates a

(x) technical ( ) legislative ( ) market-based ( ) vigilante

approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)

( ) Spammers can easily use it to harvest email addresses
(x) Mailing lists and other legitimate email uses would be affected
( ) No one will be able to find the guy or collect the money
(x) It is defenseless against brute force attacks
(x) It will stop spam for two weeks and then we'll be stuck with it
( ) Users of email will not put up with it
( ) Microsoft will not put up with it
( ) The police will not put up with it
( ) Requires too much cooperation from spammers
( ) Requires immediate total cooperation from everybody at once
( ) Many email users cannot afford to lose business or alienate potential employers
( ) Spammers don't care about invalid addresses in their lists
( ) Anyone could anonymously destroy anyone else's career or business

Specifically, your plan fails to account for

( ) Laws expressly prohibiting it
( ) Lack of centrally controlling authority for email
( ) Open relays in foreign countries
( ) Ease of searching tiny alphanumeric address space of all email addresses
( ) Asshats
( ) Jurisdictional problems
( ) Unpopularity of weird new taxes
( ) Public reluctance to accept weird new forms of money
( ) Huge existing software investment in SMTP
( ) Susceptibility of protocols other than SMTP to attack
( ) Willingness of users to install OS patches received by email
( ) Armies of worm riddled broadband-connected Windows boxes
(x) Eternal arms race involved in all filtering approaches
( ) Extreme profitability of spam
( ) Joe jobs and/or identity theft
( ) Technically illiterate politicians
( ) Extreme stupidity on the part of people who do business with spammers
( ) Dishonesty on the part of spammers themselves
( ) Bandwidth costs that are unaffected by client filtering
( ) Outlook

and the following philosophical objections may also apply:

( ) Ideas similar to yours are easy to come up with, yet none have ever
been shown practical
( ) Any scheme based on opt-out is unacceptable
( ) SMTP headers should not be the subject of legislation
( ) Blacklists suck
( ) Whitelists suck
( ) We should be able to talk about Viagra without being censored
( ) Countermeasures should not involve wire fraud or credit card fraud
( ) Countermeasures should not involve sabotage of public networks
( ) Countermeasures must work if phased in gradually
( ) Sending email should be free
(x) Why should we have to trust you and your servers?
( ) Incompatiblity with open source or open source licenses
(x) Feel-good measures do nothing to solve the problem
( ) Temporary/one-time email addresses are cumbersome
( ) I don't want the government reading my email
( ) Killing them that way is not slow and painful enough

Furthermore, this is what I think about you:

(x) Sorry dude, but I don't think it would work.
( ) This is a stupid idea, and you're a stupid person for suggesting it.
( ) Nice try, assh0le! I'm going to find out where you live and burn your
house down!
OPPOTUNITY. == DISCRETION REQUIRED == by Anonymous Coward · 2007-01-25 04:03 · Score: 5, Funny

SIR,

OUR TECHNOLOGY DEPARTMENT HAS COME UP WITH A GREAT OPPUTUNITY TO STOP ALL YOUR SPAM. THIS TECHNOLOGY IS CALLED source Trust Prediction (STP). IT WORKS BASED ON identifying patterns and trends in real time AND IN THIS WAY PREVENT SPAM. HOWEVER TO MAKE PROFIT FROM THIS NEW TECHNOLOGYY WE NEED TO DO A PATENT APPLICATION. YOUR NAME CAME FORWARD AS AN EXCELLENT INVESTOR FOR THIS. WITH THE CURRENT RISE OF SPAM THIS TECH WILL BE REQUIRED QUICKLY BY A LOT OF PEOPLE.

I am only contacting you as a foreigner, I will use my influence to
effect legal approvals and onward transfer into your account At the
conclusion of this business, you will be given 50% of the total
PROFITS, 50% will be for me and my family AFTER DEDUCTION OF THE PATENT COSTS
. I await to hear from you.

Yours truly,

Mr.Barry Leoard.

FNB OF SOUTH AFRICA
THIS
IS MY PRIVATE EMAIL ADDRESS, YOU CAN SEND YOUR REPLY HERE:-
barryleonard@walla.com
Re:This is painfully obvious and hopelessly naive by Pontus_Pih · 2007-01-25 04:05 · Score: 4, Interesting

I was going to say... What would happen if we all started replying with the same auto generated mails? How would the spammers tell the difference from legit spam replies?
Its not snake oil, but... by popo · 2007-01-25 04:10 · Score: 4, Interesting

... and its not disimilar from greylisting from what I can tell, but I don't think its going to be
effective in the long term. Getting around this type of filter (or delay) seems relatively simple
compared to the task of defeating the bayesian filters over the past couple years.

The lynchpin of greylisting is that legitimate mail will "try again" after being returned by the
server, while spam will not. The conclusion (which we hope is true) is that any mail that is
not re-sent was in fact spam. Never mind the danger that the assumption could be false and
legitimate mail gets lost -- how long will it be before spammers simply "re try" their spam --
or worse -- just send everything twice?

As with any attempt to modify behavior electronically -- behavior usually wins.

--
------ The best brain training is now totally free : )
Re:This is painfully obvious and hopelessly naive by KKlaus · 2007-01-25 04:13 · Score: 4, Insightful

Complaining that people are frequently bad decision makers is usually not worthwhile. Much better to recognize the truth that they are, and then work to try and take the decisions out of their hands.

Its similar to a pretty interesting conceptual innovation in medicine, when people realized that even excellent doctors will at some point make grossly negligent mistakes simply due to the shear amount of work they do (i.e. operating on people with paralytics but not analgesics). So the innovation is to make them make fewer decisions - machines that check settings before running, labels that a four year old could understand, arrows and other reminders liberally applied.

So similarly here, yes it's annoying that people continue to "fund" spammers, but education is not the answer. Because, unfortunately, the spammer's target market of "everyone in the world" will always contain enough people to make their trade profitable if all we rely on is good decision making on the parts of spam recipients. So the solution has to be technical or legal. And in that regard, another small step for man here.

--
Relax I just want some peanuts.
this and other effective weapons by fifedrum · 2007-01-25 04:17 · Score: 5, Interesting

yes, traffic shaping is effective in determining the nature of connections

I work for a small email company we process millions of emails an hour inbound, but only a few million a day outbound.

Our most effective filters are:

connect/HELO restrictions: you can only get email into the environment if your IP address resolves to a FQDN.

HELO restrictions: if you connect using X different HELO strings, you are blacklisted. Spambots often randomize the helos, this blocks those.

Spamassassin at the client side, filtering email into various folders based on the score.

antivirus server that filters the few viruses that make it in, and phishing is filtered too.

The problem? All this doesn't catch enough of the spam. We still have loads of CPU dedicated to filtering spam, but something like this technique at the border will help, and I'll predict (based on experience watching the traffic and spam filtering graphs) that we could cut spam another 30% just by watching the curves and tightening the restrictions during those peaks.
Even if no one ever responds, it won't stop by MarkusQ · 2007-01-25 04:25 · Score: 4, Insightful

Even if no one ever responds, it won't stop as long as the people paying to have it sent think it works. It's like burning candles to St. Balderdash for scam marketing morons. As long as there is a steady supply of rubes who think that sending spam is their road to riches, and are willing to pay some brighter but no more honest spam lord to send their dreck to a bazillion hapless victims for them, spam will contine to flow.
This is true even if no one ever responds to, falls for, or even opens a spam message ever again.
--MarkusQ
Has been done for a long time. by MadTinfoilHatter · 2007-01-25 04:27 · Score: 5, Interesting

My (previous) ISP did this several years ago. I found out when I was making a computer for a friend. At the time (this was a few years ago) I didn't yet know just how quickly an unprotected windows-box is owned by viruses. I thought I'd be okay for the time it takes to download a firewall. 20 seconds later I got a popup that I recognized as an infection, so I shut down the machine, and tried to get the firewall / AV-software with my other machine instead - only to be greeted by a screen where my ISP informs me that "By the look of your outgoing traffic, it would seem that your machine has been turned into a spam-bot by a virus, and your account will be automatically unblocked 1 hour after the suspicious traffic stops." This was followed by some generic instructions for virus removal.
Great idea, just several years late ;) by Thorizdin · 2007-01-25 05:21 · Score: 4, Informative

For everyone screaming that this isn't feasible, will kill mailing lists, and other wise render effective communication via SMTP impossible you might want to consider that about a quarter of global email volume is already flowing through a system very much like what the OP describes.

Ironport (recently purchased by Cisco for $830 million US) has been doing this kind of service for large providers for several years.
Their statistics site is publicly viewable, but using their stats requires a subscription fee.
http://www.senderbase.org/
Its interesting to look at how well or poorly the MTA's you use are scored. All of the stats are gathered by the systems they sell to ISP's and enterprise customers. These boxes perform the spam filtering for that organization's customers and provide statistical data back to senderbase.org, which allows all Ironport customers to "know" about problems for all other Ironport customers.

The link to their PDF on their metric's is here:
http://ironport.com/pdf/ironport_wp_reputation_bas ed_control.pdf

We evaluated their system last year as a possible replacement for a third party spam/virus scanning provider and may end up purchasing their equipment once everything with the Cisco purchase shakes out. Their solution, while not perfect, behaves far better than some of the things that large service providers *coughAOLcough* have tried and are (or were when we tested) comparable to most of the content based scanning systems in terms of spam filtering with a lower rate of false positives.