Stopping Spam Before It Hits the Mail Server
Al writes "A team of researchers at the Georgia Institute for Technology say they have developed a way to catch spam before it even arrives on the mail server. Instead of bothering to analyze the contents of a spam message, their software, called SNARE (Spatio-temporal Network-level Automatic Reputation Engine), examines key aspects of individual packets of data to determine whether it might be spam. The team, led by assistant professor Nick Feamster, analyzed 2.5 million emails collected by McAfee in order to determine the key packet characteristics of spam. These include the geodesic proximity of end mail servers and the number of ports open on the sending machine. The approach catches spam 70 percent of the time, with a 0.3 false positive rate. Of course, revealing these characteristics could also allow spammers to fake their packets to avoid filtering."
I'll go first.
All spammers have to do is change the characteristics of the message. It's always going to be a cat and mouse game, just like antivirus and antispyware, so saying that they've found THE solution to blocking spam from hitting the server is slightly irresponsible.
Problem already solved back in 2003, I don't get any spam now.
Why do we need a crazily complex scheme like this when a simple entry in your router's 'Deny' list (for the source IP of the spam) has the same end effect?
Given the spew pouring out of the IP space of China, LACNIC, and Russia, blocking in such a manner appears to be near-lossless compression.
Bruce Lane, KC7GR,
Blue Feather Technologies
I figured out how to stop 100% of spam. I've disconnected my mail server from the internet. Sure, it catches a few false positives that way, but that's really the best part... the more spam I get, the lower the false positive rate!
Just like other criminals, spammers must quickly respond to what actually works. In essence this is the flaw in any "security by obscurity" scheme, the bad guys simply respond to whatever works. If you get to try several billion times a day then you can try a whole lot of combinations.
That means that in my office of 50 people, with an average of 50 emails per day (a very very low estimate), we'd get 7-8 false positives daily. I'd hear bloody murder if that was the case.
We get a lot more mail than that per day, and our spamassassin without autolearning (simply flag anything higher than 5.0) does a hell of a lot better job than that... down in the range of 1-2 false positives a month. Assuming a low daily average of emails (like my example), that's .002% false positives.
I'm out of my mind right now, but feel free to leave a message.....
The original is "The end result was a system capable of detecting spam 70 percent of the time, with a 0.3 percent false positive rate."
Dave Barnes 9 breweries within walking distance of my house
0.3 would be terrible - three out of ten false positives. 0.3 percent - what the article actually says - is not too bad. But current techniques allow me to check the spam bin for such messages. This technique would pretty much preclude that capability, since the mail would never arrive at the server. I'm not sure that a rate of 0.003 would be acceptable under those circumstances.
Floating face-down in a river of regret...and thoughts of you...
Did the e-mail message originate from Taiwan, Indonesia, or some other third-world country? If so, block it.
And for those of us who do business with Chinese entities that have a ".cn" at the end of their domains?
Am I going to have to request a whitelist entry every time I get a new contact?
And what happens when someone tries to contact me out of the blue before I have a chance to white list them?
IP addresses, he notes, are easy to fake.
Sure, you can fake your IP address so you get past this filtering, because it just looks at the first packet. It won't help you though, because you can't complete a TCP 3-way handshake from a fake address, and without doing that you can't actually send spam.
Finally! A year of moderation! Ready for 2019?
Isn't this just pushing the processing back a level, but still arriving at its destination? I guess you could implement bandwidth-provider-level (i.e. before the customer even gets their packets) spam filtering this way, but I'm sure most organizations would prefer to retain control by doing their own filtering.
So this software functions in both space AND time? Fascinating.
It's good that they specified that in the name, to avoid questions such as "Will this software work in the universe which we inhabit?"
a baseball glove.
But I'd first have to question why somebody is throwing spam at my mail server in the first place?
I've got a device in front the mail server, many people do. These and others work fine. Sorry for folks that don't have one. As long as it is free, it will be abused. Someone already said it was cat and mouse.
ceci n'est pas un sig
I hear this suggestion a lot. However, many of us work for global companies that deal with legitimate email from these countries. We can't just reject IP blocks for countries when we have dealings in them. China and Russia are huge for international companies.
It sounds like this approach would be fairly CPU intensive; analyzing the characteristics of packets, comparing them to other packets, looking for information on their originating systems, etc... It seems like they are throwing a non-trivial amount of computational time at the problem in order to spare the storage space that would be otherwise taken up by spam.
And of course as others have already pointed out, this just starts another round of whac-a-mole by pursuing this avenue.
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
Regardless of how complex you make it, someone will always eventually figure out a way around it.
The fundamental property of spam is that it involves many similar messages going to a large number of destinations. That's what to look for. Google can do that, because they manage a very large number of mailboxes with a single system. SpamCop used to do that, but they had to be in the mail-forwarding business to do it and that was too expensive.
Trying to detect spam by looking only at the mail for a single account is inherently a form of guessing. The existing technologies are reasonably good, but not good enough that the spammers give up.
Oh yeah. I was thinking a rate of 0.3 was huge. 0.3 percent is much better but still not acceptable.
This isn't news... The team at LinuxMagic Inc (http://www.linuxmagic.com) has already been doing this for years with their MagicMail Server product (http://magicmail.linuxmagic.com), and more recently with the new MagicSpam software (http://www.magicspam.com) which can be installed on any email server.
Why this would matter is that in high volume sites they would, in theory, consume less resources and also quarantine the offending spam server.
For us mere mortals though qpsmtpd is pretty awesome.
If you set up such a packet-based filter and get a bug in your config (or the environment changes rendering your diligently-crafted config inappropriate), then you may end up with the wEiRdEsT error situations. Missing your new client's orders? Not receiving that hello email from the cutie you gave your address at yesterday evening's party? Bad luck, dear!
Not to mention other applications going gaga. Whoops, who would think a rotten packet filter might affect non-email packets?
Your post advocates a
(x) technical ( ) legislative ( ) market-based ( ) vigilante
approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)
( ) Spammers can easily use it to harvest email addresses
( ) Mailing lists and other legitimate email uses would be affected
( ) No one will be able to find the guy or collect the money
( ) It is defenseless against brute force attacks
(x) It will stop spam for two weeks and then we'll be stuck with it
(x) Users of email will not put up with it
( ) Microsoft will not put up with it
( ) The police will not put up with it
( ) Requires too much cooperation from spammers
( ) Requires immediate total cooperation from everybody at once
(x) Many email users cannot afford to lose business or alienate potential employers
( ) Spammers don't care about invalid addresses in their lists
( ) Anyone could anonymously destroy anyone else's career or business
Specifically, your plan fails to account for
( ) Laws expressly prohibiting it
( ) Lack of centrally controlling authority for email
( ) Open relays in foreign countries
( ) Ease of searching tiny alphanumeric address space of all email addresses
( ) Asshats
( ) Jurisdictional problems
( ) Unpopularity of weird new taxes
( ) Public reluctance to accept weird new forms of money
( ) Huge existing software investment in SMTP
( ) Susceptibility of protocols other than SMTP to attack
( ) Willingness of users to install OS patches received by email
(x) Armies of worm riddled broadband-connected Windows boxes
(x) Eternal arms race involved in all filtering approaches
( ) Extreme profitability of spam
( ) Joe jobs and/or identity theft
( ) Technically illiterate politicians
( ) Extreme stupidity on the part of people who do business with spammers
( ) Dishonesty on the part of spammers themselves
( ) Bandwidth costs that are unaffected by client filtering
( ) Outlook
and the following philosophical objections may also apply:
( ) Ideas similar to yours are easy to come up with, yet none have ever
been shown practical
( ) Any scheme based on opt-out is unacceptable
( ) SMTP headers should not be the subject of legislation
( ) Blacklists suck
( ) Whitelists suck
(x) We should be able to talk about Viagra without being censored
( ) Countermeasures should not involve wire fraud or credit card fraud
( ) Countermeasures should not involve sabotage of public networks
( ) Countermeasures must work if phased in gradually
( ) Sending email should be free
(x) Why should we have to trust you and your servers?
( ) Incompatiblity with open source or open source licenses
( ) Feel-good measures do nothing to solve the problem
( ) Temporary/one-time email addresses are cumbersome
( ) I don't want the government reading my email
( ) Killing them that way is not slow and painful enough
Furthermore, this is what I think about you:
(x) Sorry dude, but I don't think it would work.
( ) This is a stupid idea, and you're a stupid person for suggesting it.
( ) Nice try, assh0le! I'm going to find out where you live and burn your
house down!
It should be illegal to say that freedom of speech should be limited.
When will people understand the one simple, essential truth about spam?
Attacking the supply of spam will never work, except temporarily.
Attacking the demand for spam is the only possible way to fix it.
Your post advocates a
( X ) technical ( ) legislative ( ) market-based ( ) vigilante
approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)
( ) Spammers can easily use it to harvest email addresses
( ) Mailing lists and other legitimate email uses would be affected
( ) No one will be able to find the guy or collect the money
( ) It is defenseless against brute force attacks
( X ) It will stop spam for two weeks and then we'll be stuck with it
( ) Users of email will not put up with it
( ) Microsoft will not put up with it
( ) The police will not put up with it
( ) Requires too much cooperation from spammers
( ) Requires immediate total cooperation from everybody at once
( ) Many email users cannot afford to lose business or alienate potential employers
( ) Spammers don't care about invalid addresses in their lists
( ) Anyone could anonymously destroy anyone else's career or business
Specifically, your plan fails to account for
( ) Laws expressly prohibiting it
( ) Lack of centrally controlling authority for email
( ) Open relays in foreign countries
( ) Ease of searching tiny alphanumeric address space of all email addresses
( X ) Asshats
( ) Jurisdictional problems
( ) Unpopularity of weird new taxes
( ) Public reluctance to accept weird new forms of money
( ) Huge existing software investment in SMTP
( ) Susceptibility of protocols other than SMTP to attack
( ) Willingness of users to install OS patches received by email
( ) Armies of worm riddled broadband-connected Windows boxes
( X ) Eternal arms race involved in all filtering approaches
( ) Extreme profitability of spam
( ) Joe jobs and/or identity theft
( ) Technically illiterate politicians
( ) Extreme stupidity on the part of people who do business with spammers
( X ) Dishonesty on the part of spammers themselves
( ) Bandwidth costs that are unaffected by client filtering
( ) Outlook
and the following philosophical objections may also apply:
( X ) Ideas similar to yours are easy to come up with, yet none have ever
been shown practical
( ) Any scheme based on opt-out is unacceptable
( ) SMTP headers should not be the subject of legislation
( ) Blacklists suck
( ) Whitelists suck
( ) We should be able to talk about Viagra without being censored
( ) Countermeasures should not involve wire fraud or credit card fraud
( ) Countermeasures should not involve sabotage of public networks
( ) Countermeasures must work if phased in gradually
( ) Sending email should be free
( ) Why should we have to trust you and your servers?
( ) Incompatiblity with open source or open source licenses
( ) Feel-good measures do nothing to solve the problem
( ) Temporary/one-time email addresses are cumbersome
( ) I don't want the government reading my email ( X ) Killing them that way is not slow and painful enough Furthermore, this is what I think about you: ( X ) Sorry dude, but I don't think it would work. ( ) This is a stupid idea, and you're a stupid person for suggesting it. ( ) Nice try, assh0le! I'm going to find out where you live and burn your house down!
- My uid ends in 69...
First: I do not want others to decide what's spam for me.
Second: I got graylisting, amavisd with spamd & co, and more. Why exactly would I put such a system on every other node of the net too? To throw away resources?
Any sufficiently advanced intelligence is indistinguishable from stupidity.
Two thoughts. 1) Why doesn't anyone come up with an open source version of Blue Frog legal DOS attack on the merchants that fund the spammers? 2) Is it possible that at least in the US that a computer connected to the public internet and infected with a virus violates part 68 of the FCC code, and therefore the owner could in esssence be fined for being an idiot and not running any of the free anti-virus software?
Maybe we could fund anti-spam efforts from fines for spam-bot supporters.
Part 68 ...Under Part 68, wireline telecommunications carriers must allow all Terminal Equipment (TE) to be connected directly to their networks, provided the TE meet certain technical criteria for preventing four proscribed harms. These harms
are...degradation of service to customers other than the user of the TE...
What exactly does this mean? A rate is usually a comparison of two values. What two values were compared to get 0.3?
My business has been using Sendio ESP for about a year now with absolutely no false positives and nearly complete elimination of unwanted spam. If we utilized all of the features, I'm sure the spam would be completely eliminated.
Big whoop. All it does is block email with IP addresses from France, Belgium, Russia, Italy, and Argentina.
I want to try to keep this as non-spam as possible, but Symantec acquired a company about 5 years ago called TurnTide that did almost *exactly* that. Take the reputation of the sending address, and shape the TCP/IP packets to slow down the rate of mail into the system. Symantec touts a 70% reduction in mail volume and an 80% reduction in the amount of spam that hits a mail server. I've had it in production in one environment where the customer went from approximately 5 million messages/day to 500,000 messages/day.
YOU REMEMBER WHEN SEX WAS THE LAST TIME? REFRESH THE MEMORY OF VIA GRA!
No more hair Rogaining medicine.
GIRLS DO ANYTHING FOR A BIG HOSE
It boosts your rod!
Make two days nailing marathon
for your delicate advantage
And all that is just from the most recent page in my spam folder.
http://twitter.com/OLDTELEGRAM
We already have a method to securely transfer data over SSL and verify the identity of the originating party, and virtually everyone trusts this method with their banking information among other information.
Can't this exact same process be leveraged to help fight spam? SMTP servers already support SSL, so _when used_ why not start verifying SMTP server SSL certificates and the identity of the originating server, if it matches simply reduce the likelihood that the email is spam (-5 score in SPAM ASSASSIN or something), or +5 if the mail comes from a server without a valid certificate.
Combine this with domain keys and eventually it should eliminate spam from botnets, as there is no way they would purchase SSL certificates and setup domain keys for each compromised host, and if they did it would just provide a nice little list of IP addresses to block.
To deal with spam coming from verified servers with certificates simply setup a service that assigns spam complaints to each certificate and a formula that raises the spam score from servers with higher complaint rates. Almost like a FICO/Credit score for email servers.
Since SSL certificates would cost money ($100 or so) and are verified against a corporation/personal identity, it would be relatively difficult/expensive for someone to obtain enough certificates to circumvent the system.
False positives are not that big a deal here. They show why it's actually better to reject spam instead of filter it. When you reject spam, false positives result in the sender getting a bounceback. They know their email didn't reach you. Rejecting spam, not filtering it, ought to be the predominant model.
Tired of FB/Google censorship? Visit UNCENSORED!
But do those 2500 messages include spam or are they just the mails that get through the existing spam filters?
Otherwise my understanding of the 0.3% false positive is where 100% = the total number of emails.
Which is rather unacceptable given the handling of false positives, and the total number of emails could be very high when you include spam.
Technology which is already here for a long time has now a buzzword - SNARE. For example OpenBSD spamd doing the same based on blacklists, greylists and even on Operating System fingerprints. Wheel is reinvented again... ;-)
Since 1996...
A few years ago the company I worked for came under an email DOS attack that bogged down our Exchange server to the point that it took about 10 hours for a legitimate email to get through. The Windows admins tried all 10 spam settings with no affect. I put a Linux box running SpamAssassin in front of the Exchange server and within a couple of hours the delivery time dropped to about 10 seconds. Products like SpamAssassin are essentially dynamic filters that can and do get fresh filter information as often as you like. This case was a dictionary attack and we got rid of the vast majority of the spam by the simple expedient of deleting anything that wasn't addressed to a legitimate account. As another poster noted, most spam filtering methods are just educated guessing. Rely on one that is educable.
-- Consensus - 50% probability that the majority are wrong.
0.3% FP on the total mail input, but 90% is spam anyway.. so that means 3% of legit mail is dropped.
3$ is way too high.
Why does it seem everyone ignores the real source of the majority of spam: Microsoft windows computers infected by viruses running botnets that send spam. Yes, is generated by other systems, but not nearly the amount that is being generated by MS based botnets.
How about everyone just send their frigen spam bill to MS. How about a class action for everyone to collect for the damage that MS does to networks around the World. Better yet lets just forward all the spam we get to MS. Let them sort it out.
Living in Chile
Is it possible to type in a verification characters before the person sends the email so it knows you are human?
Like how you create an account or trying to log into an account?
So 70% of the time it works every time. Sold.
Wouldn't it be cheaper in the long run to simply design a new mail protocol from the ground up, with security and spam prevention as the main focuses? It seems to me that when you need to implement solutions which are as complex as this one to keep the system running as intended it is more or less a failure.
Not even remotely. At best this system could only be used as input to a secondary system that then uses this information along with other sources. See, e.g., SpamAssassin's scoring approach.
... like network-based virus blockers bring several good things:
* an entirely different set of algorithms can be used, leveraging data and traffic patterns not specific to the message contents
* a team of engineers not tied to a single enterprise
And, indeed, major network operators like to do stuff like this - takes traffic off the network, and relieves enterprises of evil traffic forms (including DDOS)
BUT then, net neutrality purists, like 4chan, despise this and fight back, as recently when AT&T worked to thwart a large-scale DDOS attack.
I still don't understand why they don't regulate SMTP servers on the net just like other business areas. These have a real financial impact on other's operating costs. If they required all SMTP servers on the net to be closed and regulated, I think it would be a good start.
I'm talking fines and the ability to cut off any rogue SMTP servers. They also need a better method to validate connecting servers and it needs to be an industry wide adopted standard, whether that is done via certificate authority or some other 'secure IP' method.
I hate this stuff.
What is the obsession with sticking this functionality into the network WHERE IT DOESN'T BELONG! How many times do we have to go through this crap? How many times will some idiot try to stick something like this into the network? Is it SO difficult to kill the spam at the server? The systemic down-side of encouraging this sort of approach is so unbelievably bad, we should tar and feather its developers as lesson, so other "researchers" won't follow suit.
Gah!!!!!
I have a 100% guaranteed way to stop spam from reaching the mail sever.
Unplug the dam thing!
An SQL query goes to a bar, walks up to a table and asks, "Mind if I join you?"
sdf
Although it is not 100% effective, having a spam filter in front of the email server is the best solution IMHO. Solutions like this let traffic hit the mail server before stopping it as spam. Other than it being annoying to users, the big issue with spam is lots of small connections slowing down the system. Letting a EHLO for each of the spam hits despite filtering it away before completion is not helpful. But then, it might depend on if your an end user that hates getting spam or an admin that hates what spam actually does to your mail server.
Having to work for a living is the root of all evil.
Are they kidding? 70% and 0.3 % false positives? I employ a simple GreyList which catches 90% of spam and 0 false positives short of a misconfigured sending email server that does not adhere to RFC. Couple this with user-configured Spam Assassin, and my clients see maybe 1 (generally 0) spam email in their inbox a day, with around 10-20 ending up in the spam due to SA. This is down from hundreds in the spam folder and 20-25 in the Inbox before implementing this solution. At least if we're going to pretend something is newsworthy, make it better than what already exists.
Call me old school, but I think the best way to keep spam from getting to the servers would be for there to be a spirited geek vigilante initiative for a couple of years where guys with pocket protectors and baseball bats would show up on the doorsteps of spammers and break their kneecaps. I think there was a Russian spammer who got harsher treatment than this a year or two ago, but I think broken kneecaps would suffice. Just saying...