The Next Step in Fighting Spam: Greylisting

your first mistake by frieked · 2003-06-20 06:38 · Score: 4, Insightful

I'm going to try to say this as nicely as possible and without trolling:
You have just rendered Greylisting pretty useless by making it open source. Spammers are much smarter than you think and what you have basically done is shown them what they need to do in order to get around Greylisting. That's just my take on the issue, maybe I'm wrong but I doubt it.

--

I have often regretted my speech, never my silence.
-Xenocrates

Re:your first mistake by Soko · 2003-06-20 06:45 · Score: 4, Insightful

I'm going to try to say this as nicely as possible and without trolling:

Not trolling at all - you have a legitimate (though perhaps misguided) problem with this method.

You have just rendered Greylisting pretty useless by making it open source. Spammers are much smarter than you think and what you have basically done is shown them what they need to do in order to get around Greylisting. That's just my take on the issue, maybe I'm wrong but I doubt it.

So, the spammers themselves will be of significant help in debugging and helping to fix the code so they can't circumvent it, won't they? OSS means anyone who finds how the greylist script is beaten can figure out a fix and post it. Sounds like the best thing to do IMHO.

Soko

--
"Depression is merely anger without enthusiasm." - Anonymous
Re:your first mistake by Schnapple · 2003-06-20 06:46 · Score: 5, Funny

You have just rendered Greylisting pretty useless by making it open source.
You're assuming the spammers can read source code.

--
Schnapple
Re:your first mistake by L.+VeGas · 2003-06-20 06:49 · Score: 4, Funny

That's just my take on the issue, maybe I'm wrong but I doubt it.

That's what I like to see. Someone with strong opinions. Or maybe not.

--
Best Windows Freeware
Re:your first mistake by tomstdenis · 2003-06-20 06:56 · Score: 4, Informative

You're missing a big part of it though. If you have to try say 3 times to send a message [over a 5 day period or so] you're ability to mass send 100million emails is really squashed.

Legitimate people first time sending won't really mind the few day wait and most MTAs will try for upto a month.

Tom

--
Someday, I'll have a real sig.
Re:your first mistake by TheCarp · 2003-06-20 06:59 · Score: 5, Informative

not at all

Read the paper. Spammers would figure it out eventually. What it buys is what they have to do to get around it.

It means they have to do retrys...that means spam runs take longer, especially since they have to run...then wait for a locally defined timeout, and run all those addresses again

AND they have to do it from the same IP.

This raises their bandwidth profile. It wastes their time... all in all... it raises their cost of doing buisness and cuts into their profit margins.

It means they will have to upgrade their tools again. It means they get headaches. And of course, the next step is to impliment spam traps that watch activity and see that a spammer is spamming, and promotes them to a blacklist before they can even retry. (oh gee 1000 new greylist triplets from 1 IP in under 5 mins? Set the timeouts for that IP to 12 hours)

-Steve

--
"I opened my eyes, and everything went dark again"
Re:your first mistake by Henry+Stern · 2003-06-20 07:29 · Score: 4, Interesting

It means they have to do retrys...that means spam runs take longer, especially since they have to run...then wait for a locally defined timeout, and run all those addresses again

AND they have to do it from the same IP.

Not to mention that if this is used in conjunction with other collaborative tools (i.e. RBL, checksums), by the time that the spamming MTA can return its IP address will have been submitted to MAPS/etc. and the contents of the message will have been submitted to Razor/Pyzor/DCC.

I think that this greylisting idea will be pretty hard to beat by Joe spammer. Since the game of spam detection is pretty much an arms race, slowing him down will probably be enough to turn the battle in your favour.
Re:your first mistake by Ross+C.+Brackett · 2003-06-20 07:31 · Score: 4, Funny

You're assuming the spammers can read.
Re:your first mistake by autopr0n · 2003-06-20 07:41 · Score: 4, Funny

You're assuming the spammers can read source code.

Who do you think writes spamming software?

--
autopr0n is like, down and stuff.

can't believe their numbers by sqrt529 · 2003-06-20 06:39 · Score: 5, Informative

most spam today is sent through open relays. Those relays will simply retry the delivery no matter which software the spammer uses, so the method won't work.

Re:can't believe their numbers by McDutchie · 2003-06-20 06:49 · Score: 5, Informative

Eh, open relays are soooo 20th century. :) Actually most open relays today are either blocked or closed, and newly installed MTAs are secure against third-party relaying by default, so this spam method is dying out. Most spam today is sent either directly to the receiving MTA, through open proxies, or through formmail.pl and similar exploits.

In case of /.'ing by Anonymous Coward · 2003-06-20 06:39 · Score: 4, Informative

The Next Step in the Spam Control War: Greylisting By Evan Harris Copyright 2003, all rights reserved. Introduction This paper proposes a new and currently very effective method of enhancing the abilities of mail systems to limit the amount of spam that they recieve and deliver to their users. For the purposes of this paper, we will call this new method "Greylisting". The reason for choosing this name should become obvious as we progress. Greylisting has been designed from the start to satisfy certain criteria: 1. Have minimal impact on users 2. Limit spammers ability to circumvent the blocking 3. Require minimal maintenance at both the user and administrator level User-level spam blocking, while somewhat effective has a few key drawbacks that make its use in the continuing spam war undesirable. A few of these are: 1. It provides no notice to the senders of legitimate email that is falsely identified as spam. 2. It places most of the costs of processing the spam on the receivers side rather than the spammers side. 3. It provides no real disincentive to spammers to stop wasting our time and resources. As a result, Greylisting is designed to be implemented at the MTA level, where we can cause the spammers the most amount of grief. For the purposes of evaluating and testing Greylisting, an example implementation has been written of a filter that runs at the MTA (Message Transfer Agent) level. The source for this example implementation is available as a link below, and as other implementations or additional utility code become available, they will also be linked. Greylisting has been tested on a few small scale mail hosts (less than 100 users, though with a fairly diverse set of senders from all over the world, and volumes over 10,000 email attempts a day), however it is designed to be scalable, as well as low impact to both administrators and users, and should be acceptable for use on a wide range of systems, including those of very large scale. Of course, performance issues are very dependent on implementation details. The Greylisting method proposed in this paper is a complimentary method to other existing and yet-to-be-designed spam control systems, and is not intended as a replacement for those other methods. In fact, it is expected that spammers will eventually try to minimise the effectiveness of this method of blocking, and Greylisting is designed to limit options available to the spammer when attempting to do so. The great thing about Greylisting is that the only methods of circumventing it will only make other spam control techniques just that much more effective (primarily DNS and other methods of blacklisting based on IP address) even after this adaptation by the spammers has occurred. The Greylisting Method High Level Overview Greylisting got it's name because it is kind of a cross between black- and white-listing, with mostly automatic maintenance. A key element of the Greylisting method is this automatic maintenance. The Greylisting method is very simple. It only looks at three pieces of information (which we will refer to as a "triplet" from now on) about any particular mail delivery attempt: 1. The IP address of the host attempting the delivery 2. The envelope sender address 3. The envelope recipient address From this, we now have a unique triplet for identifying a mail "relationship". With this data, we simply follow a basic rule, which is: If we have never seen this triplet before, then refuse this delivery and any others that may come within a certain period of time with a temporary failure. Since SMTP is considered an unreliable transport, the possibility of temporary failures is built into the core spec (see RFC 821). As such, any well behaved message transfer agent (MTA) should attempt retries if given an appropriate temporary failure code for a delivery attempt (see below for discussion of issues concerning non-conforming MTA's)

Tempfailing is not new and unique by HiKarma · 2003-06-20 06:39 · Score: 5, Informative

This idea isn't so new or unique. It's been discussed a fair bit on the ASRG mailing list under the name "tempfailing".

First I heard of it was from Landon Noll and Mel Pleasant. It is noted in brief as one of the techniques in this plan to end spam (though their plan, which did include the triplets, is not laid out in full there.)

It is a worthwhile technique for a little while, and if spammers were rational, would be worthwhile for some time to come. But spammers are not rational, and already this technique is not as useful as would be hoped.

Do a Google Search for Tempfailing especially in ASRG to see statistics etc.

Time critical by Synithium · 2003-06-20 06:42 · Score: 5, Insightful

Time critical mailing will go out the window. I can see how this might make any corporate user irate. The same thing goes for challenge-response, the time delay in the business world is unacceptable.

This would be great for personal mail, but that's about it. ISPs would have the same problems with it because their business-class users most likely use the same servers as their consumer-class users.

Re:Time critical by eGabriel · 2003-06-20 06:50 · Score: 4, Informative

This isn't true, actually. Once one mail gets through, the system lets in subsequent mails from that sender. So there is only the initial delay, after that CEO Joe can use his email as a fat instant messenger per usual.
Re:Time critical by IncohereD · 2003-06-20 08:40 · Score: 4, Insightful

How often do you get time critical e-mail from someone you've never recieved e-mail from before?

some guy telling you to BUY THIS NOW != time critical.

your wife telling you to BUY THIS NOW == time critical, and in theory, your wife == whitelisted (or blacklisted, depending on personal preference).

security through obscurity, again? by dh003i · 2003-06-20 06:42 · Score: 4, Insightful

If they can get around it by looking at the source, then something was wrong with it, waiting to be exploited. Might as well fix it.

--
social sciences can never use experience to verify their statemen

Re:security through obscurity, again? by blakestah · 2003-06-20 06:48 · Score: 4, Interesting

The thing that is wrong is the SMTP protocol, and most people's conception of a spammer. Once you see a few "confessions of ex-spammers", everything changes.

There are people out there who pay $10000 in startup costs, and then make $2000/week for spamming. The $10000 gets them software written by knowledgable internet security experts. This software finds any and every way to anonymify the email spam, and finds lists of people to spam.

As long as knowledgable internet security experts are getting paid good cash to enable spammers, and SMTP doesn't change, spam will only continue to get worse. There needs to be a fundamental change in SMTP protocols. It oughta take the spammers about 2 days to fix their MTA bug to get around greylisting.
Re:security through obscurity, again? by SillySlashdotName · 2003-06-20 07:14 · Score: 4, Insightful

I see that, in fine /. tradition, you didn't RTFA.

From the article: If we have never seen this triplet before, then refuse this delivery and any others that may come within a certain period of time with a temporary failure. (emphasis addded)

Later in the article it goes into much more detail about the delay, how long to delay if the triplet has not been seen before, life time of the whitelist, etc.

It also talks about configuring the times - they mention the default delay is 1 hour, but that their records suggest that 1 minute would have caught 99% of the same spam messages - "The data collected during testing showed that more than 99% of the mail that was blocked with the tested setting of 1 hour would still have been blocked with a delay setting of only 1 minute. At that point, having a larger initial delay will definitely help, as it gives time for other blocking methods to act. For this reason, it is suggested that at least a one hour delay value be kept as a default, since spammers will start adapting as soon as this method becomes known and starts being used. (again, emphasis added)

RTFA!

--
Acts of massive stupidity are almost never covered by warranty. --me.
Re:security through obscurity, again? by blakestah · 2003-06-20 07:33 · Score: 4, Insightful

RTFA!

There is no magical waiting period or re-try period that cannot be trivially coded around. And, with good money on the line, will be trivially coded around.

You don't get it. Really smart people are getting paid a whole lot of money to make programs to exploit every possible crack in the way we send email. There is no general rule to spammers, except that it is a lot of money and they are very clever. Little bandaids are not going to stop this one - there needs to be a much more fundamental change. And I am not talking about laws against spam - I am talking about changes in the protocols we use to send email.
Re:security through obscurity, again? by letxa2000 · 2003-06-20 08:48 · Score: 5, Interesting

is reject the mails on the greylist after holding the connection for, say, 10 minutes. That will help deter spamming software,
I doubt it. I would assume the spam software would have a timeout, and I doubt it's ten minutes. If they want to hit-and-run and aren't even willing to make a second delivery attempt when an error code is returned, I doubt they're going to wait 10 minutes. I'm sure that within 30 seconds or less they'll consider it a dead connection and hang up.
Problem is, I used to have my sendmail HANG UP in real-time on an incoming connection as soon as it realized a message was spam. I.e., the incoming message was filtered in the DATA phase and if it was spam I hung up immediately. It worked great and it felt good, but there were many spam programs that took the disconnection as some kind of TCP/IP failure and immediatelty tried again. So I had one day where a single message was attempted to be delivered about 30,000 times as the spammer connected, I hung up, spammer software said "Oops, let me try again!" About one delivery attempt every second or so.
I'd be willing to bet if you put a 10 minute timeout in sendmail you'll see lots of spammer software disconnecting sooner and just trying again. It takes more of their resources, but takes more of yours, too.

spam.....hrmmm by chef_raekwon · 2003-06-20 06:46 · Score: 5, Insightful

with all of these solutions to spam..and all of the spam now flooding mail servers...

isn't it time to change the specification (RFC) and possibly the manner in which our current system works? i haven't come up with anything yet, but surely there must be some sort of handshaking/secure type connection that could be used - - some sort of postage (free) that is encrypted into the mail, that states that it is genuine....kind of like the hologram on those windows cds...

i dunno. file this story under redundant.

--
We're like rats, in some experiment! -- George Costanza

I think not by Monoman · 2003-06-20 06:54 · Score: 5, Interesting

Doels this mean all public crypto algorithims are useless?

--
Keep the Classic Slashdot.

RFC 3514 by pizen · 2003-06-20 06:57 · Score: 4, Funny

How about in the spirit of RFC 3514 (the evil bit) we create a spam header in email. Spam will set this header so we can easily filter it out.

Published a paper? by Call+Me+Black+Cloud · 2003-06-20 06:58 · Score: 4, Informative

Where? To me, publishing a paper means your writing appeared in some peer-reviewed journal (where the "peers" are acknowledged as domain experts). What you did was put up a web page. With a donation link at the bottom.

For others looking for a solution, try POPFile. Open source, cross platform, gives me 96% accuracy.

One more thing: "practically eliminates" is not the same as "eliminates".

Re:Published a paper? by vidarh · 2003-06-20 07:33 · Score: 4, Insightful

To me publishing a paper in a peer reviewed journal instead of on the web would mean that I'd expect audience to be reduced to a ridiculously small fraction of people that might be interested. If I wanted to publish something I'd do it on the web first, and if it stacks up people I respect would start talking about it and link to it.
Yes, I realize that for "serious" science still expect things to be published in peer reviewed journals, but in most cases I can't help but think that getting the article out there would be more useful. Sure, peer review is important, and somewhere to look for some kind of verification of the value of a paper is useful. But I much prefer the Research Index way, where I can get a good indication of the value of a paper by looking at how many people have cited a paper and WHO have cited a paper.
Anyway, pretending that putting up a document on a website is somehow less publishing a paper than having it printed in a journal, is just plain elitist. You should propably be a bit more critical to papers that are published that you don't know have been through a proper review, especially if you're not a domain expert yourself, but being aware of the source is something that you always need to be.

Poor use of statistics by GGardner · 2003-06-20 06:58 · Score: 4, Insightful

The data in this article claims that 1% of all corporate mail servers in the UK allow open relaying, down from 91% in 1997. For all we know, the total number of corporate e-mail servers has grown by a factor of 100 (or more) in the last six year, meaning that perhaps there are more open relays now.

The article also doesn't measure the amount of spam coming through those relays. Even if there are only 10 open relays in the UK at any one time, it still might be possible for all of the spam to be coming through them.

Certainly, closing down open relays is a good thing, but lowering the percentage of open relays doesn't prove anything about the source of spam

Easy for end-users, sure. by Medievalist · 2003-06-20 06:58 · Score: 5, Insightful

Just encode your e-mail address on web pages & don't sign up to any dubious mailing lists.

Many of us must maintain contact addresses in the global whois database - so that people can contact us when something is broken.

Look at it this way: you can stop crank calls by unlisting your phone numbers. But you can't unlist the hospital, the ambulance service, the fire department, etc.

We're not all end-users. Some of us are the plumbers.

Re:1 false positive is not acceptable. by pclminion · 2003-06-20 07:00 · Score: 5, Interesting

Wrong. 1 false positive can be acceptable, and in fact is probably better than how things are now.

At USENIX '03 there was a paper presented on artificial intelligence techniques for spam detection. I can't provide a link since only USENIX members can download the paper (at this point, at least). I was a coauthor of that paper.

One of the things we've discovered in our research is that some classes of filters (most notably, the one I have been developing along with a few other individuals) are actually more effective at correctly classifying email than humans are. That is to say, you can train the learning algorithm on mostly-correctly-classified data, then re-run it over the training data, and almost miraculously, it discovers all kinds of email in the training set that was incorrectly classified.

I.e., this filter has discovered mail that I myself incorrectly thought was spam. It's scary, because there's a lot of it.

To assume that a human will always be 100% accurate at classifying their own email isn't just arrogant, it's plain wrong. Newer filters that will be introduced in the near future might possibly be more accurate than you, a frail human, could ever be.

Waiting for Article Title by notque · 2003-06-20 07:02 · Score: 4, Funny

The Next Step in Fighting Spam: Death Penalty

--
http://use.perl.org

Delaying email by one hour! by pjrc · 2003-06-20 07:04 · Score: 5, Insightful

From the linked paper:

An hour is short enough that in most cases, users will not notice the delay.

I'm wondering how I'm going to explain that to a new customer over the phone who says "I'll just email that file right now so we can go over it together".

--
PJRC: Electronic Projects, 8051 Microcontroller Tools

Re:Delaying email by one hour! by vidarh · 2003-06-20 07:24 · Score: 4, Insightful

Agreed. I've been involed in operating a larger (hundreds of thousands of active users) mail system a couple of years ago, and users would complain if their mail took more than seconds. We had to upgrade our system at one point because rapid growth had made mail delivery take a couple of minutes on average, and it caused bad publicity - a lot of users had a clear expectation that e-mail should be delivered in a few seconds and that if it didn't something was wrong.
I think changing that perception of e-mail as near instant will be incredibly hard. And if you succeed it will just move even more traffic over to the IM networks and cause spamming of IM networks to escalate instead.

One good point about this proposal by Anonymous Coward · 2003-06-20 07:10 · Score: 5, Insightful

It deals with spam at the server level. All the wonderful user-level solutions don't do jack to stop spam from being sent. Look at the numbers the spammers show for return rate, and look at how fast spam programs can go, and you'll see that the only solutions that will work are those that make it expensive to send spam. Anything else will just make the spammers send more spam to try and get the hit rate they need.

clever hack for WHOIS contact addresses by phr1 · 2003-06-20 07:11 · Score: 5, Interesting

The registrar I use (jumpdomain.com) has a clever hack for despamming WHOIS contact email. Basically they change your published contact address once a week. The published address i automatically generated, looks like gibberish, and forwards to your real address. If someone wants to contact you by looking up your address by WHOIS and writing to you, it works fine. But if they add the address to a mailing list, it stops working in a week. That has eliminated almost all my WHOIS spam. Good scheme.

The real solution by mrseigen · 2003-06-20 07:11 · Score: 4, Funny

We should grab some of the guys who get 1000+ spams per day, point them to the physical location of the spammers, and then step back. I can guarantee you that vigilante justice is entirely appropriate here, considering we want the gov to step back from the 'net instead of entering new "secret arrests of spammers"(?) laws.

Bogofilter does pretty well for a client filter by lxdbxr · 2003-06-20 07:15 · Score: 4, Interesting

The summary does not seem completely accurate; since the greylisting MTA sends an SMTP temp failure there should never be any false positives as long as the sending MTA is vaguely RFC-compliant (sadly not true I suspect). Or at least that was my reading of the paper...

I'm currently using Bogofilter (and looking into CRM114) and getting better than 99% accuracy (about 1 in 200 false negatives at the moment) and very very few false positives (maybe 2 in 5000 messages).

Of course these are MUA level filters (and yes, I know, I've already "paid" with bandwidth to download the spam) - however since the proposed "greylister" would have to be installed as the MTA at major ISPs (as the authors note) I'm not convinced that is more likely to get widespread adoption than the various sorts of adaptive client-based filtering now available, particularly as it requires a database to back the method up.

As far as I am concerned the major factor in a spam filter should be zero false positives - personally I don't mind reviewing one or two spams a week but I get really annoyed if I were to lose a real message (note the two false positives I have sent to date with bogofilter contained forwarded sales pitches along with a message).

--
-- Nothing unusual happened today

missed the point by eLoco · 2003-06-20 07:33 · Score: 4, Informative

I've seen some comments that say (paraphrasing) "For real SPAM filtering use <POPFile|Spamassassin|...>", but these missed the point (or perhaps didn't read the paper?). This method is a "first-pass" filter, getting rid of e-mails for which no redelivery attempt will be made. The second-pass filter should still be implemented for everything that gets through the first pass. From the paper:

"The Greylisting method proposed in this paper is a complimentary method to other existing and yet-to-be-designed spam control systems, and is not intended as a replacement for those other methods. In fact, it is expected that spammers will eventually try to minimise the effectiveness of this method of blocking, and Greylisting is designed to limit options available to the spammer when attempting to do so."

--
sig != null

Re:Bayesian Filtering by anti$pam · 2003-06-20 08:42 · Score: 4, Insightful

The key is to make spammers not make money!

If people start adopting anti-spam technologies we would reduce the return spammers get from sending spam. Reduce this enough and the spamming business will no longer be profitable.

POPFile is great. I've also used SAProxy (http://saproxy.bloomba.com/) under windows and it works great too.

Again, the idea is not to eliminate all spam, but to reduce the return rate, and therefore the money made by spammers.

Anti-Spam Techniques: Honeypot spam detection! by mabu · 2003-06-20 11:06 · Score: 4, Informative

Aside from the obvious of getting the authorities to crack down on the existing illegal activities (relay hijacking, violation of TOS of ISPs, header forging, etc.) which is the only true solution, I think there are much better approaches than this "greylisting" method.

The problem with the greylist method is it still slows down mail service, and potentially more than the relay blacklist features. The objective here is that end-user/networks should not be penalized in the fight against spam. We already waste too many resources, and according to my latest mail server stats, more than 65% of our inbound mail is UCE. I'm fed up with more than half my e-mail bandwidth being crap my users didn't request so more resource allocation on a local level in the fight against spam is counterproductive!

Here's a very clever, much more practical method I cound recently.

A company is Canada has set up what it calls SORBS: Spam and Open Relay Blocking System.

What's different from their blacklist is that they maintain "honeypots" strategically located around the Internet. These are servers they specifically set up as inbound mail relays, but never for legitimate purposes. If the servers get [select] mail activity, it's assumed to not be legitimate and it flags the source as a potential spammer... it makes a lot of sense. You create a domain name, but don't promote it in any legitimate manner, and/or you seed spam lists with these e-mail addresses and then let the spammers send to your key systems around the internet and *bam*, they're identified in real time, and then added to a blacklist.

I really like this idea. Like any other system, it has the potential for abuse but the beauty is the identity of the honeypot systems is kept secret, so it's very difficult for anyone other than spammers to exploit the network.

Slashdot Mirror

The Next Step in Fighting Spam: Greylisting

39 of 481 comments (clear)