Slashdot Mirror


Interview With The SpamAssassin

comforteagle writes "Howard Wen has conducted an interview with Daniel Quinlan of SpamAssassin. In it he explores what keeps Daniel motivated in the face of the unrelenting torrent of spam and new spamming techniques, as well as, what is working - what is not, and what he predicts spammers have up their sleeves next for defeating spam detection." From the interview: "If you don't mind deleting spam manually, that's your prerogative, but don't complain about it. If your ISP doesn't do a good job fighting spam, then switch ISPs or install your own anti-spam software. There are a lot of choices out there."

25 of 202 comments (clear)

  1. gmail has good spam protection by erick99 · · Score: 5, Informative

    When I got to over 300 spam a day was just about the time I tried gmail (google mail). So far this is the best spam protection I have come across. My spam folder is getting about 400 a day now but I can't remember the last time a "good" message went in there. I still get about five spam a day that I need to manually deal with.

    --
    http://www.busyweather.com/
    1. Re:gmail has good spam protection by exhilaration · · Score: 2, Informative
      No, Gmail filters do NOT whitelist messages.

      I've seen several of my filtered messages end up labeled as spam. Since they *were* spam, I was quite happy to see this.

  2. Cloudmark SpamNet by Zendar · · Score: 5, Informative
    Been using Cloudmark's SpamNet for over a year and haven't looked back since. Nothing gets by.

    Disclaimer: No interest in the company. Just a satisfied customer.

    1. Re:Cloudmark SpamNet by brj · · Score: 2, Informative

      I tried Cloudmark once, but found their false positive rate to be atrocious. They were tagging legitimate marketing emails from companies like REI that I had actively signed up for as spam. Their network of lusers are too lazy to unsubscribe from legit emails and they just report them as spam. Argh! (This was several years ago, so I don't know if things have improved since then.)

  3. My view by elid · · Score: 2, Informative
    OSDir.com: What's the craziest/toughest spamming scheme that the SpamAssassin team has encountered and dealt with?

    Quinlan: That would probably be advance fee fraud, also known as "Nigerian" or "419" scams. These messages are often literally sent individually to each recipient, mutating each time, by scammers typically located somewhere in West Africa. Because they often are sent in low volume, and almost every one is somewhat different, they are a bit tricky to catch.

    An easy solution for home users who don't happen to know anyone from West Africa is to just block all e-mail from there. But even without that, I have had decent success in the past with a combination of SpamAssassin tagging e-mails and Thunderbird filtering. Stay away from OE. Far, far away.

    1. Re:My view by daremonai · · Score: 2, Informative
      I have found that Bayesian filtering is essentially 100% effective on 419 scam mail. As is obvious when reading any of them, they have a very distinctive vocabulary...

      The "trick," such as it is, is to maintain three separate Bayes databases - a "good" one, a "spam" one, and a "419" one. Filter with good vs. spam first, and then with good vs. 419. This seems to work better than just lumping 419 mail in with other spam, since as Quinlan notes, the 419 scam mail tends to have little content in common with other spam. But with a separate filter, it can be identified with essentially 100% accuracy.

  4. We use a Brightmail tool on Ironport appliances by csoto · · Score: 2, Informative

    IT IS THE BOMB. Spam loads to my work account dropped by orders of magnitude. Now, Mail.app identifies maybe 2 per day, instead of 200+.

    Charles

    --
    There exists no way of exchanging information without making judgments. --Bene Gesserit Axiom
  5. Once again.. by daeg · · Score: 4, Informative

    I've said it before, but I have to promote PopFile (http://popfile.sourceforge.net/) again. Since doing a bit of training, it now correctly sorts about 99% of my e-mail. I get about 600 messages a day not including mailing lists, and my accuracy is 99.65%. It is generally not susceptible to new spam techniques unless they can match the subject matter that my e-mail typically covers.

    When they start spamming "Linux IPF Apache LOOK! Vi@GR@ makes your peNi$ PHP Bug CSS" I will be concerned.

  6. Am I alone? by The+Eagle+Maint · · Score: 4, Informative

    Maybe I'm the lucky minority here, or my mail host has some crazy filters I don't know about, but I very, very rarely recieve any type of spam. Now, I don't go handing out my email address either. If I'm signing up for something shady, I use another address at a web-based email account, which does get a lot of spam... but otherwise I use the mail host that comes with my website http://www.surpasshosting.com/ and Thunderbird as a client, and never see any type of spam.

    1. Re:Am I alone? by Saeed+al-Sahaf · · Score: 2, Informative

      Which is why, when you run your own personal mail server (qmail with vpopmail, anyone?), you should not have a default catch. If it does not go to a real account, dev/null it.

      --
      "Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
  7. SpamAssassin has SURBL support by Anonymous Coward · · Score: 1, Informative

    SpamAssassin got 'native' SURBL support in 3.0

  8. If you can't run your own mailserver... by vasqzr · · Score: 4, Informative


    A pop3 proxy works great. I recommened SpamBayes

    http://spambayes.sourceforge.net/

  9. Re:Complain as much as you can! by bbuR_bbuB · · Score: 2, Informative

    Most spam these days isn't coming from China and the far east. Instead, they are coming from zombie PC's haxored by spammers, most likely right in your own backyard. Well, maybe not your backyard, but a lot of it is definately coming from the US again. So much for blocking .cn emails....

  10. personalized training by the+quick+brown+fox · · Score: 2, Informative
    Quinlan: Any technique that tries to identify "good" mail without authentication backing it up, or some form of personalized training. It worked well for a while, but it's definitely not an effective technique today.

    What's wrong with personalized training? I get more spam than almost anyone I know, and SpamBayes does a fantastic job for me.

  11. Re:you'ved been spammed! by QuasiEvil · · Score: 2, Informative

    For me it comes and goes, but yes, in the last couple weeks I've noticed a dramatic increase in false negatives. I feed them back into the bayesian filter for training, but it doesn't seem to help much. The worst part is that there's no real pattern to the stuff that gets through, other than the fact it tends to be very minimalist - a few words, often about a stock to invest in, etc.

    That said, SA has been a saviour of unimaginable proportions. I get 400-600 pieces of spam a day, and normally it's very good about getting all but 1-2 of them each day with hardly any false positives. Lately it's been letting 10-20 slip through, though.

  12. Re:Complain as much as you can! by frankie · · Score: 4, Informative

    Most spammers are not in U.S.

    This is false. The SpamHaus list shows the USA hosts more spammers than the other countries put together.

    the FBI who has bigger fish to fry

    This is somewhat true. We won't put a dent in spam from a legal perspective until a federal agency devotes some serious infrastructure to the job.

    That's mainly due to lack of willpower and expertise rather than funding, however. A competent "Spam Czar" armed with the authority to seize spammer's personal assets could easily achieve self-funded operation within a year.

  13. Re:you'ved been spammed! by Christopher_G_Lewis · · Score: 3, Informative

    It's just an arms race. SpamAssassin gets better, then the spammers adjust.

    Part of the problem with open source spam filters, the Bad Guys can reverse engineer what's currently being tested.

    I kinda wish that the SpamAssassin group would separate their tests from their product development, so we could get more frequent update of the "offical" spam assassin filters. However, I remember reading somewhere that testing and evalutating any new rules against their current corpus takes quite a long time.

    Also, make sure you check out http://www.rulesemporium.com/ for more frequently updated rules.

  14. Two words: Spam Bayes by laxiepoo · · Score: 2, Informative

    Spam Bayes with Outlook correctly handles over 95% of my spam.

  15. Spamassassin much better with personal training by gvc · · Score: 3, Informative
    The article and the SpamAssassin documentation seem to imply that SpamAssassin is best used as a server-side filter.

    In fact I've found it works great as a personal filter, if you configure it somewhat differently from the way the documentation suggests. That is, increase the weight of the Bayes filter, and have it train itself on every message it classifies. Then correct it on any mistakes it makes - which rapidly become few and far between.

    Here's a paper showing that SpamAssassin can achieve as good results as others touted for personal use.

    Unfortunately SpamAssassin is a bit hard to install and set up. But if you have RedHat or Debian Linux, it is available by rpm/apt and you can install a few scripts to make it work.

    I wish I had a better shrink-wrapped version, but I don't. So I'm supplying the raw files for one user in the hopes that (a) somewhat technical people can reproduce the setup and be happy, (b) somebody will make a shrink-wrapped version, perhaps with plugins or extensions or macros for more mail clients.

    Here is the Linux Personal Spamassassin setup.

  16. Easy manual sorting.. by deacon · · Score: 3, Informative
    For those of us who prefer to sort manually, using Pine over SSH and leaving all email on the ISP's server works pretty well.

    With a full screen terminal window, I can mark spam based on the name and the subject header. I can recognize spam at a rate of about 10 per second this way. With the names spammer pick, and the mis-spelled subject headers, it is pretty easy to pick them out.

    Using pine, I never give a spammer info by opening web bugs. I can look at the raw email by typing "h" to show the headers, so all those phishing emails are immediately obvious.

    Keeping the email on the isp's server means that when I rebuild a machine, I don't have to worry about about backing up my email.

  17. How I beat spam by Just+Some+Guy · · Score: 5, Informative
    I just wrote an article for this month's issue of Free Software Magazine on building spam filters. The long and short of it is that Spam Assassin is a very, very good last line of defense. However, there's a lot you can do to limit the amount of junk that even makes it that far into your system:
    1. Filter the HELO messages. If the sender says "HELO yourownname.example.com", then it's lying and you can safely reject the connection.
    2. Don't be overly picky about reverse DNS lookups, but do check that the domain of the From: address is resolvable. After all, what's the point of getting mail from "spew@nonexistentdomain.com" if you can't reply to them?
    3. Selective DNS blacklists. Do your homework and find a couple that are picky about what they add. Remember: false negatives are much better than false positives!
    4. SPF. It's not a cure all, but it works and it's available today.
    5. Greylisting. Oh, how I love thee!
    6. Finally, Spam Assassin, ClamAV, and other "expensive" defenses.

    Since I implemented the above as a Postfix ruleset, I don't get spam anymore, and it's not exactly like I've actually kept my primary address secret. No, I'm not kidding or exaggerating - basically, my mailbox is my own once again. Viva Postfix! Viva greylisting!

    --
    Dewey, what part of this looks like authorities should be involved?
  18. How I do it ... by Tripster · · Score: 2, Informative

    I manage a couple ISP incoming MTAs, they come looking for a anti-spam and anti-virus solution which is easy to provide them in OSS land.

    First Qmail setup to use RBLs ...

    cbl.abuseat.org sbl-xbl.spamhaus.org relays.ordb.org dynablock.njabl.org list.dsbl.org dul.dnsbl.sorbs.net

    That bunch will block a whole lotta spam before it ever gets to discuss sending mail with the SMTP server.

    Next, SimScan from Inter7.com, this little c app runs at the front end of the SMTP process, it will scan incoming mail at SMTP level with ClamAV and SpamAssassin, anything scoring over 10 in SA is dropped at SMTP level with a 5xx error.

    SimScan allows you to fine tune settings on a per domain and per user level if you so desire, so it is easy to turn SA off entirely for a user who wants all the spam they can get, ditto for those who'd rather not be protected from viruses.

    Using these features you stop a LOT of spam, likely in the 80% or higher range. Most domains we've applied this to have gone from hundreds per day to less than 10 per day.

    It is imperative you also use the SURBL features in SA to stop more spam than ever, you should also use Razor2, DCC and Pyzor. I suggest upping the Razor2 scores a bit as well the defaults are quite low.

  19. & for Windoze users... by robogun · · Score: 2, Informative

    Use SpamPal. It comes with blacklists, but you can turn it off because the reg expressions that came with it are very effective. There are also modules to decode base64, filter on spammed URLs, clean up web bug crap, block by country etc. & it's free.

  20. MOD DOWN PARENT by exhilaration · · Score: 2, Informative
    Dude, did you even RTFA???

    From the article:

    Quinlan: I'm sure some spammers look at our code, but the end effect is about the same as with closed source. To beat closed-source spam filters, all you need to do is install the filter somewhere or get an account at the ISP, then you just keep an eye on whether your spam is getting through.

    Also, much of our filtering relies on stuff not in the source code: user training via Bayes, network rules like SURBL for URI blocking, various DNS blocklists, and message checksum systems like DCC.

    To put it another way, closed source hasn't exactly protected closed-source programs from other types of security problems.

  21. Yahoo! Anyone? by G1aucon · · Score: 2, Informative

    I can't believe no one has mentioned Yahoo! yet. Automatic, accurate spam-filtering? Yes. White-listing? Yes. Black-listing? Yes. And if you want to stick with the free account, use Yahoo!POPs to download messages into Thunderbird.

    Personally, I have the upgraded (2GB) account so I can take advantage of what I consider the best anti-spam feature available anywhere: disposable email addresses.

    Not sure if you want to divulge your address to for a free iPod contest? Give them a disposable address where email is directed straight past your inbox and into a separate folder. When you lose that iPod contest and the spam starts pouring in, just delete the disposable address.

    Sure, you can set up a free "junk mail" address with Hotmail, Yahoo!, but I've found that "checking in" on my spam is a waste of time.

    Of course, the best solution is to not give out your email address.