Slashdot Mirror


Critical Eye on SpamAssassin

ErrorBase writes "In this Infoworld article, Logan G. Harbaugh makes a great deal about an ancient (2.44) version of SpamAssassin comparing it with newer comercial variants. Quote : You get what you pay for. [...] However, it took more than 10 times as long to install and configure SpamAssassin as it did any of the other products. " Why did he not ask Kevin Railsback who had the whole thing working some while ago?)"

23 of 324 comments (clear)

  1. Re:What is a good client-side spam filter for Outl by reaper20 · · Score: 4, Informative

    SpamBayes, by far.

  2. Re:Is there a gui tool for configuring SpamAssassi by PhilippeT · · Score: 4, Informative

    Webmin is great for setting up just about anything you can think of.

    --
    A psychopath can't tell the difference between right and wrong. A sociopath knows the difference - he just doesn't care.
  3. Logan You Better Run by Anonymous Coward · · Score: 5, Informative

    Great - compare generation or more older open source to fresh shrinkwrap. Who's zooming (or shilling) for who.

    My ISP (souther NH) runs SpamAssassin 2.6 - and I can tell you that at the default settings it catches 90-95% with .01% (yes Bucko, less than 1/1000) false positives. When they implemented it several versions ago it was just as good.

    I've got one client where the run NO filter - some folks (the names GOTTA be on the web site) get up to 100 spams a day. IT are basically monkeys with hands. I have no idea what the CEO thinks. They wouldn't even think OS as they're a total MS shop.

    1. Re:Logan You Better Run by shis-ka-bob · · Score: 4, Informative
      From the home page of Spam Assassin:
      Razor: Vipul's Razor is a collaborative spam-tracking database, which works by taking a signature of spam messages. Since spam typically operates by sending an identical message to hundreds of people, Razor short-circuits this by allowing the first person to receive a spam to add it to the database -- at which point everyone else will automatically block it.

      From the review:
      All the products except Brightmail and SpamAssassin allow end-users to add senders to the domain whitelist themselves. Brightmail allows users to forward misidentified e-mails to the administrator, who can choose to add the sender to the whitelist. SpamAssassin allows only the administrator to add to the whitelist, with no direct access for users.

      Who is missing something here? Me or the reviewer? It looks like Razor does exactly what he wants to do and claims that SpamAssassin doesn' t do. It seems to me you are right ... selectively comparing old OS with newer commercial software so that he can make claims that are factually correct about SpamAssassin 2.44 but completely missleading about the current version.

      --
      Think global, act loco
    2. Re:Logan You Better Run by mrex · · Score: 3, Informative

      This "journalist" is a grade-A moron as has been demonstrated sufficiently already in this thread. The one new thing I have to add to this conversation is that, contrary to the following statement:

      SpamAssassin allows only the administrator to add to the whitelist, with no direct access for users.

      SpamAssassin (anything remotely resembling a current version) supports per-user whitelists and other preferences. It takes a little more skill to set up, but frankly the end result is way better than anything you're likely to achieve with a commercial product. The users of my ISP can simply log into a secure space on our website, where they can then view their assassinated spam, change their default score, and create individual white and black lists. This is accomplished with nothing but SpamAssassin, Apache, MySQL, and a few glue scripts. I would put our OSS-based solution in a head to head with any of those commercial offerings.

  4. I get what I pay for too from reading the article. by reaper20 · · Score: 4, Informative

    I don't understand why he's so critical of a free product. I upgraded to 2.60 and it's running near flawless, and since the program is so simple, you just upgrade it, no need to change configuration options if you don't need to, you just call it from procmail.

    Yeah all those GUI options look nice, but 90% of the time, why do I need to change my spamblocking settings? The Bayesian filter autoadjusts itself with little or no user intervention -- it's near transparent.

  5. Works for me by perlionex · · Score: 5, Informative

    I run a mail server at home on a Linux box, with Postfix and Spamassassin 2.60. I have it configured to label mail as spam once it hits 8 points, and to automatically chuck it into /dev/null once it hits 12 (using Postfix's header_checks).

    It works pretty well for me -- the mail server's only for my personal use so I don't really have to worry about irate subscribers sueing me for dropping them legit mail =p and the 8-12 point range in the spam marking gives me a chance to vet through those suspicious mails briefly before deleting them.

    I've never tried any other spam filters on the server-side, so I can't really compare. I guess I'm also a bit of a Linux hacker so I don't mind tweaking all those config files along the lines of the FAQ and other hints on forums to get it to work the way I want it to.

    1. Re:Works for me by perlionex · · Score: 5, Informative
      Inside /etc/postfix/main.cf:
      # The header_checks parameter specifies an optional table with patterns
      header_checks = regexp:/etc/postfix/header_checks
      Inside /etc/postfix/header_checks (note: replace "*" with "[backslash]*"):
      /^X-Spam-Level: ************/ REJECT
      Inside /etc/mail/spamassassin/local.cf:
      rewrite_subject 1
      report_header 1
      ok_languages en
      ok_locales en
      required_hits 8
      subject_tag [SUSPECTED SPAM]
  6. Re:What is a good client-side spam filter for Outl by Anonymous Coward · · Score: 4, Informative

    I know people have been recommending SpamBayes but be warned - it is very slow to parse and move the emails. Only bother with this if you receive only a small volume of spam or have a pretty fast computer.

  7. He already sent an open letter to SAtalk by damian · · Score: 5, Informative

    He sent a long open letter to SAtalk. You can find it in the mailing list archive

    1. Re:He already sent an open letter to SAtalk by Joseph+Vigneau · · Score: 4, Informative

      Wow. Considering he probably got a lot of nasty emails from the zealot crowd, this is a well reasoned response. He laid out his review criteria, and how SA can be improved to fare better against its commercial competitors. Well done, and a good challenge for the committers of SA.

  8. Critical Eye on Tech Journalists by abulafia · · Score: 4, Informative
    In true form for throwaway articles like this, products are compared poorly:

    Each product was tested with a different stream of mail, so the number of messages received varied, but all received enough messages to assess their capabilities.

    Can you imagine someone writing "Oracle, Sybase and Postgres were compared. While the data and workloads were different, all products performed enough work to assess thier capabilities."

    All the products except Brightmail and SpamAssassin allow end-users to add senders to the domain whitelist themselves.

    I don't know anything about Brightmail. Spamassassin end user whitelists entries can be set up in a number of ways.

    And all the products but SpamAssassin use dynamic updates to keep up with the evolving technologies spammers use to circumvent less sophisticated filters.

    As aluded to in the summary, this is false with modern versions of Spamassassin, which uses Baysian filtering. (The author later says he couldn't get it working.

    However, it took more than 10 times as long to install and configure SpamAssassin as it did any of the other products. [...] But just because the software is installed does not mean it will work -- filtering criteria must be added manually, and until that's done nothing is filtered out. Getting the various configuration files edited properly so that the whole package worked was not simple. Documentation was difficult to find, and not always easy to follow.

    While it is true that one must be comfortable with a text editor to configure Spamassassin, thus perhaps putting it out of reach of point-and-click admins and technical journalists, I also wouldn't be prone to put my mail servers in the hands of either of those groups of people.

    It looks for keywords in the subject or body of e-mails, but is frustrated by words not in the dictionary, such as "V!agra," or words that contain invisible HTML characters.

    While I am not sure what tests appeared in which version, I'm pretty sure 2.44 handled off-by-one works such as V!agra. I have no idea what he's talking about when he says "invisible HTML characters", but it does seem to point to a certain technical incompetence, similar to the ostritch belief - "If I can't see you, then you can't see me."

    This is not to say Spamassassin is the easiest thing in the world to deal with. I happen to love it, because of the extreme flexibility.

    I just get sick of tech journos who decide that because a tool doesn't have a gui and they don't want to take the time to configure it, it sucks.

    --
    I forget what 8 was for.
  9. You think 2.44 is ancient? by ryanvm · · Score: 4, Informative

    You think 2.44 is ancient? Feh - Debian 'stable' is still stuck with 2.20.

    1. Re:You think 2.44 is ancient? by Tom · · Score: 4, Informative

      Try http://www.backports.org for woody packets of SpamAssassin 2.60 (and other software)

      Aside from that, installing 2.60 into your home directory is absolutely painless. Just did that, before I learned about the backports.org website.

      --
      Assorted stuff I do sometimes: Lemuria.org
  10. SA+MailScanner works for me by cyways · · Score: 5, Informative

    I've found the easiest way to implement SpamAssassin is to invoke it through MailScanner. MailScanner uses third-party virus scanners and can optionally invoke SpamAssassin as well. With the free ClamAV antivirus product, you can build a powerful open source mail scanner. Even without a virus scanner, MailScanner detects and quarantines executable attachments and other dangerous content which represent the most common types of mail-borne viruses and worms.

    RedHat installs the daemonized version of SA as well as the SA Perl scripts. Using the daemon, the easiest implementation is to invoke SA in /etc/procmailrc on the mail delivery host; for mail gateways running sendmail, you need to use the milter interface. I've found the MailScanner+SpamAssassin approach much easier to configure than either of these methods, and you get virus scanning to boot!

    I suspect if the reviewer had compared SA 2.60+ to the commercial products, rather than the older 2.44 version used in the review, SA would have shown better results.

    I'd agree with the reviewer that one of the things SA lacks is an easy method for users to interact directly with the program. (Part of the issue has to do with security; SA runs as root. As I read the review, I wondered how the other products allow users to interact directly with the scanners without sacrificing security.) It's not easy to maintain per-user Bayesian filtering, for instance, but I generally recommend having the mail client, e.g., Mozilla, handle these tasks.

  11. Old, and on the list by satyap · · Score: 3, Informative

    Not only is this somewhat old news, it's been discussed on the spamassassin mailing list. Apparently, the article was edited so that it's more anti-spamassassin than the reviewer intended, but Mr. Harbaugh also defends his review of an older version of spamassassin as "it came with my Redhat 9" (NOT a direct a quote). He also claims it took nearly an hour to install and set up. (I counter that it took seconds to install and minutes to set up).

    The current version of spamassassin is 2.60.

  12. Try the Custom Rule Emporium! by sillypixie · · Score: 4, Informative
    I have SA 2.6 running as a plugin to the SunONE Messaging Server (v5.2), in BAREBONES mode (ie no RBL, no Bayesian, nothing but perl regex) and it filtered 591 spam from my bosses mailbox alone on the first weekend. 12 or 13 managed to sneak through.

    Since then, I've downloaded a bunch of rules from The SA Custom Rule Emporium and almost nothing gets through.

    If this guy had trouble, it is the fault of the documentation, not the product. Either that, or he was dumb enough not to upgrade to perl 5.8 or above, and spent forever installing modules.

    He says:
    SpamAssassin is the perfect example of first-generation techniques becoming outmoded by advances in spamming technology

    Funny how when you install an old version of the product, it seems outmoded, hmmm?

    Sheesh.

    Pixie
    --
    don't mess with those geekgrrls
  13. Re:a problem with reviewers by aug24 · · Score: 4, Informative
    Bollocks, the reviewer said in the damn article why he used it. It's cos that's what comes with RH9. I've just checked on the RH web site, and 9 is their current release.

    So if you want to whinge at anyone, whinge at RH. At least this shows that reviewers now think they should include FOSS in their reviews.

    Justin.

    --
    You're only jealous cos the little penguins are talking to me.
  14. POPFile by Anonymous Coward · · Score: 4, Informative

    I don't know anything about SpamBayes so I cannot comment on it at all.

    POPFile is easy to use. It also performs Bayesian filtering. It is what I use.

    http://popfile.sourceforge.net/

    My current POPFile statistics:
    Messages classified: 1,440
    Classification errors: 19
    Accuracy: 98.68%

  15. Re:Is there a gui tool for configuring SpamAssassi by Salo2112 · · Score: 4, Informative

    saconf works for the Windows versions of spam assassin.

    http://www.openhandhome.com/saconf.html

  16. Re:What is a good client-side spam filter for Outl by jpmrst · · Score: 3, Informative

    Spamagogo doesn't have quite the same setup, but it is good, and free for now.

    --

    Time for a snack.

  17. Re:What is a good client-side spam filter for Outl by junklight · · Score: 3, Informative

    indeed - I've been using this for a while now. No false positives, I see bits and pieces in my unsure folder - including the "Hi, heres that link you asked for http://spam.spam.spamcorp, cheers .." that Paul Graham reckons is the future of spam.

    Given I get over 100 spams a day and I see non of them I am very happy with this indeed.

  18. Re:What am I doing wrong? by Pasc · · Score: 3, Informative

    Are you running the newest version? 2.60 is much improved over previous versions.

    If you are running 2.60, have you trained and enabled the bayesian filters? By default you need to feed SpamAssassin about 300 spam and 300 ham (non-spam) messages for it to learn the difference. It will auto-train itself over time but it only auot-learns on messages that are very obviously (to it) spam or ham.

    If you normally only get email from a select list of people then you may want to lower your threshold. For people you routinely recieve email from, SpamAssassin will remember that they usually don't send you spam so if you occasionally get something with a high score from them it will automatically lower it a bit. So, you can lower your threshold and still not get any false positives.

    I have my required_hits set to 3 and the only false positives I've seen (since switching to 2.60) have been mailing lists (one was from LinuxWorld, the other from another news site) and not person-to-person email. I recieve 50-60 spam messages a day and only one or two a week gets into my inbox.

    spam cathing - >99%
    false positives (normal email) - 0%
    false positives (mailing lists) - .5%

    I do some stuff to keep SpamAssasin's bayesian filters well trained. Every couple weeks I will go in to my spam folder and quickly page through it. If I see a spam that the bayesian filters gave a low score (less than 90% sure it is spam) I will pipe it (I use pine) to sa-learn to train the bayesian filters (unless it was autotrained).