Slashdot Mirror


Critical Eye on SpamAssassin

ErrorBase writes "In this Infoworld article, Logan G. Harbaugh makes a great deal about an ancient (2.44) version of SpamAssassin comparing it with newer comercial variants. Quote : You get what you pay for. [...] However, it took more than 10 times as long to install and configure SpamAssassin as it did any of the other products. " Why did he not ask Kevin Railsback who had the whole thing working some while ago?)"

20 of 324 comments (clear)

  1. What is a good client-side spam filter for Outlook by Dancin_Santa · · Score: 1, Interesting

    What is a good, free client-side spam filter for Outlook?

  2. SpamAssassin by hookedup · · Score: 5, Interesting

    All my incomming mail comes through SpamAssassin (cant remember which version off the top of my head), and once in a blue moon a single piece of spam will manage to find it's way through. When it does, I guess i should just applaud the spammer for being so devious.

    TrollAssasin would be nice, imagine seeing posts subjects as *****TROLL***** heh

    1. Re:SpamAssassin by Anonymous Coward · · Score: 1, Interesting

      Yeah, there's a good reason for some of those messages getting through. Hang on, let me check that "post anonymously" box... there we go.

      I work for a company that has decided that sending out spam is part of the revenue generating enterprise. Save the BS about how I should tell my boss to shove it, I have a lot more reasons to stay here than to go. At any rate, one of the things that most spammers (strike that, what the good spammers) do is to get as many of these tools as they can, and run test mailings of outgoing mail across it. SpamAssassin is nice because it details exactly which parts of the message triggered the filter, and you can whittle the message down until it will pass by just about any installation.

      Most times when I send out mail, it ends up with a spamassasin score of about 3, and if anybody has it set that low they're probably losing valid mail too.

  3. Because by FreeLinux · · Score: 4, Interesting

    Why did he not ask Kevin Railsback who had the whole thing working some while ago?)"

    He expected to get the results that he normally gets with most commercial software. Click Setup.exe, answer a question or two and it's done, up and running. Further configuration is not required though it may be desired.

    The commercial vendors of Spamassassin have not improved the core product in any way. What they have improved is the packaging, the installation, the default configuration and the interface to modify that configuration. The stock SpamAssassin does not offer that although, Spamassassin setup is far more simple than some other packages out there.

  4. Taken from the two articles by lpontiac · · Score: 4, Interesting
    Kevin Railsback is Test Center operations manager at InfoWorld.

    versus

    IT consultant Logan Harbaugh is the author of two books on networking.

    The first found Spamassassin easy, the second found it hard. Hmmm.

    What really aggravates me is the typical "There are blacklists available that you can subscribe to, and some are updated regularly, but these are noncommercial lists with no guarantees." I'd like to see what guarantees the commercial lists come with.

  5. sixty-two percent? by dboyles · · Score: 4, Interesting

    [SpamAssassin] filtered only 62 percent of spam, whereas the other products produced great results, blocking 90 percent to 96 percent of all the spam they encountered with few, if any, legitimate messages blocked.

    To me, this statement is pretty telling. Harbaugh must get some completely different kinds of spam than me, because, even though I receive about 60 spam mails a day (directed to my "spam" folder, so I never see them until I scan the "From:" field and then delete them), maybe one per week makes it through the filter. And seeing as how I can't even remember the last time I got a false positive, that's a pretty damn good number.

    I can believe that if you receive a variety of mail and if you took no time to configure SpamAssassin other than cranking it up, maybe then it'll only catch 80% of the spam. But 62%? I'm not sure if Harbaugh is skewing the benchmarks or if he just doesn't know what he's doing.

    There are some legitimate issues with SpamAssassin that might not make it ready for the enterprise, but for a handful of users, I have been more than satisfied. And the price is right.

    --
    -- "Complacency is a far more dangerous attitude than outrage." -Naomi Littlebear
    1. Re:sixty-two percent? by alatesystems · · Score: 2, Interesting

      Sure they do, it's called NewsForge.

      Yes, I am kidding. Also, here is the email i sent our buddy:
      -------------
      In regards to your "bottom line" at the end of the article entitled:
      Commercial solutions win, spam loses, Nov 14th, you stated it was much
      harder to install, configure, and keep running.

      Although it isn't point and click like with windows, you said it was
      installed WITH red hat 9. All you had to do was add: :0fw
      | spamc
      to your /etc/procmailrc files to make it an enterprise wide spam filter.

      You also said it has scanty documentation, but it has full documentation for
      every configurable option available on their site, spamassassin.org.

      You said 63% spam identification??? What did you have your threshold set to?
      9? or some other high integer? I have mine set to 5 and I have a spam
      catching percent greater than 99%.

      "But just because the software is installed does not mean it will work --
      filtering criteria must be added manually, and until that's done nothing is
      filtered out." -- What is that??? You can edit the scores but all the scores
      have default values that are very good and require NO editing. I can
      understand if you are a linux/*nix newbie, but you should have a disclaimer
      in your article instead of bashing an open source project that works quite
      well with no configuration other than procmail.

      As far as the whitelisting you said that could not be done by normal users:
      first: there are many web(php and perl) applications that let you do this
      over the web and also will let you view quarantined mail over the web.
      Second: from the spamassassin man page: -W, --add-to-whitelist
      Add addresses in mail to whitelist (AWL)

      >From your article again: "There are blacklists available that you can
      subscribe to, and some are updated regularly, but these are noncommercial
      lists with no guarantees." Those "non-commercial" lists are used by ALL the
      commercial products. In fact, one of the major commercial antispam product
      companies just bought spamcop to ensure its success in the future. Those
      blacklists are not ones you have to "subscribe to" as your purport, but are
      already used. Vipul's razor which IS a signature product used by
      CloudMark's commercials software is automatically used if found. You can
      install that from an rpm. In your chart, you said that SA cannot use
      signature based scans.

      To keep SA up to date, guess what you type. up2date spamassassin. OH MY
      GOODNESS!!! That was very difficult. Or if you want, since you're using red
      hat 9, you can type yum upgrade spamassassin.

      "Filtering rules are relatively basic, and although there is a Bayesian
      filter available, it is not part of the distribution -- and I wasn't able to
      get it working for this review." Filtering rules are not basic in any form
      and Bayesian filter is included. Another lie(a disturbing trend for a
      "journalist"). Simply add use_bayes 1 to the local.cf configuration file.

      Where to begin? "It looks for keywords in the subject or body of e-mails,
      but is frustrated by words not in the dictionary, such as "V!agra," or words
      that contain invisible HTML characters." I get TONS of spam both in the
      enterprise and at home and spamassassin gets more than 99% of it with 0
      false positives. Believe me, it gets the "vee ag ra" and the "v!agra" and
      variants. 100% of the time, in fact.

      Your chart said no end user access to quarantined mail, but you can easily
      put it into any folder you want because spamassassin writes a header:
      X-Spam-Status: Yes. That means you can also put into /etc/procmailrc the
      following: :0:
      * ^X-Spam-Status: Yes
      $HOME/mail/Spam

      And like Emeril, BAM! Enterprise wide filtering and quarantining of mail
      into a Spam folder.

      I really wish before you create another article f

  6. Thanks for the reminder!! by Perl-Pusher · · Score: 3, Interesting

    I was using version 2.44, I was able to compile and upgrade spamassassin before the number of posted replies hit 60! Can't be too hard!

  7. What am I doing wrong? by TamMan2000 · · Score: 3, Interesting

    All my mail comes through spamassassin as well, but I am not having nearly the success you are...

    I get about 60-70% of my spam correctly tagged, and about .2-.5% false positive. Don't get me wrong, I am WAY happier now that before spamassassin, but if I could be getting better performace, that would be great...

    --
    "I'll have a Guinness, no wait, make that a Coors Light" -Grad student I work with, who shall remain anonymous...
  8. He was trying to make a point by Zebra_X · · Score: 3, Interesting

    While his review was perhaps not scientifically conducted. I think there was a point to be made with the SpamAssasin blurb.

    Notice that he deliberately took a standard install from RedHat 9, something some IT person (Not a tr00 g33k) might buy at CompUSA. He then tried to install the provided product. Clearly, a tr00 g33k would go and download the latest release, but keep in mind that not everyone is so comfortable with being on the bleeding edge - I believe that this was a point he tried to make. There is also the perception that the release provided with a "product" such as RedHat 9 will be up to the same standards as the OS.

    While it's true the latest version has default rules and whatnot - it's quite likely that his older, more out of date version does not. In fact, going briefly to the spamassin home page the links for the 2.5 and 2.4 release documentation are broken.

    The point to be made was: OSS needs to be more buttoned up. Notice that he said that he had no trouble installing redhat 9. That's becuase the installer is rather good.

  9. modifying subjects and other content by dan_bethe · · Score: 2, Interesting
    TrollAssasin would be nice, imagine seeing posts subjects as *****TROLL***** heh

    I know you're just joking, but to be serious for a minute, the reason not to do that is because you'd be transparently altering someone else's copyrighted property. Overzealous and/or overworked sysadmins misconfigure SA to globally analyze all incoming content and then to alter email subjects based on its opinion. This is an invasion of content, certainly prone to false positives because antispam scanning is an individually trained process, and breaks the trail of reply threads at least on a visual basis. There are always going to be tons of misconfigured or RFC ignorant smtp servers out there, and being compatible with them is what makes the Internet work. That would include corporate servers, legitimate opt-in bulk mail, and opt-in mailing lists run by Some Dude. There will be people on a mailing list whose personal content is always publicly marked by certain recipients as spam! It's confusing, insulting, and unnecessary. SMTP has invisible meta-tags in its headers to allow for that, and agents are supposed to respect them.

    This is fine for using SA's global config as your personal config for your own little systems, but not for an ISP or business.

    According to spamassassin.org:

    We strongly urge ISPs installing the product to notify their users when it's installed, and to not enable it by default -- but many seem to ignore this advice. We agree, that's totally unprofessional. :(
  10. Re:I get what I pay for too from reading the artic by Anonymous Coward · · Score: 1, Interesting

    I believe the article is a bit unfair on spamassassin. Spamassassin does fairly good at what it is good at -- filtering spam. The other commercial products seem to be a total solution package, which would not only filter spam but lets you configure it so that, for example, you could have special spam folders with an auto expiry date.

    I would be more interested in seeing comparisons on how well it compares with other commercial products on the success rate of identifying spam email (false positives would also be quite interesting).

    Having said that, I agree that it would be nice if there were some programs or scripts that would automate the setting up of these nice ``extra'' features for you.

    A final note, it seems that the article is not very accurate. I am quite sure that spamassassin would allow you to define whitelists, however, that requires running it as root and that has security implications.

  11. Arsehole by FinestLittleSpace · · Score: 2, Interesting

    Does he by any chance love outlook rules as well?

    Spam assasin is on my server and is absolutely brilliant.. it catches 99.9% of all my spam, and has only on 5-10 occasions in the past month (i get about 50-60 emails a day) counted 'innocent' mail as spam... and even those were newsletters....

    Anyone who slates SpamAssasin is one very deluded person... its Open Source, constantly improved... open to editing by it's users, rules can be added.... marvellous.

    Commercial variants ive seen have been painfully badly implemented and not worked properly. Get SpamAssasin and fight the closed source lovers :)

  12. tech vs. consultant, humorous by motorsabbath · · Score: 2, Interesting

    Humorous how the guy who liked SpamAssassin (Kevin Railsback) was a tech who actually set it up for use at infoworld and the guy who didn't like it is an "IT consultant the author of two books on networking." Always trust a tech.

    --
    The heat from below can burn your eyes out
  13. Re:I get what I pay for too from reading the artic by gid · · Score: 2, Interesting

    Exactly, I had SA integrated into exim with custom rules and what not, but it would break on upgrading the debian package, happened twice, needed to tweak exim.

    Then I found out about the beauty of procmail once I looked into filtering all spam to it's own folder without email client filters. So now, I have different emails filtered to specific folders before it ever hits my inbox. Oh and I had to disable the bayesian filter, it was catching way to many not spam emails. Stuff that didn't have any keywords in it at all. One was just a couple quick sentences from a friend, who knows why it thought it was spam. :( I really should re-enable the bayes stuff, and figure out how to teach it what isn't spam.

    Here's a watered down version of my procmail file for those interested: http://gid0ze.net/dl/dot.procmailrc

  14. Re:The algorithm by perlionex · · Score: 3, Interesting

    Bayesian filtering is a bit like fuzzy-logic. Right now, it's best known for filtering spam. SpamAssassin uses a whole long list of tests and assigns +ve or -ve scores to each test that comes out positive (a bit like Slashdot's moderation).

    I know someone who did a project on classifying video using Bayesian filtering. It looked at stuff like brightness, contrast, volume, basically everything they could extract from the movie file and give a value to. The concept itself is quite powerful; the difficulty is getting a list of tests that can accurately predict / classify what you have (spam/non-spam, or for video, thriller/drama/etc).

    If you're interested in finding out more about actually coding Bayesian filters, you can check out the Bayes ++ project page.

  15. Re:What is a good client-side spam filter for Outl by professorhojo · · Score: 2, Interesting

    spampal does the trick for me.

    quick and effective identification. can check the online black hole lists for IP ranges to block and you can manually set the thing up to ignore email from any country. :)

    goooooodbye china!

  16. Re:a problem with reviewers by zerocool^ · · Score: 2, Interesting

    A problem we had here at Netmar was that spam assassin, in conjunction with mime-defang, really slams the system. We have several clients who run listserv-type email lists (for various reasons, all verified non-spam, most for like non-profit orgs), and when you send a 500k listserv digest email to 2,000 people, in the default spam assassin config, it would spawn a perl process for each attempted email. So, for about 3 minutes, our mail server would be swamped (load creeping up over 10ish), even though it's a 1.2 ghz duron.

    So, we solved it by figuring out how to run spam assassin / defang as daemons. Works great now, and when someone tries to send 2,000 messages, it just queues them and delivers them as it can. Takes less time to get through them one at a time than it did to spawn max_file_descripters perl processes.

    ~Wx

    --
    sig?
  17. But fix your .procmailrc by jlv · · Score: 2, Interesting

    And you better change that sime, straightforward procmail recipe to use ":0fw:" on the first line. That trailing ":" is important if you are not running spamd, as it makes procmail use a lock file and only run 1 instance of SpamAssassin at a time. Otherwise, if you get 30 messages, you'll get 30 instances of SpamAssassin, which is 30 instances of Perl, etc. Large load spike.

  18. Re:Is Running Home Server Worth It? by timeOday · · Score: 2, Interesting
    I like having my own email server at home because I can make up a different email address each time I give one out - any email address I want, since it's at my own domain. This is the key to my spam filtering.

    As for maintainence, there isn't any. I set up exim two or three years ago and have hardly touched it since.