Slashdot Mirror


Critical Eye on SpamAssassin

ErrorBase writes "In this Infoworld article, Logan G. Harbaugh makes a great deal about an ancient (2.44) version of SpamAssassin comparing it with newer comercial variants. Quote : You get what you pay for. [...] However, it took more than 10 times as long to install and configure SpamAssassin as it did any of the other products. " Why did he not ask Kevin Railsback who had the whole thing working some while ago?)"

99 of 324 comments (clear)

  1. Re:What is a good client-side spam filter for Outl by reaper20 · · Score: 4, Informative

    SpamBayes, by far.

  2. SpamAssassin by hookedup · · Score: 5, Interesting

    All my incomming mail comes through SpamAssassin (cant remember which version off the top of my head), and once in a blue moon a single piece of spam will manage to find it's way through. When it does, I guess i should just applaud the spammer for being so devious.

    TrollAssasin would be nice, imagine seeing posts subjects as *****TROLL***** heh

    1. Re:SpamAssassin by Cygnus78 · · Score: 5, Funny

      Why not create SlashAssassin ? All incoming mail gets moderated.. Let's see ... aah a +5 Interesting. What about this one. Oh a mail from dad.. moderated Offtopic! How typically. Ah one from my brother.. Flamebait.. !

    2. Re:SpamAssassin by endx7 · · Score: 2, Insightful

      TrollAssasin would be nice, imagine seeing posts subjects as *****TROLL***** heh

      Seriously, wasn't that one of the ideas behind moderation?

    3. Re:SpamAssassin by Brummund · · Score: 2, Informative

      That is called scoring. Gnus and other good email/news clients have this. Very useful for reading high-volume lists and avoiding USENET kooks.

  3. Re:What is a good client-side spam filter for Outl by PhilippeT · · Score: 2, Insightful

    not using Outlook? Seriusly most good anti spam filters are server side.

    --
    A psychopath can't tell the difference between right and wrong. A sociopath knows the difference - he just doesn't care.
  4. Is there a gui tool for configuring SpamAssassin? by ACK!! · · Score: 4, Insightful

    Seems like this guy did not verbalize it but that was his problem. If you know what you are doing hacking a conf file from vi is easier than a gui for sure. However, his low performance and configuration woes would have probably been handled with a easy to use graphical interface.

    Aren't there tools that do this?

    --
    ACK /ak/ interj. 2. [from the comic strip "Bloom County"] An exclamation of surprised disgust, esp. i
  5. a problem with reviewers by Taranis-BSD · · Score: 5, Insightful

    This was just a setup to make commercial software look better or just a incompetent reviewer. Next.

    1. Re:a problem with reviewers by Mysticalfruit · · Score: 2, Insightful

      Ding, we have a winner...

      This is a kin to when Ballmer was quoted comparing Redhat 6 vs. Longhorn or XP or whatever.

      This guy's just following the first rule of "marketbenching"

      "When in doubt, squew results in favor of the company that's paying you the most..."

      --
      Yes Francis, the world has gone crazy.
    2. Re:a problem with reviewers by I+am+Kobayashi · · Score: 4, Insightful
      Agree. I was going to post as an answer to the question:
      "Why did he not ask Kevin Railsback who had the whole thing working some while ago?)"
      Because Freeware doesn't pay for advertisements in his publication....
      It is always nice to see a lack of journalistic integrity in reviewers...
      --
      --Kobayashi--
    3. Re:a problem with reviewers by AKnightCowboy · · Score: 2, Informative
      This was just a setup to make commercial software look better or just a incompetent reviewer. Next.

      Spamassassin didn't seem that hard to install. I just typed "apt-get install spamassassin" and just piped my mail through it with a procmail recipe:

      :0fw
      | spamassassin -P

      :0:
      * ^X-Spam-Status: Yes
      spam

      Seemed simple and straight forward. Granted, if you're doing it on an entire machine basis you'd just use spamd/spamc and setup a filter on the mail server itself. For one user though I'm not sure how it could be any simpler. If I want to whitelist people I just add them to my ~/.spamassassin/whitelist file. *shrug*

    4. Re:a problem with reviewers by aug24 · · Score: 4, Informative
      Bollocks, the reviewer said in the damn article why he used it. It's cos that's what comes with RH9. I've just checked on the RH web site, and 9 is their current release.

      So if you want to whinge at anyone, whinge at RH. At least this shows that reviewers now think they should include FOSS in their reviews.

      Justin.

      --
      You're only jealous cos the little penguins are talking to me.
    5. Re:a problem with reviewers by Taranis-BSD · · Score: 2, Insightful

      Clearly you did not check well enough, RedHat 9 is now very old by distro standards and is now replaced by their commercial line of products or Fedora.

    6. Re:a problem with reviewers by black+mariah · · Score: 3, Insightful

      It doesn't pay for Slashdot either. Notice those nice shiny MS ads up there?

      --
      'Standards' in computing only impress those who are impressed by things like 'standards'.
    7. Re:a problem with reviewers by IANAAC · · Score: 2, Insightful

      Yes, RedHat 9 is considered old by the OSS community, but not by the general public. There are still many people running RH9 out there. Hell, there are still a lot of people running RH7.x (particularly on servers).

    8. Re:a problem with reviewers by aug24 · · Score: 3, Insightful
      very old by distro standards

      That is the oldest canard (read: excuse) in the FOSS zealot's book. And I say that as a regular proscelitiser myself.

      How old is Red Hat 9? It was the current release till earlier this year, when they launched Fedora. So, he used a version that is a few months old. Whoop-de-fuck. 'Very old' my arse.

      J.

      --
      You're only jealous cos the little penguins are talking to me.
    9. Re:a problem with reviewers by JonnyCalcutta · · Score: 4, Insightful

      But he didn't upgrade it. Would it be acceptable if he tested an anti-virus product he got with the PC he bought last year and he didn't update the virus defs? Or perhaps he should have used the release version of Brightmail from the time of the Windows XP launch?
      Anybody using an old version of anti-virus or anti-spam software gets what they deserve (or get's the review their advertisers want). I use spamassassin and clamav with mimedefang on my corporate gateway and you have to upgrade spamassassin regularly or more and more spam starts slipping through - this is the nature of anti-spam and I'm sure is just as true of brightmail and the others.

    10. Re:a problem with reviewers by zerocool^ · · Score: 2, Interesting

      A problem we had here at Netmar was that spam assassin, in conjunction with mime-defang, really slams the system. We have several clients who run listserv-type email lists (for various reasons, all verified non-spam, most for like non-profit orgs), and when you send a 500k listserv digest email to 2,000 people, in the default spam assassin config, it would spawn a perl process for each attempted email. So, for about 3 minutes, our mail server would be swamped (load creeping up over 10ish), even though it's a 1.2 ghz duron.

      So, we solved it by figuring out how to run spam assassin / defang as daemons. Works great now, and when someone tries to send 2,000 messages, it just queues them and delivers them as it can. Takes less time to get through them one at a time than it did to spawn max_file_descripters perl processes.

      ~Wx

      --
      sig?
    11. Re:a problem with reviewers by Yo+Grark · · Score: 2, Funny

      Damn. Time to upgrade from RH5.1....

      Yo Grark
      Canadian Bred with American Buttering

      --
      Canadian Bred with American Buttering
  6. Re:Is there a gui tool for configuring SpamAssassi by PhilippeT · · Score: 4, Informative

    Webmin is great for setting up just about anything you can think of.

    --
    A psychopath can't tell the difference between right and wrong. A sociopath knows the difference - he just doesn't care.
  7. Coming soon at Infoworld... by JohnGrahamCumming · · Score: 4, Insightful

    "We compare a collection of recent operating systems: Windows XP Professional, Mac OS X Panther, Debian GNU/Linux 0.91".

    Seriously, InfoWorld, SpamAssassin 2.44 was released in February, all the other vendors you compared were constantly updating their products to cope with the ever changing nature of spam.

    John.

  8. Logan You Better Run by Anonymous Coward · · Score: 5, Informative

    Great - compare generation or more older open source to fresh shrinkwrap. Who's zooming (or shilling) for who.

    My ISP (souther NH) runs SpamAssassin 2.6 - and I can tell you that at the default settings it catches 90-95% with .01% (yes Bucko, less than 1/1000) false positives. When they implemented it several versions ago it was just as good.

    I've got one client where the run NO filter - some folks (the names GOTTA be on the web site) get up to 100 spams a day. IT are basically monkeys with hands. I have no idea what the CEO thinks. They wouldn't even think OS as they're a total MS shop.

    1. Re:Logan You Better Run by shis-ka-bob · · Score: 4, Informative
      From the home page of Spam Assassin:
      Razor: Vipul's Razor is a collaborative spam-tracking database, which works by taking a signature of spam messages. Since spam typically operates by sending an identical message to hundreds of people, Razor short-circuits this by allowing the first person to receive a spam to add it to the database -- at which point everyone else will automatically block it.

      From the review:
      All the products except Brightmail and SpamAssassin allow end-users to add senders to the domain whitelist themselves. Brightmail allows users to forward misidentified e-mails to the administrator, who can choose to add the sender to the whitelist. SpamAssassin allows only the administrator to add to the whitelist, with no direct access for users.

      Who is missing something here? Me or the reviewer? It looks like Razor does exactly what he wants to do and claims that SpamAssassin doesn' t do. It seems to me you are right ... selectively comparing old OS with newer commercial software so that he can make claims that are factually correct about SpamAssassin 2.44 but completely missleading about the current version.

      --
      Think global, act loco
    2. Re:Logan You Better Run by Refried+Beans · · Score: 2, Funny
      IT are basically monkeys with hands.

      Monkey's don't have hands!?!?

    3. Re:Logan You Better Run by mrex · · Score: 3, Informative

      This "journalist" is a grade-A moron as has been demonstrated sufficiently already in this thread. The one new thing I have to add to this conversation is that, contrary to the following statement:

      SpamAssassin allows only the administrator to add to the whitelist, with no direct access for users.

      SpamAssassin (anything remotely resembling a current version) supports per-user whitelists and other preferences. It takes a little more skill to set up, but frankly the end result is way better than anything you're likely to achieve with a commercial product. The users of my ISP can simply log into a secure space on our website, where they can then view their assassinated spam, change their default score, and create individual white and black lists. This is accomplished with nothing but SpamAssassin, Apache, MySQL, and a few glue scripts. I would put our OSS-based solution in a head to head with any of those commercial offerings.

  9. I get what I pay for too from reading the article. by reaper20 · · Score: 4, Informative

    I don't understand why he's so critical of a free product. I upgraded to 2.60 and it's running near flawless, and since the program is so simple, you just upgrade it, no need to change configuration options if you don't need to, you just call it from procmail.

    Yeah all those GUI options look nice, but 90% of the time, why do I need to change my spamblocking settings? The Bayesian filter autoadjusts itself with little or no user intervention -- it's near transparent.

  10. Works for me by perlionex · · Score: 5, Informative

    I run a mail server at home on a Linux box, with Postfix and Spamassassin 2.60. I have it configured to label mail as spam once it hits 8 points, and to automatically chuck it into /dev/null once it hits 12 (using Postfix's header_checks).

    It works pretty well for me -- the mail server's only for my personal use so I don't really have to worry about irate subscribers sueing me for dropping them legit mail =p and the 8-12 point range in the spam marking gives me a chance to vet through those suspicious mails briefly before deleting them.

    I've never tried any other spam filters on the server-side, so I can't really compare. I guess I'm also a bit of a Linux hacker so I don't mind tweaking all those config files along the lines of the FAQ and other hints on forums to get it to work the way I want it to.

    1. Re:Works for me by perlionex · · Score: 5, Informative
      Inside /etc/postfix/main.cf:
      # The header_checks parameter specifies an optional table with patterns
      header_checks = regexp:/etc/postfix/header_checks
      Inside /etc/postfix/header_checks (note: replace "*" with "[backslash]*"):
      /^X-Spam-Level: ************/ REJECT
      Inside /etc/mail/spamassassin/local.cf:
      rewrite_subject 1
      report_header 1
      ok_languages en
      ok_locales en
      required_hits 8
      subject_tag [SUSPECTED SPAM]
  11. Spam Filters . . . and Eudora by Newt-dog · · Score: 4, Funny
    I use Eudora and I *tried* to set up a complex system of "filter words". I even it up so that all of the spam would go into a "spam filter" folder. Lotta good that did me . . . Now all of my spam goes directly into my In box, and the good email goes into the spam folder.

    Come to think of it, it seems to work out just fine.

    Newt-dog

  12. Sales sales sales by Anonymous Coward · · Score: 3, Insightful

    This is likely funded by un-named virus vendors who has integrated SapmAssassin into their appliaces. Away on a vacation, I came back to find our people unaware SpamAssassin was open source. The vendor quietly forgot to mention that.

    In the end, any company is going to have to put people and tools together to get a spam solutution, or outsource it. But DIY needs people time.

    Don't pay vendors for SpamAssassin, it runs quite nicely on left over PCs reloaded with Linux.

  13. Re:What is a good client-side spam filter for Outl by Anonymous Coward · · Score: 4, Informative

    I know people have been recommending SpamBayes but be warned - it is very slow to parse and move the emails. Only bother with this if you receive only a small volume of spam or have a pretty fast computer.

  14. Re:What is a good client-side spam filter for Outl by uradu · · Score: 5, Funny

    ==> Start|Settings|Control Panel|Microsoft Office XP Professional with FrontPage|Remove

    Best one yet!

  15. He already sent an open letter to SAtalk by damian · · Score: 5, Informative

    He sent a long open letter to SAtalk. You can find it in the mailing list archive

    1. Re:He already sent an open letter to SAtalk by Joseph+Vigneau · · Score: 4, Informative

      Wow. Considering he probably got a lot of nasty emails from the zealot crowd, this is a well reasoned response. He laid out his review criteria, and how SA can be improved to fare better against its commercial competitors. Well done, and a good challenge for the committers of SA.

    2. Re:He already sent an open letter to SAtalk by CaptainZapp · · Score: 3, Insightful
      The same is true of support - while you may get faster or better support through this group than you get with commercial software, there's no guarantee that you'll get any support at all - and most organizations will find that hard to live with.

      This is very true, of course. But has the guy considerered that this is 1:1 the case with commercial software too?

      Even support providers for enterprise level software (i.e database vendors, which may charge hundreds of thousands of $, depending on the installation and support level) will never guarantee that they provide you with a solution.

      Of course their sales reps have the flashier presentations though, which is a part of what you pay for.

      --
      ich bin der musikant

      mit taschenrechner in der hand

      kraftwerk

    3. Re:He already sent an open letter to SAtalk by vondo · · Score: 2, Informative

      Did you not read his post to the mailing list where he says he had words to that effect in the article he submitted but *the editor* took them out?

    4. Re:He already sent an open letter to SAtalk by anthony_dipierro · · Score: 2, Insightful
      Regarding some of the other comments that have been made, some of you have said that SA is not hard to install, taking no more than an hour or two to download, install, configure and begin using. That is consistent with the 10 times longer number I used, because the other installation and configuration times were all around 5-10 minutes.
      You have also said that I should have taken into account the fact that it doesn't cost anything before making statements about it being harder to install, configure and manage than the commercial products. SA does cost - but in an administrator's time rather than money, which I did say in the article.

      Hmm. Brightmail Anti-Spam - Enterprise Edition is $14,000 a year for up to 1000 users ($1500 for up to 50 users). Hiring a professional consultant to install Spamassassin (about an hour or two of work) would surely cost much less. And you wouldn't have to worry about the company going out of business or raising prices. So even if your administrator's time is worth more than $7,000 (or $750) an hour, there's an alternative solution, pay someone to install the damn thing.

  16. no wonder... by theonlyholle · · Score: 5, Insightful

    well, on the first page the author already makes it pretty obvious why SpamAssassin had to come out at the bottom of the list. He is comparing version 2.44, which was included in RH9 and is thus at least 8 months old, to the latest antispam software that is regularly updated. How on earth is that an unbiased comparison? In a world where spam patters change every week, if not every day, 8 months is a generation... he even says so in his article. I'd be interested to see the results of a similar test, but with SpamAssassin 2.60 and of course with bayesian filtering and some of the other optional features enabled...

  17. Because by FreeLinux · · Score: 4, Interesting

    Why did he not ask Kevin Railsback who had the whole thing working some while ago?)"

    He expected to get the results that he normally gets with most commercial software. Click Setup.exe, answer a question or two and it's done, up and running. Further configuration is not required though it may be desired.

    The commercial vendors of Spamassassin have not improved the core product in any way. What they have improved is the packaging, the installation, the default configuration and the interface to modify that configuration. The stock SpamAssassin does not offer that although, Spamassassin setup is far more simple than some other packages out there.

  18. Taken from the two articles by lpontiac · · Score: 4, Interesting
    Kevin Railsback is Test Center operations manager at InfoWorld.

    versus

    IT consultant Logan Harbaugh is the author of two books on networking.

    The first found Spamassassin easy, the second found it hard. Hmmm.

    What really aggravates me is the typical "There are blacklists available that you can subscribe to, and some are updated regularly, but these are noncommercial lists with no guarantees." I'd like to see what guarantees the commercial lists come with.

  19. Re:I get what I pay for too from reading the artic by gl4ss · · Score: 2, Informative

    you need to change them because the easy install solutions suck(and have default installs that somebody can try to get around and test untill it goes through).

    --
    world was created 5 seconds before this post as it is.
  20. Critical Eye on Tech Journalists by abulafia · · Score: 4, Informative
    In true form for throwaway articles like this, products are compared poorly:

    Each product was tested with a different stream of mail, so the number of messages received varied, but all received enough messages to assess their capabilities.

    Can you imagine someone writing "Oracle, Sybase and Postgres were compared. While the data and workloads were different, all products performed enough work to assess thier capabilities."

    All the products except Brightmail and SpamAssassin allow end-users to add senders to the domain whitelist themselves.

    I don't know anything about Brightmail. Spamassassin end user whitelists entries can be set up in a number of ways.

    And all the products but SpamAssassin use dynamic updates to keep up with the evolving technologies spammers use to circumvent less sophisticated filters.

    As aluded to in the summary, this is false with modern versions of Spamassassin, which uses Baysian filtering. (The author later says he couldn't get it working.

    However, it took more than 10 times as long to install and configure SpamAssassin as it did any of the other products. [...] But just because the software is installed does not mean it will work -- filtering criteria must be added manually, and until that's done nothing is filtered out. Getting the various configuration files edited properly so that the whole package worked was not simple. Documentation was difficult to find, and not always easy to follow.

    While it is true that one must be comfortable with a text editor to configure Spamassassin, thus perhaps putting it out of reach of point-and-click admins and technical journalists, I also wouldn't be prone to put my mail servers in the hands of either of those groups of people.

    It looks for keywords in the subject or body of e-mails, but is frustrated by words not in the dictionary, such as "V!agra," or words that contain invisible HTML characters.

    While I am not sure what tests appeared in which version, I'm pretty sure 2.44 handled off-by-one works such as V!agra. I have no idea what he's talking about when he says "invisible HTML characters", but it does seem to point to a certain technical incompetence, similar to the ostritch belief - "If I can't see you, then you can't see me."

    This is not to say Spamassassin is the easiest thing in the world to deal with. I happen to love it, because of the extreme flexibility.

    I just get sick of tech journos who decide that because a tool doesn't have a gui and they don't want to take the time to configure it, it sucks.

    --
    I forget what 8 was for.
    1. Re:Critical Eye on Tech Journalists by dboyles · · Score: 4, Insightful

      Can you imagine someone writing "Oracle, Sybase and Postgres were compared. While the data and workloads were different, all products performed enough work to assess thier capabilities."

      A very large sample of mail would negate almost all of the differences caused by using a different set of mail, but I get the feeling that each of these servers ran for about a day and the results were gleaned from that.

      I don't know anything about Brightmail. Spamassassin end user whitelists entries can be set up in a number of ways.

      ...and it ain't that hard.

      As aluded to in the summary, this is false with modern versions of Spamassassin, which uses Baysian filtering. (The author later says he couldn't get it working.)

      Maybe I'm missing something or taking things that I consider basic for granted, but Bayesian filtering with SA is about as straightforward as it gets, except that instead of clicking a few buttons, you run one short command.

      While it is true that one must be comfortable with a text editor to configure Spamassassin, thus perhaps putting it out of reach of point-and-click admins and technical journalists, I also wouldn't be prone to put my mail servers in the hands of either of those groups of people.

      I think we've all known these types, and unfortunately they're more widespread than we'd like to think. Many simple solutions such as SA are ruled out because the admin doesn't have the skill to implement them. Note to any managers reading this: hire people with a solid background in the field, not those who list single-platform applications on their resume as "skills." Software changes, but a good administrator has the ability to adapt.

      --
      -- "Complacency is a far more dangerous attitude than outrage." -Naomi Littlebear
    2. Re:Critical Eye on Tech Journalists by ceejayoz · · Score: 2, Informative

      I have no idea what he's talking about when he says "invisible HTML characters", but it does seem to point to a certain technical incompetence, similar to the ostritch belief - "If I can't see you, then you can't see me."

      If you look at the source of most HTML spam, you'll see things like:

      v<!-- the -->i<!-- brown -->a<!-- cow -->g<!-- is -->r<!-- dead -->a

      The <!-- --> parts are HTML comments and thus won't be displayed to the user, but they can mess up some spam filters that don't account for them.

      SpamAssassin should deal with them just fine, though - it did when I was using it over a year ago.

    3. Re:Critical Eye on Tech Journalists by abulafia · · Score: 2, Insightful
      Umm, why would a "simple" solution require a bunch of skill to implement? Perhaps you meant to say "complex" solutions, which do typically require skill. Simple ones should not require specialized skill- or else they're not simple.

      I think the poster was creating an implicit comparison between various types of admins. Installation, configuration and maintenence of Spamassassin is simple for a skilled admin, while it may not be for an inexperienced one. It is a simple solution because well, it is, if you know what you're doing. If you don't, perhaps you shouldn't be trying to solve the problem.

      There are easy comparisons to other fields. For instance, changing the brakes in a modern car is simple. It happens thousands of times every day, and there are entire franchise operations set up to do it. And yet, if I were to sit down with a random 2003 model car, it would be hard for me, perhaps beyond me (I dunno, I used to change my brakes on my 1984 Civic with no problem, but I suspect the braking systems are as overengineered as the rest of the car these days.).

      See the distinction?

      --
      I forget what 8 was for.
  21. sixty-two percent? by dboyles · · Score: 4, Interesting

    [SpamAssassin] filtered only 62 percent of spam, whereas the other products produced great results, blocking 90 percent to 96 percent of all the spam they encountered with few, if any, legitimate messages blocked.

    To me, this statement is pretty telling. Harbaugh must get some completely different kinds of spam than me, because, even though I receive about 60 spam mails a day (directed to my "spam" folder, so I never see them until I scan the "From:" field and then delete them), maybe one per week makes it through the filter. And seeing as how I can't even remember the last time I got a false positive, that's a pretty damn good number.

    I can believe that if you receive a variety of mail and if you took no time to configure SpamAssassin other than cranking it up, maybe then it'll only catch 80% of the spam. But 62%? I'm not sure if Harbaugh is skewing the benchmarks or if he just doesn't know what he's doing.

    There are some legitimate issues with SpamAssassin that might not make it ready for the enterprise, but for a handful of users, I have been more than satisfied. And the price is right.

    --
    -- "Complacency is a far more dangerous attitude than outrage." -Naomi Littlebear
    1. Re:sixty-two percent? by wizkid · · Score: 2, Insightful


      Look at where the article is from!!

      Infoworld.com Do you think there going to put their advertisers products down? I could tell after the first three paragraphs that the article was a sales brochure.

      --
      I take no responsibility for what I say. Even though I'm never wrong :)
    2. Re:sixty-two percent? by Tom · · Score: 2, Informative

      The version he's using might make the difference.

      I was using 2.20 until recently. After updating to 2.60, the level of spam still coming through the filter dropped right off. It's about 1 msg. per day now, used to be at least 5 times that.

      --
      Assorted stuff I do sometimes: Lemuria.org
    3. Re:sixty-two percent? by alatesystems · · Score: 2, Interesting

      Sure they do, it's called NewsForge.

      Yes, I am kidding. Also, here is the email i sent our buddy:
      -------------
      In regards to your "bottom line" at the end of the article entitled:
      Commercial solutions win, spam loses, Nov 14th, you stated it was much
      harder to install, configure, and keep running.

      Although it isn't point and click like with windows, you said it was
      installed WITH red hat 9. All you had to do was add: :0fw
      | spamc
      to your /etc/procmailrc files to make it an enterprise wide spam filter.

      You also said it has scanty documentation, but it has full documentation for
      every configurable option available on their site, spamassassin.org.

      You said 63% spam identification??? What did you have your threshold set to?
      9? or some other high integer? I have mine set to 5 and I have a spam
      catching percent greater than 99%.

      "But just because the software is installed does not mean it will work --
      filtering criteria must be added manually, and until that's done nothing is
      filtered out." -- What is that??? You can edit the scores but all the scores
      have default values that are very good and require NO editing. I can
      understand if you are a linux/*nix newbie, but you should have a disclaimer
      in your article instead of bashing an open source project that works quite
      well with no configuration other than procmail.

      As far as the whitelisting you said that could not be done by normal users:
      first: there are many web(php and perl) applications that let you do this
      over the web and also will let you view quarantined mail over the web.
      Second: from the spamassassin man page: -W, --add-to-whitelist
      Add addresses in mail to whitelist (AWL)

      >From your article again: "There are blacklists available that you can
      subscribe to, and some are updated regularly, but these are noncommercial
      lists with no guarantees." Those "non-commercial" lists are used by ALL the
      commercial products. In fact, one of the major commercial antispam product
      companies just bought spamcop to ensure its success in the future. Those
      blacklists are not ones you have to "subscribe to" as your purport, but are
      already used. Vipul's razor which IS a signature product used by
      CloudMark's commercials software is automatically used if found. You can
      install that from an rpm. In your chart, you said that SA cannot use
      signature based scans.

      To keep SA up to date, guess what you type. up2date spamassassin. OH MY
      GOODNESS!!! That was very difficult. Or if you want, since you're using red
      hat 9, you can type yum upgrade spamassassin.

      "Filtering rules are relatively basic, and although there is a Bayesian
      filter available, it is not part of the distribution -- and I wasn't able to
      get it working for this review." Filtering rules are not basic in any form
      and Bayesian filter is included. Another lie(a disturbing trend for a
      "journalist"). Simply add use_bayes 1 to the local.cf configuration file.

      Where to begin? "It looks for keywords in the subject or body of e-mails,
      but is frustrated by words not in the dictionary, such as "V!agra," or words
      that contain invisible HTML characters." I get TONS of spam both in the
      enterprise and at home and spamassassin gets more than 99% of it with 0
      false positives. Believe me, it gets the "vee ag ra" and the "v!agra" and
      variants. 100% of the time, in fact.

      Your chart said no end user access to quarantined mail, but you can easily
      put it into any folder you want because spamassassin writes a header:
      X-Spam-Status: Yes. That means you can also put into /etc/procmailrc the
      following: :0:
      * ^X-Spam-Status: Yes
      $HOME/mail/Spam

      And like Emeril, BAM! Enterprise wide filtering and quarantining of mail
      into a Spam folder.

      I really wish before you create another article f

  22. You think 2.44 is ancient? by ryanvm · · Score: 4, Informative

    You think 2.44 is ancient? Feh - Debian 'stable' is still stuck with 2.20.

    1. Re:You think 2.44 is ancient? by Tom · · Score: 4, Informative

      Try http://www.backports.org for woody packets of SpamAssassin 2.60 (and other software)

      Aside from that, installing 2.60 into your home directory is absolutely painless. Just did that, before I learned about the backports.org website.

      --
      Assorted stuff I do sometimes: Lemuria.org
  23. Article lenght advertisement by ericspinder · · Score: 2, Insightful
    In my testing, the performance of the newer products was more than acceptable in every case. Per-user, per-year pricing should not be an obstacle, even for the most expensive product.

    Sounds to me like Infoworld has an advertising contract with (at least) one of these companies. At the very least he should have checked the site for an update before he started his "tests". For a while there, I got every one of those "IT industry" hype mags (always free). While there was some good information here and there, you had to wade through a lot of advertising pretending to be articles.

    I love SpamAssassin and would not consider email hosting without it. It has made my email account useable again ! For the record, it seems to catch about 80-90% of my spam, and I have never seen a 'false positive' (I do check my 'spam' folder, but less and less)

    --
    The grass is only greener, if you don't take care of your own lawn.
  24. it's a matter of proper configuiration! by dummkopf · · Score: 2, Informative

    i have been using spamassassin for a year and it works great! granted, in the beginnings about 18% of the spam (in my case 18% of about 30 emails per day) would get trough. BUT if you read the manpage and tweak with the different scores a bit, you can get that down to 1 - 2% with about the same amount of false positives. as an admin, you should be able to tweak any spam filter to match your needs best.

    what i can highly recommend is to increase the score of MICROSOFT_EXECUTABLE as it generally is a piece of spam. in addition the bayesian statistics are a great idea: a spam filter that learns!

    as for the reviewer: if it takes this person 10 times longer to read a manpage and punch in some trivial scores into a trivially set up configuration file, then you should take his review with a HUGE grain of salt... especially since he reviewed an ancient version of the software.

    finally a general comment about spamassassin: EXCELLENT software, especially for the bargain price of $0.

  25. -1, Troll by Tom · · Score: 4, Funny

    Can we moderate the article at -1 Troll, please?

    It's just a bit too obvious that he was hoping for a severe slashdotting, driving his own numbers ("look, editor, how many people read my articles!") and the ad numbers of his paper up.

    Probably submitted the story himself, too. :)

    --
    Assorted stuff I do sometimes: Lemuria.org
  26. The review isn't as bad as slashdotters make it by greppling · · Score: 5, Insightful
    I am sure he was as disappointed as me that the installation didn't follow the ./configure && make && make install standard procedure, and that it defaulted to /usr instead of /usr/local as installation directory.

    Seriously:

    • The Spamassassin installation documentation could be better written IMHO.
    • Why doesn't RedHat's update service offer constand updates to the current version of SpamAssassin?
    • Why doesn't it (as mentioned in another post) have the most important configuratoin setups included in their overall configuration GUI?
    I really wish distributions would support SA better.
    1. Re:The review isn't as bad as slashdotters make it by dan14807 · · Score: 2, Informative

      I am sure he was as disappointed as me that the installation didn't follow the ./configure && make && make install standard procedure, and that it defaulted to /usr instead of /usr/local as installation directory.

      • su -
      • perl -MCPAN -e shell
      • cpan> install Mail::SpamAssassin
      Nice easy way to install and keep up to date with the latest version of SA. This might be why the ./configure method was neglected, although I agree it's disappointing.

      And if I remember correctly, the CPAN method does install the programs to /usr/local/.

  27. Re:NonDocumentedSoftwareAssassin by Anonymous Coward · · Score: 3, Funny

    What? open source software having crappy and hard to find documentation?

    Memo to self: if I ever spend 3 months creating free software to share, take 2 hours to write a web page showing somebody how it freaking works!

  28. Rule #1: user intelligence >= tool by Pointy_Hair · · Score: 2, Insightful

    First thing, the user has to be at least as smart as the tool they are wielding. No, actually just smart enough to follow directions and go beyond clicking on "help" to get help. Just another case of wannabe administrator arrogance: "If the tool doesn't configure itself or have cool looking icons, it must suck."

  29. It's all about the UI by The+Subliminal+Kid · · Score: 4, Insightful

    The bias apparent in this article and the crappy comparison chart aside this review doesn't even begin to touch base as a throughly researched opinion ion piece and ends up look like an advert for Brightmail.

    However we do in the OS community face a UI problem. The missing rung on the ladder to mass acceptance is the absence of high quality UI that give users and indeed administrators of the point and drool variety a interface with the service they are seeking to use.

    Before the Highly polished phpmyadmin I met serious resistance from admins for MySQL over msSQL based mostly on interface. The same goes for CUPS which has a web interface that I think has come of age if not achieve adult hood. The Webmin's are OK as long as you don't tinker to much or do anything slightly non-standard. I dislike Swat and am now so used to editing smb.conf I haven't even checked it;s working. I think that a lot of these services, apache, Spamassassin and X11 for example, could bare providing embedded configuration UI's if they aim to capture wider markets. Mandrakes X11 confugulator is very good.

    I was going to mention the difficulty presented for admins with widely deployed Outlook when looking at these kind of solutions but then I though no only have sympathy where it is due. An I know that SpamAssassin could work seamlessly with Outlook but if users want a front end for white-listing then SpamAssassin isn't going to be your toy just yet.

    Though we love the text based config file you may have to put a lot of working into configuration UI's if you want to enter the area as far as that reviewer and many sysadmins are concerned.

  30. Not Really by tookish · · Score: 4, Insightful
    So his complaints are:
    1. SpamAssassin is hard to install
    2. it isn't very effective
    3. nothing is filtered until you manually set up your own filters
    4. it's hard to configure and poorly documented
    5. non-commercial blacklists come with no guarantees
    6. end users can't add to the whitelist
    7. Bayesian filtering isn't included by default, and he couldn't make it work anyway
    8. it doesn't catch words like Viagra and invisible HTML characters

    I knew nothing about filtering spam until I installed SpamAssassin 2.6 in a multi-user environment last week. Here are my responses:

    1. it took less than half an hour to install (from CPAN) and start
    2. effectiveness out of the box was about 95%, with no false positives -- after a few minor tweaks, I'm at about 98% with no false positives
    3. simply not true -- it runs right out of the box
    4. maybe it's hard to configure if you're used to a GUI -- if you're not afraid of editing a text file, it's very easy to set up; and there's no shortage of documentation at spamassassin.org and elsewhere
    5. do commercial blacklists come with guarantees? I don't know
    6. with a very little bit of scripting, you could allow users to add to the whitelist
    7. I haven't tried the Bayesian filtering because it's apparently not well suited to a multi-user environment
    8. simply not true -- it flags this stuff out of the box

    I wouldn't recommend that my grandmother install SpamAssassin, but if you have any admin skills whatsoever, it's quite easy to use it to set up effective and useful filters. Furthermore, there are enough factual errors in the article that I'm tempted to dismiss it outright.

    Of course, it's possible that it got a lot better between 2.44 and 2.6, but that begs the question, why did he install 2.44?

    --
    "The obvious mathematical breakthrough would be . . . an easy way to factor large prime numbers"
    Bill Gates, 1995
  31. install took 10 times as long...? by lone_marauder · · Score: 4, Insightful

    I can install Spamassassin and six other applications via CPAN in the time it takes to get the syntax right for one license key.

    I also like the characterization of Spamassassin as "first generation" without any supporting evidence to the fact. First generation was adding spam senders to your e-mail client's blocklist. Bayesian filtering is well beyond first generation, but spammers have learned to defeat Bayesian filtering with poison data in non-eyeball space and text obfuscation. The next generation in spam detection is to detect the Bayesian evasion features - and guess what does that!? Spamassassin (2.60).

    --
    who are those slashdot people? they swept over like Mongol-Tartars.
  32. SA+MailScanner works for me by cyways · · Score: 5, Informative

    I've found the easiest way to implement SpamAssassin is to invoke it through MailScanner. MailScanner uses third-party virus scanners and can optionally invoke SpamAssassin as well. With the free ClamAV antivirus product, you can build a powerful open source mail scanner. Even without a virus scanner, MailScanner detects and quarantines executable attachments and other dangerous content which represent the most common types of mail-borne viruses and worms.

    RedHat installs the daemonized version of SA as well as the SA Perl scripts. Using the daemon, the easiest implementation is to invoke SA in /etc/procmailrc on the mail delivery host; for mail gateways running sendmail, you need to use the milter interface. I've found the MailScanner+SpamAssassin approach much easier to configure than either of these methods, and you get virus scanning to boot!

    I suspect if the reviewer had compared SA 2.60+ to the commercial products, rather than the older 2.44 version used in the review, SA would have shown better results.

    I'd agree with the reviewer that one of the things SA lacks is an easy method for users to interact directly with the program. (Part of the issue has to do with security; SA runs as root. As I read the review, I wondered how the other products allow users to interact directly with the scanners without sacrificing security.) It's not easy to maintain per-user Bayesian filtering, for instance, but I generally recommend having the mail client, e.g., Mozilla, handle these tasks.

  33. Thanks for the reminder!! by Perl-Pusher · · Score: 3, Interesting

    I was using version 2.44, I was able to compile and upgrade spamassassin before the number of posted replies hit 60! Can't be too hard!

  34. Old, and on the list by satyap · · Score: 3, Informative

    Not only is this somewhat old news, it's been discussed on the spamassassin mailing list. Apparently, the article was edited so that it's more anti-spamassassin than the reviewer intended, but Mr. Harbaugh also defends his review of an older version of spamassassin as "it came with my Redhat 9" (NOT a direct a quote). He also claims it took nearly an hour to install and set up. (I counter that it took seconds to install and minutes to set up).

    The current version of spamassassin is 2.60.

  35. What am I doing wrong? by TamMan2000 · · Score: 3, Interesting

    All my mail comes through spamassassin as well, but I am not having nearly the success you are...

    I get about 60-70% of my spam correctly tagged, and about .2-.5% false positive. Don't get me wrong, I am WAY happier now that before spamassassin, but if I could be getting better performace, that would be great...

    --
    "I'll have a Guinness, no wait, make that a Coors Light" -Grad student I work with, who shall remain anonymous...
    1. Re:What am I doing wrong? by Dr.+Evil · · Score: 3, Funny

      The problem is that you're making the same mistake I am.

      (No, I can't expand upon that)

    2. Re:What am I doing wrong? by Pasc · · Score: 3, Informative

      Are you running the newest version? 2.60 is much improved over previous versions.

      If you are running 2.60, have you trained and enabled the bayesian filters? By default you need to feed SpamAssassin about 300 spam and 300 ham (non-spam) messages for it to learn the difference. It will auto-train itself over time but it only auot-learns on messages that are very obviously (to it) spam or ham.

      If you normally only get email from a select list of people then you may want to lower your threshold. For people you routinely recieve email from, SpamAssassin will remember that they usually don't send you spam so if you occasionally get something with a high score from them it will automatically lower it a bit. So, you can lower your threshold and still not get any false positives.

      I have my required_hits set to 3 and the only false positives I've seen (since switching to 2.60) have been mailing lists (one was from LinuxWorld, the other from another news site) and not person-to-person email. I recieve 50-60 spam messages a day and only one or two a week gets into my inbox.

      spam cathing - >99%
      false positives (normal email) - 0%
      false positives (mailing lists) - .5%

      I do some stuff to keep SpamAssasin's bayesian filters well trained. Every couple weeks I will go in to my spam folder and quickly page through it. If I see a spam that the bayesian filters gave a low score (less than 90% sure it is spam) I will pipe it (I use pine) to sa-learn to train the bayesian filters (unless it was autotrained).

  36. Try the Custom Rule Emporium! by sillypixie · · Score: 4, Informative
    I have SA 2.6 running as a plugin to the SunONE Messaging Server (v5.2), in BAREBONES mode (ie no RBL, no Bayesian, nothing but perl regex) and it filtered 591 spam from my bosses mailbox alone on the first weekend. 12 or 13 managed to sneak through.

    Since then, I've downloaded a bunch of rules from The SA Custom Rule Emporium and almost nothing gets through.

    If this guy had trouble, it is the fault of the documentation, not the product. Either that, or he was dumb enough not to upgrade to perl 5.8 or above, and spent forever installing modules.

    He says:
    SpamAssassin is the perfect example of first-generation techniques becoming outmoded by advances in spamming technology

    Funny how when you install an old version of the product, it seems outmoded, hmmm?

    Sheesh.

    Pixie
    --
    don't mess with those geekgrrls
  37. He was trying to make a point by Zebra_X · · Score: 3, Interesting

    While his review was perhaps not scientifically conducted. I think there was a point to be made with the SpamAssasin blurb.

    Notice that he deliberately took a standard install from RedHat 9, something some IT person (Not a tr00 g33k) might buy at CompUSA. He then tried to install the provided product. Clearly, a tr00 g33k would go and download the latest release, but keep in mind that not everyone is so comfortable with being on the bleeding edge - I believe that this was a point he tried to make. There is also the perception that the release provided with a "product" such as RedHat 9 will be up to the same standards as the OS.

    While it's true the latest version has default rules and whatnot - it's quite likely that his older, more out of date version does not. In fact, going briefly to the spamassin home page the links for the 2.5 and 2.4 release documentation are broken.

    The point to be made was: OSS needs to be more buttoned up. Notice that he said that he had no trouble installing redhat 9. That's becuase the installer is rather good.

  38. Commercial Guarantees, eh? by TheSpoom · · Score: 4, Insightful

    Here's a nice example of a commercial guarantee. See if you can determine where it's from:

    11. LIMITED WARRANTY FOR PRODUCT ACQUIRED IN THE US AND CANADA.

    Microsoft warrants that the Product will perform substantially in accordance with the accompanying materials for a period of ninety days from the date of receipt.

    ...

    YOUR EXCLUSIVE REMEDY. Microsoft's and its suppliers' entire liability and your exclusive remedy shall be, at Microsoft's option from time to time exercised subject to applicable law, (a) return of the price paid (if any) for the Product, or (b) repair or replacement of the uct, that does not meet this Limited Warranty and that is returned to Microsoft with a copy of your receipt.


    Note that a) no updates or fixes are guaranteed, b) your only remedy is media replacement or a refund, and c) this choice of remedy is up to Microsoft.

    I love it when people claim that you're taking a huge risk with open source software without guarantees. Microsoft says their software will work, but isn't saying that if their software doesn't work, they have to fix it.

    --
    It's better to vote for what you want and not get it than to vote for what you don't want and get it.
    - E. Debs
  39. Re:Is there a gui tool for configuring SpamAssassi by ministerofsickeningr · · Score: 3, Insightful
    apparently you just do this:

    "I installed the software on Red Hat Linux 9, with help from one of Proofpoint's systems engineers. She talked me through getting the Linux system configured properly, getting sendmail set up, and installing and configuring the Protection Server, which includes the MySQL database server for storing quarantined e-mail."

    who needs a gui?

    no wonder he gave spamassassin a low score. he couldnt have someone handhold him

  40. POPFile by Anonymous Coward · · Score: 4, Informative

    I don't know anything about SpamBayes so I cannot comment on it at all.

    POPFile is easy to use. It also performs Bayesian filtering. It is what I use.

    http://popfile.sourceforge.net/

    My current POPFile statistics:
    Messages classified: 1,440
    Classification errors: 19
    Accuracy: 98.68%

    1. Re:POPFile by drooling-dog · · Score: 2, Insightful

      > Messages classified: 1,440
      > Classification errors: 19
      > Accuracy: 98.68%

      That's nice, but it's really important to break it down between false positives and negatives. I get over 200 spams a day (before filtering), and while it's quite tolerable for 2 or 3 of those to get through, missing that many legitimate messages a day is not.

  41. Re:Is there a gui tool for configuring SpamAssassi by Salo2112 · · Score: 4, Informative

    saconf works for the Windows versions of spam assassin.

    http://www.openhandhome.com/saconf.html

  42. Re:What is a good client-side spam filter for Outl by jpmrst · · Score: 3, Informative

    Spamagogo doesn't have quite the same setup, but it is good, and free for now.

    --

    Time for a snack.

  43. modifying subjects and other content by dan_bethe · · Score: 2, Interesting
    TrollAssasin would be nice, imagine seeing posts subjects as *****TROLL***** heh

    I know you're just joking, but to be serious for a minute, the reason not to do that is because you'd be transparently altering someone else's copyrighted property. Overzealous and/or overworked sysadmins misconfigure SA to globally analyze all incoming content and then to alter email subjects based on its opinion. This is an invasion of content, certainly prone to false positives because antispam scanning is an individually trained process, and breaks the trail of reply threads at least on a visual basis. There are always going to be tons of misconfigured or RFC ignorant smtp servers out there, and being compatible with them is what makes the Internet work. That would include corporate servers, legitimate opt-in bulk mail, and opt-in mailing lists run by Some Dude. There will be people on a mailing list whose personal content is always publicly marked by certain recipients as spam! It's confusing, insulting, and unnecessary. SMTP has invisible meta-tags in its headers to allow for that, and agents are supposed to respect them.

    This is fine for using SA's global config as your personal config for your own little systems, but not for an ISP or business.

    According to spamassassin.org:

    We strongly urge ISPs installing the product to notify their users when it's installed, and to not enable it by default -- but many seem to ignore this advice. We agree, that's totally unprofessional. :(
  44. Re:What is a good client-side spam filter for Outl by junklight · · Score: 3, Informative

    indeed - I've been using this for a while now. No false positives, I see bits and pieces in my unsure folder - including the "Hi, heres that link you asked for http://spam.spam.spamcorp, cheers .." that Paul Graham reckons is the future of spam.

    Given I get over 100 spams a day and I see non of them I am very happy with this indeed.

  45. Is it a sin to be critical of a free product? by Chemisor · · Score: 3, Insightful

    > I don't understand why he's so critical of a free product.

    Why is there this attitude that if your project is free, then it does not matter if it is garbage. Furthermore, you are not allowed to say it is garbage, because, after all, you don't look a gift horse in the mouth. Perhaps that is why Linux is still not on the desktop. There are plenty of people who spend days configuring theirs and then post "it works for me" comments, while the rest of us silently wonder why anyone would want to spend so much time on such garbage.

  46. Arsehole by FinestLittleSpace · · Score: 2, Interesting

    Does he by any chance love outlook rules as well?

    Spam assasin is on my server and is absolutely brilliant.. it catches 99.9% of all my spam, and has only on 5-10 occasions in the past month (i get about 50-60 emails a day) counted 'innocent' mail as spam... and even those were newsletters....

    Anyone who slates SpamAssasin is one very deluded person... its Open Source, constantly improved... open to editing by it's users, rules can be added.... marvellous.

    Commercial variants ive seen have been painfully badly implemented and not worked properly. Get SpamAssasin and fight the closed source lovers :)

  47. SpamAssassin+PostFix vs Exchange+Comm'l Product by texspeed · · Score: 2, Informative

    We replaced an SMTP relay/spam filter/virus scanner based on Exchange and a commercial product (not one of the reviewed products) about a month ago with one using PostFix and SpamAssassin (and amavisd) on RH. Incoming spam levels have been reduced by about a factor of ten with no false positives to date. This solution was not much of a challenge to implement - for a primarily Windows-oriented admin for whom it was a learning exercise. I haven't tried the products reviewed, but am more than impressed with what we now have.

  48. tech vs. consultant, humorous by motorsabbath · · Score: 2, Interesting

    Humorous how the guy who liked SpamAssassin (Kevin Railsback) was a tech who actually set it up for use at infoworld and the guy who didn't like it is an "IT consultant the author of two books on networking." Always trust a tech.

    --
    The heat from below can burn your eyes out
  49. Re:What is a good client-side spam filter for Outl by keath_milligan · · Score: 2, Informative

    I'll third that - SpamBayes ROCKS. I use it at work where our IT department just wasted huge amounts of money on a back-end solution that stops less than half my spam while at the same giving me trouble with blocking legitimate messages. SpamBayes cleans up what the back-end commercial solution misses every time.

  50. spamassassin-2.44-11.8.x.i386.rpm by poszi · · Score: 4, Insightful
    2.54, not 2.44

    To moderators. When you mod something "informative", please check the facts first. Spamassasin in RH 9 is 2.44.

    --

    Save the bandwidth. Don't use sigs!

    1. Re:spamassassin-2.44-11.8.x.i386.rpm by caluml · · Score: 2, Funny

      Yep, if I was to mod this, I'd get a spare machine, and spend an hour of so installing Redhat on it to check the version of SA. What do you what, +1 Absolutely-And-Positively-Accurate?

    2. Re:spamassassin-2.44-11.8.x.i386.rpm by poszi · · Score: 2, Insightful
      Yep, if I was to mod this, I'd get a spare machine, and spend an hour of so installing Redhat on it to check the version of SA.

      Ever heard of RPMs? You can check the nearest RH mirror and find the version: here or here. No need to install.

      Anyway, if you are not sure what's the version, don't mod it. False information is hardly "informative".

      --

      Save the bandwidth. Don't use sigs!

  51. Re:I get what I pay for too from reading the artic by gid · · Score: 2, Interesting

    Exactly, I had SA integrated into exim with custom rules and what not, but it would break on upgrading the debian package, happened twice, needed to tweak exim.

    Then I found out about the beauty of procmail once I looked into filtering all spam to it's own folder without email client filters. So now, I have different emails filtered to specific folders before it ever hits my inbox. Oh and I had to disable the bayesian filter, it was catching way to many not spam emails. Stuff that didn't have any keywords in it at all. One was just a couple quick sentences from a friend, who knows why it thought it was spam. :( I really should re-enable the bayes stuff, and figure out how to teach it what isn't spam.

    Here's a watered down version of my procmail file for those interested: http://gid0ze.net/dl/dot.procmailrc

  52. Personalized Bayesian training by gvc · · Score: 2, Informative

    The Bayes filter in SA 2.6 works very well but unfortunately is not well-suited to site-wide learning.

    -- casual readers may skip the following details

    In an attempt to mitigate this, SA makes an unfortunate mistake in its unsupervised learning algorithm - it uses a different set of rules for training than it uses for marking mail as spam or not. So you can easily have email marked as spam but have the system trained as non-spam (or vice versa). This introduces systematic bias into the learning so that spam detection can get worse in the long run. As a further attempt to mitigate this problem, the learner uses a higher spam threshold, so many spams that are correctly marked do not contribute to the learning process. There is no way to set the SA configuration parameters to eliminate these biases (setting the learn threshold does *not* do it).

    --- end of gory details

    It is not too difficult to set up SA for personalized learning. Just pipe your mail to the following command:

    spamassassin -e

    If the return code is 0 (non-spam) also pipe the mail to

    sa-learn --ham --single

    If the return code is 1 (spam) pipe to

    sa-learn --spam --single

    If you do this you are guaranteed that the statistics recorded in your personal bayes db correspond exactly to the judgements made by SA.

    In addition to this you must correct SA when it makes a mistake, by piping the message to sa-learn again with the right flag. You may be able to set up a macro in your mail reader to do this.

    This isn't as easy to set up as it should be, but it is *very* effective.

    In the last year I've received 20,000 non-spam and over 100,000 spam messages & viruses (30,000 if you eliminated the "Cumulative Update" messages, which SA caught just fine.) About 100 spams have gotten through (a couple a week) and about 10 false positives have occurred. All of the false positives have been 'weird' - advertising, automatic responses, or web pages that were forwarded to me. As far as I know (and I do check periodically) I've had no false positives in the last 50,000 spams.

    My preliminary analysis indicates that personalized learning reduces both false negatives and false positives by a factor of ten. I'll report more systematic analysis in due course.

  53. Re:The algorithm by perlionex · · Score: 3, Interesting

    Bayesian filtering is a bit like fuzzy-logic. Right now, it's best known for filtering spam. SpamAssassin uses a whole long list of tests and assigns +ve or -ve scores to each test that comes out positive (a bit like Slashdot's moderation).

    I know someone who did a project on classifying video using Bayesian filtering. It looked at stuff like brightness, contrast, volume, basically everything they could extract from the movie file and give a value to. The concept itself is quite powerful; the difficulty is getting a list of tests that can accurately predict / classify what you have (spam/non-spam, or for video, thriller/drama/etc).

    If you're interested in finding out more about actually coding Bayesian filters, you can check out the Bayes ++ project page.

  54. what is it with those guys? by jqh1 · · Score: 2, Insightful

    Larry Seltzer did a similar job with a review of disposable email address services in
    PC Magazine.

    Spamgourmet (open source and free to use) was lined up against several commercial offerings, and was rated the lowest. It was clear from the review that he didn't spend much time learning about how spamgourmet works -- he wound up faulting it for perceived problems that were addressed by features that he ignored in the review.

    Not to be cynical, but if I were a tech reviewer, I might be afraid of lawsuits resulting from my reviews -- open source projects have no revenue, and therefore can't prove up any damages in court. This might make me more likely to choose the open source alternative to get the shaft. Hopefully that's not what's going on here, but you've got to wonder...

    --
    who's moderating the meta-moderators?
  55. Re:What is a good client-side spam filter for Outl by professorhojo · · Score: 2, Interesting

    spampal does the trick for me.

    quick and effective identification. can check the online black hole lists for IP ranges to block and you can manually set the thing up to ignore email from any country. :)

    goooooodbye china!

  56. Re:What is a good client-side spam filter for Outl by letxa2000 · · Score: 2, Insightful
    Client side? I'll take server-side any day. Why would I want to download 250+ spams per day when the server could just as easily filter them for me?

    If you have your mail on a POP server (ISP, hosting provider, etc.) try PrismEmail. It filters between your server and you so there is effectively no time or load on your computer, plus it works with virtually any mail client with nothing to install on the server or on the client.

    I'm at 99.9% accuracy so far this month.

  57. But fix your .procmailrc by jlv · · Score: 2, Interesting

    And you better change that sime, straightforward procmail recipe to use ":0fw:" on the first line. That trailing ":" is important if you are not running spamd, as it makes procmail use a lock file and only run 1 instance of SpamAssassin at a time. Otherwise, if you get 30 messages, you'll get 30 instances of SpamAssassin, which is 30 instances of Perl, etc. Large load spike.

  58. Better way to integrate postfix and SA by cblack · · Score: 2, Informative

    Two things, first, it is probably more proper to match the X-Spam: YES header than the number of asterisks in the X-Spam-Level header. Then you configure you can tweak your cutoff level for X-Spam: Yes in the SA config.
    Also, rather than running SA from procmail or other means, it is much more efficient and clean to run it from a seperate daemon like amavisd-new and then configure postfix to use amavisd-new as a content_filter. There are several advantages of this approach, the greatest one being that you do not have process startup penalties for incoming mails to be scanned since amavisd-new is written in perl, references the SA engine through the perl module rather than the commandline, and has a similar scalable child process architecture to apache and many other network server daemons. Other nice things about amavisd-new is that you can integrate many different virus scanners with it as well as SA and it will handle all the subject rewriting, mail deleting, etc for you.

  59. Re:Is Running Home Server Worth It? by timeOday · · Score: 2, Interesting
    I like having my own email server at home because I can make up a different email address each time I give one out - any email address I want, since it's at my own domain. This is the key to my spam filtering.

    As for maintainence, there isn't any. I set up exim two or three years ago and have hardly touched it since.

  60. Spamassassin and other tools. by hoyhoy · · Score: 2, Informative

    I wrote an article about the open source tools that I use to keep Spam out of my inbox here:
    http://www.involution.com/spamstats.php

  61. Technical expertise of the media is a factor. by merc · · Score: 2, Funny

    ... "I installed the software on Red Hat Linux 9, with help from one of Proofpoint's systems engineers. She talked me through getting the Linux system configured properly, getting sendmail set up, and installing and configuring the Protection Server, which includes the MySQL database server for storing quarantined e-mail."

    [ ... ]

    IT consultant Logan Harbaugh is the author of two books on networking. Contact him at [snipped]

    Ok, which one of you helped him with the book?
    --
    It's true no man is an island, but if you take a bunch of dead guys and tie 'em together, they make a good raft.
  62. My letter to the author by macdaddy · · Score: 5, Insightful

    This guy's article was a joke. Not only did he use an ancient version (in the spam world) of SpamAssassin but he either flat out lied in his article or was too lazy to seek out the truth. Hard to configure? Can't find docs? Doesn't support A B C D or E? If this guy had spent 5 minutes of his precious time doing to research on SA he wouldn't have made these flagrant lies. I don't get these people. I really don't. I CCd the Editor-in-Chief at InfoWorld, Mr. Steve Fox, as well.

    Mr. Harbaugh,

    This letter is in response to your InfoWorld article titled "Commercial solutions win, spam loses." In that article you portray all commercial spam solutions as winners and you portray the only open-source spam solution you reviewed as a dismal failure. I must say that as a professional in the anti-spam field I'm am truly disappointed by your incomplete and inaccurate assessment.

    You start the article off quite well. Your introduction regarding two of the possible types of spam filtering is in terms that the average reader can understand. The introduction is also technically accurate, although it doesn't mention the other ways to filter spam.

    You quickly take an opportunity to kick dirt on SpamAssassin by claiming it filters a fraction of the amount of spam all the commercial solutions filter. You hint at something during that statement when you said that SpamAssassin's "age showed in my tests," yet you fail to actually make it apparent to the user what the real truth is. I must ask, why did you choose to compare such an ancient version of SpamAssassin to the current versions of the four commercial products? Version 2.44 is over 9 months old. Spam filtering techniques are constantly evolving to filter a continually changing target. Comparing a 9.5 month old copy of SpamAssassin to the current version of BrightMail is like comparing a 1990 Chevy Silverado to a brand-new 2004 model. As an author and professional in the IT industry writing a column for InfoWorld, one of your goals is accuracy and fairness in reporting, is it not?

    You make numerous false statements regarding SpamAssassin in your article:

    1) "All the products except Brightmail and SpamAssassin allow end-users to add senders to the domain whitelist themselves... SpamAssassin allows only the administrator to add to the whitelist, with no direct access for users."

    This is simply not true. SpamAssassin allows its users to add whitelist or blacklist entries to the personal preferences. It also allows its users to control the scoring for each individual ruleset with SpamAssassin's arsenal. Even the ancient version of SpamAssassin you chose to use had that simple feature. SpamAssassin also has the ability to automatically whitelist senders.

    2) "Delegation of specific administrative functions is possible with all the products except SpamAssassin..."

    This too is not true. As I said in response to number 1, SpamAssassin allows its users to control the scoring for each individual ruleset. This gives them the ability to disable certain rules, lessen the scores of others, and increase the scores of rules they wish had more weight. For example a user could disable the MAPS RBL DNS blacklist checks, whitelist joe@mydomain.tld, blacklist annoying-spammer@spamdomain.biz, and increase the score of the rule ALL_CAP_PORN to 2. The users can also create their own rulesets. SpamAssassin gives its users a high level of control over their spam filtering.

    3) "Finally, in addition to stopping spam, all four commercial products provide content-filtering features, allowing the administrator to block incoming or outgoing e-mail that contains proprietary data, audio or video files, executables, sexually explicit words, or racial slurs. They also provide protection against DoS attacks and directory harvesting attacks."

    This one baffled me at first. I'm honestly not sure why you want to compare features that have nothing to do with filtering spam. Filtering racial slurs from an email is

  63. Catching false positives. by jelwell · · Score: 2, Informative

    Here's how I catch false positives. But basically you should just learn to live with either false positives or spam. Take your pick.

    I turned subject rewriting on:
    rewrite_subject 1

    Then I set the subject tag to include the hit number:
    # Text to prepend to subject if rewrite_subject is used
    subject_tag *****SPAM****:*_HITS_*

    then in your email client you can sort your JUNK messages based on subject. This will put the tagged spam messages with the fewest hits at the top. That way you can easily look at messages with the fewest hits.

    I added another level of filtering to avoid looking at totally bogus spam messages. I setup two folders in my email client. "SPAM" and "EVILSPAM". I have a procmail filter that pipes spam messages with hits greater than 10 to EVILSPAM, that way I don't even look at them. All other spam goes to SPAM: :0 H
    * ^X-Spam-Status: Yes, hits=[0-9][0-9]
    mail/EVILSPAM :0 H
    * ^X-Spam-Status: Yes
    mail/SPAM

    Your email client can probably do this for you, instead of a procmail filter. But this way I can use webmail and all my rules are on my server, not on my client.
    joe.

  64. The author replies..... by macdaddy · · Score: 2, Informative
    I received a reply from the author, Logan Harbaugh, a little while ago. It would seem that I'm not the only person that stood up in support of SA. Apparently there was a reason he used an ancient version of SA. It would seem that the reason was supposed to be in the article but that the editing staff stripped it out prior to being published. Here is Mr. Harbaugh's reply:

    Date: Tue, 25 Nov 2003 11:40:33 -0800

    From: Logan Harbaugh
    Subject: RE: In regards to your article titled "Commercial solutions win, spam loses"

    To all concerned, I apologize for the apparent maligning of SpamAssassin in my recent article in InfoWorld. In my original article, I stated that I used the 2.44 release of SpamAssassin for two reasons - because it was the version shipping with the latest release of Red Hat 9 and because it would illustrate how much the state of the art has changed in the last year or two. This explanation was condensed in the finished article by copy editors, which is beyond my control. This will be covered in the letters to the editor section of InfoWorld so the rest of the world will know that I did not deliberately use an old version of SA to show it in a bad light against commercial products. I plan to review the current version in an upcoming article, and I am sure that it will perform better.

    Regarding some of the other comments that have been made in the many emails I've received defending SpamAssassin, some of you have said that SA is not hard to install, taking no more than an hour or two to download, install, configure and begin using. That is consistent with the 10 times longer number I used, because the other installation and configuration times were all around 5-10 minutes. You have said that an experienced Linux administrator doesn't find SA difficult to install or configure, and that additional functionality such as user-accessible white lists can be added, either through additional open source software or by writing scripts or programming to extend the functionality of SA. That's true, but not really relevant, unless there is a distribution that contains all of those features.

    You have also said that I should have taken into account the fact that it doesn't cost anything before making statements about it being harder to install, configure and manage than the commercial products. SA does cost - but in an administrator's time rather than money, which I did say in the article.

    The same is true of support - while you may get faster or better support through this group than you get with commercial software, there's no guarantee that you'll get any support at all - and most organizations will find that hard to live with.

    So, when I review the latest version of SA, you can expect performance to be better, but I will still look closely at installation, administration, updates, maintenance, reporting, granularity of management, and end-user features for SA, just as I will for any other anti-spam packages I review.

    Again, my apologies for creating a story that distressed so many of you. I do try to create balanced reviews that reflect the pros and cons of all the products reviewed.

    Thanks,

    Logan G. Harbaugh

    Thank you to Mr. Harbaugh for replying. His second paragraph still indicates that he doesn't realize that the current release of SA has all the features he said were missing. I look forward to this being corrected in a future article. I didn't go into much of a free vs commercial debate in my reply; however it seems that some folks did. I also didn't touch on the support issue. Frankly I find that support really isn't needed as long as the admin is compotent. I was involved in a discussion yesterday with a company I consult with. The topic of the discussion was which Linux distro we should use in the future now that RH is going towards an entreprise distribution and support contracts. Many seemed to believe that we should have technical support for whatever distro we chos