Critical Eye on SpamAssassin
ErrorBase writes "In this Infoworld article, Logan G. Harbaugh makes a great deal about an ancient (2.44) version of SpamAssassin comparing it with newer comercial variants.
Quote : You get what you pay for. [...] However, it took more than 10 times as long to install and configure SpamAssassin as it did any of the other products. "
Why did he not ask Kevin Railsback who had the whole thing working some while ago?)"
SpamBayes, by far.
Webmin is great for setting up just about anything you can think of.
A psychopath can't tell the difference between right and wrong. A sociopath knows the difference - he just doesn't care.
Great - compare generation or more older open source to fresh shrinkwrap. Who's zooming (or shilling) for who.
.01% (yes Bucko, less than 1/1000) false positives. When they implemented it several versions ago it was just as good.
My ISP (souther NH) runs SpamAssassin 2.6 - and I can tell you that at the default settings it catches 90-95% with
I've got one client where the run NO filter - some folks (the names GOTTA be on the web site) get up to 100 spams a day. IT are basically monkeys with hands. I have no idea what the CEO thinks. They wouldn't even think OS as they're a total MS shop.
I don't understand why he's so critical of a free product. I upgraded to 2.60 and it's running near flawless, and since the program is so simple, you just upgrade it, no need to change configuration options if you don't need to, you just call it from procmail.
Yeah all those GUI options look nice, but 90% of the time, why do I need to change my spamblocking settings? The Bayesian filter autoadjusts itself with little or no user intervention -- it's near transparent.
I run a mail server at home on a Linux box, with Postfix and Spamassassin 2.60. I have it configured to label mail as spam once it hits 8 points, and to automatically chuck it into /dev/null once it hits 12 (using Postfix's header_checks).
It works pretty well for me -- the mail server's only for my personal use so I don't really have to worry about irate subscribers sueing me for dropping them legit mail =p and the 8-12 point range in the spam marking gives me a chance to vet through those suspicious mails briefly before deleting them.
I've never tried any other spam filters on the server-side, so I can't really compare. I guess I'm also a bit of a Linux hacker so I don't mind tweaking all those config files along the lines of the FAQ and other hints on forums to get it to work the way I want it to.
Gan Family Homepage
I know people have been recommending SpamBayes but be warned - it is very slow to parse and move the emails. Only bother with this if you receive only a small volume of spam or have a pretty fast computer.
He sent a long open letter to SAtalk. You can find it in the mailing list archive
Each product was tested with a different stream of mail, so the number of messages received varied, but all received enough messages to assess their capabilities.
Can you imagine someone writing "Oracle, Sybase and Postgres were compared. While the data and workloads were different, all products performed enough work to assess thier capabilities."
All the products except Brightmail and SpamAssassin allow end-users to add senders to the domain whitelist themselves.
I don't know anything about Brightmail. Spamassassin end user whitelists entries can be set up in a number of ways.
And all the products but SpamAssassin use dynamic updates to keep up with the evolving technologies spammers use to circumvent less sophisticated filters.
As aluded to in the summary, this is false with modern versions of Spamassassin, which uses Baysian filtering. (The author later says he couldn't get it working.
However, it took more than 10 times as long to install and configure SpamAssassin as it did any of the other products. [...] But just because the software is installed does not mean it will work -- filtering criteria must be added manually, and until that's done nothing is filtered out. Getting the various configuration files edited properly so that the whole package worked was not simple. Documentation was difficult to find, and not always easy to follow.
While it is true that one must be comfortable with a text editor to configure Spamassassin, thus perhaps putting it out of reach of point-and-click admins and technical journalists, I also wouldn't be prone to put my mail servers in the hands of either of those groups of people.
It looks for keywords in the subject or body of e-mails, but is frustrated by words not in the dictionary, such as "V!agra," or words that contain invisible HTML characters.
While I am not sure what tests appeared in which version, I'm pretty sure 2.44 handled off-by-one works such as V!agra. I have no idea what he's talking about when he says "invisible HTML characters", but it does seem to point to a certain technical incompetence, similar to the ostritch belief - "If I can't see you, then you can't see me."
This is not to say Spamassassin is the easiest thing in the world to deal with. I happen to love it, because of the extreme flexibility.
I just get sick of tech journos who decide that because a tool doesn't have a gui and they don't want to take the time to configure it, it sucks.
I forget what 8 was for.
You think 2.44 is ancient? Feh - Debian 'stable' is still stuck with 2.20.
I've found the easiest way to implement SpamAssassin is to invoke it through MailScanner. MailScanner uses third-party virus scanners and can optionally invoke SpamAssassin as well. With the free ClamAV antivirus product, you can build a powerful open source mail scanner. Even without a virus scanner, MailScanner detects and quarantines executable attachments and other dangerous content which represent the most common types of mail-borne viruses and worms.
RedHat installs the daemonized version of SA as well as the SA Perl scripts. Using the daemon, the easiest implementation is to invoke SA in /etc/procmailrc on the mail delivery host; for mail gateways running sendmail, you need to use the milter interface. I've found the MailScanner+SpamAssassin approach much easier to configure than either of these methods, and you get virus scanning to boot!
I suspect if the reviewer had compared SA 2.60+ to the commercial products, rather than the older 2.44 version used in the review, SA would have shown better results.
I'd agree with the reviewer that one of the things SA lacks is an easy method for users to interact directly with the program. (Part of the issue has to do with security; SA runs as root. As I read the review, I wondered how the other products allow users to interact directly with the scanners without sacrificing security.) It's not easy to maintain per-user Bayesian filtering, for instance, but I generally recommend having the mail client, e.g., Mozilla, handle these tasks.
Since then, I've downloaded a bunch of rules from The SA Custom Rule Emporium and almost nothing gets through.
If this guy had trouble, it is the fault of the documentation, not the product. Either that, or he was dumb enough not to upgrade to perl 5.8 or above, and spent forever installing modules.
He says:
Funny how when you install an old version of the product, it seems outmoded, hmmm?
Sheesh.
Pixie
don't mess with those geekgrrls
So if you want to whinge at anyone, whinge at RH. At least this shows that reviewers now think they should include FOSS in their reviews.
Justin.
You're only jealous cos the little penguins are talking to me.
I don't know anything about SpamBayes so I cannot comment on it at all.
POPFile is easy to use. It also performs Bayesian filtering. It is what I use.
http://popfile.sourceforge.net/
My current POPFile statistics:
Messages classified: 1,440
Classification errors: 19
Accuracy: 98.68%
saconf works for the Windows versions of spam assassin.
http://www.openhandhome.com/saconf.html