Spamassassin Beats CRM-114 In Anti-Spam Shootout

← Back to Stories (view on slashdot.org)

Spamassassin Beats CRM-114 In Anti-Spam Shootout

Posted by timothy on Tuesday June 22, 2004 @03:24PM from the hawaii-alaska-and-utah dept.

Simon Lyall writes "A new study of antispam software shows that Spamassassin performed well in various configurations along with Spamprobe , Bogofilter and Spambayes also came out good while CRM-114 failed to live up to its previous claims . The study shows: 'The best-performing filters reduced the volume of incoming spam from about 150 messages per day to about 2 messages per day.'"

4 of 330 comments (clear)

Min score:

Reason:

Sort:

The Mozilla ThunderBird SPAM filter by k.ellsworth · 2004-06-22 15:30 · Score: 5, Interesting

the mozilla spam filter does a very good job too, when it learns enough it becomes over 95% acurate. i dropped evolution for it , and never looked back

--
Putting a windows cd backwards, plays evil messages, but it gets worse, putting it right, installs windows.
Mozilla Messenger / Thunderbird Performance? by Mark_MF-WN · 2004-06-22 15:34 · Score: 5, Interesting

I wonder how Mozilla Messenger/Thunderbird's spam filtering stacks up against these filters? I've heard some negative comments about the Mozilla filtering system, but it's worked wonders for me.
Why don't people use catch-all accounts? by mattkinabrewmindspri · 2004-06-22 15:44 · Score: 5, Interesting
When you register with a hosting company, very frequently, they set up what's called a catch-all account, and any email to your domain that's not addressed to a real address goes there. This is how I use it:
- I only use my main email address with friends and family, and never post it online.
- Whenever I post an email address or register for anything online, I put thatsite@mydomain.com as my email address.
- All email is received by one account, but each message can have a different "to:" header. I set my filters to filter mail to different boxes. Email sent to amazon@mydomain.com goes to the amazon folder. Same with ebay, slashdot, whatever.
- Any time I start receiving spam, I just set my mail server to disregard email sent to whatever email address is getting the spam, and I can stop doing business with the company that sold my email address.
I receive on average 0 spams per day.
--
Albuquerque PC
Issues with testing corpus by w_mute · 2004-06-22 16:00 · Score: 5, Interesting

I haven't read everything in detail yet, but one of the things that stands out is that their 'gold standard' representing the best result consists of 9,038 ham messages (18.4%) 40,048 spams (81.6%). While large, the dataset is unbalanced. One of the things that is recommended by many of the filters is training on equal proportions of ham/spam in order to prevent biasing (overfitting).

Their train on errors approach may simulate what goes on with some filters it doesn't reflect the scenario where there is a initial dataset to be trained on _before_ new messages are processed. Instead, each message is in essence 'new'. So in their tests the machine learning filters start out knowing nothing, but SpamAssassin starts out with its inbuilt ruleset. Not exactly fair.

-Greg