Spamassassin Beats CRM-114 In Anti-Spam Shootout

← Back to Stories (view on slashdot.org)

Spamassassin Beats CRM-114 In Anti-Spam Shootout

Posted by timothy on Tuesday June 22, 2004 @03:24PM from the hawaii-alaska-and-utah dept.

Simon Lyall writes "A new study of antispam software shows that Spamassassin performed well in various configurations along with Spamprobe , Bogofilter and Spambayes also came out good while CRM-114 failed to live up to its previous claims . The study shows: 'The best-performing filters reduced the volume of incoming spam from about 150 messages per day to about 2 messages per day.'"

9 of 330 comments (clear)

Min score:

Reason:

Sort:

Correct link to CRM-114 by athakur999 · 2004-06-22 15:27 · Score: 5, Informative

CRM-114

The link in the article points to SpamBayes again.

--
"People that quote themselves in their signatures bother me" - athakur999
The Mozilla ThunderBird SPAM filter by k.ellsworth · 2004-06-22 15:30 · Score: 5, Interesting

the mozilla spam filter does a very good job too, when it learns enough it becomes over 95% acurate. i dropped evolution for it , and never looked back

--
Putting a windows cd backwards, plays evil messages, but it gets worse, putting it right, installs windows.
1. Re:The Mozilla ThunderBird SPAM filter by norton_I · 2004-06-22 20:01 · Score: 5, Insightful
  
  Better to do spam filtering with your MTA/MDA anyway, if possible. That way, the same filter is used no matter which email client you use from which computer. Plus, it means you don't have to download spams to your MUA when on a slow connection.
  
  Now if only I could get the rest of my mail configuration to be shared between evolution, mutt, and squirrelmail.
Quit acting like goddamn babies... by Anonymous Coward · 2004-06-22 15:32 · Score: 5, Funny

Baysian, gaysian. Real men hit delete.
No HTML, Just ps or pdf, conclusions inside by randyest · 2004-06-22 15:34 · Score: 5, Informative

And a long document it is (funny placeholder images though.) Here's the conclusions for the impatient but interested in a little more than the summary:

Supervised spam filters are effective tools for attenuating spam. The best-performing filters reduced the volume of incoming spam from about 150 messages per day to about 2 messages per day. The corresponding risk of mail loss, while minimal, is difficult to quantify. The best-performing filters misclassified a handful of spam messages early in the test suite; none within the second half (25,000 messages). A larger study will be necessary to distinguish the asymptotic probability of ham misclassification from zero.

Most misclassified ham messages are advertising, news digests, mailing list messages, or the results of electronic transactions. From this observation, and the fact that such messages represent a small fraction of incoming mail, we may conclude that the filters find them more difficult to classify. On the other hand, the small number of misclassifications suggests that the filter rapidly learns the characteristics of each advertiser, news service, mailing list, or on-line service from which the recipient wishes to receive messages. We might also conjecture that these misclassifications are more likely to occur soon after subscribing to the particular service (or soon after starting to use the filter), a time at which the user would be more likely to notice, should the message go astray, and retrieve it from the spam file. In contrast, the best filters misclassified no personal messages, and no delivery error messages, which comprise the largest and most critical fraction of ham.

A supervised filter contributes significantly to the effectiveness of Spamassassin's static component, as measured by both ham and spam misclassification probabilities. Two unsupervised configurations also improved the static component, but by a smaller margin. The supervised filter alone performed better than than the static rules alone, but not as well as the combination of the two.

The choice of threshold parameters dominates the observed differences in performance among the four filters implementing methods derived from Graham's and Robinson's proposals. Each shows a different tradeoff between ham accuracy and spam accuracy. ROC analysis shows that the differences not accountable to threshold setting, if any, are small and observable only when the ham misclassification probability is low (i.e. hm
CRM-114 and DSPAM exhibit substantially inferior performance to the other filters, regardless of threshold setting. Both exhibit substantial learning throughout the email stream, leading us to conjecture that their performance might asymptotically approach that of the other filters. From a practical standpoint, this learning rate would be too slow for personal email filtering as it would take several years at the observed rate to achieve the same misclassification rates as the other systems. Both these systems were designed to be used in a train on error configuration, and do not self-train. This configuration could account for a slow learning rate as each system avails itself of the information in only about 1,000 of the 50,000 test messages. In an effort to ensure that we had not misinterpreted the installation instructions, we ran CRM-114 in a train-on-everything configuration and, as predicted by the author, the result was substantially worse.

Spam filter designers should incorporate interfaces making them amenable for testing and deployment in the supervised configuration (figure 4). We propose the three interface functions used in algorithm 1 - filterinit, filtereval, and filtertrain - as a standardized interface. Systems that self-train should provide an option to self-train on everything (subject to correction via filtertrain) as in algorithm 2.

Ham and spam misclassification proportions should be reported separately. Accuracy, weighted accuracy, and precision should be avoided as primary evaluation measures as th

--
everything in moderation
Mozilla Messenger / Thunderbird Performance? by Mark_MF-WN · 2004-06-22 15:34 · Score: 5, Interesting

I wonder how Mozilla Messenger/Thunderbird's spam filtering stacks up against these filters? I've heard some negative comments about the Mozilla filtering system, but it's worked wonders for me.
A little advice by Anonymous Coward · 2004-06-22 15:37 · Score: 5, Funny

You don't want to face an assassin in a shootout. Maybe a pie eating contest, or a spelling bee... but not a shootout.
Why don't people use catch-all accounts? by mattkinabrewmindspri · 2004-06-22 15:44 · Score: 5, Interesting
When you register with a hosting company, very frequently, they set up what's called a catch-all account, and any email to your domain that's not addressed to a real address goes there. This is how I use it:
- I only use my main email address with friends and family, and never post it online.
- Whenever I post an email address or register for anything online, I put thatsite@mydomain.com as my email address.
- All email is received by one account, but each message can have a different "to:" header. I set my filters to filter mail to different boxes. Email sent to amazon@mydomain.com goes to the amazon folder. Same with ebay, slashdot, whatever.
- Any time I start receiving spam, I just set my mail server to disregard email sent to whatever email address is getting the spam, and I can stop doing business with the company that sold my email address.
I receive on average 0 spams per day.
--
Albuquerque PC
Issues with testing corpus by w_mute · 2004-06-22 16:00 · Score: 5, Interesting

I haven't read everything in detail yet, but one of the things that stands out is that their 'gold standard' representing the best result consists of 9,038 ham messages (18.4%) 40,048 spams (81.6%). While large, the dataset is unbalanced. One of the things that is recommended by many of the filters is training on equal proportions of ham/spam in order to prevent biasing (overfitting).

Their train on errors approach may simulate what goes on with some filters it doesn't reflect the scenario where there is a initial dataset to be trained on _before_ new messages are processed. Instead, each message is in essence 'new'. So in their tests the machine learning filters start out knowing nothing, but SpamAssassin starts out with its inbuilt ruleset. Not exactly fair.

-Greg