Slashdot Mirror


Sorting the Spam from the Ham

MrClever writes "The Sydney Morning Herald (Aust) is running an article about the merits of Bayesian filtering and a good plain-english description of how it works. Might be handy if you need to explain it to non-technophiles. The main thing that may be useful is a Bayesian spam filter written to drop straight into Outlook 2k/XP available here and written in Python by Mark Hammond." Math buffs might enjoy reading these pages or browsing this writeup and its many links.

1 of 249 comments (clear)

  1. Why stop at classifying spam? Why not all e-mail? by Anonymous Coward · · Score: 5, Insightful

    As I wrote only late last night, using Bayesian classification with only two categories (spam and "non-spam") is somewhat short-sighted, since if properly trained, a Bayes classifier can do a much better job than ordinary mail filtering (procmail, Mozilla or Mail.app filters, you name it).

    In fact, if I had to bet on the next "killer apps", mail sorting and RSS filtering based on Bayesian classification would be right at the top of my list, based solely on the actual time-saving benefits for users. And I can't see any reason for Bayesian filtering not being included in Mozilla Mail and Apple's own (revamped) Mail.app.

    I have to use Outlook at work, and after setting up Outclass (which requires POPfile) with several "buckets" to classify my corporate e-mail by project and field, I'm definetly not going back. Outlook, even with extensive use of Rules Wizard and categories, simply cannot cope with the diverse kinds of project-related e-mail I swap with colleagues, and Outclass is the only thing I could find that could deal with Exchange, PST folders and multiple Bayesian "buckets" categories.

    Come on, do the right thing and tell Apple and The Mozilla Project that you want configurable Bayesian filtering on their mail clients.