Slashdot Mirror


Two Spam Filters 10 Times As Accurate As Humans

Nuclear Elephant writes "The authors of two spam filters, CRM114 and DSPAM, announced recently that their filters have achieved accuracy rates ten times better than a human is capable of. Based on a study by Bill Yerazunis of CRM114, the average human is only 99.84% accurate. Both filters are reporting to have reached accuracy levels between 99.983% and 99.984% (1 misclassification in 6250 messages) using completely different approaches (CRM114 touts Markovan, while DSPAM implements a Dolby-type noise reduction algorithm called Dobly). If you're looking for a way to rid spam from your inbox, roll on over to one of these authors' websites."

4 of 487 comments (clear)

  1. Re:Huh? Aren't humans 100%? by MarkJensen · · Score: 5, Informative

    I haven't been 100% accurate.

    I received an email from my sister-in-law from her work, and the address looked suspicious (one of those weird-looking "letter and number" jumbles.

    I deleted it. It happens.

  2. Re:can it be used with SA? -yes by wideangle · · Score: 5, Informative

    A CRM114 plugin for SA is available, thanks to Devin Nate:

    http://bugzilla.spamassassin.org/show_bug.cgi?id =2 301

  3. Spot the reference... by Maj.+Kong · · Score: 5, Informative
    CRM114 was a piece of encryption gear in Major Kong's...err, my B-52 in the movie Dr. Strangelove . It allowed only properly coded messages to be received by the crew. When the Soviet SAM detonated near the airframe, the CRM114 was damaged and the crew could not get the recall order.
    Kong: (announcing through headset intercom )

    This is your attack profile: to insure that the enemy cannot monitor voice transmission or plant false transmission, the CRM114 is to be switched into all the receiver circuits. Emergency phase code prefix is to be set on the dials of the CRM. This'll block any transmission other than those preceded by code prefix. Stand by to set code prefix.

    ObKubrick: In 2001: A Space Odyssey, one of the pods was marked with the designation CRM-114. And in Clockwork Orange, Alex is injected with serum 114. I suppose CRM-114 is to Kubrick as THX1138 is to Lucas.

    Dobly, on the other hand, is from This is Spinal Tap , a mispronounciation of "Dolby" by David St. Hubbins's girlfriend:

    Jeanine Pettibone: You don't do heavy metal in Dobly, you know.

    Not to mention that it probably avoids trademark infringement (though I wouldn't put it past Dolby Labs or Thomas Dolby to raise a stink).

    Maj. Kong
    --

    Shoot, a fella' could have a pretty good weekend in Vegas with all that stuff.
  4. Re:Huh? Aren't humans 100%? by Harinezumi · · Score: 5, Informative
    Computers are neither lazy nor pressed for time, and therefore can afford to read and evaluate every single line of every single message. Humans generally can't be bothered to be so diligent, and while they have the ability to get a 100% rate, in most cases they devote so little attention to the task of filtering email that the success rate drops.

    When these factors are considered, I think it's quite possible to write software that in the long run has a higher success rate than a human who has better things to do than filter his mail all day.