Slashdot Mirror


Proving Which Spam Filters work Best

pirateninja writes "Dr. Gord Cormack decided to find and prove what the best spam filter is. In his study he looked at the major spam filters (DSPAM, SpamAssassin, etc.) along with those submitted by various academics. The results are quite surprising, with a previously unheard-of spam filter, which uses ideas from various compression algorithms, performing the best overall. He recently presented the results and methodology used in a presentation titled 'Spam Filters, Do they Work? and Can you prove it?'" Note that this is a video of his presentation.

9 of 263 comments (clear)

  1. Re:In my experience... by coffeeisclassy · · Score: 3, Insightful

    Whats surprising is, while Bayesian spam filters work well in his tests, the one that performs the best was never really heard of before.... I wonder how long it will be before we see something using the methods available, who wants to bet OpenSource will beet closed source to implementing this?

  2. RTFA? by glowworm · · Score: 4, Insightful

    So, how are we supposed to RTFA then the FA is over 470MB and a video file. Why not just a nice simple text summary Mr Submitter, but nooooo that would just be too easy!

    --
    Orationem pulchram non habens, scribo ista linea in lingua Latina
  3. Not surprising... by RealGrouchy · · Score: 4, Insightful

    Although I haven't WTFV (watched the video), it doesn't seem surprising that spam filters which use techniques that aren't used widely would be most successful.

    If they aren't used widely, it would either be because they don't work, or they do work but they haven't caught on [yet].

    It's like any other fad. As an example, when the original Survivor series came out, it was really popular because it achieved its goal (attracting viewers) in a way that was original. Heck, even I watched the original one. Now that all the networks are doing the reality TV thing, it has become hackneyed, and each successive version of survivor does a worse job of achieving its goal. And I've given up watching TV.

    With antispam, new techniques are effective, but as they become more popular and more widely used, spammers will find equally innovative ways of getting around them.

    I've noticed that at any given time, there will be a particular style of (non-blank) spam that manages to get through Gmail's filters fairly consistently, but every now and then Gmail adapts its spam filters to block the successful spam type of the season, and eventually a new type will make its way through.

    - RG>

    --
    Hey pal, this isn't a pleasantforest, so don't waste my time with pleasantries!
  4. No bittorrent... No credibility by bgog · · Score: 4, Insightful

    Why exactly should be give any weight to anything from and organization so ignorant as to disallow bittorrent? I take someone pretty darn ignorant to disallow a protocol because some use it to transport illegal content. Why havn't then banned TCP? It is an evil technology used every day to violate copyright.

    This guy should spend his time educating the fools at his institution.

  5. Re:In my experience... by I!heartU · · Score: 3, Insightful

    Domain keys... now just get everyone to use it.

  6. Re:Harder! by cruachan · · Score: 4, Insightful

    Don't knock it, cuneiform on backed clay is the single most successful format for long-term storage ever invented - 3000 years and counting. Heck, most of our modern storage formats can't even manage 30 - tied to read a 8" floppy recently?

  7. Re:In my experience... by KlaymenDK · · Score: 4, Insightful

    "False positives may be a problem, however."

    False positives are a HUGE problem compared to the occasional "true negative"(?).

    I'd rather have a small trickle of spam emails (I can't believe I'm saying this, but hear me out) than I would risk missing out on that one truly important email.

  8. Re:In my experience... by jank1887 · · Score: 3, Insightful
    Hello. welcome to the internet.
    First, spam does not need to make sense to make money. Here's some of my latest received headlines:
    • placing LEDhas
    • pJapans mission
    • capture Todays architect shared
    • 6MZ
    and the body text (with an attached image):

    -----
    malware

    USDA databases crop

    entente cordial: admission relation contract GB giveaway andd

    studios another page:

    ... (etc.,etc.)
    -------
    AND IT STILL MAKES MONEY!!!
    spam is funded by idiots. we will never run out of idiots on the net. Thus, spam will always be profitible under the current email system. No matter what filters are used. Filters don't fix the spam problem any more than Virus Scanners stop viruses from spreading. It's all reactionary, which translates to 'fighting a never-ending battle on the losing side'.

  9. Re:Harder! by Squalish · · Score: 4, Insightful

    Am I the only one that read the means of presentation as a hilarious attack on a university policy of blocking bittorrent? Given that adding 470MB doesn't really add any usable information to a discussion about spam filters over a piece of text, and all.

    Your college doesn't like bandwidth-efficient delivery? Flood them with a Slashdot effect on a 500mb file, an extra $500 in bandwidth charges, and maybe they'll change their tune.

    --
    People in Soviet Russia, however, appear to be afflicted with amusing juxtapositions of the aforementioned situation