Slashdot Mirror


Proving Which Spam Filters work Best

pirateninja writes "Dr. Gord Cormack decided to find and prove what the best spam filter is. In his study he looked at the major spam filters (DSPAM, SpamAssassin, etc.) along with those submitted by various academics. The results are quite surprising, with a previously unheard-of spam filter, which uses ideas from various compression algorithms, performing the best overall. He recently presented the results and methodology used in a presentation titled 'Spam Filters, Do they Work? and Can you prove it?'" Note that this is a video of his presentation.

10 of 263 comments (clear)

  1. Re:Why not just douse the server in gas... by Tsiangkun · · Score: 5, Funny

    I'm getting 8kb/s downloads from the site, it's just like the good old days !

    I'll post more next week after I watch the video.

  2. Fantastic Spam Filters Which Work Best Proving! by _vSyncBomb · · Score: 5, Funny

    Hey Slashdot, what's up, man! Dude, I read your thing and like totally agree about Best Work Proving Spam Site Work! Dude, that's awesome!

    Bro, in the same vein, I was totally checking out this dope ass site which you might wanna check out too man. Guys like us that dig Spam Which Proving and Best work Filters will be all over this before long...

    OK, man take care until I see you this Friday at the dinner thing, Slashdot!

    Cheers,
    John

  3. Re:In my experience... by ozmanjusri · · Score: 5, Funny
    I've always had a very high success rate with these.

    I haven't tested this one myself, Barrett Filter but I understand it is 100% effective at reducing spam from known sources. False positives may be a problem, however.

    --
    "I've got more toys than Teruhisa Kitahara."
  4. Flaw in the test by lheal · · Score: 5, Informative

    The spammers actively try to subvert the more popular filters. That gives a lesser-known one a decided advantage, one which will go away as it becomes more popular.

    As with most choices like this, factors such as ease of use, speed, and resource efficiency can overshadow selectivity. No system is perfect, so it's perfectly reasonable to go with a system that's pretty good if you already are using it, rather than switching to the latest cool thing.

    I have found that using two dissimilar systems in a chain is quite effective.

    --
    Raise your children as if you were teaching them to raise your grandchildren, because you are.
  5. Harder! by Profane+MuthaFucka · · Score: 5, Funny

    I uuencoded the video file, translated it into Sumerian cuneiform, and pressed it into a billion little clay tablets. They are cooking in my oven right now. Now, the Internet is NOT some kind of truck you can just dump stuff onto, so if you want to get the data you're going to have to come to my house.

    --
    Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
  6. Re: Very Interesting And Generally Really Amusing by Anonymous Coward · · Score: 5, Funny

    Hey _vSyncBomb,

      Having trouble pleasing your woman? I've got something Very Interesting And Generally Really Amusing that you could try!!!

    Your buddy,
    _vAnoymousCoward

  7. Re:RTFA? by emag · · Score: 5, Funny

    "We are sorry that these talks are not available as plain HTML, PDF, or text, however under present IST policy we are not allowed to provide plain HTML, PDF, or text."

    --
    "The urge to save humanity is almost always a false front for the urge to rule." --H.L. Mencken
  8. text versions of the material by martin-boundary · · Score: 5, Informative
    For those who don't relish downloading 400MB worth of video (why can't somebody cut out the audio as a standalone MP3?), the material of the talk is also available in text mode.

    The official tests of spamfilters were done in last year's TREC conference, you can read the writeup here (or pdf overview).

    You can duplicate those tests yourself if you download the evaluation toolkit (GPL). It's a modular system where you can add a mail corpus (either one of the public TREC ones, or you can make your own trivially), and add a spamfilter package (there are 10 or so to download from the web, or create your own as per documentation).

    There's also a video talk given at Microsoft research which should cover pretty much the same ground, if text mode is slashdotted :).

    There's a new scheduled test towards the end of the year at TREC 2006.

  9. Ask Slashdot ... by Anonymous Coward · · Score: 5, Funny

    Dear Slashdot,
    At the university where I work, they have recently adopted a pesky policy banning the use of bitTorrent.
    What can I do to fix this ?
    Yours faithfully,
    Dr. Gord Cormack

  10. GMail Spam Filter by foxylad · · Score: 5, Interesting

    I use greylisting (gld to be specific) which works wonderfully. A couple of customers wanted even better filtering...

    First I tried DSPAM, but they refused to train it so the results weren't good. Then I tried Spam Assasin, which also let through a suprising amount of spam - a lot more than my personal account on Gmail.

    So I set up accounts on Gmail for them, and forwarded their mail to those accounts (after greylisting - don't want to burden GMail too much!). Gmail lets you set up forwarding, so I simply forwarded all the filtered mail back to a second account on my mailserver for the customer to pick up. Finally I wrote a python script that logs in to Gmail once a week to prevent the account being closed due to non-use.

    A tad involved, but it works like a dream. Yet again Google comes out on top, this time in a market it doesn't even know it's in!

    --
    Do as you would be done to.