The Next Step In Spam Filtering

← Back to Stories (view on slashdot.org)

The Next Step In Spam Filtering

Posted by CmdrTaco on Thursday October 9, 2003 @09:08AM from the brace-yourself-for-impact dept.

simeonbeta2 writes "Paul Graham (of "A Plan for Spam" fame) has a couple of new articles up. The first one details the success of Bayesian spam filters despite various circumvention techniques by spammers. While the success of Bayesian spam filtering is encouraging, it certainly hasn't seemed to stem the flow of spam in the last year or so. His second article, however, suggests finally taking the anti-spam battle to the spammers! Paul proposes that spam filtering packages automatically spider links contained in probable spam. Not only will this increase the accuracy of filters (by running the retrieved content through the spam filter as well) but this would effectively be a massive distributed DOS attack on spammers. This isn't a new idea nor is it without its problems but I think it's definitely an idea whose time has come."

7 of 349 comments (clear)

Min score:

Reason:

Sort:

Repeat from August by merger · 2003-10-09 09:12 · Score: 2, Informative

Feel free to read the comments from when this article was posted to slashdot in August.
Re:DoS Filter Circumvention by sketerpot · 2003-10-09 09:13 · Score: 2, Informative

It's possible to include, say, the Mozilla javascript engine in one of these spam filters, which would let it deal with funky javascript. BFilter, for one, uses this approach to deal with ad banners that are inserted in the page by javascript. The redirects can be dealt with; I'm sure there's some standard code for dealing with them that would be easy to use.
Really, you cn take quite a bit of browser code out of the browser and use it in a filter.
Re:Are these subject lines example's of anti BF? by Sheetrock · 2003-10-09 09:19 · Score: 3, Informative

The recognizable words (neonatal, pedant, betsy) might be a weak attempt at that in addition to creating non-identical subjects, although they'd need a lot more non-spammy words buried in the article to get through... which they usually do, surrounded with HTML to make them invisible.

--

Try not. Do or do not, there is no try.
-- Dr. Spock, stardate 2822-3.
Re:Boston Globe Article by hackhound · 2003-10-09 09:27 · Score: 3, Informative

Correct, clickable link here: Boston Globe
Re:What about false positives? by (54)T-Dub · 2003-10-09 09:29 · Score: 3, Informative

From the FAQ :

This could be used to DoS innocent victims.

That's the point of the blacklist. A site doesn't get pounded simply by being mentioned in a spam. It has to be mentioned in a spam and be on the blacklist.

--

"I can not bring myself to believe that if knowledge presents danger, the solution is ignorance" - Isaac Asimov
Re:DoS Filter Circumvention by vslashg · 2003-10-09 09:29 · Score: 2, Informative

Eventually, a good filter will have to mimic what the browser does very closely. Maybe it'd be better to actually use a browser that the user can't see.

Or set up a filter, and just stop accepting HTML mail altogether. Life is so much better when all of your incoming email is plain text. Most legitimite incoming mail is sent as multipart, so mail from your friends still gets through, even when they use mail clients that want to send out formatted mail.

The spammers sometimes send multipart messages with a text part that says something like "There is no plain text version of this message", but that's still better to see than a picture I didn't ask for.
Re:DoS Filter Circumvention by BagOBones · 2003-10-09 09:41 · Score: 2, Informative

In order to render the image it would have to be dowloaded.
This is how spammers know that they found a working e-mail address.

--
EA David Gardner -"... but the consumers have proven that actually what they want is fun."