Slashdot Mirror

← Back to Stories (view on slashdot.org)

Paul Graham on Fighting Spam

Posted by CmdrTaco on Friday August 16, 2002 @04:08AM from the near-and-dear-to-my-heart dept.

Ramakrishnan M writes "Paul Graham, the Lisp Guru is back with a great technique to fight spam. It is based on trust matric, and he claims, only 5 out of 1000 spams got leaked out of this system with 0 false positives. Worth looking at."

2 of 675 comments (clear)

Major geek bias there... by Kaa · 2002-08-16 04:21 · Score: 5, Funny

From the article:

Based on my corpus, "sex" indicates a .97 probability of the containing email being a spam, whereas "sexy" indicates .99 probability. And Bayes' Rule, equally unambiguous, says that an email containing both words would, in the (unlikely) absence of any other evidence, have a 99.97% chance of being a spam.

Hmm.... take an average adult geek and yes, an email mentioning sex or sexy can go to /dev/null immediately without as much as a second glance... :-)

On the other hand if you run the statistics on email of an average horny teenager, the probabilities might get a bit different.

--

Kaa
Kaa's Law: In any sufficiently large group of people most are idiots.
False positives... by dillon_rinker · 2002-08-16 04:44 · Score: 5, Funny

From the article:

In the spam filtering business, false positives are your biggest worry...Based on my corpus, "sex" indicates a .97 probability of the containing email being a spam, whereas "sexy" indicates .99 probability...an email containing both words would have a 99.97% chance of being a spam.

False positives could be a HUGE problem in this case...imagine the agony if you missed this email from your wife: "I'm feeling REALLY sexy today - meet me at the motel off 12th street at noon for some lunch-hour sex!"