Plan for Spam, Version 2

← Back to Stories (view on slashdot.org)

Posted by CmdrTaco on Tuesday January 21, 2003 @06:17AM from the bayesian-filtering-for-a-quieter-inbox dept.

bugbear writes "I just posted a new version of the Plan for Spam Bayesian filtering algorithm. The big change is to mark tokens by context. The new version decreases spams missed by 50%, to 2.5 per 1000, even though spam has gotten harder to filter since the summer. I also talk about how spam will evolve, and what to do about it."

10 of 459 comments (clear)

Min score:

Reason:

Sort:

Re:How is spam that big of a problem? by crawdaddy · 2003-01-21 06:36 · Score: 3, Insightful

Overblown? The fact that you would need more than one email account to keep from having your time wasted by spam proves otherwise.
Spam only cost-ineffective with ISP-level filters by PseudoThink · 2003-01-21 06:39 · Score: 5, Insightful

Spam filters are great, but it seems that only the Net-savvy are using them. Savvy users aren't the people spammers are making all their money from--they are making money off the naive and inexperienced users. These users aren't going to go out and install the latest Bayesian filters on their system, and the major email readers won't (and probably shouldn't) come with them automatically activated.

To make spam cost-ineffective for the spammers, we've got to stop it (or flag it) before it gets to the end-user. It would obviously be a mistake to allow ISP's to automatically delete all email that fails their spam filters, but I think it would be appropriate for them to include something in the headers flagging such email as probable spam. Then future email readers could detect this header and handle it gracefully, like moving it to a "spam" folder on the user's machine. Once this happens and Grandpa no longer gets email asking him to test the latest Viagra alternative, spam may become a thing of the past.
filtering effectiveness by qoncept · 2003-01-21 06:40 · Score: 5, Insightful

I think I speak for everyone when I say false positives are the only real hinderance to the filtering of spam. I get roughly 20 emails a day, 75% of which are spam. If one of them slips past the filter and I see it, it doesn't bother me so much. Spam is no longer a problem. What is an absolute necessity, though, (and probably less so for me than other people) is that none of my legitimate email is filtered as spam. I'd rather have 100 spams filtered improperly than one legit email.

--
Whale
Re:hopeless by Kallahar · 2003-01-21 06:40 · Score: 5, Insightful

Yeah, 2.5 per 1000 getting through is a proof that his ideas are obviously flawed. Having a working system is the best proof that an idea works :)

Travis
Re:hopeless by ajs · 2003-01-21 06:42 · Score: 4, Insightful

Everyone but the folks at SpamAssassin have been focusing on the idea that any one technique for identifying spam is doomed to diminishing returns.

Over at SpamAssassin, they've been busily creating a system that collects "good enough" tests by the dozens and uses them to collectively score a message and determine its general "spamishness". The system relies on a complex scoring system that is determined, not by the whim of human programmers, but on the results of a genetic training system that pits one set of scores against another until equilibrium is reached for a given set of example spam and non-spam.

See my other post here for how Bayesian filtering will be used to allow this system to feed back on itself and improve as it sees more of your spam and non-spam....
Re:Why can't we have legal restrictions on spam? by Steve+B · 2003-01-21 07:01 · Score: 5, Insightful

Because the last thing we need in this country is the government telling us how and when we can send email or make a phone call.
In certain ways, the government does and should do precisely that. If I repeatedly call you at 4 AM to ask if your refrigerator is running or deliberately send you virus-laden e-mail, then you have every right to call upon the long arm of the law to slap down the harassment.
Spamming, being a violation of the recipient's property rights, falls into that category.

--
/. If the government wants us to respect the law, it should set a better example.
Re:AOL or Hotmail adopt? by Anonvmous+Coward · 2003-01-21 07:14 · Score: 4, Insightful

"Does anyone think AOL or Hotmail could start using such a system as the one outlined in the article?"

No. My problem's with the senders, not the messages. What Hotmail should do is send back an email saying "Your message has been rejected because you have not been authorized by this user. If you'd like to request authorization, click here and follow the instructions."

When they properly fill out the form, you get a message saying "so'n'so wants to send you a message. Interested?" and you can say yes/no. If you say yes, they get added to your address book and they can email you until you remove them from it.

With this approach, it requires a valid return address before the message can possibly get to you. That means you're able to tell the person to remove you, unlike today's 'send anything to anybody' system.

If Hotmail did that, I'd actually consider paying for their service.
Re:Stop spam? by rograndom · 2003-01-21 08:10 · Score: 4, Insightful

Filtering is nice, I've been using SpamAssassin with reasonable results for the last few months. It has nearly no false positives but has recently been missing more. Perhaps I should update.

Actually spamassassin has a nice built-in reporting tool
spamassassin -r < *mailmessage*
And if you setup it up to work with with Vipul's Razor for it's all automagically updated.

--

Stupid Cheap Guitars
Re:Stop spam? by Deltan · 2003-01-21 08:21 · Score: 4, Insightful

Correction.. spam will never stop... ever.

You say that it will stop if it's fully against the law and people bring legal action to stop it.

Last time I checked, murder was illegal, punishable by death in many states, yet it still occurs.
Re:Stop spam? by CoughDropAddict · 2003-01-21 10:41 · Score: 3, Insightful

Last time I checked, murder was illegal, punishable by death in many states, yet it still occurs.

People spam because it is rational to do so (or at least spammers make them think so). Very low costs, the possibility of a good return, and nothing to lose since there are virtually no spam laws.

A better comparison than murder is the practice of child labor. While it was legal it was a rational practice to engage in, because the return was high and the risk was low -- if a kid gets eaten by a machine you just find another kid. Now that is illegal the practice is almost completely extinct because it is no longer rational -- the police would come knocking at the door, which impedes the goal of running a profitable business.