Using Email Networks as P2P Spam Filters

← Back to Stories (view on slashdot.org)

Using Email Networks as P2P Spam Filters

Posted by CmdrTaco on Thursday May 12, 2005 @07:02AM from the where-have-i-heard-this-before dept.

Oscar Boykin writes "New Scientist is running a story on using the social network in email as a P2P network. The idea is that email networks have structure that is conducive to a type of search called percolation search . This means email clients could query the social network of email users to filter spam. This story is based on a preprint available."

4 of 108 comments (clear)

Min score:

Reason:

Sort:

YahooMail, GMail and Hotmail Do This Already by osewa77 · 2005-05-12 07:19 · Score: 3, Informative

What strikes me is that the idea of "pooling information" isn't really new. When one yahoo-mail/HotMail/Gmail user marks a particular mailing as spam, it affects the likelihood that the same email would be marked as spam for other yahoo users. So, the idea of "pooling information about spam" (from article) is already in use! However, it would be nice to create explicit protocols to allow such data (what mailings I have marked as spam) to be made public so that people using other email providers or their own mail servers can share in this pool of knowledge. Of course, the big three email providers (yahoo mail, hotmail, and gmail) will be foolish to make this information public: the spam filtering is one thing that makes a yahoo/gmail account more attractive to potential users! Good idea in theory, but bad business prospects. To add insult to injury, there is no way for the researchers to profit from the arrangement.
Sounds Like SpamNet by MBCook · 2005-05-12 07:36 · Score: 2, Informative

That sounds like Cloudmark SpamNet (I think that was what it was called). I used it a few years ago when it was in beta and it worked great. The idea is people marked mail they got as spam if it was. When they did that, a hash of the message (or title, or something like that) was sent to their server. When your mail came in, it was hashed and checked to see if it was spam. It was VERY accurate. It had only one problem:
Cloudmark.

I signed up for the free beta and was told that it would be free forever (they were going to charge businesses, IIRC). Then they chagned their mind but said that early adpoters/beta users would get it free for life. Then it left beta and they offered me a $5 discout (one time) for their subscription service (or some other pointless trinket offer like that). As far as I'm concerned they ripped me off.

That set me off trying other things, and I eventually found POPFile, which I use to this day (great software). I've posted this to Slashdot before (a long time ago). Some nice guy from a anti-spam company gave me a code for a free version of their product to be nice (I never used it, I had found something by then and didn't feel like switching again).

The point of all this is that it is a nice method that really works. If there was an open source project that did the same thing, I would use it. Untill then, I've got a solution that works fine.

But this isn't new (if I'm right about what it is, the article is down).

--
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
Re:Wondering if this works for mailinglists by geoffspear · 2005-05-12 07:52 · Score: 3, Informative

If you're sending messages to email addresses that didn't actually subscribe then yes, you're a spammer and you should be blocked.
A well-designed opt-in list won't have any fake addresses on it (although it may have messages to invalid addresses bounce is once-valid accounts stop working), because anyone with half a brain designing an opt-in list would require the addresses it's mailing to be validated by the recipients of the messages before sending them anything.

--
Don't blame me; I'm never given mod points.
Re:Wondering if this works for mailinglists by jurt1235 · 2005-05-12 19:35 · Score: 2, Informative

Taking texts out of context is your hobby I guess, anyway a reply:
If you get enough trash back because of users, the nicest way is to let the mailserver handle it. A CPU can do the dumping a lot faster than a person can lookup an account, and take the person of the mailinglist

The spamassassin side of the story: We do not like to send out a plain text message, but nice HTML formatted messages. We take care that this requested e-mail is not mistaken for spam by already routing it through a filter to prevent our users who request this mail do not accidentily put us in a spambox. Since we send it from the same address all the time, they can or go to our side and login with their own account and disable, or use a filterrule to dump it in the thrash anyway.

3th point: We send everybody a welcome message with a login. So they need to be active to get started. There is however an very high rate of AOL/Hotmail addresses which do not live very long, resulting in a lot of trouble.

And no, sending a normal mailinglist with limitted resources is not like being a spammer, it is more like being spammed because you have to get rid of all the trash expiring e-mail accounts cause.

--

My wife's sketchblog Blob[p]: Gastrono-me