Distributed Spam Detection

← Back to Stories (view on slashdot.org)

Posted by CmdrTaco on Saturday December 1, 2001 @05:20AM from the interesting-ideas dept.

A reader writes "There's an interesting project at SourceForge, called, "Vipul's Razor", that uses a gnutella like system to let users exchange spam "signatures" to filter spam. I work at an ISP in Ottawa, we have been using it for last two weeks to stop bulk of spam coming to our POP3 accounts. More impressively, it hasn't tagged any valid mail as spam yet. Here's the scoop from its webpage: "Vipul's Razor is a distributed, collaborative, spam detection and filtering network. Razor establishes a distributed and constantly updating catalogue of spam in propagation. This catalogue is used by clients to filter out known spam. On receiving a spam, a Razor Reporting Agent (run by an end-user or a troll box) calculates and submits a 20-character unique identification of the spam (a SHA Digest) to its closest Razor Catalogue Server. The Catalogue Server echos this signature to other trusted servers after storing it in its database. Prior to manual processing or transport-level reception, Razor Filtering Agents (end-users and MTAs) check their incoming mail against a Catalogue Server and filter out or deny transport in case of a signature match."" Cool idea. I'm up around 80% spam a day on my main mail account. Might be worth a try.

14 of 304 comments (clear)

Min score:

Reason:

Sort:

Idiotic by Anonymous Coward · 2001-12-01 05:26 · Score: 1, Insightful

90% of spam I get has a subject like:

"New pill reduces debt! 513456"

So, a message digest won't work.
Anyone know where these people live? by oman_ · 2001-12-01 05:28 · Score: 1, Insightful

Just curious.. has anyone compiled a list of known spammers and their home addresses?

--
Rats would be more funny if they could fart.
Great use of p2p by astrashe · 2001-12-01 05:29 · Score: 5, Insightful

This is a great use of p2p -- something that doesn't involve piracy. I wish I had heard of it before.

Are there any other innovative non-piracy p2p apps out there that we should know about?
Authentication with servers? by GlassUser · 2001-12-01 05:30 · Score: 5, Insightful

I read some of the documentation, but I can't find details on a couple of questions. Do the servers authenticate with each other? It was implied, but how deep is it? Are the SHA signatures signed to the originating server (or client/trollbox) too? I think this kind of model is great, but if you don't have some nifty authentication/accountability, it can be wide open for abuse. I'm sure anyone reading slashdot can imagine a vengeful spammer flooding the network with bogus or malicious hashes.

--
funny munging
How about a server frontend approach? by serial+frame · 2001-12-01 05:32 · Score: 3, Insightful

It would be very neat if this were provided as a free service that acts as a front-end to an existing POP3 account. Simply sign up, provide info like your username, POP3 host (but not password; that can be passed from the service to your POP3 server on log-in for safety reasons). Then, point your favourite mail client at the service's POP3 server, and...voila. Same e-mail, minus the spam.
Nothing truly insightful here, just speculation from a convenience freak.

--

-
And the Angel said unto me, "These are the cries of the carrots! The cries of the carrots!"
idea won't work if reaches critical mass by intuition · 2001-12-01 05:37 · Score: 4, Insightful

Razor catalogs spam by hashing the entire text of the message. Later potential spam is "detected" by hashing entire texts of messages to see if the hash matches any of the existing hashes in the spam catalog.

To get around this all a spammer has to do is change/add at least one charachter to each spam. This would make all the hashes unique and no spams would be detected.
Open for abuse? by robstah · 2001-12-01 05:45 · Score: 2, Insightful

Although, i marvel at the theory and innovative use of peer to peer technology to achieve exemplary aims. I have some concerns about the possibilities of abuse, AFAIK the submission system for spam, is not moderated in any way. In fact only the hash is sent to the server and not a copy of the spam, i am therefore concerned that the system could possibly be abused by someone submitting the hash of a legitimate mail to the system that would then result in this email from being recieved by the other hosts. This could be done to prevent the circulation of bugtaq items, my a malicous user for instance. And as everyone has different personal opinions about SPAM and what constitues it, i think a set of clear guidelines is required and when submissions are made a copy of the mail is associated with it and a human being moderates the hashes being submitted. Although i have my doubts about the system, if these were put to rest i would have no hesistation in implementing a system like this.

--
Rob 'robster' Bradford
Debian Planet Guy
We are the apt. You will be packaged. Resistance is futile.
Re: Distributed spam filter by blibbleblobble · 2001-12-01 05:45 · Score: 3, Insightful

It does seem like a remarkably sensible system, just getting email clients to talk to each other about the emails they get.

You can tell if the same email has been sent to hundreds of people (and if you use hashes, you can do that without revealing the email)

You can click a "this is spam" button when you read an email, and anyone who trusts you (i.e. has your public key in their "trusted filtering friends" list) can look for similar messages and filter them.

But, there do seem to be a load of problems:
- Personalised email, as someone already mentioned
- Privacy problems with letting others into the secrets of your mailbox
- If you have the original of a message, you can calculate the hash, then see who else got the message (i.e. works for personal mail as well as spam)
- Relatively easy for malicious users to wrongly label someone as a spammer

Well worth investigating, though...
One way around potential abuse. by chris_7d0h · 2001-12-01 05:56 · Score: 5, Insightful

To eliminate the situation where one person posts a lot of "incorrect" signatures, a ranking system could be applied.
The thought goes like this.
A person submits a signature of "identified" spam mail to a "supernode" for ex. and the submission gets a ranking of 1. Each additional submission (by other users) increases the score by a number.

This way, there are several classifications which could be used to filter incoming mail. For the mail providers, they could opt for only removing mail matching signatures with a very high score (thus very likely these will be actual spam) or they could filter anything reported.

The purpose of allowing the use of classifications is that it will take longer time to get higher scores, since more people have to report the specific spam mail. Some people whish to eliminate things the least bit suspected, but mileage may vary.

Do you see a resemblance with the ./ moderation?

--
In a society that believes in nothing, fear becomes the only agenda ~ Bill Durodié
Re:I've managed to filter most spam by LiteForce · 2001-12-01 07:22 · Score: 2, Insightful

This won't work if somebody has sent you a message by way of BCC (Blind Carbon Copy).

--
"Be vewy vewy quiet, I'm hunting wuntime ewwors!" - Elmer Fudd
an other effective spam stopping method ? by Sarin · 2001-12-01 07:24 · Score: 3, Insightful

I receive about 40 spam messages in my mail account each day and I run my own mail server (qmail). Someone told me about a very basic spam stopping method. Just remove the mail-account for a couple of weeks and then reconnect it again, you should less or no spam after that period.

I receive too much real messages in order to try this out and I think most spammers won't bother to actuall remove an email address from their database if it doesn't exist. But has someone else tried this with any luck?

This p2p spam sounds really nice and I'm going to give it a try asap. I already "lost" an other mail-account in the flood of spam I got on it, so now it forwards all messages to msnbill@microsoft.com (microsoft domain billing address).
Re:So... by Greyfox · 2001-12-01 08:05 · Score: 4, Insightful

Spammers themselves are generally interested in ways to disrupt those lines of defense. If this project grows in popularity and shows itself to effectively block spam, they'll start gunning for it. Considering potential holes in the system before that starts happening really isn't a bad idea.

--
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Why not a histrogram filter? by javaaddikt · 2001-12-01 08:47 · Score: 1, Insightful

The best option would be a word count histogram filter. Then the spammer would have to entirely alter their language or sales pitch, which isn't going to happen. Just like handwriting, it is hard to change unless you make a whosale effort at changing it. They're too lazy, too.
Re:So... by dev0n · 2001-12-01 08:51 · Score: 4, Insightful

Seems like everyone hates spam with a passion, except maybe the spammers themselves

well, i would have to disagree with you on this point.. i work at a web hosting company as the technical support manager, and handling abuse complaints falls into my realm of responsibility... and i have found that a significant number of first time spammers do not KNOW that spam is "wrong", and get quite upset that they were "taken" by companies that send bulk messages on their behalf. i had one gentleman send me an apology letter that actually made me feel sorry for him. he, and many other people on our network, have never been repeat spammers.

i know that there are many people out there who don't care, but we can't automatically assume that all spammers are evil. some of them are just ignorant.