Slashdot Mirror


Using Email Networks as P2P Spam Filters

Oscar Boykin writes "New Scientist is running a story on using the social network in email as a P2P network. The idea is that email networks have structure that is conducive to a type of search called percolation search . This means email clients could query the social network of email users to filter spam. This story is based on a preprint available."

27 of 108 comments (clear)

  1. Secure? by geomon · · Score: 4, Interesting

    The authors propose that their system have access to inbound and outbound contacts. For trusted email accounts, that might work. But what about email accounts that people may want to creat to sheild their identity (political dissidents, whistleblowers). They would have to live outside of the spam protection network and would, I assume, be seed accounts for spammers.

    Am I missing something in this analysis?

    --
    "Rocky Rococo, at your cervix!"
    1. Re:Secure? by seoYak · · Score: 5, Interesting

      I don't think that i'll trade my privacy for a reduction in spam.

    2. Re:Secure? by Anonymous Coward · · Score: 4, Funny

      "Those who would trade privacy for a reduction in spam deserve neither." Benjamin Franklin.

    3. Re:Secure? by rescendent · · Score: 2, Interesting

      Also you would not be able to be emailed by people who you haven't already approved the email address; would they have to phone you first?

      For example:
      People who change email address (Gmail, dropping a spammed email)
      People legitimately contacting you (Old friend, people wanting to know more about your website etc.)
      etc.

      It would be like setting your telephone to only accept certain phone numbers and scrapping the phonebook. Bad for people, terrible for business.

      Though I suppose spam is worse because it requires less effort and cost to contact 1000 people than using the telephone would... I've only changed email address 8 times... LOL

  2. Nice...but not necessary by PenguinBoyDave · · Score: 4, Insightful

    Since switching to Thunderbird, I get nearly no spam...maybe one or two per day. I like fancy stuff, but when simple works, go with it!

    --
    I'm not a troll, but I play one on Slashdot.
    1. Re:Nice...but not necessary by winkydink · · Score: 2, Interesting

      One or two per day out of how many? 3? 5? 1000?

      --

      "I'd rather be a lightning rod than a seismometer." -Ken Kesey

    2. Re:Nice...but not necessary by Dukael_Mikakis · · Score: 5, Insightful

      I use gmail, which does an excellent job at filtering spam.

      But I think this could even be a step back. Like the parent says, I think most informed people have solved the issue of filtering spam pretty effectively (Thunderbird, Yahoo, Gmail, Bayesian filters, etc.) and so we don't generally *see* much spam.

      The *REAL* problem with spam is traffic and network pollution. Spam wastes a ridiculous amount of bandwidth and (through spyware) hijacks our systems' cycles to do something that is (with filters) ultimately to no end. This seemingly won't solve the bandwidth consumption issue and might worsen the problem by polling all your friends over the network and then using your personal cycles to scan said email against all the known spam on your friends' computers.

      People forget that the true detriment of spam these days is the traffic it causes, not cluttering your inbox (if you're smart).

    3. Re:Nice...but not necessary by geoffspear · · Score: 2, Interesting

      That's a nice theory, but it seems more likely that the more effective spam filtering gets, the more spam will be sent. If it takes 100x more messages to get the same results, the spammers will just send 100x more messages. And they'll need to turn even more machines into zombies to do it.

      --
      Don't blame me; I'm never given mod points.
  3. Great... by yotto · · Score: 4, Funny

    ...Now the RIAA's going to sue me for getting spam.

  4. Potential for harm by davidwr · · Score: 4, Insightful

    Imagine the potential for harm if I infiltrated a social network and then identified my enemies as spammers, either deliberately or because I or the software agent I use was somehow tricked into doing so.

    Social network-based spam-detection is a part of, not a total, solution, and its limits need to be recognized.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  5. Isn't this basically how Razor works? by forevermore · · Score: 4, Insightful

    Granted, I just skimmed the article, but isn't this exactly how Razor works? (simplified) Communities of people flag messages, senders, etc. as spam, and the mail server (or in my case, spamassassin) compares the messages to the community spam archive for matches before delivery.

    --
    Do you really need reason for beer? Wingman Brewers
  6. Isn't this how Yahoo works by CrazyJim1 · · Score: 3, Interesting

    You click a multi-user message as marked as spam, then it turns into spam for everyone else too.

  7. Reduces to a standard spam filter by tdvaughan · · Score: 3, Insightful
    According to the article the method works by asking its network of email users if they've seen the spam before:
    Similar software on each computer that receives the query would then check the message against its own spam database, and so on, until a match is found, or the message is deemed original.

    So it can't deal with spam that includes a unique random ID and would tag emails from a mailing list as spam. Once more: nice try, but it won't work in the real world.
  8. Ob by lheal · · Score: 4, Interesting

    In Korea, only old people get P2P spam.

    Actually, I think we should find a way to attach the same stigma to spam customers that we do to the spammers. Why do spam customers not have to go to jail? They're as much the problem as the spammers.

    I can see something like having all the spam customers' names published online, so you google for "spam" and "lheal" and up pops my list of purchases. The other spammers then get a very clean list of people to spam. Over time, the net would be segregated into those who like spam and those who don't.

    Yeah, unworkable idea, but so are all the others.

    --
    Raise your children as if you were teaching them to raise your grandchildren, because you are.
  9. Hmmm... by __aaclcg7560 · · Score: 2, Funny

    When you thought it was safe to use email again...

  10. YahooMail, GMail and Hotmail Do This Already by osewa77 · · Score: 3, Informative

    What strikes me is that the idea of "pooling information" isn't really new. When one yahoo-mail/HotMail/Gmail user marks a particular mailing as spam, it affects the likelihood that the same email would be marked as spam for other yahoo users. So, the idea of "pooling information about spam" (from article) is already in use! However, it would be nice to create explicit protocols to allow such data (what mailings I have marked as spam) to be made public so that people using other email providers or their own mail servers can share in this pool of knowledge. Of course, the big three email providers (yahoo mail, hotmail, and gmail) will be foolish to make this information public: the spam filtering is one thing that makes a yahoo/gmail account more attractive to potential users! Good idea in theory, but bad business prospects. To add insult to injury, there is no way for the researchers to profit from the arrangement.

  11. would that really be good? by overbom · · Score: 2, Insightful

    If I were a spammer:

    I'd change an email client to respond with any message from certain folks I don't like to report all of their messages as spam to poison the social network. a couple of clients out there saying "yup, I've already got a message like that here, and my user marked it as spam".

    think globally, act locally, right?

  12. Not a particularly new idea... by Otto · · Score: 3, Insightful

    This isn't a new idea... except that they propose to integrate it into the mail client and have everybody you've ever sent mail to or received mail from be a potential contact, weighted by frequency that you email them. That's a bit new, but not as effective as it seems.

    For one thing, it would block mailing list messages, which are messages that you probably do share with your contacts.

    For another, it does not consider that most spam has random keywords seeding into every copy sent, so those would have to be ignored somehow, which introduces a fuzzy match algorithim, which means the possibility of false matches exists, and since you're asking others (probably all using the same algorithim against their databases) you have increased the chances of a false match being found.

    In any case, collaborative networks already exist in a better form. Users mark messages as spam when they get them, a flag is created and sent to some central place that all users check against for matches. The algorithim for fuzzy matching resides in one place and is only used as an indicator in spam assassin in any case, not as the sole indicator..

    Large scale systems like Google's GMail can use people flagging messages as spam to filter similar enough messages from other users, sort of thing. I'm pretty sure they do something like this, in fact, as my GMail account has *never* made a mistake in it's spam detection.

    And so forth. There's better ways than relying on a random query of your contacts to see what they think.

    --
    - Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.
  13. Mmmm, buzzwordie by cloudmaster · · Score: 2, Funny

    Sorry, I can't read the article. There were too many buzzwords in the post.

  14. Re:Secure?partially... by spectrokid · · Score: 3, Interesting

    You could collect email adresses in a hashed form, just like passwords are stored on a server. You would be able to check if the sender is in the list, but not be able to "un-hash" the list back into real adresses. The way to get around it would be for spammers to attach their sender adresses to these funny mails people do like to forward to their friends.

    --

    10 ?"Hello World" life was simple then

  15. spam filters should reduce network load by sPaKr · · Score: 2, Insightful

    Skipping past the security issues. One of the goals of spam filters should be reducing network load not increasing it. If we have to send our spam to several differnt peers to be scored this would compound the network load problems. Mostly this is a bad idea(tm) from the get go. I think the only thing that will really stop spam is to force something like pgp(gpg) signatures on all mail. Here's hoping the new national ID cards will have public certs encoded on them. It would be cool if someone would step in and get PKI working for the rest of us. Also we should drag the boddies of spammers through major cities behind a horse, while allowing victums to beat the spammer with large sticks like golf clubs.

  16. Sounds Like SpamNet by MBCook · · Score: 2, Informative
    That sounds like Cloudmark SpamNet (I think that was what it was called). I used it a few years ago when it was in beta and it worked great. The idea is people marked mail they got as spam if it was. When they did that, a hash of the message (or title, or something like that) was sent to their server. When your mail came in, it was hashed and checked to see if it was spam. It was VERY accurate. It had only one problem:

    Cloudmark.

    I signed up for the free beta and was told that it would be free forever (they were going to charge businesses, IIRC). Then they chagned their mind but said that early adpoters/beta users would get it free for life. Then it left beta and they offered me a $5 discout (one time) for their subscription service (or some other pointless trinket offer like that). As far as I'm concerned they ripped me off.

    That set me off trying other things, and I eventually found POPFile, which I use to this day (great software). I've posted this to Slashdot before (a long time ago). Some nice guy from a anti-spam company gave me a code for a free version of their product to be nice (I never used it, I had found something by then and didn't feel like switching again).

    The point of all this is that it is a nice method that really works. If there was an open source project that did the same thing, I would use it. Untill then, I've got a solution that works fine.

    But this isn't new (if I'm right about what it is, the article is down).

    --
    Comment forecast: Bits of genius surrounded by a sea of mediocrity.
  17. Re:Wondering if this works for mailinglists by geoffspear · · Score: 3, Informative
    If you're sending messages to email addresses that didn't actually subscribe then yes, you're a spammer and you should be blocked.

    A well-designed opt-in list won't have any fake addresses on it (although it may have messages to invalid addresses bounce is once-valid accounts stop working), because anyone with half a brain designing an opt-in list would require the addresses it's mailing to be validated by the recipients of the messages before sending them anything.

    --
    Don't blame me; I'm never given mod points.
  18. Standard Form Letter by Golthur · · Score: 4, Funny

    Your post advocates a

    (X) technical ( ) legislative ( ) market-based ( ) vigilante

    approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)

    ( ) Spammers can easily use it to harvest email addresses
    (X) Mailing lists and other legitimate email uses would be affected
    ( ) No one will be able to find the guy or collect the money
    ( ) It is defenseless against brute force attacks
    (X) It will stop spam for two weeks and then we'll be stuck with it
    ( ) Users of email will not put up with it
    ( ) Microsoft will not put up with it
    ( ) The police will not put up with it
    ( ) Requires too much cooperation from spammers
    ( ) Requires immediate total cooperation from everybody at once
    ( ) Many email users cannot afford to lose business or alienate potential employers
    ( ) Spammers don't care about invalid addresses in their lists
    (X) Anyone could anonymously destroy anyone else's career or business

    Specifically, your plan fails to account for

    ( ) Laws expressly prohibiting it
    ( ) Lack of centrally controlling authority for email
    ( ) Open relays in foreign countries
    ( ) Ease of searching tiny alphanumeric address space of all email addresses
    ( ) Asshats
    ( ) Jurisdictional problems
    ( ) Unpopularity of weird new taxes
    ( ) Public reluctance to accept weird new forms of money
    ( ) Huge existing software investment in SMTP
    ( ) Susceptibility of protocols other than SMTP to attack
    ( ) Willingness of users to install OS patches received by email
    ( ) Armies of worm riddled broadband-connected Windows boxes
    (X) Eternal arms race involved in all filtering approaches
    ( ) Extreme profitability of spam
    (X) Joe jobs and/or identity theft
    ( ) Technically illiterate politicians
    ( ) Extreme stupidity on the part of people who do business with spammers
    ( ) Dishonesty on the part of spammers themselves
    (X) Bandwidth costs that are unaffected by client filtering
    ( ) Outlook

    and the following philosophical objections may also apply:

    ( ) Ideas similar to yours are easy to come up with, yet none have ever
    been shown practical
    ( ) Any scheme based on opt-out is unacceptable
    ( ) SMTP headers should not be the subject of legislation
    ( ) Blacklists suck
    ( ) Whitelists suck
    ( ) We should be able to talk about Viagra without being censored
    ( ) Countermeasures should not involve wire fraud or credit card fraud
    ( ) Countermeasures should not involve sabotage of public networks
    ( ) Countermeasures must work if phased in gradually
    ( ) Sending email should be free
    ( ) Why should we have to trust you and your servers?
    ( ) Incompatiblity with open source or open source licenses
    (X) Feel-good measures do nothing to solve the problem
    ( ) Temporary/one-time email addresses are cumbersome
    ( ) I don't want the government reading my email
    ( ) Killing them that way is not slow and painful enough

    Furthermore, this is what I think about you:

    (X) Sorry dude, but I don't think it would work.
    ( ) This is a stupid idea, and you're a stupid person for suggesting it.
    ( ) Nice try, assh0le! I'm going to find out where you live and burn your
    house down!

    --
    Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law.
    1. Re:Standard Form Letter by Linux_ho · · Score: 2, Insightful

      I'll add my own here:

      (X) Similar to DCC and Razor, but far less bandwidth efficient than either

      You should also have checked:

      (X) Users of email will not put up with it
      (X) Requires immediate total cooperation from everybody at once

      --
      include $sig;
      1;
  19. Bigger problem... by Not_Wiggins · · Score: 2, Insightful

    What is one person's spam is another person's desired mail. I'm not talking about advertising, either. For example, I know for a fact that there are a lot of people out there that "knee-jerk" react to service messages from their bank, credit card, whatever... stuff they even signed up for that they mark as spam. Since I want to get my "your payment has posted" email, do I want to rely on the network of people around me that signed up for the same thing with the same company and report it as spam because they're too lazy to just unsubscribe?

    --
    Diplomacy is the art of saying, "Nice doggie!" until you can find a rock.
  20. Re:Wondering if this works for mailinglists by jurt1235 · · Score: 2, Informative

    Taking texts out of context is your hobby I guess, anyway a reply:
    If you get enough trash back because of users, the nicest way is to let the mailserver handle it. A CPU can do the dumping a lot faster than a person can lookup an account, and take the person of the mailinglist

    The spamassassin side of the story: We do not like to send out a plain text message, but nice HTML formatted messages. We take care that this requested e-mail is not mistaken for spam by already routing it through a filter to prevent our users who request this mail do not accidentily put us in a spambox. Since we send it from the same address all the time, they can or go to our side and login with their own account and disable, or use a filterrule to dump it in the thrash anyway.

    3th point: We send everybody a welcome message with a login. So they need to be active to get started. There is however an very high rate of AOL/Hotmail addresses which do not live very long, resulting in a lot of trouble.

    And no, sending a normal mailinglist with limitted resources is not like being a spammer, it is more like being spammed because you have to get rid of all the trash expiring e-mail accounts cause.

    --

    My wife's sketchblog Blob[p]: Gastrono-me