Slashdot Mirror


Distributed Spam Detection

A reader writes "There's an interesting project at SourceForge, called, "Vipul's Razor", that uses a gnutella like system to let users exchange spam "signatures" to filter spam. I work at an ISP in Ottawa, we have been using it for last two weeks to stop bulk of spam coming to our POP3 accounts. More impressively, it hasn't tagged any valid mail as spam yet. Here's the scoop from its webpage: "Vipul's Razor is a distributed, collaborative, spam detection and filtering network. Razor establishes a distributed and constantly updating catalogue of spam in propagation. This catalogue is used by clients to filter out known spam. On receiving a spam, a Razor Reporting Agent (run by an end-user or a troll box) calculates and submits a 20-character unique identification of the spam (a SHA Digest) to its closest Razor Catalogue Server. The Catalogue Server echos this signature to other trusted servers after storing it in its database. Prior to manual processing or transport-level reception, Razor Filtering Agents (end-users and MTAs) check their incoming mail against a Catalogue Server and filter out or deny transport in case of a signature match."" Cool idea. I'm up around 80% spam a day on my main mail account. Might be worth a try.

304 comments

  1. mail by alanak · · Score: 0, Offtopic

    if only there were a service like this for junk snail mail.

    1. Re:mail by jonathonc · · Score: 1

      Just remember to ensure that you return all postal junk mail that has a return paid envelope back to the sender. Preferably with something it in, like decaying vegatable matter, or worst.

    2. Re:mail by FrostedWheat · · Score: 1

      >Preferably with something it in, like decaying vegatable matter, or worst.

      50 hours free on AOL.

    3. Re:mail by Anonymous Coward · · Score: 0

      Just remember to ensure that you return all postal junk mail that has a return paid envelope back to the sender. Preferably with something it in, like decaying vegatable matter, or worst.

      The telephone book is always a good idea. That is unless you can find something heavier!

      The postage is based on the weight of what you send back, so the heaver it is, the more they have to pay!

  2. Idiotic by Anonymous Coward · · Score: 1, Insightful

    90% of spam I get has a subject like:

    "New pill reduces debt! 513456"

    So, a message digest won't work.

    1. Re:Idiotic by Anonymous Coward · · Score: 0


      You're correct. Most of current SPAM (80%) can not get captured with signatures-based methods.

      There are many interesting and not obvious problems with SPAM, actually.

      Now it's time for SPAM:

      I've spend some time writing a bit unusual SPAM filter: http://www.spafi.com

      It works pretty well for myself and for some of my friends. It is free.

    2. Re:Idiotic by Weh · · Score: 1
      i read the eula of your software, this:

      By entering into this Agreement, using the Service,
      you grant Paul Tchistopolskii and its affiliates a perpetual,
      nonexclusive, world-wide, royalty-free, irrevocable and fully
      sublicensable right and license to store, copy, distribute,
      display and otherwise use your e-mail Headers, and any or all
      information contained therein, for any purpose whatsoever.


      caused me some concern, why don't you give me any sense of security about what you're planning on doing with my email headers ?
    3. Re:Idiotic by ldeviator · · Score: 1

      It's not based on subject but message content.

  3. SpamBouncer by joib · · Score: 5, Informative

    I'm personally using SpamBouncer, a procmail-based spam filter. Works fine for me.

  4. Anyone know where these people live? by oman_ · · Score: 1, Insightful

    Just curious.. has anyone compiled a list of known spammers and their home addresses?

    --
    Rats would be more funny if they could fart.
    1. Re:Anyone know where these people live? by Pr0n+K1ng · · Score: 0, Funny

      I believe that Nazi timothy is doing something like this. Check his journal for more info.

      --

      Oh well, back to dowloading pr0n...

      Pr0n K1ng

    2. Re:Anyone know where these people live? by Malc · · Score: 2

      And what good will that be? Are you planning some vigilante action?

    3. Re:Anyone know where these people live? by npietraniec · · Score: 1

      Mod this up, coffee almost just came shooting out of my nose.

    4. Re:Anyone know where these people live? by Anne+Thwacks · · Score: 1

      Perhaps we could find a good use for any cruise missiles left over from Afganistan.

      --
      Sent from my ASR33 using ASCII
    5. Re:Anyone know where these people live? by Anonymous Coward · · Score: 0

      You want to be careful with that. You should say "Tom Cruise missile" instead. Otherwise the US government will call you a terrorist.

    6. Re:Anyone know where these people live? by JThaddeus · · Score: 1

      Ah, the good old days when you could set up a crontab to mail some bozo a core dump every 10 minutes...

      --
      "Love is a familiar; Love is a devil: there is no evil angel but Love." --William Shakespeare ('Love's Labors Lost')
    7. Re:Anyone know where these people live? by OsamaBinLogin · · Score: 1

      We should compile a hit list of spammers and their addresses, the top 42 of them, on a hit list website. Tell them to stop sending spam or we'll shoot them with a high powered water gun through their kitchen window. Then, as we shoot them, and CNN (conservative news network) reports it nationally, cross each one off the list.

      They'll get the message.

      --
      Marketing-driven companies end up over-marketing their products. Engineering-driven companies end up over-engineering
    8. Re:Anyone know where these people live? by Weh · · Score: 1

      since when is CNN conservative? I thought it was more left-wing if anything.

  5. Great use of p2p by astrashe · · Score: 5, Insightful

    This is a great use of p2p -- something that doesn't involve piracy. I wish I had heard of it before.

    Are there any other innovative non-piracy p2p apps out there that we should know about?

    1. Re:Great use of p2p by Anonymous Coward · · Score: 1, Funny

      P2PQ is a distributed question/answer network.

    2. Re:Great use of p2p by Anonymous Coward · · Score: 1, Informative

      Reptile is a distributed publishing agent.

    3. Re:Great use of p2p by __aawsxp7741 · · Score: 2, Interesting

      How about Freenet? Can be (ab)used for piracy, of course, but neither is that its purpose, nor does it seem its current main use.

    4. Re:Great use of p2p by Sarcasmooo! · · Score: 5, Informative

      Just because most people on a P2P network use it for piracy, it doesn't become a pirate-app. I can, and have, used programs that are under attack by the RIAA do download speeches, text documents, etc. At the early point of the 2000 Nader campaign, when he couldn't get 30 seconds of time on M$NBC (much less a place in the debates later on), I used Napster and Scour to find speeches he's given. And when the Department of Commerce kicked of it's 'Safe Harbor' privacy program by failing to put the confidential information provided by the companies involved on a secure site, I downloaded the pages in a zip file despite the site being closed for a fix. Using programs like Scour, I found reading material on scientology, COINTELPRO, and more, all the way up until the day that lawsuits shut them down.

    5. Re:Great use of p2p by addaon · · Score: 1

      "non-piracy p2p apps"

      How about Morpheus? Or this other one you may have heard about, Napster? Yeah, I've heard rumors that people used them for pirating; I don't know, I used them daily, and never pirated anything. In any case, how is the app itself a piracy app?

      --

      I've had this sig for three days.
    6. Re:Great use of p2p by LionKimbro · · Score: 2

      Not yet, but there will be relatively soon.

      I anticipate that P2P networks will be good as a Free Software server publishing mechanism.

      For example, you download a game, and it uses some popular publishing mechanism for finding or publishing where a game server is.

      I'd REALLY like to see a game construction kit that allows you to easily share your sprites and sounds with others around the world.

      I mean, just think about anything that you can create and share with others...

  6. So... by DagSverre · · Score: 5, Interesting

    ...what stops this from being abused? Say I set up a box that automatically reports all mails on the most popular mailing lists as spam, effictively making the ISPs around the world start to filter out the mailing lists...

    It's a great initiative, I really hope no troll out there takes my word on this and actually do this.

    1. Re:So... by OverCode@work · · Score: 1

      Seems like everyone hates spam with a passion, except maybe the spammers themselves, and from what I've gathered they're generally pretty clueless. Why would people mess up one of the few effective lines of defense?

      Maybe I'm just being naive, but I think this could actually work.

      -John

    2. Re:So... by Angry+White+Guy · · Score: 1

      Then maybe we could create a distributed agent to detect and report the false spam signatures, then block the offenders. Then when people start mis-reporting offenders, we could write a ditsributed client.....

      Or maybe we could add a trust model? All spam signatures are stamped with the IP or MAC, then you only update from somebody you trust, or somebody that your peers trust. Anything else is asking for trouble, and this would allow smaller networks and limit (not eliminate) points of entry for malicious users.

      AWG
      But what do you when you can't trust anyone?

      --
      You think that I'm crazy, you should see this guy!
    3. Re:So... by danielpavel · · Score: 1

      Your trust model idea sounds a lot like PGP... And it works just fine, IMO (I use it on a regular basis for professional contacts); so this should work as well. Of course, you'd still need some central database, where everyone would start from...

    4. Re:So... by cascino · · Score: 1

      An idea is to simply keep a tally of how many times (and by how many unique ip's) each entry in the database has been entered - and then only screen all spam addresses above a certain threshold. No user moderation necessary.

    5. Re:So... by onepoint · · Score: 1

      >>Why would people mess up one of the few effective lines of defense.

      No offense, People might get pissed off at someone and effectively kill that persons e-mail capabilities .

      Now do I like the idea, Yes very much so. I was always hoping for something like this. I can see the effectiveness of a well patroled spam mail catcher.

      I wish the developement team best of luck.

      -onepoint

      --
      if you see me, smile and say hello.
    6. Re:So... by Anonymous Coward · · Score: 0

      Uh huh. You suggest a way for the network to be disrupted, then you come on Slashdot and post it in the hopes that TROLLS won't read it. You -must- be blind.

    7. Re:So... by Greyfox · · Score: 4, Insightful

      Spammers themselves are generally interested in ways to disrupt those lines of defense. If this project grows in popularity and shows itself to effectively block spam, they'll start gunning for it. Considering potential holes in the system before that starts happening really isn't a bad idea.

      --

      I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    8. Re:So... by dev0n · · Score: 4, Insightful

      Seems like everyone hates spam with a passion, except maybe the spammers themselves

      well, i would have to disagree with you on this point.. i work at a web hosting company as the technical support manager, and handling abuse complaints falls into my realm of responsibility... and i have found that a significant number of first time spammers do not KNOW that spam is "wrong", and get quite upset that they were "taken" by companies that send bulk messages on their behalf. i had one gentleman send me an apology letter that actually made me feel sorry for him. he, and many other people on our network, have never been repeat spammers.

      i know that there are many people out there who don't care, but we can't automatically assume that all spammers are evil. some of them are just ignorant.

    9. Re:So... by Anonymous Coward · · Score: 0

      I bet they'll attempt legal means like dmca if they can't do technical.

    10. Re:So... by Zero+Sum · · Score: 1
      Seems like everyone hates spam with a passion, except maybe the spammers themselves, and from what I've gathered they're generally pretty clueless. Why would people mess up one of the few effective lines of defense? Maybe I'm just being naive, but I think this could actually work.

      I use kmail and spam (filter selected) gets piped through ricochet (also Vipul's) and I have recieved on 'specially crafted' spam which caused my box to send off about a thousand false complaints. That spammer was not incompetant.

      --

      Zero Sum (don't amount to much). [root@localhost]

    11. Re:So... by Suidae · · Score: 2

      Seems like it would be easier to set up a superserver or central server setup similar to Kazaa that requires multiple matching reports from many different sources. That would eliminate the difficulties with trust models (like having to pay certificate providers, and people that obtain certs specificly to poison the data).

      Either way, you need some to have spam sigs verified from mulitple sources before accepting them.

    12. Re:So... by Erasmus+Darwin · · Score: 2
      "Seems like everyone hates spam with a passion, except maybe the spammers themselves, and from what I've gathered they're generally pretty clueless."

      Given that (some) spammers are smart enough to pull of dictionary attacks for accounts, exploit open relays, and use a non-Internet-based reply method (i.e. something you can't get yanked with an email to abuse@whatever), I question your characterization.

      I hate spammers, and I find what they do reprehensible, but you're deluding yourself if you don't recognize that some people working that side of the fence have technical skill.

    13. Re:So... by Angry+White+Guy · · Score: 1

      That's kind of what I was driving at, in a roundabout way.

      --
      You think that I'm crazy, you should see this guy!
  7. Authentication with servers? by GlassUser · · Score: 5, Insightful

    I read some of the documentation, but I can't find details on a couple of questions. Do the servers authenticate with each other? It was implied, but how deep is it? Are the SHA signatures signed to the originating server (or client/trollbox) too? I think this kind of model is great, but if you don't have some nifty authentication/accountability, it can be wide open for abuse. I'm sure anyone reading slashdot can imagine a vengeful spammer flooding the network with bogus or malicious hashes.

    1. Re:Authentication with servers? by imrdkl · · Score: 1
      imagine a vengeful spammer flooding the network with bogus or malicious hashes

      I don't think this is such a big risk. The typical spam mail is duplicated many times, and is easily identifiable. It's SHA hash, once it's known, could filter it from many of the remaining sites. I often get spam with the exact same header many times.

      Otoh, creating and publishing arbitrary SHA hashes won't really match a single valid mail very often, I guess.

  8. Fabulous Idea! by under_score · · Score: 3, Interesting

    The people who came up with this idea deserve to be considered heros! This is one of the coolest uses of technology I have seen. (Not to be too gushing: SPAM is a rich mans problem - I hope someone comes up with some cool technological solutions to some of humanities more basic problems.) I run a server which hosts mail for a number of domains. I haven't yet, cause I just heard of it, but this will be used! There might be some interesting extensions based on possible problems: certain kinds of spam interest certain people. Perhaps a categorization system would be useful so that spam can be filtered based on these categories (for example, some people might like receiving 100 MLM spam messages a day :-P ). Also, there is an (extremely) slim chance that a legit mail might be blocked based on match hashes. Although this is extremely unlikely, could it be fixed somehow? Finally, some spam comes with very slight differences but is essentially the same spam instance. Chain letters are in a grey area. It would be good to have some heuristic methods of filtering based on content too. I don't know the characteristics of the hashing algorthm used, but perhaps by doing three hashes: start of message, middle of message, and end of message, it may be possible to identify spam even if a small part has been change. Anyway, just some random thoughts. Kudos again to those who have built this!

    1. Re:Fabulous Idea! by J.J. · · Score: 1

      (Not to be too gushing: SPAM is a rich mans problem - I hope someone comes up with some cool technological solutions to some of humanities more basic problems.)

      No, spam is a rich man's problem. SPAM is a problem that most definately spans class boundaries. The world would be a better place without them both.

      Cheers,
      JJ

    2. Re:Fabulous Idea! by mmol_6453 · · Score: 2, Interesting

      I own and operate an ISP, and I will not install this software on my servers, because I refuse to withold my customers' mail.

      However, I will reccommend this software to my customers, so they can use it at their option. That way, they can do what they want. (And I don't get hit with a lawsuit on the off chance a very vital email gets blocked.)

      --
      What's this Submit thingy do?
    3. Re:Fabulous Idea! by Alsee · · Score: 1

      there is an (extremely) slim chance that a legit mail might be blocked based on match hashes

      Yes, but you probably underestimeate the extreme meaning of "extremely" here. The article mentions a 20 character hash code. The characters are probably hexidecimal (restricted to 01234567890ABCDEF). That would be 80 bits. A good rule of thumb is that every 10 bits equals one comma. 80 bits is an eight comma number:
      1,208,925,819,614,629,174,706,176

      That number is the odds against a pair of matching hash codes.

      Assuming you are blocking a million virus signatures, and assuming every person on the planet (six billion) sent one E-mail per day, you would get a false match (blocked E-mail) about once every half billion years.

      The "10 bits equals one comma" rule assumes an exact multiple of 10 bits, and is always correct up to 2910 bits. For up to 290 bits it is always "1" immediately followed by a comma.

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
  9. Can it be abused? by LinuxGeek8 · · Score: 1, Interesting

    That's quite interesting.
    But the main question is, can it be abused?
    I'd expect the senders of spam to be wanting this project to be rendered useless, by submitting garbage to the database.

    In return, I guess it is possible to have some sort of moderating system on the submitters of the data, which can filter out most of the abusers.

    --
    Well, don't worry about that. We can get you back before you leave. (Dr. Who)
  10. Stopping bogus entries? by 13013dobbs · · Score: 1, Redundant

    They need 20 words to make the signature. What is some jerk submits a 'spam' signature that contains common words found in normal emails? Or what if these spam words can be used in a non-spam context?

    --

    No replies made to AC posts. Please log in.

    1. Re:Stopping bogus entries? by cwebster · · Score: 2, Informative

      search google for SHA digest, read how it works the take a good look at your question

    2. Re:Stopping bogus entries? by Anonymous Coward · · Score: 3, Informative
      You don't seem to understand the concept of a hash. A hash function parses a message into blocks of n-size. If the message is not a multiple of n, it's padded using one of several techniques. It then reduces the size of this block through a complex algorithm that's difficult to reverse. For instance, MD5 uses a 512 bit input, and spits out a 128 bit output. Then it puts the output blocks together to form new n-sized blocks and runs those through the algorithm again until it has one n-size block. This block is run through the algorithm and the output is the message digest, or hash. The chances of two messages having the same hash is inversely proportinal to the length of the hash. The ability of an attacker to find two messages with the same hash depends on the strength of the hash. Hope this clarifies everything.

      .derf

    3. Re:Stopping bogus entries? by cheebie · · Score: 2, Informative

      In the first place, it's not 20 words, it's 20 characters. In the second place, those 20 characters are simply the SHA signature of the offending message. I assume they key on some of the more constant headers and (possibly part of) the body of the text. By the very nature of digital sigs, it would be difficult (impossible?) to key on something like "any post with the word 'carroway' in it".

  11. How about a server frontend approach? by serial+frame · · Score: 3, Insightful
    It would be very neat if this were provided as a free service that acts as a front-end to an existing POP3 account. Simply sign up, provide info like your username, POP3 host (but not password; that can be passed from the service to your POP3 server on log-in for safety reasons). Then, point your favourite mail client at the service's POP3 server, and...voila. Same e-mail, minus the spam.

    Nothing truly insightful here, just speculation from a convenience freak.

    --

    -
    And the Angel said unto me, "These are the cries of the carrots! The cries of the carrots!"
    1. Re:How about a server frontend approach? by crisco · · Score: 2
      Or how about an email client program that logs into your POP mailboxes, downloads mail (without removing it from the mailbox), compares spam signatures and then proceeds to remove spam from the mailbox. Very useful for those of us who don't yet run our own mail servers.

      Might be a little slower for those dial-up users, especially if they are being charged for connect time. But for people with a shell account (I'd love to set a cron job for every hour or so) and an ISP that is unwilling to run a filter, or someone with inexpensive connetivity who would like to reduce spam, it would be a beautiful solution.

      --

      Bleh!

    2. Re:How about a server frontend approach? by Humba · · Score: 0

      Brightmail.com (site is not responding for me now) used to provide just such a free service that worked pretty much as you describe. Alas, last summer it went the way of so many other useful, free services.

      One of it's good features (besides getting at least 80% of the spam that hit me) was to send a monthly summary of the mail it filtered with links to retrieve it from their server.

    3. Re:How about a server frontend approach? by budgenator · · Score: 2
      For it to be widely used as a server front end, you would have to convince a lot of network types that its both effective and secure. Because right now they would view it as another piece of software to config and patch, in an area where they had no software before and no presidence to have any software. Also the legal types tend to worry about bogus claims like Email = free speach, and liability over mistakenly blocked Emails ect its easier for them to concider it a user problem.

      About 80% of the spam to our domain get forwarded to user bitbucket anyways. This is because our domain name is poiuyt.com and a lot of people use it as a FAKE Email domain instead of using example.com; qwerty@poiuyt.com get tons of spam. Life would be a lot simpler for me if I got off my duff and learned enough about the pop3 protocal to write a script that just found out how many spam's to delete and delete them w/o downloading. Oh well such is the cost of laziness.

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    4. Re:How about a server frontend approach? by serial+frame · · Score: 1
      A wonderful idea, but not everyone runs an open operating system, and readily has available the tools to take advantage of open software.

      Not to mention, my Psion 5mx would have no choice but to be clogged with spam. :(

      --

      -
      And the Angel said unto me, "These are the cries of the carrots! The cries of the carrots!"
    5. Re:How about a server frontend approach? by crisco · · Score: 2
      Yeah...

      I would suggest perl, as it can be made to work on those other operating systems.

      But I don't know about the Psion...

      --

      Bleh!

  12. with distributed counter-attack... by fungus · · Score: 1

    could help the internet to be a better place. :)

  13. What about evil admins? by mehfu · · Score: 1

    Would an evil admin be able to report a "signature" as a source of spam, even if it's not? Probably.

    This project has a great idea behind it, but how does it work? A lot of the mail I get is spam, but the line between spam and wanted mail is often not very distinct.

    This project put a lot of trust to the admins as trusted and intelligent people. I hope this turns out well...

  14. Fighting spam by Brian+Kendig · · Score: 5, Informative

    I'll post my usual public service announcements here:

    SpamCop is a great service for reporting spam; just paste the spam message into the web form, and it'll automatically figure out where the smap came from and send complaints off to the appropriate people.

    The Spam Bouncer is a procmail-based personal spam screening tool. It's got some interesting features, but I haven't used it in a long while.

    The way I avoid spam is to have my mail client screen out any email which contains any of these phrases:

    to be removed
    to be permanently removed
    to get removed
    to get off the list
    to get off this list
    to be taken off
    to remove yourself
    removal instructions
    remove in subject line
    "remove" in subject line
    remove in the subject
    "remove" in the subject
    'remove' in the subject
    S.1618
    S. 1618


    This list by itself catches about 80% of the spam I get.

    1. Re:Fighting spam by sqlrob · · Score: 2, Informative

      don't forget:

      one time mailing

    2. Re:Fighting spam by invenustus · · Score: 2, Informative
      The way I avoid spam is to have my mail client screen out any email which contains any of these phrases:

      Um, are you on any legitimate mailing lists? Don't those get filtered out? I'd imagine half of Slashdot's readership is on one or more of the Linux development lists. I'm Yahoo! Groups mailing list for any number of different interests....
      --
      grep -ri 'should work' /usr/src/linux | wc -l
    3. Re:Fighting spam by kuiken · · Score: 1

      Nice way off blokking all legit e-mail lists as well

      --

      42
    4. Re:Fighting spam by Anonymous Coward · · Score: 0

      make this a filter that only applies to unsorted messages in inboxes (You DO sort your mailing list messages, right?).

      My filter list is much like that, only including common phrases like "XXX" "100% FREE" etc..

    5. Re:Fighting spam by Anonymous Coward · · Score: 0

      Well, most people will have filters for those mailinglists as well, with higher priorities than the spam filters.

      So this *is* a good tip. I'll add it to my filters.

    6. Re:Fighting spam by suwain_2 · · Score: 2, Informative
      I think there's a potential problem with this... Not sure if you'll ever have any actual problems with it, but...

      Suppose you send me mail with the exact text in your post. Now, I don't actually get any spam, but it's not a problem. BUt let's say I reply, and leave the original text. SUddenly, my mail meets every single criteria that you're filtering.

      --
      ________________________________________________
      suwain_2 :: quality slashdot p
    7. Re:Fighting spam by Thanatopsis · · Score: 2, Interesting

      Not really, you simply change the order in which your filters get checked and filter out legitimate mailing list traffic from SPAM. For example I am member of various ZDNet lists and development lists. I filter those based on the sender or the from address into my mailbox for them and then I can read them at my leasure.

    8. Re:Fighting spam by FattMattP · · Score: 2

      Also try JunkFilter

      --
      Prevent email address forgery. Publish SPF records for y
    9. Re:Fighting spam by drsoran · · Score: 1

      Not at all. If you're using procmail just put all the legitimate lists above those lines and they'll get dumped into your list folders.

    10. Re:Fighting spam by Restil · · Score: 2

      Ya.. the S.1618 catches a lot. Or just filtering "this is not spam" or "you requested more information" would get a bunch too. :)

      Sad.. I know.

      -Restil

      --
      Play with my webcams and lights here
    11. Re:Fighting spam by mmol_6453 · · Score: 1

      I've noticed that /. often has articles about novelty websites.

      How about a site title, "1001 Ways to Be Filtered as SPAM"?

      --
      What's this Submit thingy do?
    12. Re:Fighting spam by Zero+Sum · · Score: 1
      Suppose you send me mail with the exact text in your post. Now, I don't actually get any spam, but it's not a problem. BUt let's say I reply, and leave the original text. SUddenly, my mail meets every single criteria that you're filtering.

      Have you ever heard of regular expressions?

      When you reply there will be a "Re:" in front.

      When you forward, a "FWD:". So all you need to do is match on the Subject and correctly anchor your regular expression.

      In this way I never get the same piece of spam twice. It works very well and does not do what you claim. One moment's thought should have dismissed this. Instead it is a +2. Weary, weary...

      --

      Zero Sum (don't amount to much). [root@localhost]

    13. Re:Fighting spam by csbruce · · Score: 2

      I find that setting aside e-mail that's not actually addressed to me catches a lot of spam.

    14. Re:Fighting spam by Artifex · · Score: 1

      That's a nice screener list, but isn't it too strict? Almost all legitimate mailing lists include removal instructions, now.

      Spamcop is cool, but it only works after the fact, unless you become a paying member. Sure, you can report spammers all day and night, but the biggest ones have their IP space SWIPed to them directly, which decreases Spamcop effectiveness substantially. Also, if you run Outlook, it's very difficult to get the full message out to Spamcop for reporting, anyway.

      Rather than get into what can be done on the client side, I'd like to see more on how to block it on the server side - I have a vanity domain, and I'm going to set up a box with sendmail or postfix or something... I'd like to know what filters work best in that environment.

      My goal is to block the mail as it's trying to hit the server; for example, since the Excite.com admins are refusing to help me reset my password for free mail account, which is forwarding spam into my vanity domain 30 or 40 a day, I want it to find everything with a from line like @excite.com, or everything with an IP range that matches Excite.com space or a HELO statement with %excite.com% in it, to immediately get a 550 abort line back "sorry, Excite.com admins are refusing to resolve a mail issue with me, and so I'm refusing everything sent through them now. Don't complain to me, complain to them. If you are interested, please read the history of the dispute at my website; my new contact address can be found at the bottom." Or, for everything from PostmasterGeneral (a company that specifically is geared to hosting commercial lists that are supposedly opt-in, but they take their clients' word for it and only remove people when they complain, making them opt-out at best - why would anyone want anything form them?), search for PMGUID (their list tag) in the message, then make the 550 read simply:"Up yours."

      Alternatively, does anyone know how to set up sendmail/postfix/some other linux mail server to bounce offending mail directly to their hosts' operations mail? And I don't mean just abuse@, I mean to their NOC, DNS admin, postmaster, IP admin, AND CEO if I can find it. Surely, if I start clogging their internal mailboxes with "hey... if you don't want to see these mails pile up in your inbox, don't let your customer send them to me - as long as your abuse team continues to ignore me, *you* will get these mails," followed by the spam I am getting, they will have to take me a bit more seriously.

      Any suggestions how to make either alternative work?

      --
      Get off my launchpad!
    15. Re:Fighting spam by adelton · · Score: 1

      And increasingly popular "... cannot be considered spam". It is enough to grep the last 10 or so lines of the mail for this phrase.

  15. Good idea by fritter · · Score: 1

    It seems as though using a digest algorithm it not only will be pretty much immune to deleting valid email, but also good at preventing DOS-type database poisoning by spammers (trying to put in a huge number of randomly-generated hashes would likely be detected).

    Of course, it seems that simply randomly changing a few characters in each message could get around this. Suggestions, anyone?

    1. Re:Good idea by Anonymous Coward · · Score: 0

      I was wondering about this too. "Personalized" spam would pretty much bypass this. Someone with a perl script and a bunch of synonyms could produce zillions of similar spams, all of which would slide by this thing.

  16. idea won't work if reaches critical mass by intuition · · Score: 4, Insightful

    Razor catalogs spam by hashing the entire text of the message. Later potential spam is "detected" by hashing entire texts of messages to see if the hash matches any of the existing hashes in the spam catalog.

    To get around this all a spammer has to do is change/add at least one charachter to each spam. This would make all the hashes unique and no spams would be detected.

    1. Re:idea won't work if reaches critical mass by morzel · · Score: 2
      Technically, it would be possible to create hashes for different pieces of the message, which can be combined in one single "signature" to detect potential matches. It would be more complicated for the catalogue server to execute searches, and the answers won't always be absolute (e.g. partial match).

      --
      Okay... I'll do the stupid things first, then you shy people follow.
      [Zappa]
    2. Re:idea won't work if reaches critical mass by DaSyonic · · Score: 2

      Spammers already do this. Both to the subject line and in the email you will often find a series of 6-8 random numbers attached. This does not make it impossible for this plan to work however.

      --

      Linux: Because a PC is a terrible thing to waste.
      James Brents
    3. Re:idea won't work if reaches critical mass by Anonymous Coward · · Score: 0

      You don't seem to be very enlightened either. Hashes of a piece of the message won't match with hashes of the same piece shifted by one byte, so you have to hash pieces starting at every possible position, or at "key" positions as given by some heuristic (words starting with "s" for example). That hack could do the trick. Spammers would have to carefully craft their mails to fool the heuristics, which could be fun to watch.

      The better solution is to measure the distance between two messages by dynamic programming (e.g. the length of a "patch" to go from A to B as given by diff). But it is too expensive to compute a "diff" against all spam for every single incoming mail. Now you get into a very nice research problem: can you speed up an approximation to this by doing preprocessing on both the spam and the mail, preferably using hashes to ensure privacy of the mails? That's where a smart CS grad in your project group comes in handy.

    4. Re:idea won't work if reaches critical mass by intuition · · Score: 2

      Technically, it would be possible to create hashes for different pieces of the message, which can be combined in one single "signature" to detect potential matches. It would be more complicated for the catalogue server to execute searches, and the answers won't always be absolute (e.g. partial match).

      You would have to define in advance what a "piece" of a message would consist of. Then the spammer simply puts the extra space, unique charachter, etc. in each "piece" of the message. Then, curiously, morzel is still receiving spams despite his/her modified spam blocking approach.

      The central problem is whatever heuristic they use to define what a spam is, it has to be predefined and well known. This would imply the spammer would have knowledge of said heuristic and would be able to form his emails in such configuration as to avoid detection.

      An AC has replied to your post as well suggesting a incomprehensible replacement which at one point says doing preprocessing on both the spam and the mail Ok, buddy and you are going to force the spammers to properly preprocess their mail so that it will get blocked by the mail server filter......right.

      If you can force people to do preprocessing a much better (and comprehensible) solution is
      Hash cash Wherein you force each client to precompute a special value that is costly-enough in terms of CPU cycles to deter spamming. This value can be instantly verified by your client, mailserver, etc. and the email will be summarily dropped if the value is not of the costly-variety. Even if this value had to be checked by the recievers client itself, if a significant aamount of clients were configured not to display the email until the value was verified incentives for sending spam would drop. (hopefully to the point where the effort to send the spam outweighs the return to the spammer)

    5. Re:idea won't work if reaches critical mass by morzel · · Score: 3, Interesting
      It is true that it is not always trivial to pick the pieces in a way that the fragments being hashed start at the same offset, but isn't always needed to add extra complexity. Due to the sheer numbers of the same message being sent by the spammers, it would be quite difficult and timeconsuming for them to create a lot of "slight variants" of the same message. Add to that that spammers aren't the only resourceful people on this planet: we can make it difficult for them as well.

      This is how I would do it:

      Strip HTML/markup language, so that we get plain text of the message.

      Strip all "meaningless" characters from the text, keep only alphabetic (or alphanumeric) characters, no spaces or punctuation.

      Uppercase everything.

      We now have one string, with all the meaningful characters of the email, which makes it quite hard for spammers to vary much without mutilating the message they're trying to convey.

      Pick a 8 entry points in this string based on the occurance a number of well-chosen, predefined two-character combinations that are likely to be found in English text(*) - these need to be defined upfront. There are lots of texts available in the gutenberg project to analyze to get to such a set.

      This is hard: we need to find a good balance between physical location in the string, and the occurance of the combinations we have defined, so that we can take a broad "sample" of the text. Luckily for us , spammers tend to send long messages :-)

      Now we compute the hash of the fragments, defined by our entry-points and a fixed length. These hashes combined provide a "real big signature" of the spam message. Pick the last two bytes of every hash, and stick them together for a "small signature" that can be used for searching/matching. We need to define our protocol for searching the catalogue in such a way that when a partial match is found using the small signature, we can retrieve the full signature to check further.

      Based on this we have a rating from 0/8 -> 8/8 for the probability of a mail being a spam message. End user settings can define what is destined for the bitbucket, and what goes in your mailbox.


      In the end, spammers can (and will) try to circumvent these measures, but it would be hard and (hopefully) time-consuming, and it will require them to mutilate their messages to be undetected. Of course, this system only works properly when people are willing to submit spam fingerprints to the catalogue servers.

      Anyway, that's my 0.02 EURO...

      (*)Of course, English isn't the only language being used in spam, but I guess it's the most prevalent here. You can ofcourse apply the same principle to any language. Heck, if you really want to push the envelope, you can try to detect the language (character frequency analysis and checking for very common words).

      --
      Okay... I'll do the stupid things first, then you shy people follow.
      [Zappa]
    6. Re:idea won't work if reaches critical mass by Anonymous Coward · · Score: 0

      Well, you pretty much rewrote "hash pieces starting at key positions as given by some heuristic" using 2 pages, great.

      I guess spammers can defeat this by drawing at random the meaningless adjectives they need (incredible, amazing, great, extraordinary, unsurpassed, remarkable, prodigious, astonishing, marvelous).

      At least if spammers go through all that trouble, it "could be fun to watch".

    7. Re:idea won't work if reaches critical mass by thogard · · Score: 1

      So they need to strip the 1st three and last 3 lines of a message if its over 15 lines long and then do the hash. There are lots of good spam filters from usenet that have already figured most of this out.

  17. Hey Taco, reduce your percentage of spam! by Nagash · · Score: 1, Offtopic

    Subscribe to the kernel mailing list.

    Woz

  18. Philanthropic P2P by UM_Maverick · · Score: 1, Offtopic

    Don't forget about Intel's cancer-research P2P system - www.intel.com/cure

    They also have info there about Stanford's protein-folding project (http://folding.stanford.edu)

    1. Re:Philanthropic P2P by FleshWound · · Score: 1

      Ummm...that's not P2P...that's distributed computing. Big difference.

  19. Yes I've posted this before but by 4444444 · · Score: 3, Interesting

    I love costing spammers real money just got to
    http://goto.com
    and do a search for "bulk email" each link you click will cost the scumbags that sell spam software or spamming services several dollars each
    Also I love this new technology I wish all isp's would use it

    and for more spam fighting ideas please check out
    http://www.lenny.com/spam

    --

    http://Lenny.com
    4 great justice!
    1. Re:Yes I've posted this before but by TMB · · Score: 2

      That goto.com (though it looks like they've changed their name to Overture) link is damn cool... over $8 per click?! Though that only hurts the companies that make the software, not the ones that use it. Still worthwhile though...

      Now I wonder whether they have any limitations for hits from a given IP address? One little perl script could put some of those companies out of business otherwise.... :-)=

      [TMB]

    2. Re:Yes I've posted this before but by bleeeeck · · Score: 2, Interesting
      I love costing spammers real money just got to http://goto.com and do a search for "bulk email" each link you click will cost the scumbags that sell spam software or spamming services several dollars each

      Here's the link for you lazy people.

      The top few listings are more than $8 each.

    3. Re:Yes I've posted this before but by Idolatre · · Score: 1

      I LOVE hurting those assholes that make the software, and those that make hundreds of dollars selling MY email adress on CD :)~

    4. Re:Yes I've posted this before but by Anonymous Coward · · Score: 0

      Now THIS is where we need to get distributed. Admin of a decent network perls up a little script to send an HTTP GET request through overture.com to the top couple sites (hell, all of them... 36 cents adds up fast when there's perl involved).....every hour, 24/7. Maybe have something on an inbound router to filter out the actual pages to keep internal network traffic down. Might be sticky legally, though....

    5. Re:Yes I've posted this before but by Bob+McCown · · Score: 1

      A quick PERL script should be able to do this automaticaly, and for hours on end...Anyone game?

    6. Re:Yes I've posted this before but by wheany · · Score: 1

      Go now, show the power of a good slasdotting!

    7. Re:Yes I've posted this before but by Anonymous Coward · · Score: 0

      Here's my spamhurt.php file

      <?php
      error_reporting(E_ALL);
      set_time_limit(0);

      $agents = array("Mozilla/4.75 [en] (X11; U; Linux 2.2.16 i686)",
      "Mozilla/4.74 [en] (X11; U; Linux 2.2.10 i686)",
      "Mozilla/4.72 [en] (X11; U; Linux 2.2.12 i686)",
      "Mozilla/4.73 [en] (X11; U; Linux 2.2.14 i686)",
      "Mozilla/4.77 [en] (X11; U; Linux 2.4.3 i686)",
      "Mozilla/5.0 (X11; U; Linux 2.2.16 i686; en-US; 0.7) Gecko/20010105",
      "Mozilla/5.0 (X11; U; Linux 2.2.14 i686; en-US; 0.7) Gecko/20010105",
      "Mozilla/5.0 (X11; U; Linux 2.4.3 i686; en-US; 0.6) Gecko/20001206",
      "Mozilla/4.51 [en] (WinNT; U)",
      "Mozilla/4.72 [en] (WinNT; U)",
      "Mozilla/4.74 [en] (WinNT; U)",
      "Mozilla/4.08 [en] (WinNT; U)",
      "Mozilla/5.0 (Windows; U; Win95; en-US; rv:0.8.1+) Gecko/20010426");

      srand((double)microtime() * 1000000);
      shuffle($agents);
      $agentCount = sizeof($agents) - 1;

      function HTTPGet($url)
      {
      global $agents, $agentCount;
      if(!($fp = fsockopen("www.overture.com", 80))) return FALSE;
      fwrite($fp, "GET $url HTTP/1.0\r\nHost: www.overture.com\r\nUser-Agent: " . $agents[mt_rand(0, $agentCount)] . "\r\n\r\n");
      $html = fread($fp, 100000);
      fclose($fp);
      return $html;
      }

      mt_srand((double)microtime() * 1000000);
      preg_match_all("/<a href=(.*xargs.* ?)>/U", HTTPGet("/d/search/?Keywords=bulk+email"), $urls);
      preg_match_all("/<a href=(.*xargs.* ?)>/U", HTTPGet("/d/search/?Keywords=bulk+mail"), $urls2);
      $urls = array_merge($urls[1], $urls2[1]);
      shuffle($urls);
      $linkCount = sizeof($urls) - 1;

      while(TRUE)
      {
      $html = HTTPGet($urls[mt_rand(0, $linkCount)]);
      if(strstr($html, "HTTP/1.1 302")) echo preg_replace("/^.*Location: http:\\/\\/(.*?\\r\\n).*$/s", "\\1", $html);
      }
      ?></A></A>

  20. How do you compute a signature? by cperciva · · Score: 5, Informative

    As far as I can tell from a quick glance at this, it looks like the entire message body is being used to compute the signature. This isn't going to work very well -- over half of the spam I receive is "personalized", and that fraction is growing every day.

    This could work very well, but we need some way of computing signatures which will be invariant across different copies of personalized spam for this to be effective.

    1. Re:How do you compute a signature? by Chagrin · · Score: 2

      If, when creating the siganture, you make sure to only use words that are common to spam or dictionary words you'd be able to avoid the majority of any personalization present.

      --

      I/O Error G-17: Aborting Installation

    2. Re:How do you compute a signature? by FFFish · · Score: 2

      For instance, they could use a Markhov chain algorithm to parse their ever-increasing collection of sample spam, and use that to determine the "spamness" of email.

      --

      --
      Don't like it? Respond with words, not karma.
  21. Well.. by Anonymous Coward · · Score: 0

    I'm sorry, but this just doesn't seem all that useful. Is this going to stop spam coming to my Hotmail/ICQMail/other coprporately run webmail service, or even the e-mail address from my cable provider? Unfortunately, no. For it to be of any use to most people it would have to be implemented by the companies that are allowing spam through to its users in the first place.

  22. Idea by svara · · Score: 1

    It would be great if common email programs had a function for that - for example, right clicking on a spam mail, choosing "spam", the mail would be moved to a "spam folder" and the relevant info about that email would automatically be submitted to an anti-spam institiution of that kind. This would make handling spam a lot easier IMHO.

    1. Re:Idea by Zero+Sum · · Score: 1
      It would be great if common email programs had a function for that - for example, right clicking on a spam mail, choosing "spam", the mail would be moved to a "spam folder" and the relevant info about that email would automatically be submitted to an anti-spam institiution of that kind. This would make handling spam a lot easier IMHO.

      I do precisely this with kmail and ricochet.

      --

      Zero Sum (don't amount to much). [root@localhost]

  23. Open for abuse? by robstah · · Score: 2, Insightful

    Although, i marvel at the theory and innovative use of peer to peer technology to achieve exemplary aims. I have some concerns about the possibilities of abuse, AFAIK the submission system for spam, is not moderated in any way. In fact only the hash is sent to the server and not a copy of the spam, i am therefore concerned that the system could possibly be abused by someone submitting the hash of a legitimate mail to the system that would then result in this email from being recieved by the other hosts. This could be done to prevent the circulation of bugtaq items, my a malicous user for instance. And as everyone has different personal opinions about SPAM and what constitues it, i think a set of clear guidelines is required and when submissions are made a copy of the mail is associated with it and a human being moderates the hashes being submitted. Although i have my doubts about the system, if these were put to rest i would have no hesistation in implementing a system like this.

    --
    Rob 'robster' Bradford
    Debian Planet Guy
    We are the apt. You will be packaged. Resistance is futile.
    1. Re:Open for abuse? by Idolatre · · Score: 1

      A legitimate mail wouldn't be sent to multiple people, so it won't have any impact on anyone
      if you start marking your own mail as spam.

      (Unless you are subscribed to mailing lists, then you would have to exclude the mailing lists
      from your filters)

    2. Re:Open for abuse? by Anonymous Coward · · Score: 0

      It is not that bad. If you put razor _after_ your procmail rules for mailinglists.... Good luck for anyone trying to guess what non-mailinglist I'm going to receive.

  24. Re: Distributed spam filter by blibbleblobble · · Score: 3, Insightful

    It does seem like a remarkably sensible system, just getting email clients to talk to each other about the emails they get.

    You can tell if the same email has been sent to hundreds of people (and if you use hashes, you can do that without revealing the email)

    You can click a "this is spam" button when you read an email, and anyone who trusts you (i.e. has your public key in their "trusted filtering friends" list) can look for similar messages and filter them.

    But, there do seem to be a load of problems:
    - Personalised email, as someone already mentioned
    - Privacy problems with letting others into the secrets of your mailbox
    - If you have the original of a message, you can calculate the hash, then see who else got the message (i.e. works for personal mail as well as spam)
    - Relatively easy for malicious users to wrongly label someone as a spammer

    Well worth investigating, though...

  25. ifile - filters out over 90% of my spam by orbman · · Score: 1

    I found ifile - http://www.ai.mit.edu/%7Ejrennie/ifile/ - that learns how to recognize spam with statistics of word usage. I have written some scripts for using it with standard unix procmail and standard unix mailboxes. It recognize over 90% of incoming SPAM and 99% of corrects mails so it works very well for me. My scripts will be soon integrated in ifile package as said by ifile's author.

    1. Re:ifile - filters out over 90% of my spam by madfgurtbn · · Score: 1

      But what about that 1% of real mail that goes to the spam can? I can't risk losing a real email every two or three days, so I'm still going to have to sift through all the spam. If you can get that down to 1 a year or so, then it might be acceptable for my purposes.

      --
      Send lawyers, guns, and money. Dad, get me out of this.
    2. Re:ifile - filters out over 90% of my spam by orbman · · Score: 1

      It goes to ~/mail/spam which I read once a 2-3 days so I don't lost any email. I just don't want my computer telling me I've got new mail when I've got new spam ;-)

  26. SpamAssassin uses Razor by wideangle · · Score: 5, Informative
    From http://spamassassin.taint.org/:

    SpamAssassin is a mail filter to identify spam.

    Using its rule base, it uses a wide range of heuristic tests on mail headers and body text to identify "spam", also known as unsolicited commercial email.

    The spam-identification tactics used include:

    • header analysis: spammers use a number of tricks to mask their identities, fool you into thinking they've sent a valid mail, or fool you into thinking you must have subscribed at some stage. SpamAssassin tries to spot these.
    • text analysis: again, spam mails often have a characteristic style (to put it politely), and some characteristic disclaimers and CYA text. SpamAssassin can spot these, too.
    • blacklists: SpamAssassin supports many useful existing blacklists, such as mail-abuse.org, ordb.org or others.
    • Razor: Vipul's Razor is a collaborative spam-tracking database, which works by taking a signature of spam messages. Since spam typically operates by sending an identical message to hundreds of people, Razor short-circuits this by allowing the first person to receive a spam to add it to the database -- at which point everyone else will automatically block it.

    Once identified, the mail can then be optionally tagged as spam for later filtering using the user's own mail user-agent application.

    SpamAssassin requires very little configuration; you do not need to continually update it with details of your mail accounts, mailing list memberships, etc. It accomplishes filtering without this knowledge, as much as possible.

    Call your ISP and ask if they use it.
    1. Re:SpamAssassin uses Razor by portnoy · · Score: 1

      I've just started using spamassassin on my home box. It does a really good job of identifying and classifying the spam -- I recommend it.

    2. Re:SpamAssassin uses Razor by Anonymous Coward · · Score: 0

      Yup.. I've been using this for a while now and I am yet to encounter a mistake in the filtering... although I am also lucky in that I probably receive less spam than most people. :-)

    3. Re:SpamAssassin uses Razor by recklessNomad · · Score: 1

      The Razor technique is definately best when used in conjunction with other spam identification techqniques a la SpamAssassin.

      I was getting 90% spam in my primary mailbox until I started using SpamAssassin. It works great, and is highly recommended.

      I doubt, however, that more than a small subset of SpamAssassin users will manually add caught spam to the Razor's databse on a regular basis. I wonder if it's safe/ethical/etc to set it up so that caught spam is automatically added to the Razor...

  27. Sounds tres cool by Saint+Aardvark · · Score: 2
    I came across an ad recently for a commercial system that worked in a similar way; they had a bunch of different pop accounts set up to catch spam, and then created signatures of those messages in real time. You subscribe to their service, and you get an updated list every . Can't remember the name of the company, but I do remember them saying that new spam messages were typically sent out to clients w/in 15 minutes.

    One question about this system that I hope the poster (or someone else using this system) will answer: what's it like on server load? Right now, at the ISP I work at, we're using procmail to filter for spam (check the graphs here: http://selenium.dowco.com/spam/spam.html). It's a good way of doing things, but there are some shortcomings: basically, since it runs on our mailserver, I can't run all the body searches I want; in fact, we had to cut out body searches recently because the load was getting too high and/or email was taking too long to get through. There's some workarounds that I haven't got around to putting in yet (body scanning only when 3k in size, etc), but you can see my point. Anyone?

  28. This is just a temporary solution. by mrsam · · Score: 5, Informative
    Spam generators have been trying to hash-bust these kinds of filters for years now. A four year spam generator automatically appends random junk at the end of the Subject header or at the tail end of the message, in order to defeat the early hash-based spam filters.


    This is probably a 'fuzzy' hash function that should ignore minute variations. However, it goes without saying that if this hash-based spam filter becomes widespread, then the spammers will simply figure out how to hash-bust their way past it.


    To have any hope of working over the long term, this kind of an approach must include the ability to distribute not just the hashes themselves, but the hash function as well, so that the hash function itself can be adjusted, when needed.

    1. Re:This is just a temporary solution. by phkamp · · Score: 1

      We have actually used a signature of "a lot of spaces and 5-7 random characters, possibly in []" at the end of Subject as a very successful spamfilter for over a year.

      --
      Poul-Henning Kamp -- FreeBSD since before it was called that...
  29. One way around potential abuse. by chris_7d0h · · Score: 5, Insightful

    To eliminate the situation where one person posts a lot of "incorrect" signatures, a ranking system could be applied.
    The thought goes like this.
    A person submits a signature of "identified" spam mail to a "supernode" for ex. and the submission gets a ranking of 1. Each additional submission (by other users) increases the score by a number.

    This way, there are several classifications which could be used to filter incoming mail. For the mail providers, they could opt for only removing mail matching signatures with a very high score (thus very likely these will be actual spam) or they could filter anything reported.

    The purpose of allowing the use of classifications is that it will take longer time to get higher scores, since more people have to report the specific spam mail. Some people whish to eliminate things the least bit suspected, but mileage may vary.

    Do you see a resemblance with the ./ moderation?

    --
    In a society that believes in nothing, fear becomes the only agenda ~ Bill Durodié
    1. Re:One way around potential abuse. by suwain_2 · · Score: 1
      Furthermore, a user itself could be given a credibility rating. This could be, say, the average number of people who submit the same hash. This can be a "multiplier".

      So, say I've submitted a total of 500 spam messages, and, on average, 12.34 people submit a similar hash. THus, when I submit a 501st message, the server recognizes me as having a 12.34 submission average. Thus, 12.34 credibility is added, as opposed to one.

      However, you'd have to figure in the number of messages submitted; if I submit one message, and 100 people submit the same one, but then I start flooding it with crap, the server wouldn't give me 100 points.

      You closed by saying "Do you see a resemblance with the ./ moderation?" My idea is sorta like karma points, and the +1 bonus you can get after 26 karma.

      --
      ________________________________________________
      suwain_2 :: quality slashdot p
    2. Re:One way around potential abuse. by MindStalker · · Score: 3, Informative

      Why bother. A hash is only going to affect a very specific mail. How often do you get mails that many other people get the same identical mail if it isn't spam. Listservs might be a problem. But I'm sure you could filter for each of your subscribed servs so that they don't get deleted.

    3. Re:One way around potential abuse. by chris_7d0h · · Score: 1

      Mailing lists for example.
      This was pointed out earlier in another post as a potential problem.

      The thing about some kind of scoring system is that it will be very improbable that several spammers will "band together" to submit the sigs of legit mass-mailings. In the unlikely situation that this would actually happen, these people can be identified and their posting access revoked until they mail some admin person explaining their actions. This of cause assume the existence of some kind of super-node which will have some kind of administration (either by ordinary admins or by an early /. type of system where appointed people take turn to do the chores).

      The implementation structure of how moderation will take place is not something I have bothered thinking about. These things have a tendency to work out in a fairly good way as history has shown.

      --
      In a society that believes in nothing, fear becomes the only agenda ~ Bill Durodié
    4. Re:One way around potential abuse. by Suidae · · Score: 2

      It doesn't have to be a MD5/SHA/whatever hash, it can be a signature based on a fuzzy match. The point is, whatever it is, it needs to be submitted by a number of unrelated sites before its accepted as valid data. Each site can set their own threshold for messages, depending on how much they want to filter.

  30. Bogus hashes won't tag valid mail by morzel · · Score: 4, Informative
    The beauty of a cryptographic hash function is that it's purely one-way: it is very easy to check if two messages are the same (they calculate to the same hash), but it is nearly impossible (or at least very very very hard) to calculate the message for any given hash.

    Injecting random hashes into the network won't result in valid emails being tagged, but can flood/DOS the catalogue machines.

    It would be possible to create hashes for a number of "probable" emails, but diversity in messages is so big, the chances are quite slim to actually stop a legitimate mail.

    --
    Okay... I'll do the stupid things first, then you shy people follow.
    [Zappa]
    1. Re:Bogus hashes won't tag valid mail by JohnPM · · Score: 1

      Er...hello...it's absolutely impossible to derive the message from the hash. The hash may contain millions of times less information than the message. The point of cryptographic hashing is that it should be impractical to fabricate a second message that will produce the same hash.

      Cryptographic signing is the same principal except you encrypt the resulting hash with your private key so that it can be verified using your public key.

      --
      Karma police, I've given all I can, it's not enough, I've given all I can, but we're still on the payroll.
    2. Re:Bogus hashes won't tag valid mail by morzel · · Score: 2
      Hehe... I actually meant that it's very hard to derive a message from the hash. (not the message).

      You are absolutely correct.

      --
      Okay... I'll do the stupid things first, then you shy people follow.
      [Zappa]
    3. Re:Bogus hashes won't tag valid mail by Alsee · · Score: 1

      Injecting random hashes into the network won't result in valid emails being tagged

      Correct, but you missunderstood the attack threat. Injecting the hashes of valid E-mail will block that E-mail. Person to person E-mail would be pretty much immune, but any form of broadcast E-mail is vulnerable - newsletters, mailing lists, alerts etc.

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
    4. Re:Bogus hashes won't tag valid mail by morzel · · Score: 2
      Newsletters, mailing lists and alerts all come from "trusted sources". It would/should be trivial to configure your client to accept messages based on certain constraints (email adresses, smtp servers, pgp signature...)

      E-mail isn't really built for broadcasting: websites, NNTP (to name a few) are far better solutions to accomplish the same goal. Unless the content is personalize, but then injecting hashes wouldn't pose a threat anymore.

      --
      Okay... I'll do the stupid things first, then you shy people follow.
      [Zappa]
    5. Re:Bogus hashes won't tag valid mail by Alsee · · Score: 2

      Newsletters, mailing lists and alerts all come from "trusted sources". It would/should be trivial to configure your client to accept messages based on certain constraints

      Yes, there is a workaround, but it has two major flaws. (1) While you and I may consider it "trivial" to fix, there are a lot of people who find entering an E-mail address to be a challenging task. (2) The effects of this attack are unseen E-mails. You generally don't fix a problem until you notice it exists. Even if you *do* notice you haven't gotten XYZ newsletter in 3 months, it is non-trivial to figure out why.

      P.S. Stealth attack usually reffers to an attack that is hard to trace, or that is undetectable until you are harmed. What do you call an attack that harms you, and you have no clue you've been harmed?

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
    6. Re:Bogus hashes won't tag valid mail by morzel · · Score: 2
      Tagged mails don't need to be sent to the bitbucket: store them in a seperate folder. That way, you won't lose a thing. If a message from someone gets in that folder, you should be able to select it and mark it as "not spam", which will add the source to your profile. (a bit like hotmail works nowadays)

      People who find entering an email adres a challenging task, most probably won't install this kind of software themselves. They will use it if it's built-in into their mail client. Adding clear and simple support for marking specific sources/messages would be an obvious feature for such a client.

      --
      Okay... I'll do the stupid things first, then you shy people follow.
      [Zappa]
  31. brightmail? by autopr0n · · Score: 2

    You might be thinking of brightmail. I think that's what they do (to lazy to look it up)

    --
    autopr0n is like, down and stuff.
  32. Why this wont work. by VC · · Score: 1

    Spammers will just modify their spamming programs to very slightly change each message so that they generate completely different hashes.
    Cool idea but wont work. Sorry.

    1. Re:Why this wont work. by glomph · · Score: 2, Informative

      Spammers -have- been doing this for a long time, appending some randomly generated crap characters to the subject line, to avoid hash-recognition.

    2. Re:Why this wont work. by ldeviator · · Score: 1

      It doesn't compute the hash on the subject line only the body. Vipul's razor therefore is immune to this and actually works quite well against it.

  33. Heh, intresting idea by autopr0n · · Score: 2

    I always figured that the major problem with a system like this was randomized messages. I figured a way around it would be try to make a 'conceptual' hash of the contents, that try to analyze the meaning of the text, not just the data.

    The big problem with that, is well, it's not easy :). But redistributing the hash function when spammers figure out the old one is an interesting idea as well. The big problem is with the more technically savvy spammers (yeh, I'm sure they're are some out there, unfortunately) who could reverse engineer the hash to figure out what makes it tick.

    --
    autopr0n is like, down and stuff.
  34. Mailwasher by Heem · · Score: 3, Informative

    I'm using Mailwasher it works well for me. Allows you to preview your message headers, delete,blacklist and 'bounce' anything you dont want to recieve. Works well on spam as well as email from your ex-girlfriend.

    --
    Don't Tread on Me
  35. Re:Great use of p2p -- Wont work. by VC · · Score: 2, Interesting

    This wont work. All that will happen is that the spammers will just modify their spam programs to slightly modify each message they send out. This will result in each message having a COMPLETELY different SHA signature.
    Cool idea but wont work. Sorry. Maybe some kind of AI algrorithm.

  36. One fix . . . by tmoertel · · Score: 2

    for abusers who report bogus signatures is to count the number of times each signature is reported and only consider a report valid after the count exceeds a threshold value. Real spam mailings would be reported many times each from distinct nodes and would be easy to distinguish from bogus signatures, which wouldn't be as widely reported.

    1. Re:One fix . . . by onepoint · · Score: 1

      Real good idea, but the goal is to prevent the spam from ever reaching the users box. If you start to add on a qualifier of reports submitted then ( i think ) a delay occurs. I think I like the idea of peer trust better because in the long run I would be able to filter out the abusers.

      -onepoint

      --
      if you see me, smile and say hello.
  37. Have you looked at Hotmail's new spam filter? by wideangle · · Score: 1

    Don't like Hotmail very much, but their
    spam filter rocks. 100% of spam is binned.

    Today in my unused Hotmail box, I see:
    Inbox 0 (0 new)
    Junk Mail 189 (189 new)

    Why? If you set the option, Hotmail excludes
    anyone not in your address book. In other words,
    anything _NOT_ sent to someone in your address
    book goes directly to the bin.

    So unless you have simon.wong@bulkemail.ca
    in your address book, you should be spam-free.

    ------

    Outlook 2002's new rules manager does the same
    thing. Does anyone know if Eudora, Calypso, or PINE can
    filter by excluding those not in your address book?

    1. Re:Have you looked at Hotmail's new spam filter? by Anonymous Coward · · Score: 2, Interesting

      I wonder if Hotmail is using the same kind of logic. I mean, they allow the user to label which emails the user sees as spam. Then they can set somekind of threshold based on how many labels a signature has received.

      When the threshold is crossed, then the signature will be categorized as spam to all other users.

      It will work beautifully considering how many users they have.

    2. Re:Have you looked at Hotmail's new spam filter? by Anonymous Coward · · Score: 0

      You call that a spam filter? Wrong understanding of the concept. That'll kill a lot of legitimate mail (non-spam, just from anybody you don't happen to have in your address book). Plus faked sender addresses aren't caught that way. I can't believe such crap is marketed as anti-spam. procmail filtering all mail to /dev/null is just as efficient as anti-spam measure. As is not having e-mail in the first place.

      If you want such a filter, it's trivial to set up using procmail on Unix. The reason it's not in PINE, or other MUA's? The authors must have felt it's fundamentally flawed :-)

      Michael

    3. Re:Have you looked at Hotmail's new spam filter? by quinto2000 · · Score: 1

      I use and love that flexible filtering language, sieve. Designed for Cyrus/IMAP servers, but also works clientside.
      (info about sieve can be found here)

      --
      Ceci n'est pas un post
    4. Re:Have you looked at Hotmail's new spam filter? by wideangle · · Score: 1
      Michael --

      You call that a spam filter?

      Yes. One definition of a filter is:
      "A program or routine that blocks access to data that meet a particular criterion."

      1. Hotmail uses a anti-spam routine.
      2. That routine blocks access to data (spam) ...
      3. that meets a particular criterion (if sender does not exist in address book, filter it)

      That'll kill a lot of legitimate mail (non-spam, just from anybody you don't happen to have in your address book).

      You're right. But considering how easy it is to add someone to your address book, that's not a problem. (Click the "Add to Address Book" button by the person's name.) You only have to do it once.

      Plus faked sender addresses aren't caught that way.

      Nobody I know sends mail to me with a faked sender address. But if they do, see above point.

    5. Re:Have you looked at Hotmail's new spam filter? by Anonymous Coward · · Score: 0
      W: if sender does not exist in address book, filter it

      M: That'll kill a lot of legitimate mail (non-spam, just from anybody you don't happen to have in your address book).

      W: considering how easy it is to add someone to your address book, that's not a problem. [...] You only have to do it once.

      But if you never see the message, how are you going to know that you have to add the address to your whitelist? If you have to look through the "spam" mailbox frequently to make sure nobody new has sent you mail, then you still have to look at all the spam.

      People send me mail from new addresses all the time. Should my spam filter filter out all email replies to my posts on mailing lists, Usenet, or slashdot? What about when I buy or sell something on eBay? Or make an online purchase or open a web account that sends me an email confirmation? What about when my friends get new addresses and send me a change-of-address mail from the new address? What if I meet someone in meatspace and exchange email addresses? All of these things have happened to me in the last month, and a "spam filter" that flagged them as spam would be useless to me.


      While I'm here, I may as well share with the Slashdot community an interesting observation about spam patterns. I post to Usenet fairly regularly. My From: address is not munged, it's user@foo.bar.baz. However, I don't get spam at that address — I get it all at user@bar.baz, which is my "canonical" email address. I assume that the spam harvesters all think that the "foo" is a spamblind, and automatically strip it off without even noticing that that host exists. It's been about a year, and I get well under 1 spam/month to my @foo.bar.baz address. I get far, far more to the @bar.baz address (though there are other places that address could be harvested from, as well). Interesting, no?

    6. Re:Have you looked at Hotmail's new spam filter? by Anne+Thwacks · · Score: 1
      It cant be much good - 50% of spam I get comes from hotmail accounts.

      I forward them all to abuse@hotmail.com, and a robot replies. 6 months later, I get back a note saying they'll do something about it, maybe sometime soon. Another 8 months, and the spammer's account is closed. If only snails were as slow as hotmail's anti spam procedures.

      --
      Sent from my ASR33 using ASCII
    7. Re:Have you looked at Hotmail's new spam filter? by wideangle · · Score: 2
      Yes you need to scan your junk folder occasionally.
      Think of it like dumpster diving -- sometimes you
      discover good stuff in there.:

      Seriously, the trick is making legit mail easier to find.
      So in addition to the whitelist, you need more filters:

      1. Filter mail sent from [family addresses] to family folder
      2. Filter mail sent from [friend addresses] to friends
      3. Filter mail sent from [*@work.com] to work
      4. Filter mail with subj [ebay] to ebay
      5. Filter mail with [foreign strings] to junk AND color it Gray
      6. Filter mail with [spam criteria] to junk AND color it Gray
      7. Anything else goes to junk for review
      Key rules are #5 and 6, which color spam appropriately.
      Now it's easier to review your junk folder, because real mail is most likely colored Black.

      Add one more rule, to be checked after you send a _reply_ to a legit msg:

      1. Is recipient in address book? (whitelist) If not, add it.
  38. a solution to abuse by Anonymous Coward · · Score: 0

    i think i got a good idea to add to this.

    seti@home can have abusers on their systems so they make a redundantcy. sending the same packet for analisis. when 1 doesn't go right, they'll resend the thing to 3 others.

    why not have a requirement of a certain # of people sending in an address as spam before it's considered spam? (just an idea, i don't know alot on how it works or anything, just trying to contribute

    blue tiger

  39. Unique ID's in most spam by EMIce · · Score: 1

    Most spam I get these days have URLs with unique ID's encoded in them, so they know which email address brought in the user. You'd think this would screw up the hash.

    1. Re:Unique ID's in most spam by EMIce · · Score: 1

      Wait, I should have checked the author's site first. He makes another package called Ricochet that analyzes spam. He probably uses some of the same technology.

      "It (Ricochet) traces the names and addresses of the systems where the spam originated from along with the servers that provide domain name resolution services to these systems (in most cases their ISPs). Then it collects/generates a list of email addresses of tech/billing/admin/abuse contacts of these system and mails them a complaint and a copy of the spam. "

      Ricochet Website

      I bet he uses some of the same techniques as in the Ricochet app, by involving the spam header when generating the hash.

  40. X-YahooFilteredBulk by Malc · · Score: 4, Informative

    I noticed that a lot of spam coming through my Yahoo account had been tagged with the header "X-YahooFilteredBulk". I added this to my Exim system filter and I've gone from 20+ spams a day in my inbox to 2 in a week. Thank you Yahoo!

    Unfortunately, a lot anti-spam measures (including Exim 3's system filters) only take place after a message has been accepted for delivery. For me, this results in a lot of bounce messages frozen in the queue as they cannot be returned (Hotmail mailbox full, etc). I've switched on features like verifying the sender and the headers, but this doesn't catch them all, and in some cases might even stop some legitimate spam (one of my mailing lists uses incorrect syntax for the "RCPT TO:").

    More effective anti-spam systems need to filter before the message has been accepted. If you wait until then, it is already too late and it is on your system. No, refusing accept delivery is much effective IMHO, and forces the MTA's further up the chain to deal with it. They shouldn't have accepted it in the first place! When you get spam, return 550 (or whatever the code is) and let the SMTP client deal with it. In an ideal world, ever provider (ISP, or free service like Yahoo) will implement stricter MTA's. If the spam rejection can be pushed far enough up the chain, life for everyone will easier.

    BTW, according to Philip Hazel (a message I recieved to a question I posed on the Exim mailing list), Exim 4 will offer much more functionality along these lines, including the invocation of C funtions after the DATA phase of the SMTP input. I guess this would be the spot to plug in Vipul's Razor, although I don't know what kind performance hit that would lead to. Mr. Hazel also pointed out that some stupid clients are in contravention of the RFC and will continue to try and delivery a message if they recieved 5xx after the DATA phase... oh well: they'll be using my bandwidth but they won't be putting any crap on my server.

    1. Re:X-YahooFilteredBulk by sydbarrett74 · · Score: 1

      RE: Yahoo, I think Yahoo does a much better job of filtering spam than Hotmail. Whilst Hotmail lets spam into my inbox and filters legitimate mailing-list posts (which I have added to the safe list, BTW), Yahoo seems to get it right and mostly filter only spam.

      --
      'He who has to break a thing to find out what it is, has left the path of wisdom.' -- Gandalf to Saruman
  41. Virus Scan? by Anonymous Coward · · Score: 0

    Couldn't this same program be used to keep a signature of new email viruses?

  42. a good idea, but... by deander2 · · Score: 3, Interesting



    What stops the spammer from including a unique identifier in each e-mail (such as a count variable), changing the SHA for each e-mail that goes out?

    Just a thought...

    1. Re:a good idea, but... by Animats · · Score: 2
      What stops the spammer from including a unique identifier in each e-mail...

      That's a serious problem with a signature-based spam recognizer. There are spam generators that already make each spam unique. Some just personalize the message. Some add text composed of random phrases to the message. Some append a number to the subject line. Just hashing the text of the message won't work for long.

    2. Re:a good idea, but... by Samrobb · · Score: 1

      Fortunately, there are a number of people who have had to deal with this subject in a different forum - Lycos, Google, AltaVista... search engines deal with a different kind of spam, but it's spam, all the same. Perhaps some current (or former) search engine folks could spend some time trying to improve how they extract relevant text from the message before hashing.

      --
      "Great men are not always wise: neither do the aged understand judgement." Job 32:9
  43. I've managed to filter most spam by Rikardon · · Score: 3, Interesting

    I found a clever way to defeat most spam on the webpage of an avid cyclist; unfortunately I can't remember his name or enough information about him to run a Google search and give this method proper attribution. But here goes anyway:

    The key to this method is to realize that most spam has a spoofed "To" address -- RARELY is it addressed directly to you. If you dig in the message headers, you'll usually found it was mailed (or CC'd) to a whole bunch of people at once, for obvious reasons. So you set up your mail filters thusly:

    First, set up a filter allowing any "legal" mailing lists you're on to go to your Inbox.

    Next, a filter to allow any mail sent directly to you (i.e. you@domain.com is in the To or CC lines) to go to your Inbox.

    Finally, a filter that deletes everything else.

    You'd be amazed how effective this is. Since setting this up, I only get maybe one spam message past this system every three or four months.

    Mind you, I also have my email come in via Bigfoot, which has a pretty good spam filter itself. But this has nonetheless proven quite effective.

    1. Re:I've managed to filter most spam by Spinality · · Score: 1

      I get a ton of spam addressed directly to me. (I think this all stems from a time years ago when I foolishly posted my email address on a bulletin board with a very limited user group. Normally, all the messages scrolled away after a week or so, so I thought I was safe. But unfortunately somebody archived the posts and they wound up somewhere permanent, and the harvesters have had me since then.) :( So though this is a good approach, in my case it doesn't help much.

      --
      -- We all have enough strength to endure the misfortunes of other people. La Rochefoucauld
    2. Re:I've managed to filter most spam by Rikardon · · Score: 1

      You might consider, in addition to the suggestions I made earlier, using the method Brian Kendig listed above. I'll repost the relevant bit here for your convenience:

      > The way I avoid spam is to have my mail client
      > screen out any email which contains any of
      > these phrases:

      > to be removed
      > to be permanently removed
      > to get removed
      > to get off the list
      > to get off this list
      > to be taken off
      > to remove yourself
      > removal instructions
      > remove in subject line
      > "remove" in subject line
      > remove in the subject
      > "remove" in the subject
      > 'remove' in the subject
      > S.1618
      > S. 1618

      > This list by itself catches about 80% of the
      > spam I get.

      Just make sure you put these filters AFTER the ones allowing your legal mailing lists through =)

      Hopefully, these two methods in tandem should work for you.

    3. Re:I've managed to filter most spam by LiteForce · · Score: 2, Insightful

      This won't work if somebody has sent you a message by way of BCC (Blind Carbon Copy).

      --
      "Be vewy vewy quiet, I'm hunting wuntime ewwors!" - Elmer Fudd
    4. Re:I've managed to filter most spam by Fatal0E · · Score: 1

      Does this muck up legit mailing lists? I'd be lost w/o my bugraq :)

    5. Re:I've managed to filter most spam by doorbot.com · · Score: 1

      Two comments: I use Eudora on my Mac and I have it setup to filter (to the trash) all mail where the To: field contains some form of "undisclosed recipients" This alone stops 90% of the spam. I'd also like to setup on my mail server a check where if the reply-to address != the from address, deny the message at the server. There's got to be a way to do this with SendMail, but my skills in that department are lacking.

    6. Re:I've managed to filter most spam by psamuels · · Score: 2
      I'd also like to setup on my mail server a check where if the reply-to address != the from address, deny the message at the server.

      Don't do this! There are many legitimate uses for Reply-To. Think about it. If Reply-To should always be the same as From, why did the standards even bother to define it?

      Most commonly, some mailing lists set the Reply-To to the list address. This is Considered Harmful, partly because some users have other legit uses for the same field, but some list servers do it anyway.

      --
      "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
  44. Re:FP by Anonymous Coward · · Score: 0

    Wrong....look at the timestamp on THIS one

  45. Re:Great use of p2p -- Wont work. by DLG · · Score: 5, Interesting

    >> This wont work. All that will happen is that the spammers will just modify their spam programs to slightly modify each message they send out.

    It will however require them to send each specific message separately rather than sending large cc's or using some sort of relay. That alone is a big step since right now most spammers can get away with sending a single email message and relying on an open relay to retransmit to a larger group.

    Furthermore I have doubts that for the time being this project will concern spammers. Infact I am pretty sure spammers are not really interested in wasting their own time trying to spam people who consider spam a violation. It is more convenient to ignore those people (which is why they don't bother to check if you want spam or not before they send it to you).

    DLG

  46. Re:Great use of p2p -- Wont work. by sporty · · Score: 2

    To a degree, this can work. If the signatue was of the text itself. If it was based on long sentences being present within a mail, plus the origin of the mail (based on the connecting IP), this might have a chance.

    Think of it, spammers would have to start hitting multiple mail servers which creates a lot of over head and is just silly, to get around this. That and spammers would have to use very very generic text to get by it. Like "Act now. We sell. Porn!. Natalie Portman!" vs "Come see our barely of age teens do really bad stuff."

    --

    -
    ping -f 255.255.255.255 # if only

  47. Virus Detection by doorbot.com · · Score: 5, Interesting

    This seems like it would be a great method for virus detection on a non-Windows machine. For those of you who run *nix mail servers which eventually filters down to Windows clients, having a mail tagged as viral would be nice to have it be immediately denied at the server. So I'm assuming all it would take is a smart admin to tag the email as spam, and then it will propagate around to the other servers (less than 1k would transfer!).

  48. Re:Great use of p2p -- Wont work. by sporty · · Score: 2

    Let me make an ammendment, a common IP, not necessarily the IP of origin. Someone could be behind NAT :) But then again, the software to figure out the common IP shouldn't be hard...

    --

    -
    ping -f 255.255.255.255 # if only

  49. Spam destruction from a pop3 users side by spagbol · · Score: 1

    I have the problem of many commercial pop3 accounts which supply techncal information for the use of end users of our products. Spam chokes up several accounts to such a great degree as to make them unusable. I searched Freshemeat and tried a program called "mailfilter". It goes out to pop3 accounts on the server and deletes mail based on your rules. Believe me, it rules! Spam problem for us is gone. Best wishes to you guys on the other side of the pop3 server with this new technology

  50. Re:Great use of p2p -- Wont work. by Idolatre · · Score: 2, Informative

    It will however require them to send each specific message separately rather than sending large cc's or using some sort of relay. That alone is a big step since right now most spammers can get away with sending a single email message and relying on an open relay to retransmit to a larger group

    Most spam I get has my real name somewhere in the body of the message, so it doesn't seem like a problem for spammers :(

  51. One flaw, depending on your perspective... by wirefarm · · Score: 4, Interesting

    I spent the last few days hacking together a bulk mailer in perl. I did so with a lot of sensitivity and a bit of trepidation and a lot of social engineering to my employer who wanted to put together a way to send invitations to a party via email, rather than the very expensive snail mail method that we had been using.

    This was emailed to our real customers - our 'A list'. These are the people who get invited to these parties each time - people who come and enjoy the food and drinks, no strings attached.

    But, yet, technically, it *is* bulk email and this first time, unsolicited. A very large percentage of the people responded enthusiasticly that they want to remain on the list for this, but a few (8 out of 3500) asked to be removed from the list. One guy seemed annoyed and I typed him a personal apology. (In fact, I doubt that this guy read the email before sending off his remove request.)
    What if that guy had submitted the email as spam to this system?
    In that case, the rest would miss out on coming to a good party.

    I hate spam as much as anyone on slashdot. I was asked to set up a bulk email and found that it could be done in a way that was not offensive in this case. Had it conflicted with my conscience, I would have refused.

    Maybe the system needs some sort of moderation as a filter, too. At least that would allow valid bulk email to survive one trigger-happy end-user.

    Ok, go ahead and tell me that I'm wrong in this...
    Cheers,
    Jim in Tokyo

    --
    -- My Weblog.
    1. Re:One flaw, depending on your perspective... by Spinality · · Score: 1

      You make an interesting point, but I think you have to lose this one. If there were absolutely no legitimate basis for unsolicited bulk email then we'd already have laws against it. The problem, as usual, is that there are pros and cons. But I think we (on /.) have concluded that the cons totally outweigh the pros in this matter. Even though you and your organization may have been on the side of the angels, you could just as well have said this: "We wanted to send party invitations, so we hacked into each of our customers' servers and put a message on their home page. Yeah, lots of crackers are evil, but we were doing this with good intentions, so it was OK."

      If you want to send out party invitations, get your customers to opt-in through your normal commercial channels.

      I would never go to a party announced via spam, even if it were at the Playboy mansion with hot and cold running blondes. If you were my vendor, what you did would have put you on the 'prohibited vendor list' -- we have a strict no-spam policy.

      Others may not agree, but that's how I see it. Needless to say, of course, I'm not flaming you about this -- it sounds like you tried to do the right thing. I'm just saying that, in my company, this would have had direct commercial repercussions against your firm.

      --
      -- We all have enough strength to endure the misfortunes of other people. La Rochefoucauld
    2. Re:One flaw, depending on your perspective... by Anonymous Coward · · Score: 0


      Just wondering, does your firm find snail mail spam as offensive?

    3. Re:One flaw, depending on your perspective... by Fatal0E · · Score: 2

      reading your reply I wonder if you have ever ran a business. Believe it or not (are you sitting down?) email is a very effective way of keeping your customers in touch with what your company is up to. If you're clever,(you're still sitting right?) those people might even be interested in the products that can supplement the things they already bought! On the other hand, it's my responsibility to take them off those lists at their request. Thats a business plan that even predates the internet. Shocking aint it? I know!

      Sarcasm aside, it's not much of a leap in logic to assume that people who bought things from you in the past might also be interested in your new products. Most of my vendors that I trust with my money and investment I also trust with my email address. Most, not all. Besides, I like keeping up with their new products. I stay informed that way.

      But isnt that spam you ask? The answer is no. When Cisco sends me (unsolicited) specs on their firewall after I bought their VOIP gateway I dont take that as an intrusion on my space. When someone from Palm Beach tells me that for 10 grand I could get rich even quicker that is SPAM. See the difference?

      Obviously this is all subjective but I think you're being harsh when you label a party invatation as spam. If I sent you a nice, fancy invatation to my New Years party via snail mail would you call me up and yell at me for sending you junk mail?

      I would never go to a party announced via spam, even if it were at the Playboy mansion with hot and cold running blondes.

      we call this "talkin out (of) your ass" in my neighborhood.

    4. Re:One flaw, depending on your perspective... by Spinality · · Score: 1
      > Just wondering, does your firm find snail mail spam as offensive?

      Good question. It's annoying as hell, and as far as I'm concerned bulk mail is the reason our postal system sucks. It concentrates most of the postal system's resources on the things we usually throw in the trash.

      However, we don't penalize vendors who use bulk mailings for a few reasons.

      1. It costs them (considerable) money to do a mailing. There's thus a built-in incentive to send appropriate, useful, legit information to people who have been selected on some plausible basis. Otherwise they're wasting their money. So there's a chance that a piece of bulk mail might actually be of interest. (Unlike email spam.)

      2. The majority of bulk mailers are legit businesses, and many of them are companies we've actually heard about before, or even done business with. (Unlike email spam.)

      3. There's a long tradition in this country of promotion through bulk mailing. Even though countless AOL mailings may piss me off, it's not sensible to regard bulk mailers as intrinsically evil. (Unlike email spam.)

      4. A good chunk of the bulk mail is coming from companies where we've done business before (due to their selection criteria used in (1) above). So although it may be unsolicited it's not from an unknown source. (Unlike email spam.)

      5. Bulk mail generally identifies the sender, including a valid mailing address. (Unlike email spam.)

      So though a) I waste plenty of time each month going through unwanted bulk mail, b) I can't remember the last time a bulk mailing from an unknown vendor actually generated any business from me, and c) we always opt out from bulk mailing lists and form any sharing of customer data, we still don't penalize vendors who use bulk snail mail. (Unlike email spam.) Not counting expected catalogues from vendors (which though it's bulk mail is not spam), we throw out 99% of the bulk mail that arrives. One piece out of 100 is either funny enough or relevant enough to avoid an instant trip to the dustbin.

      --
      -- We all have enough strength to endure the misfortunes of other people. La Rochefoucauld
    5. Re:One flaw, depending on your perspective... by Anonymous Coward · · Score: 0

      Actually, the reason bulk mailings are allowed is because the post office profits a lot more from them than from things like first-class mail. Pre-sorted and bundled mailings are much cheaper for them to deliver than an individual letter. Bulk mail actually makes the postal system suck less even though it makes the user's lives suck more.

    6. Re:One flaw, depending on your perspective... by sydbarrett74 · · Score: 1

      My problem with spam is not that it's unsolicited, but that it's mostly irrelevant. After all, when a buddy asks you to join him at the pub for a drink, that is an unsolicited invitation. But most spam is for products and services I care not one jot about. Spammers take the lazy, brute-force approach, instead of doing market research and finding out the interests and wants of potential customers. The result is that I, a male, get constantly bombarded with requests to buy cosmetics, feminine hygiene products, purses, and other shit that I could care less about. And I don't have a girlfriend or wife, so I have nobody else for whom to buy the stuff, either. But this is the more innocuous kind. The truly awful (and majority) portion of spam is for downright illegal or deceptive products. How many emails have we all received about '36% returns on gold futures,' or 'a completely safe, imported marijuana alternative' lately? I know I get those at least once a day. But I guess my big problem with this is that these filters seem to be Band-Aids. We should be stemming the flow of spam at the source, because even with filters, spam still clogs up servers and routers with its detritus.

      --
      'He who has to break a thing to find out what it is, has left the path of wisdom.' -- Gandalf to Saruman
    7. Re:One flaw, depending on your perspective... by sydbarrett74 · · Score: 1

      I forgot to add two suggestions for all the SMTP admins out there: CLOSE YOUR FUCKING OPEN RELAYS!!!!!!!!! -- and -- REQUIRE USERS TO LOG IN TO SMTP SERVERS!!!!!!!!!! In this day and age, there's NO compelling reason to use open relays anymore! About the second point, all potential users of an email server should have a legitimate account on it. Users shouldn't bitch because with any decent client, you can automatically set it up to log you on. But this would cut out a lot of spam. Ditto for news servers. Require users to log in. I lay some of the blame for the spam pandemic on lazy, incompetent system administrators. They should be earning their large salaries and reading up on CERT advisories and implementing those little fixes that, in the aggregrate, can really make a difference in making systems more secure.

      --
      'He who has to break a thing to find out what it is, has left the path of wisdom.' -- Gandalf to Saruman
    8. Re:One flaw, depending on your perspective... by The+Pi-Guy · · Score: 1
      Quoth the sender:
      " Ok, go ahead and tell me that I'm wrong in this..."


      You're wrong. ;)

      Ok, mod me down. But before you do, I do actually have something to contribute - you're not wrong. The moderation idea would work, and the end-user should be able to turn it off, or looks at the filters themselves.

      Ok, NOW mod me down. Or up. I don't care ;)
      Laters...
      --pi
    9. Re:One flaw, depending on your perspective... by RedHat+Rocky · · Score: 1

      As a customer, allow me to speak to a vendor. Just because I buy a product from you, that certainly does NOT form a relationship, no more than a hand shake makes me your friend. If I want info from your company, I'll ask for it. Otherwise, leave me alone.

      To be blunt, your attitude is that of a spammer, though you may not realize it nor actually participate in the sending of spam.

      Yes, I run a business as well. The attitude you display has nothing to do with running a business, it is rather a poor excuse to make a buck, and, IMNSHO is what is wrong with America today.

      See quotes such as:

      "The customer is always right, even when they're wrong".

      Customers are the life-blood of any business, keeping them happy is the key to success of any business. Respecting them as persons is a good start.

      --
      Anything is possible given time and money.
  52. Some positivism and less bitching please... by tcc · · Score: 3, Funny

    Well at least it *WILL* filter some of the bad content while leaving the good one clean, right now I receive 20 mails a day of spam in my hotmail inbox and the hotmail filter killed *VALID* messages! they keep junk for 2 weeks, I found that out 3 months later because my girlfriend posts would never reach me for the last few days.. and she's far from being a spammer.

    There's not perfect solution for spam (aside from killing every single individuals that dare spamming people, which unfortunately is still illegal :) ).

    Legislation is too busy removing our civil rights right now than to make our lives better (as they should do). So right now, I'd say, ANY technology helping us to reduce spam should be welcomed and helped in a productive way instead of bashing on it without even giving it a try. It's an open project and it means that if you can contribute in a POSITIVE way, you should. Else, people, please don't discourage programmers working on something that could eventually come out as being a very good solution.

    --
    --- Metamoderating abusive downgraders since my 300th post.
    1. Re:Some positivism and less bitching please... by befletch · · Score: 1

      right now I receive 20 mails a day of spam in my hotmail inbox and the hotmail filter killed *VALID* messages!

      Hotmail and Yahoo accounts accept WAY too much spam. Which is too bad, because I really like Yahoo's interface, and their reliability is second to none in the two or three years I've used them. I want them to succeed. Hotmail, well, I never got into that system, so I can't say much there.

      What I have found that is very interesting is eiomail.com. US$20/year, so they may never hit a high volume in a world of free email accounts, but no advertising and excellent spam control. They do two things to prevent spam. Firstly, they give you a list of 6 or 8 different spam lists (RBL, MAPS, etc.) that you can choose individually to filter your mail with. Secondly, they use a "target revokable" email scheme, where you create different email aliases for use in different contexts. If you start getting spam to "you-amazon@you.eiomail.com", you know who sold your name on a list.

      Ugh. Sounds like an ad, but no, I don't work there.

      One saving grace about Yahoo; once you have filtered your legit mailing list sources into folders, you can filter all the remaining BCC'd mail into the trash directly. That removes at least 80% of the spam I get there. Now if they would just not raise the 'new mail' flag on my yahoo homepage when the only new mail is in the trash, life would be great.

      --
      If you say, "now I'll be modded down because of X", I'll happily oblige.
    2. Re:Some positivism and less bitching please... by Klaruz · · Score: 2

      Sounds like they use ospam:

      http://omail.omnis.ch/ospam/

      It's qmail only though...

    3. Re:Some positivism and less bitching please... by Kris_J · · Score: 3, Interesting
      What is needs is for someone to setup free email accounts with "nospam" in the domain. myemail@nospam.com, or myemail@yahoo.nospam.com, etc -- then all these new harvest-bots that trim out "nospam" will either get it wrong or discount it completely.

      Just a random thoughr early on a Sunday morning...

    4. Re:Some positivism and less bitching please... by Anonymous Coward · · Score: 0

      And so would everybody who tries to mail you.

    5. Re:Some positivism and less bitching please... by darkonc · · Score: 2

      I have a friend who's email is something like banganospam@yahoo.com (not her real email addr). She claims that she gets ZERO spam on her accouant.

      --
      Sometimes boldness is in fashion. Sometimes only the brave will be bold.
  53. What we need by Anonymous Coward · · Score: 0

    For all those of us who use Mickeysoft products, we need such a database driven spam filter. One person says it is spam and the domain/user gets blacklisted, that way we all can help kill spam, not just you linux weenies.

  54. OH NO!! by evilpaul13 · · Score: 2, Funny

    I'll never get another "funny email" from my Mom again!

  55. Not necessarily such a Fabulous Idea! by marxmarv · · Score: 3, Interesting
    The people who came up with this idea deserve to be considered heros!
    Wouldn't that be BrightLight?
    I don't know the characteristics of the hashing algorthm used, but perhaps by doing three hashes: start of message, middle of message, and end of message, it may be possible to identify spam even if a small part has been change.
    HTML email provides too many places to hide garbage. Comment tags and unused X- attributes are the obvious ones; finely (or grossly) tweaking COLOR elements, or any number of things done to inlined images, provide an effectively infinite number of variations which will pass any filter based on the usual message digest algorithms.

    Many such tricks can be defeated by only hashing words that appear in some standard dictionary and discarding all else, such that

    <FONT COLOR="#FEFDFA"><BLINK X-515322451412135135>LIVE CO--ED NAKED DRESSED GIRLS, =46REE</BLINK></FONT>
    gets reduced to LIVE NAKED DRESSED GIRLS before hashing. Even then, the smart thing to do is not to block matching mail but to blackhole the sources of matching mail, preferably permanently.
    (Not to be too gushing: SPAM is a rich mans problem - I hope someone comes up with some cool technological solutions to some of humanities more basic problems.)
    Humanity's more basic problems are the inability to cope with the concept of a world without scarcity. Would that technology fix that instead of providing the powerful with more ways to create unnatural scarcity.

    -jhp

    --
    /. -- the Free Republic of technology.
    1. Re:Not necessarily such a Fabulous Idea! by Anonymous Coward · · Score: 1

      There's no such thing as a "world without scarcity". Resources can become less scarce thanks to technology, but there will never be an infinite amount of crude oil.

      The argument that eventually technology will go beyong the use of oil, and use fuel cells or solar power or some such doesn't work either. There is a finite amount of hydrogen and oxygen in the universe, and just because it is a huge amount technically doesn't make it not "scarce".

    2. Re:Not necessarily such a Fabulous Idea! by marxmarv · · Score: 1, Offtopic
      There's no such thing as a "world without scarcity".
      Information isn't scarce except for laws that decree it so. Sunlight isn't scarce because we have more of it than we can use.

      The concept of "scarce" applied to an open-ended future is meaningless. Webster's definition of "scarce" (emphasis mine):

      1 : deficient in quantity or number compared with the demand : not plentiful or abundant
      Loosely translated, there exists "enough" of a good when demand exceeds supply. You have no need for oil in 500 years as there's a better-than-even chance you'll be dead by then. The only thing that could inspire demand for oil in 500 years is the progenitor of scarcity, and that is greed (loosely translated, "the drive to acquire more than what one can make legitimate use of").

      Given the enabling technology, it certainly is possible for the average person to have the needs of life and significant creature comforts met with only a modicum of effort (say, 10 hours a month of easy labor).

      Resources can become less scarce thanks to technology, but there will never be an infinite amount of crude oil.
      Thank you, Mr. Cheney. By the time the oil supply runs out, there will be sufficient carbon on the surface to construct "enough" of whatever carbon-based foo we can possibly make use of.
      The argument that eventually technology will go beyong the use of oil, and use fuel cells or solar power or some such doesn't work either. There is a finite amount of hydrogen and oxygen in the universe, and just because it is a huge amount technically doesn't make it not "scarce".
      Sorry, smart guy. Not only are you using a flawed definition of "scarce", but there exists an abundance of hydrogen and oxygen because we can't destroy hydrogen and oxygen without working very hard at it. Furthermore, none of either is lost in the cycle since it essentially returns itself to the source from whence it came:

      Environmental H2O electrolysed to produce H2 and O2: energy + H2O -> 2H2 + O2
      H2 and O2 reacted in fuel cell or turbine to produce H2O, vented to environment: 2H2 + O2 -> H2O + energy

      For further study, I recommend a web search on "conservation of matter".

      If the Sun will beam enough energy down over a fixed time period (say, a day) to meet the demand of that period (say, a day), with capacity to spare, then there is an abundant energy supply, and therefore any scarcity of energy is due to the human social order imposing scarcity somewhere between the supply point and the demand point.

      Einstein said there's plenty of hydrogen and stupidity in the universe. I leave the conclusions to the reader to draw.

      -jhp

      --
      /. -- the Free Republic of technology.
  56. Re:Great use of p2p -- Wont work. by aminorex · · Score: 1

    A slight modification would fix that problem.
    Hash short segments of the mail. Use higher
    resolution at the beginning and at the end.
    Anything within a certain hamming distance
    matches.

    --
    -I like my women like I like my tea: green-
  57. how I filter spam by scrytch · · Score: 2

    By filtering out mails that contain the phrase "this is not spam"

    --
    I've finally had it: until slashdot gets article moderation, I am not coming back.
    1. Re:how I filter spam by psychosis · · Score: 2

      excellent!!! I'd never thought of that. Bravo!

  58. Cool... by CoolVibe · · Score: 1
    But it's also way different than my anti-spam strategy. Basically, what I do is I just junk everything that comes from AOL, Hotmail, MSN,Yahoo and a whole bunch of other free mail providers, and have a list of people that use these services that _can_ mail me. Cuts down spam a lot for me. I know it is quite rigourous, but hey, it works for me.

    But I'm going to give this a shot too. This sounds seriously cool.

  59. List of server-based spam filter systems by tgeller · · Score: 5, Funny
    A canonical list of server-based spam filtering systems is on the SpamCon Foundation site, along with other sysadmin resources.

    --
    Tom Geller
  60. Foreign spam removal by wideangle · · Score: 5, Informative

    For the many /.ers who:

    a. Use Outlook secretly
    b. Receive loads of foreign spam
    c. Don't know any foreign languages
    d. Don't have any foreign friends
    e. Don't have any friends

    This Outlook rule is for you!

    Apply this rule after the message arrives
    with
    Ô or ¾ or Ç or or É or ½ or Í or ò or Ë or ® or Ä or ã or Ï or Ö or Ô in the subject or body
    delete it
    and stop processing more rules.

    This blocks 99% of foreign spam. Sue Mosher wrote about other effective methods for killing spam in Outlook. Finally, before you reply saying "You dummy, that filter works in any client!" -- You're right.

    1. Re:Foreign spam removal by Anonymous Coward · · Score: 0

      Not being from the USA, I only receive foreign spam (from the US).

  61. an other effective spam stopping method ? by Sarin · · Score: 3, Insightful

    I receive about 40 spam messages in my mail account each day and I run my own mail server (qmail). Someone told me about a very basic spam stopping method. Just remove the mail-account for a couple of weeks and then reconnect it again, you should less or no spam after that period.

    I receive too much real messages in order to try this out and I think most spammers won't bother to actuall remove an email address from their database if it doesn't exist. But has someone else tried this with any luck?

    This p2p spam sounds really nice and I'm going to give it a try asap. I already "lost" an other mail-account in the flood of spam I got on it, so now it forwards all messages to msnbill@microsoft.com (microsoft domain billing address).

    1. Re:an other effective spam stopping method ? by Anonymous Coward · · Score: 0

      what a great idea, advertising to everyone that they should forward their spam to msnbill@msn.com. Instead of stopping the spam you perpetuate it.

    2. Re:an other effective spam stopping method ? by platypus · · Score: 1

      dumb question:

      if you are running your own mailserver, why don't you generate artificial bounces for spam mails?

  62. Re:Great use of p2p -- Wont work. by friscolr · · Score: 4, Interesting
    Maybe some kind of AI algrorithm

    everytime spam gets mentioned on slashdot, someone says this, and everytime i respond with the work i've been doing-
    pattern matching spam
    uses word counts and phrase counts from known spam and known good mail to match against incoming mail. requires a certain amount of known spam/not spam, but otherwise it has a good rate of matching spam/not spam and doesn't require the incoming mail to at all known beforehand.

  63. What about "good spam"? by mnordstr · · Score: 1

    Let's say Slashdot sends out an email to its users telling them about something very important. The mail every user recieves is identical, and most of the users want't to recieve it.
    Someone want's to abuse the system and reports the message as spam. Now none of the users using razor will recieve it.

    Is this possible, or am I just not getting the big picture?

  64. Intentions matter by mgkimsal2 · · Score: 2

    Wow - you're taking that to the extreme. Do you shut out people who approach you in a room to talk to you because you didn't give them permission first? Pretty much the same concept.

    If I email my bills to clients, but they didn't request them first, does that mean it's 'spam' and they don't have to pay it?

    This company had legitimate relationships with current customers. If you can't email a current customer with information about something about your current relationship, then there's something seriously wrong with that definition of 'spam'.

    Hmmm... I guess I'm not allowed to send anyone ANY email ever unless the intended recipient requested it first. If the 'real world' operated like some people want email to operate, the world would be a mighty dull place...

    1. Re:Intentions matter by Spinality · · Score: 1

      I don't think I'm taking this to an extreme, but perhaps I was responding to a different point from the one the original author intended. Let me try an entirely different tack:

      The problem here was that, when this company assembled its email mailing list, it should have been very precise about how the list would be used. When customers provided their email addresses, they should have been able to say "Only respond to my emails," "Send me new product info," "Tell me about parties," "Sell my email address to bulk emailers in Malaysia," etc. So the fault was in how the list was created. By the time they were sending out party invitations, they didn't have a valid mailing list for sending such invitations.

      Now, I agree that it sounds like they proceeded in a reasonable way, and I bet they'll be quite clear in their future list maintenance. But at a philosophical level, was this action spam or not? If you say it wasn't, you've carved out an important exception to our definition of spam. You'd have to say that any vendor can send bulk email if they have a reasonable expectation that you might want those emails. Well, every spammer would claim this.

      This party invitation problem is an extreme case; but it's in the extreme cases that we find the places to draw the line. I'd say that that line was crossed, because email addresses were used for a purpose that hadn't been agreed upon in advance. I think that 'opt in' is the only valid approach to bulk email, because the situation is so badly abused at the moment.

      I hope this position makes more sense to you. I don't think that unsolicted bulk email is at all like talking to people in a room. The assumptions and responsibilities are totally different.

      One other point, though, about sending invoices. If you send an invoice via email, but email hadn't been discussed as an acceptable medium, then no, I don't think your customer has any obligation to pay. IANAL, but there's always lots of steps to go through to set up electronic billing, and that must be for a good reason.

      --
      -- We all have enough strength to endure the misfortunes of other people. La Rochefoucauld
    2. Re:Intentions matter by Samrobb · · Score: 1

      Question: At what point does it become "bulk" email? 5 people? 10? 20? 50? 100? I've sent out personal party invitations with close to a hundred people; was I a spammer?

      No, I don't think so. I may have been sending unsolicited email to a large number of people, but I had an established, existing relationship with those people already. It would be entirely reasonable for me to presume that, if I had sent them each a personal email message about the event in question, that none of them would have thought it as an inappropriate use of email.

      I think any definition of spam needs to take that into account. We need to differentiate between these two types of email abuse - the blind use of purchased email lists vs. the misuse of legitimately acquired email lists. IMHO, the second is the more disturbing activity, while the first is more annoying, and more deserving of the title "spam".

      --
      "Great men are not always wise: neither do the aged understand judgement." Job 32:9
    3. Re:Intentions matter by Spinality · · Score: 1

      > I may have been sending unsolicited email to a large number of
      > people, but I had an established, existing relationship with those people already.

      I totally agree. It's not the size of the mailing, it's how you got the email addresses. Sending bulk email to people who expect to hear from you is, of course, A-OK.

      I'd like it to be illegal to sell an email list without showing the source of each entry, and without documentation of an opt-in for each entry. There is no situation where I'd want company A to make my information available to company B, unless I explicitly told them they could release my name.

      But that law could never happen, of course, and even if it could it wouldn't protect against international sources.

      --
      -- We all have enough strength to endure the misfortunes of other people. La Rochefoucauld
  65. Better question by Anonymous Coward · · Score: 0

    I'm in Ottawa and need a job (even contract) Can you help me out ?

  66. I knew my spamcollection would be useful one day by LinuxOnEveryDesktop · · Score: 1

    Having fun right now piping my collection of thousands of spam messages to razor-report...

    #!/bin/sh

    files=`/bin/ls *`

    for thefile in $files; do
    cat $thefile | razor-report -d
    done

  67. Re:Great use of p2p -- Wont work. by DLG · · Score: 2

    I personally have never seen a single spam that has my real name. If I sign up on some website for something then certainly I can't be really suprised if folks from that website opt me in. Giving them my email and name and such is an invititation to recieving email unless they specifically state they will not send anything.

    Much of the spam I do recieve is of the type where they are sending mail to all the DLG's out there for instance.

    Also much of the spam I get comes through the email addresses that are on webpages... I infact will recieve the same spam several times a day. The only thing that might change is the subject name. (I have never understood why someone thinks that sending me 20 of the same exact advertisement overnight is wise..)

    In any case, I don't know if this process will reduce all spam for all people, but considering that even with blackholes I still get a sizeable amount of spam, anything is worth trying...

    DLG

  68. A quick glance at the code shows... by imrdkl · · Score: 1
    Vipul does create a hash of the body, after passing it minimally through a tidy() method which removes line-endings, and a very simple method intended to removing signatures Mail::Internet::remove_sig().

    In general, it seems that the prevailing opinion that this cant stop custom-spam is correct. I suppose that would require some other sorts of additional checks related to sender, etc.

  69. Boring by Anonymous Coward · · Score: 0

    Yet another boring subject..

  70. Add one for this: by TomatoMan · · Score: 2

    This ad is produced and sent out by: AdAd Systems, NY, NY 1 1 2 2 2. To be r e m o v e d from our mailing list please email us at
    harold02@musiclover.com.au with r e m o v e in the subject.


    Note the spacing with the word "remove". I wonder if these guys read your post.

    --
    -- http://frobnosticate.com
  71. Perhaps a misunderstanding by Spinality · · Score: 1

    I think you may have misunderstood what I consider spam. If I've given you my email address and have agreed to receive email from you, then of course it's not spam. The question was about unsolicited unexpected email, e.g. messages sent using a mailing list from a third-party source.

    > I wonder if you have ever ran a business

    Yes, I've run my own business since 1980. I've also done marketing, promotion, and support plans for businesses that use direct mail, so I'm familiar with the issues.

    > email is a very effective way of keeping your customers in touch
    > Most of my vendors...I also trust with my email address

    Yes, indeed, email is a great way to keep vendors and customers in touch. But if your customers give you their email addresses and opt-in for mailings, it's not unsolicited email. I agree that this is a great use of technology.

    The whole point of this thread was to discuss whether sending 'nice' spam to people who have not agreed to receive your email is still spam. In my opinion, there is no 'nice' spam.

    > [What if] I sent you a nice, fancy invitation to my New Years party via snail mail...?

    The point is that email is different from snail mail. You wouldn't spend the money to mail me an invitation unless you knew me, or there was some legit reason to invite me (e.g. we live on the same block). Email is swamped with spam -- in my case, 90% of what I receive is spam. Anybody adding to that burden is 'over the line' as far as I'm concerned. I didn't quite get your final comment about talkin' out my ass, but my point is that I follow a strict policy about spam and about telemarketing. I simply will not do business with any spammer, nor will I respond to any telemarketing offer. So no matter what wonderful party invitation might arrive by spam, I wouldn't consider it, as a matter of principle (not that Playboy mansion invitations would be sent via spam!).

    --
    -- We all have enough strength to endure the misfortunes of other people. La Rochefoucauld
  72. spammer said stopping spam in un-american. by www.sorehands.com · · Score: 2
    I got a spam from one spammer, who gave a an 800#.

    I called him, and he was saying that it was un-American to stop spam. In my case, he He got the email address from a the prairielaw.com website, but it was too expensive for him to pay for advertising.


    Maybe you'd like to discuss it with him.
    Locators, Inc

    888-595-9131 Toll Free

    1. Re:spammer said stopping spam in un-american. by zulux · · Score: 3, Informative

      Watch out! In some cases an 888 or 800 number can act like a 900 number - It can cost you money!

      http://www.bbbsouthland.org/topic110.html
      for more information.

      --

      Moneyed corporations, non-working 'poor' and criminal prisoners are turning productive citizens into tax-slaves.

    2. Re:spammer said stopping spam in un-american. by Anonymous Coward · · Score: 0

      I wonder how much advertising cost compared to his 800 phone bill now.

    3. Re:spammer said stopping spam in un-american. by Anonymous Coward · · Score: 0

      He was not too happy when called at 3:00 am his time.

    4. Re:spammer said stopping spam in un-american. by Anonymous Coward · · Score: 0

      Thank you, sir!

  73. Exactly by Anonymous Coward · · Score: 0

    The mail in question has already BEEN delivered. It has used up network bandwidth, and will use bandwidth to deliver the bounce. The failed bounce will waste the time of the postmaster.

    Tis better to dump the connection right away. Me, I have a blanket reject, with a message pointing you to a web page. The web page has a click-thru contract letting you know you will be charged to mail analysis and that you accept the terms to be sued in local (to me) court.

    If they agree, they get added to the 'ok' list. (just like the OK list has my mailing lists etc la)

    Guess what? I don't get spam. Yet, ppl who REALLY want to use e-mail to talk to me can.

    This works a whole lot better than any other 'system' to date.

  74. Sneakemail by Spunk · · Score: 1

    The Sneakemail service is my favorite spam-fighting tool.

    Log in to Sneakemail every time a website asks for your email address. They give you a unique address (@sneakemail.com) which you then give to the website. At the Sneakemail site, you can configure it to either forward, hold, or trash your mail.

    For instance, you can tell it to trash mail sent to your "RealAudio" address and hold mail sent to your "Google" address. Each time someone sends mail to any of those addresses, it sets up a new filter rule, which you can later change: mail from CmdrTaco to your "Slashdot" address can go through, but j12h31j2hjh@hotmail.com mail to "Slashdot" can be trashed.

    Certainly you can do this yourself if you run your own mailserver, but this is easiest for me.

  75. Opting out by Spinality · · Score: 1

    > On the other hand, it's my responsibility to take them off those lists at their request

    I forgot to comment on this very important point. Opting-out is not a valid mechanism, because as you know many spammers use opt-out responses as a way to maintain their lists of valid email addresses. So opting-out may work for a customer/vendor relationship (but opting-in is just as easy); but it will not work for bulk email received from an unknown party. And even with a known vendor, opting-out can be dangerous, because it would be easy to forge an opt-out mechanism with a legit business name for the purpose of collecting email addresses. "Do you want any more bulk pr0n emails from Microsoft? Click here to remove your name from our list." I've received these, by the way.

    But just to clarify: I don't think these guys were being bad, I just think we need to have a very clear definition of spam that is not dependent on the spammer's intent. I thought that issue was at the heart of the original post.

    --
    -- We all have enough strength to endure the misfortunes of other people. La Rochefoucauld
    1. Re:Opting out by Fatal0E · · Score: 2

      firstly, I would have emailed you cuz I didnt want to discuss this on /. but since I dont troll I can sacrifice some karma :)

      anyway, I guess the biggest gap in our opinions is over good spam. UCE from respectable, reputable people to me is a good thing. They are the companies I send my money to.

      If someone on the bugtraq list came up with a commercial app that he wanted people on the list to beta test for him I would consider doing it. If he later offered it to subscribers (as before, via the list) at a substantially deep discount I wouldnt mind that either.

      My two examples up there are figurative, as opposed to the literal ones I gave you the first time.

      Victoria's Secret sends catalogs to my g/f, Thinkgeek sends me pamphlets and Cisco sends product announcements and specs. I like em all. I hold them in the same regard. Spam can be good...but not often :)

      Opting-out is not a valid mechanism, because as you know many spammers use opt-out responses as a way to maintain their lists of valid email addresses.

      You are correct, but my point is that for those of us who dont rely on spamming as our sole source of income, it's done responsibly. IOW there isnt a volture waiting for all those opt out emails to come in so he can sell them at a higher premium. I like to think that for most private corps that do these things (like mine), opt out means opt out.

      from the original post that got us both going...
      This was emailed to our real customers - our 'A list'. These are the people who get invited to these parties each time - people who come and enjoy the food and drinks, no strings attached.

  76. Bulk mail by Spinality · · Score: 1

    This issue gets argued both ways. I've seen lots of claims like yours, that bulk mail is the profit-maker. But I've also seen figures that show first-class mail to be subsidizing bulk mail. I think there's a lot of deliberate obfuscation here. (I've had direct professional involvement with the USPS and with direct marketers at various points, and so I'm talking about well-researched statistics, not the types of claims you'd find in The Onion.) As usual, a good clue is to see who is in the best position to influence the associated policy and laws. The direct marketers have a strong lobby and incredible influence on the postal service. First class mail is less and less important from a political standpoint. Therefore, I choose to believe the statistics that show first class mail to be the victim, because it's in a good position to be a victim.

    But here's another way to look at it. If you got rid of all that bulk mail, you could do the postal service's job with 1/10th the resources. There's a reason that FedEx and UPS can make a profit: We don't mind paying for efficient delivery of the stuff we really want.

    --
    -- We all have enough strength to endure the misfortunes of other people. La Rochefoucauld
  77. Why not a histrogram filter? by javaaddikt · · Score: 1, Insightful

    The best option would be a word count histogram filter. Then the spammer would have to entirely alter their language or sales pitch, which isn't going to happen. Just like handwriting, it is hard to change unless you make a whosale effort at changing it. They're too lazy, too.

  78. Re:Great use of p2p -- Wont work. by kevinank · · Score: 5, Interesting
    Interesting work, but I notice that you are only examining trigrams, and you are using an even weight factor. To improve selection you probably at least need to use variable weights (a fuzzy logic neural network rather than binary logic) and train the network with more sample spam.

    I've been working on a similar project but using additional factors that help identify spam such as violations of the mail RFC's, and other header indicators, in addition to NLP. I have a prototype that I'm using to score all of my inbox e-mail and am using that to tune the weight factors and add in new factors as I encounter them. It would be interesting to combine your approach with mine I think, since I hadn't thought of analyzing trigrams.

    Anyway, if you are interested send me an e-mail and I'll give you my current perl code.

    --
    LibBT: BitTorrent for C - small - fast - clean (Now Versio
  79. Re:Great use of p2p -- Wont work. by linzeal · · Score: 2, Funny

    I get a lot of email for Ass Hole, Fuck You, Die Spammer, and other such people I've never heard of.

  80. Similar to DCC by bedessen · · Score: 2, Informative

    See also DCC, the distributed checksum clearinghouse. It uses a fuzzy hash so that bulk emails with minor differences are caught. I think the details differ a lot but the idea is more or less the same.

    1. Re:Similar to DCC by jordan · · Score: 1

      Similar to DCC, yes. Matter of fact, same idea. But, the "fuzziness" of DCC's "fuzzy" hash is BS; it's still an MD5. All DCC tries to do is normalize the message before MD5'ing. That isn't fuzzy.

      --jordan

    2. Re:Similar to DCC by vjs · · Score: 1
      Whether the checksum is SSH or MD5 is obviously completely irrelevant to whether the input of the hash is "fuzzy."

      Some people think that SSH may be more secure than MD5. To date that supposed weakness in MD5 is at most a suspicion. For the purposes of detecting spam, it is also completely irrelevant, since the ability of a bad guy to compute collisions is not interesting. It's mostly merely good politics for dealing with people who don't understand or care to think about any relavent threat model to use MD5 or SSH instead of a long CRC. The hash must be long enough to have a probability of collision less than the probability of failures elsewhere, whether in hardware or software. For that you want 64 or 128 bits. There is very common and reasonably fast code to compute MD5, so I chose MD5 for the existing DCC checksums. There is nothing in the DCC protocol that requires the future DCC checksums to use MD5.

      "Normalizing" the message is the essense of "fuzziness." Whether you convert the message to a grammar tree, histogram of words, ignore typical spammer "customizing," or anything else before computing the checksum, you are doing no more or less than "normalizing," at least for any useful meaning of the word I can think of.

      Vernon Schryver vjs@rhyolite.com

  81. Re:Great use of p2p -- Wont work. by linzeal · · Score: 1

    Open source philosophy at work, bravo :)

  82. Here's an addy by robogun · · Score: 1

    Send an email to this address: photosport@yahoo.com
    and check out the taunting reply

  83. The death of SpamCop by Animats · · Score: 3, Informative
    I use SpamCop to filter the mail for four domains. SpamCop used to be quite effective, because it used a challenge/response system, sending new mail sources an autoreply E-mail with a URL that had to be visited before the mail was forwarded. While that's a pain for the sender, it's been 100% effective in stopping spam.

    Recently, though, SpamCop switched to a heuristic spam-filter, which is quite leaky. Not only does spam get through, messages from well-known viruses come through. It stops maybe half the spam now.

    So SpamCop is now no more effective than typical procmail filters. So there's no point in paying for SpamCop service any more.

    Anyone know of a good challenge/response alternative to SpamCop?

    1. Re:The death of SpamCop by jayed_99 · · Score: 1

      TMDA
      (Tagged Message Delivery Agent).

      Since I've started using it, I never (literally) see SPAM.

      It has a whitelist and a blacklist. It challenges unknown senders and holds their mail in a pending queue. When you send email, it can generate a new address that's only good for a set amount of time, or only good for the recipient to respond to. It does other neat things as well. It is amazing.

      The only problem that you might find is that you need to use qmail for a lot of the functions to work.

    2. Re:The death of SpamCop by nicwolff · · Score: 1

      If you can run procmail try my challenge/response script:

      http://angel.net/~nic/spam-x.html

      It just requires new senders to reply to its autoreply, but it's been foolproof so far.

    3. Re:The death of SpamCop by nicwolff · · Score: 1

      Sorry, I meant to link it:

      http://angel.net/~nic/spam-x.html

    4. Re:The death of SpamCop by PigleT · · Score: 2

      "It challenges unknown senders and holds their mail in a pending queue."

      FWIW I find this system pretty stupid. The amount of work that *you* have to do resulting from spam is that you have to press `delete' or deal with it. It is highly unfair to multiply that work off onto all senders of legitimate email - you could reasonably say that that means the spammers have won.

      --
      ~Tim
      --
      .|` Clouds cross the black moonlight,
      Rushing on down to the circle of the turn
    5. Re:The death of SpamCop by Animats · · Score: 2
      The amount of work that *you* have to do resulting from spam is that you have to press `delete' or deal with it. It is highly unfair to multiply that work off onto all senders of legitimate email....

      If that's too much work for a sender, I probably don't want to hear from them anyway. After all, I'm going to have to compose a reply. It's only required on the first e-mail from a new source, so it doesn't bother anyone I hear from regularly.

      Another advantage of the challenge/response system is that it validates the source address. This validates the source of incoming threats.

    6. Re:The death of SpamCop by PigleT · · Score: 2

      "If that's too much work for a sender, I probably don't want to hear from them anyway."

      You're right, you obviously don't want to hear from me, for starters. Tell me *why* I should have to do your work for you?

      "This validates the source of incoming threats."

      No, it validates a chosen outgoing email address of *yours* to someone else who's more likely to change their address and sell yours on as validated.

      --
      ~Tim
      --
      .|` Clouds cross the black moonlight,
      Rushing on down to the circle of the turn
    7. Re:The death of SpamCop by Mike+Van+Pelt · · Score: 2

      It does not challenge "all senders of legitimate email." It challenges (or challenged; according to this thread it apparantly it doesn't do challenges any more) senders of mail from domains that are the sources of significant spam. If you are sending email from AOL, you will get challenged the first time you send email to me. If you are sending mail from a non-spamming company address, you probably won't get challenged.

  84. there are some scripts by 4444444 · · Score: 3, Informative

    you can find some scripts here

    http://www.lenny.com/spam

    --

    http://Lenny.com
    4 great justice!
  85. Anti-Spam with Postfix.. by That+Bajan+Guy · · Score: 1

    I run all my mail through postfix, and utilize the header and body regex checks. My spam level has dropped from a few messages a day to 1 or 2 a week, and those typically come from list serves that I pick up via POP3, not ones that deliver right to my system. Toss in an inline AV filter, and procmail, and my life is pretty much junk free.

    Same applies to my web browsing - Mozilla -> Proxomitron -> Junkbuster -> Squid -> World.

    --
    -- Sapere aude.
  86. Answers to some questions raised on slashdot. by vipul_ved_prakash · · Score: 5, Informative
    Hi,

    Some of you point out that Razor's use of SHA-1 signatures can be defeated by introducing randomness in the message. This is true; SHA-1 will eventually be phased out and replaced by a fuzzy hashing mechanism like nilsimsa in future. [http://lexx.shinn.net/cmeclax/nilsimsa.html] [http://www.geocrawler.com/archives/3/2539/2001/7/ 0/6173567/] The protocol is structured to aid change of hashing algorithms seamlessly, without breaking the existing system. Regarding the possibility of poisoning the database, we are working on a reputation system that will assign credit to honest reporters. Once we have a critical mass of users, it would be hard for dishonest reporters to even join the reporting network, much less be able to mount a DOS attack. Some of these issues have been discussed on the razor-users mailing list. The list archives are located at [http://www.geocrawler.com/archives/3/2539/2001/] best, vipul.

  87. Not Gnutella-like at all; it's Napster-like. by jordan · · Score: 2, Interesting

    The comment made in the submission states that Razor is gnutella-like. That is BS too; if anything, it's Napster-like. Razor is a centralized, collaborative filtering system. One could argue that Razor's master servers are distributed and that the entire system is therefore not fully centralized, but this will change shortly to a master/slave model, which will allow the introduction of a reputation management system.

    Keep your eyes peeled.

    --jordan

  88. Sword and shield, bullet and armor... by burbilog · · Score: 1
    ...this fight will continue forever. Anyway, it's very simple to twart single hash algoritm by adding junk to the subject/body. So, we need something more reilable. Let's do THIS:

    • Add subject line to the body.
    • Strip down EVERYTHING except words from English/German/Russian/whatever dictionaries. Leave only TEXT.
    • Break these remains into several parts. Let's have three methods -- 20 bytes/chunk, 30 bytes/chunk, 50 bytes/chunk. Calculate hash for each chunk in each method. Now when we nominate this message as spam and submit it to the p2p network, we take RANDOM method and submit set of hashes.
    • When we receive email, we do the same thing, so we have three hash sets. We search database for matching at least two or three chunks from each method. IF these chunks match, it's most probably spam.

    It will be very difficult to randomize spam to avoid getting two or three chunks of spam text into database, because you will never know HOW to align text and junk to avoid hashing, and you soon will be out of english words if you try to make different variants of spam with the help of some AI like pornolize.com. Of couse, you can insert random words into the text, but then it will be impossible to understand...

  89. Re:Great use of p2p -- Wont work. by jordan · · Score: 1

    It does work, there's no debating that. The reason is because SPAMers are not yet up against mainstream technophile-type people; the mindless masses still aren't smart enough to use a procmail filter or anything else. In terms of numbers of people reached, the numbers are still in their favor to not even bother modifying their algorithms.

    However, that will change, and so will Razor. Quite soon, in fact, Razor will have a real "fuzzy" match algorithm. Note, not like the bogus "fuzzy" match system that DCC employs, which isn't fuzzy at all but is rather a normalized MD5 (still unique).

    --jordan

  90. I want a SMTP-layer trust metric by Anonymous Coward · · Score: 0

    Yes, I want to be able to grant levels of trust to senders and have it happen at the transaction level when my MTA gets a connection from the sender.

    Anyone I've dealt with myself gets whitelisted, and gets in every time, with no blocklist checks. In the case where someone is unknown to me, if there's a path back to their key from mine with sufficient trust, then I should also allow it in.

    The trick here is that if I have no path back to someone and they're mailing my private address, I should be able to kick back a real-looking (but actually fake) "550 User unknown" to make them think I don't exist.

    This has to happen at the transaction layer, since once you accept the mail, any attempts to bounce it from there (if it's spam) will probably double-bounce right back into your mailbox (you are the postmaster, right?) anyway.

    If I could figure out a good way to convey this in the existing fields like the envelope sender, then it wouldn't be too hard to write as a milter plugin for sendmail. Hopefully someone else has already done this and I won't have to.

  91. I wouldn't trust this too much. by nstrom · · Score: 1

    I wouldn't trust this too much -- none of the people working on this project appear to be regulars of NANAE. I'd go with DCC over this product -- they seem to do the same thing, and DCC is an already-established project.

    1. Re:I wouldn't trust this too much. by vjs · · Score: 1

      I'm inclined to trust the DCC far more, but only because it is my code. The DCC is completely independent of NANAE. I suspect most DCC users don't know what "NANAE" means.

      Except that both this package and the DCC involve exchanges of checksums, I don't see major similarities between the two. Perhaps that is just my NIH syndrome talking. The DCC has been in use for a bunch of mailboxes since last year.

      I think there is a major problem common to both that I deal with by saying "don't do that." That problem is dealing with bad guys. What happens if a bad guy subscribes to a mailing list you like such as CERT advisories, and submits checksums for those messages? My answer is that if your DCC server accepts checksums from DCC clients not under your personal thumb, then you must whitelist all of your incoming mailing lists because your DCC server only detects "bulkness" and not "unsolicited bulkness."

      If you accept checksums from strangers, then the effectiveness of your system for detecting bulkness increases significantly, but you can't trust people you don't know. Worse, by the time you have a significant number of users, the hassles of bookkeeping force you to assume that at least a few of them are bad guys.

      Then there are mistakes by good guys. What happens if a good guy accidentally submits the checksum(s) of a CERT advisory? The answer for the DCC is the same as for bad guys. If you feed your DCC server with anything except spam traps that cannot receive any legitimate mail, you can consider all hits to be unsolicited bulk email. If you let humans submit checksums of what they think is spam, you cannot trust them to never make mistakes, and so must treat your DCC server as telling you only about "bulkness."

      Vernon Schryver vjs@rhyolite.com

  92. Good method, but why use the Inbox at all? by wideangle · · Score: 2
    1. Set a filter that sends "legal" mailing lists to your mailing list folder.
    2. Set another filter that sends friends/family/work/etc to their own folders.
    3. Anything else (spam) gets dumped in the Inbox.

    ------

    If you have O2002, you can do something similar by whitelisting. "Whitelisting is the opposite of blacklisting. Whereas the latter bans messages from certain senders, whitelisting accepts mail from specific senders."

    "The new feature is an additional Rules Wizard condition: "sender is in Address Book," where you choose the address book--I've chosen my Contacts folder. For a message from a sender found in my Contacts folder, the rule applies a "known sender" category and stops processing the message. The "stop processing" action ensures that the message stays in my Inbox. Another rule at the bottom of the list moves everything that previous rules didn't handle into my Junk Mail folder for later review."

    How do you do this with PINE/procmail? I'd like to stop using Outlook.

  93. Apply it late by Webmonger · · Score: 2

    I don't know how an ISP would accomplish this, but when a user sets it up, it's easy: filter your mailing lists first.

    THEN filter the remaining mail.
    The remaining mail SHOULD NOT contain any mailing lists, or other generic mail, just personal stuff.

    Wait-- here's how an ISP sets it up: don't delete the suspected spam, just add a header. The user's client can filter it, hopefully after it handles mailing-list mail.

  94. where is all of this heading? by sunhou · · Score: 1

    I occasionally stop to wonder, and think back to the pre-spam days of the internet, and then to the future... We are in the middle of quite an intense evolutionary arms race, the spammers versus the anti-spammers. Whenever the anti-spammers come up with a new trick, the spammers find another way around it.

    What is this system going to look like in another 20 or 50 years? What percentage of general computational resources are going to be devoted to the spam/anti-spam war? Do any of you think any radical revolutionary changes will come along, or the battle will pretty much proceed as it is, just continuing the one-upmanship ad infinitum?

  95. Re:Why not a histrogram filter (more) by javaaddikt · · Score: 1

    Let me elaborate a bit...

    Taking a hash or SHA digest is bound to fail. One little character is off and the thing fails. That means customized emails, for example.

    I'm working on a histogram based filter. You count the number of time a word occurrs in an email and you create a hash based upon the top twenty words and their occurance rate or so. If the spammer changes dear "bob" to dear "fred", bob and fred only occurr once or twice at most and are deemed insignificant by the algorithm and do not affect the hash. The more words you accept, the more accurate the fingerprint, although 20 or so seems to be accurate enough. Other configurable parts of the algorithm allow you to bump the word increment once if it occurs two times or three times instead of a one-to-one increment. Minute changes, therefore, will not affect the hash.

    Furthermore, if an exact match is not made, you can keep compiling all histograms by sorting them into groups of like content, and then generate "master" general fingerprints, which can then be used with weights in a fuzzy algorithm to score a message for spammyness. Combined with a threshold (say 50% spamminess) you can decide whether or not you want to reject it. This again is only used if a direct match is not made. (If an exact fingerprint has not yet made it into the db).

    This system still plays nice with distributed methods as you are still using a small hash code. If it were employed on one system, it would be easier to keep more detailed records of each hist; not just a hash.

    Maybe one of these days I'll have a decent working prototype in python to share.

  96. Poor use of hashing technology by SumDeusExMachina · · Score: 1
    While I realize that it was probably the most obvious choice, couldn't they do better for signatures than using a hash? If the spammer changes just one character in each spam he sends out (say he puts a unique junk string on the end of each message), the system is totally defeated.

    A better system would be to take random samples of several line groups, and then write a "signature" with the line numbers and the contents next to them. Then, if by some stroke of chance, the spammers random string is contained in one of those samples, one could do a diff between a message and the signature, and if it was pretty close, then it would still count as a match.

    Well, I'm off to their project page on Sourceforge...

    --

    Is your company running tools written by ma
  97. I think you may have missed the point. by MarkusQ · · Score: 2
    Interesting work, but I notice that you are only examining trigrams, and you are using an even weight factor. To improve selection you probably at least need to use variable weights (a fuzzy logic neural network rather than binary logic) and train the network with more sample spam.

    They aren't trying to answer the question "should this particular piece of e-mail be considered spam," but rather "is this particular piece of mail identical (to within some factor) to one that some human considers spam." So they don't need to train anything, they just store the hash-signatures of the spam that is currently making the rounds.

    Even if someone mistakenly identifies a piece of mail as spam, it won't hurt anything; the odds are very low that it will ever match another piece of mail in the entire history of the cosmos.

    -- MarkusQ

    1. Re:I think you may have missed the point. by kevinank · · Score: 1

      Let me explain. I was responding to the post I followed up to; one about AI recognition of SPAM. I was specifically not writing about the community spam identification effort with hashes, as my post clearly shows.

      --
      LibBT: BitTorrent for C - small - fast - clean (Now Versio
    2. Re:I think you may have missed the point. by MarkusQ · · Score: 2
      Let me explain. I was responding to the post I followed up to; one about AI recognition of SPAM. I was specifically not writing about the community spam identification effort with hashes, as my post clearly shows.

      I stand corrected. -- MarkusQ

  98. +1 Hackerly on the MQR standard by MarkusQ · · Score: 2
    To have any hope of working over the long term, this kind of an approach must include the ability to distribute not just the hashes themselves, but the hash function as well, so that the hash function itself can be adjusted, when needed.

    Yes! In fact, why not have many fuzzy hash functions floating around at once? That way, their task would be to come up with something that yielded a different hash against all of the hash functions at once, a much harder problem. If some spammer figures out a way to do it, an anti-spammer can devise a function (looking at lots of copies of the spam, which shouldn't be hard to come by) that would catch it, and now that trick won't work any more.

    Distributing the functions with the hash (with a few safe guards, e.g. re: the halting problem) would make this darned near imposible to beat.

    -- MarkusQ

  99. Re:Great use of p2p -- Wont work. by Anonymous Coward · · Score: 0

    Is the code on the web?

  100. don't worry by Erris · · Score: 1
    I really hope no troll out there takes my word on this and actually do this.

    Why worry about the trolls when big ISPs are here for you? ATT would never use this to filter objectionable content would they? I mean who would want to block the local LUG? Not the MSN? No, everyone is nice enough to astroturf that kind of thing. Stephen Barktoo signs up for the local LUG then reports the mail as spam. Poof local LUG looses mail list. Then, when the ISP consolidation is over and ATT and MS own everything, they can block the LUGs webpage like they block incoming port80 now. Thanks for the insightful thought, there in return is the mechanism for email control and motive you fear with a bonus mal thought about web control.

    End the last mile tyrany! Go wireless now.

    --
    DMCA, Hollings, Palladium. What might have sounded like paranoia is now common sense.
  101. How about PGP signatures? by Corrado · · Score: 2

    I had a thought about this a little while ago. What if you only accepted mail from people that included their PGP fingerprint, and then only the particular people that you want to accept mail from?

    This turns your mailbox into a Opt-In situation. I realize that this would be hard to do, and that you would have to swap fingerprints off-line, but wouldn't you have to do that anyway? This would also require mail clients to allow you to set up a new X-Header (most will, won't they?) like PGP-Fingerprint or something.

    It certainly would keep unwanted mail out of your mailbox. And if you decided that you didn't want any more mail from a particular person, just remove their fingerprint. This also gets around the problem of someone sending email from different addresses. I personally have 4 or 5 different email addresses that I use for various purposes.

    --
    KangarooBox - We make IT simple!
    1. Re:How about PGP signatures? by Sloppy · · Score: 2

      (Terminology Nitpick: What you want is for the email that you receive to have a PGP signature (not fingerprint).)

      IMHO, that's the best long-term solution. And as a nice side-effect, if people bother to sign their mail, then it'll probably be no extra trouble to encrypt it also. The problem is getting people to do it.

      --
      As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
    2. Re:How about PGP signatures? by Corrado · · Score: 2

      Well, actually I *was* thinking of a PGP fingerprint. That way you could really kinda ignore the FROM: address in the mail header. Even better, throw out the FROM: header all together and just rely on a proper PGP fingerprint! :)

      I agree that the hard part would getting people used to it, but I had not considered that if they signed their mail, it would just be a short hop to encrypting it as well.

      This idea just keeps getting better and better!!

      --
      KangarooBox - We make IT simple!
  102. I used SpamBouncer for a year by Anonymous Coward · · Score: 1, Informative
    and it was good, but I don't know anything about procmail, so adding my own rules was a pain.

    I use Mail::Audit and Mail::SpamAssassin in a Perl filter script now. Works great and I can add custom rules easily enough.

    See link for details.

    1. Re:I used SpamBouncer for a year by gilgongo · · Score: 1

      Indeed - the trouble with purely procmail-based stuff like this is that it all gets a bit fiddly to set up. I'm now using SpamAssasin as well (which uses Razor, BTW) and it is very nice. A doddle to set up and maintain as well.

      --
      "And the meaning of words; when they cease to function; when will it start worrying you?"
    2. Re:I used SpamBouncer for a year by BACbKA · · Score: 1

      I am trying spambouncer right now, and it seems
      pretty worth the effort. If you have a shell
      account, it works pretty much out of the box,
      provided that you keep an eye on the log
      in the first couple of days.

      I have a funny metaproblem with it, though.
      On the www.spambouncer.org site it's suggested
      to subscribe the updates list. O.K., so I go for it.

      It looks like updates-request@lists.spambouncer.org
      is not a valid alias at that system
      (forgot to update the majordomo aliases file?)

      O.K., so I mailed
      majordomo@lists.spambouncer.org

      Then I got a response from...
      Majordomo@mail.cliq.com,
      (I just hope that's co-hosting...
      cliq.com does sound very spammy :)
      Oh, what an awful thought! Spammers spammed
      the MX records for the spambouncer.org
      and collect the addresses of those
      who try to subscribe updates!!)

      It says it has no list called 'updates'.
      If you ask it for all it has to offer, it says:

      Majordomo@mail.cliq.com serves the following lists:

      cliq-users-dsl Announcements for CLIQ's DSL users
      mjd_ej Electronic Journal Online Marketing list.
      test Test list for CLIQ Services Cooperative

      (The Online Marketing feeds my worst suspicions...)

      Anyone out there can point me in the right direction?

      Vassilii

      --

      VKh

  103. Re:Great use of p2p -- Wont work. by jmason · · Score: 1
    I've been working on a similar project but using additional factors that help identify spam such as violations of the mail RFC's, and other header indicators, in addition to NLP. I have a prototype that I'm using to score all of my inbox e-mail and am using that to tune the weight factors and add in new factors as I encounter them. It would be interesting to combine your approach with mine I think, since I hadn't thought of analyzing trigrams.

    Sounds a bit like SpamAssassin, if I say so myself ;)

    SA analyses mail headers, body, and uses RBL and Razor to come up with an aggregate spam/non-spam score, then filters appropriately. Most of its smarts is encapsulated in a Perl module, which means it can be run from virtually anywhere; a procmail filter, a spam-protection SMTP proxy server, a system-wide checking system, etc. (all 3 of those have been implemented). Its scores are generated using a GA and a large corpus of test mail, too. Hit rates nowadays are fantastic ;)

    Disclaimer: I'm the maintainer.

  104. Here's a Perl script for hitting spammers links by Anonymous Coward · · Score: 1, Interesting

    I just wrote a Spam Victims Revange v0.01, it's a little Perl script which hits paid links found on Overture under "bulk email" queries etc. It acts like a real browser, in terms of HTTP_USER_AGENT and random "clicks" intervals, showing progress of total hits and total bucks. Enjoy.

  105. Whitelisting with procmail by Fweeky · · Score: 1

    > How do you do this with PINE/procmail? I'd like to stop using Outlook.

    Easy enough:

    :0:
    * ? (formail -x From: -x Sender: -x Reply-To: -x Received: | fgrep -iqf ~/.src/procmail/whitelist)
    Inbox

    :0
    /dev/null

    And fill ~/.src/procmail/whitelist or whatever with patterns to match friends/ml's etc.

    It's not hard to repeat this for multiple whitelists, produce blacklists, or have whitelisted stuff get processed further.

    Giblet's procmail stuff is a nice place to start (http://www.linuxbrit.co.uk/procmail/)

  106. Just to beat this dead horse a little more by Spinality · · Score: 1
    I think this issue is important enough that public discussion is worthwhile, and you've made your points well -- again, as it says in my sig, if both sides of the argument didn't have some validity, the whole issue would have been resolved long ago. And, although many /. readers will dismiss what you're saying because you're a sender of UCE, I think your points merit discussion.

    But here's why I think this situation is black-and-white.

    Bad spammers have swamped the medium. Therefore it doesn't matter that there may be good spammers; I can't opt out to *ANY* UCE, because there's no way that I can distinguish between the white hats and the black hats. Therefore opt-out does not help. If we could shut down the spoofers and list churners, and opt-out had a reasonable chance of working, then I'd be singing a different song. But so far we can't.

    For the same reason, I don't equate catalogues and other unsolicited bulk snail mail with UCE. Those mailings are selective, because there's a substantial cost in sending it out. But UCE has been taken over by the bad guys. Therefore, in my opinion, a good guy sending UCE is actually contributing to the spam problem, by further muddying the water. I will NEVER reply to any UCE.

    If we could cut out the anonymous and spoofed crap, and UCE always had a valid sender, opt-out mechanism, etc., I might become a UCE supporter. But at the moment the amount of spam I receive grows steadily each month, the amount of time I spend dealing with it grows accordingly, and the amount of virus-infected spam goes up as well.

    So my view is that if you're not against UCE, then you're part of why we have a spam problem.

    But again, this is just my opinion, and you've made one of the better cases for 'white hat' UCE that I've heard.

    --
    -- We all have enough strength to endure the misfortunes of other people. La Rochefoucauld
  107. How about voting? by QuickFox · · Score: 1

    You might arrange a voting system. Among the Reporters at each Catalogue Server, calculate the percentage who report a message as spam. Consider it spam if more than X percent report it as spam.

    I think some sort of threshold will be necessary, because even Reporters with very high reputation scores will make occasional mistakes. Also, sometimes pranksters will subscribe people to legitimate list, making it look like spam.

    With the right threshold value, this should give very reliable results.

    You mention a reputation system. Voting can be combined with reputation scores. Each vote might get a higher or lower weight depending on a reputation score.

    Give a man a fish and you have fed him for one day. Teach him how to fish and he'll eat for a lifetime, all the while calling you a miser for not giving him your fish.

    --
    Terrorists can't threaten a country's freedom and democracy. Only lawmakers and voters can do that.
  108. Other alternatives by jambalaya · · Score: 1

    Of course, you could use the methods I use. I set up mailboxes for all my accounts and rules that filter anything coming to any of those accounts into their own box. Then, my Inbox is usually the only place I get a lot of spam, because usually the To: does not contain any of my mail addresses, so they are dumped into my inbox. And really, all you need is your delete key anyway. It's not that big of a pain. I get hundreds of spams and I simply don't allow it to bother me.

  109. Re:Great use of p2p -- Wont work. by ajs · · Score: 2

    Here's why you don't care about being able to intellegently identify spam:

    1. I get a fair amount of real mail from friends of the form "check this out: http://x.y.z/"... I also get a LOT of spam that looks exactly the same.
    2. You can catch spam the moment it comes out, by having honeypot-like mailboxes which are for no users at all, but you submit thier addresses to various places spammers look for such addresses.

    Given these two, spam filters that don't look at real spam constantly are just hobbling themselves. Perhaps you intelligent filter should start off by running razor-check, and then thinking real hard if Razor says it's not spam...?

  110. Re:Whitelisting with procmail -- Thank you Fweeky! by wideangle · · Score: 1

    Good site too. (linuxbrit.co.uk/procmail)

  111. Target the *websites* by Zilch · · Score: 2
    I want a system like this that targets not just the spam mail, but the actual websites that are more often the not the way spammers actually make money.

    So when I get spammed by some idiot advertising http://www.hot-spammer-teens.com, I can submit that URL, and (assuming enough people submit the same URL from unique IP addresses) others running the same system will get a popup message when they go to http://www.hot-spammer-teens.com, and they can vote their dislike of spam by putting their creditcard away and trying to find another pornsite.

    This could also work for email addresses maybe, when the spammer is trying to get people to send creditcard numbers for some MAKEMONEYFAST! scheme.

    This would have the advantage that you can also use it for spam that gets sent over ICQ too (which is where most of my spam comes from these days).

    Zilch

    1. Re:Target the *websites* by kindbud · · Score: 2

      I don't begrudge anyone making money. I just don't want the spam in my mailbox. If they make money doing what they were doing before, except that they stopped sending spam, that's fine with me. Likewise, if I can use a spam-filtering *something* that is effective and costs little effort to maintain, that is fine with me too. I don't need to carry the vendetta any further.

      Targeting websites and websites hosts is part ideological crusade, and part desperation. If the tools available were better at excluding spam from mailboxes, there'd be no need to carry it any further.

      --
      Edith Keeler Must Die
    2. Re:Target the *websites* by Sloppy · · Score: 2

      IMHO, this sort of thing could be the next Big Thing (and no, I don't mean it in the Cuecat "Holy Toledo!" sense) -- 3rd party comments/annotations to web sites. Kinda like "Third Voice" but w/out centralization, and instead, P2P combined with some sort of reputation system.

      --
      As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
  112. The other war by Anonymous Coward · · Score: 0

    I know it is off-topic, but don't forget the other war that will significantly effect the future of the net. The p2p vs *AA war. So far the *AA has won several battles already. I wonder if they can win them all, or will the infoanachists eventually triumph. Being an infoanarchist I hope we win, but I fear we, and the internet, will lose.

    Keep coding infoanarchists, you are our only hope.

  113. Another spam system by Anonymous Coward · · Score: 1, Interesting

    You also can check this idea which works also.

    Basicly you have a bunch of trusted people which can add entries to the spam list. When they receive a spam they do forward it to a list, signing the message with pgp/gnupg. A perl engine will then verify the sign to know if the person is allowed to add/remove entries. Then it will fetch the From: header from the forwarded email, and add it to a file which is available on the net. You just have to write your script to fetch the file every 10min and add the content to your access list (postfix, sendmail, etc) with REJECT.

    Scripts are available also for Gnus/Emacs so you hit F1 and it will send the mail the way it should, so announcing spam is one key away. It's important announcing spam doesn't take time, or you won't do it as you probably receive many per day.

    You also can add [domain] in the subject line which will add the whole domain from the From: header. The [rbl:IP] will add it to a rbl table.

    Take a look, it's cool.

    1. Re:Another spam system by vjs · · Score: 1
      Such systems can be cool, but they have two major shortcomings. The first is that they cannot start rejecting spam before it has been seen and manually reported by at least one good guy. From my logs, it seems the bad guys like to burst their spews at odd hours, such as when they get home from a hard day begging with a "homeless please help" sign.

      Second, it is practically impossible to maintain a list of more than a tiny number of only good guys. If there is any real incentive, the bad guys will get on the list with as many aliases as they need to skew the system. You must either keep the list tiny enough that all members are known to all other members, or you must assume that bad guys are present. Voting or trust schemes can ensure that no more than 5% or perhaps even 1% of members are secret bad guys, but that's not good enough for an anti-spam system that hopes to have a false negative rate lower than 40% and a false positive rate of less than 1%.

      As I understand it, this Razor can be used with spam traps (addresses that get no legitimate mail) to largely avoid the first problem. If you are extremely careful and lucky about keeping secrets, spam traps can fix the second problem. The need for lucky secrecy comes in keeping the bad guys from knowing about any of your spam traps lest they send them legitimate mail (e.g. CERT advisories).

      A major problem with spam traps is getting the bad guys to spam them. It is easy to build a spam trap that receives some spam, but if you want to reject more than 10-20% of spam, you need more. For example, you need to get the big commercial and political outfits to send their wonderful news to your traps, but they're not going to scrape domain contacts or netnews or use the standard dictionary attack list. (My copy of the standard dictionary attack list is fairly complete. Used with a DCC client, it collects a lot of spam.)

      All of that is why I believe in automated checksum reporting without any humans in the loop. I think you must start rejecting copies of a spew within minutes and ideally seconds of its start. That's why one of the design criteria of the DCC is that servers should send the checksums of a message to their peers within seconds of when its receipient count reaches "bulk."

      There is a third problem with Fabien Penso's system as I understand it. That is that none of the SMTP envelope or headers are reliable indications of spam, if you want a low false negative rate. If there is one thing that spammers can invent, it is new usernames.

  114. Re:Great use of p2p -- Wont work. by Doug+Neal · · Score: 0

    Sending large amounts of individual messages wouldn't be an issue. A lot of the time, spammers don't use open relays, they run spamming programs on their own computers which contact the target MX directly. So let's say the person to recieve spam had an account @hotmail.com for example it would do a DNS lookup on hotmail.com and see what that domain's mail exchanger is. Then it would open a TCP connection to the MX on port 25 (SMTP) and send it the message and the mailbox it is destined for (this is what SMTP relays do for you, when you go through one).

    You can tell when you've been sent direct-to-MX spam, because there will only be one header that's been inserted by an SMTP server (each server that the mail goes through adds a message to say who it is, the time it got the message, and from where). That header will be the one your mailbox provider adds, for example

    recieved from host-44-772-9.dialup.spammersisp.com by mail.spam-recipients-isp.com at 13:40 GMT 2/12/2001

    Therefore... it would not be hard to set up a program to send as many individual messages as needed, using a mail-merge style where you have a basic template for a message and swap in the individual data each time. I'm sure programs exist to do this. I could do it in Perl in an hour or two at most! (Not that I'm going to) :P

    If this system relies on spams being absolutely identical to work, then it won't, because each message is quite often different. If it took into account and compared the date, time, and originating IP address of the message, it might make it more reliable, perhaps...

  115. You're the guy that I'm worried about... by wirefarm · · Score: 2

    No offense, hear me out a bit:

    you could just as well have said this: "We wanted to send party invitations, so we hacked into each of our customers' servers and put a message on their home page.

    No, actually, you're wrong. If you go to a restaurant and leave your business card, you are pretty much authorizing the restaurant to use the information to contact you. That's how business has been conducted for quite a long time. You have a reasonable expectation that the restaurant will not abuse your trust and in that regard, I don't think we have at all. As I said, all but a very few people welcomed these invitations. My company is quite well known for throwing a hell of a party.

    Yet with this software, one person can have the ability to block a group announcement that is welcomed by 99 percent of the people.

    Ever click a ThinkGeek banner on Slashdot? What if one reader had the ability to block the ads for everyone? I'd miss them, even though they are technically the same as most any other banner ad and in some people's minds, evil. ThinkGeek seems to be a clueful company that knows its audience and in that is a welcome addition to the community. The also pay the money that keeps the servers running.

    Ever get a catalog in the mail that you actually thought was worthwhile? What if one person could decide that it was junkmail and should be blocked for everybody? That there was no way for you to 'opt-in', because there was no way for you to hear about it in the first place.

    What I'm saying is that one guy who may not even recall opting in can block a perfectly valid email announcement. In that way, the system does have a flaw.

    What if I was on the CERT advisory email list and decided to say that their latest announcement was spam? From my understanding of the system in question, I would have that ability.

    I would love for there to be a good system for controlling the junk email that I get, but I don't think that this is there yet.

    Cheers,
    Jim

    --
    -- My Weblog.
  116. Ha, actually no I not. by Spinality · · Score: 1

    > You're the guy that I'm worried about...No offense

    Ha ha. Well, actually, I'm not that guy, and I basically agree with everything you've said in this post. I think I was answering a different question, maybe one that you didn't ask. I had concluded that your customer list had been randomly accumulated, with no opt-in process, and that your customers weren't necessarily expecting to hear from you except in response to their own messages. (Example: When I place an order with amazon.com, I don't want to start receiving ads, and I damn sure don't want them to sell my name to somebody else. An amazon.com party invitation? I think in that case, given the number of their customers and what a small part of their business I represent, it would be an inappropriate use of my address. There's no reasonable expectation that ordering a $10.95 book would somehow put me onto their A-list.) It sounds like you have a much closer relationship with your customers, and so it's not black-and-white. Well, again, as I said earlier, if it were black-and-white then it wouldn't be worth discussing.

    To use your restaurant analogy, rather than collecting business cards I thought you were saving the telephone numbers left when people made reservations -- in that case, there's NO expectation that they'll start gettting telemarketing calls.

    I totally agree that this spam blocking approach -- the one this thread is putatively about -- has real weaknesses, and that (depending on how it's implemented) one individual might have the ability to block legitimate mailings. I suppose one approach would be to withhold action until some number of complaints are received -- just like how your cable company won't send out a service truck until 3 calls come in.

    But returning to my original post, I stand by my belief that a bulk mailing is spam unless there's a clear opt-in by the recipients. This opt-in could come from several means: via an explicit opt-in form; by the user manually submitting an email address; or as part of your published terms of service. (Given the abuses of list sharing, I feel that there should be NO way to resell an email address obtained from an outside source -- the only address that you should be able to share is one that you've been told first-hand can be shared. In fact, if this were done, and we got rid of the faked sender addresses etc., opt-out would have a better chance of working. But today, once you get on one list being distributed, you're screwed -- you can never stop the barrage.)

    I hope my position doesn't sound as draconian as my original post made it. I was responding to the philosophical question of "What is spam?" rather than the practical questions "Is this spam blocker a good idea?" or "Were we good guys with this party mailing?" You were good guys; but if I were you, I'd institute procedures to maintain the customer list more carefully, so that NO customer could ever be surprised by an invitation in the future. They'll know whether or not such mailings are likely at the time they leave their email addresses. And that's good customer service.

    --
    -- We all have enough strength to endure the misfortunes of other people. La Rochefoucauld
  117. how do I integrate this with qmail? by jshare · · Score: 1

    I'd like to start using this within qmail, but I'm unsure how to put it into my .qmail file.

    Anyone have a recipe?

  118. 99% of foreign mails by Anonymous Coward · · Score: 0

    a. Use Outlook secretly
    b. Receive loads of foreign spam
    c. Don't know any foreign languages
    d. Don't have any foreign friends
    e. Don't have any friends

    Foreign spam removal (Score:5, Informative)

    May be better Funny, but Informative?

    If some one do this to his email client is deleting all the foreign mails.

    Bórrame por no ser de Estados Unidos. :'(

  119. Fighting Racism by Anonymous Coward · · Score: 0

    This Comment moderation is racist.

    1. Re:Fighting Racism by Anonymous Coward · · Score: 0

      Chill dude.

  120. Latinoamerica & anti no-wasp filter. by Anonymous Coward · · Score: 0

    Latinoamerica son todos los paises que fueron colonizados por paises con lenguas romances, esto es, descendientes del pais. Ese filtro cortaria una gran cantidad de paises americanos.

    * Canada, por francofona.
    * Toda hispanoamerica y claro:
    - Muchas zonas de California.
    - Muchas zonas de "la luisiana" francofonas.
    - Muchos hispanohablantes de USA.
    * Brasil, por el portugues.

    Contando los que quedan dentro estan:
    * Algunas personas de USA.
    * Belice

    1 saludo
    Tei

  121. Fight The Spammers! by Anonymous Coward · · Score: 0

    Want to know how to hit the companies that provide the bulk email lists, and hit 'em hard? It's safe and legal!

    Just go to your favorite pay-per-clickthrough search engine (like Goto.com), search on keyword phrases like:

    email marketing
    bulk email marketing
    direct email marketing
    bulk email marketing campaign
    email marketing company
    email marketing software
    opt in email marketing
    targeted email marketing
    permission email marketing
    marketing email
    email marketing services
    email marketing tool
    optin email marketing
    online email marketing
    email marketing program
    email marketing list
    email marketing campaign
    free email marketing
    bulk email work marketing
    email marketing strategy
    email marketing solution
    permission based email marketing
    email marketing uk
    marketing email list
    target bulk email marketing
    email marketing consultant
    direct email marketing firm
    precision email marketing
    bulk email marketing software
    marketing bulk email
    marketing email service agent
    direct marketing email


    ...and start clicking away on the paid listings! Some of these comapnies are paying as much as FIVE DOLLARS PER CLICKTHROUGH for their listing!

    Can you imagine a million slahsdotters hitting these search engines? It would shut down most of these guys, and probably discourage future spammers.