Slashdot Mirror


SpamNet: Razor for the Masses

UCRowerG writes "From CNET News on Yahoo!: "Conceived by Napster co-founder Jordan Ritter and open-source developer Vipul Ved Prakash, the company is touting the benefits of democracy, networking and collaboration in the war against unscrupulous e-mail marketers." " Since Prakesh is responsible for Razor, hopefully there will be Linux support as well, but once again I gotta throw my props at Spamassassin which catches over a hundred spam for me each day.

12 of 256 comments (clear)

  1. Alanis would love this. by Bilestoad · · Score: 5, Funny

    And the first thing the story about the spam-battling startup does is to load some popup advertising.

    Wonderful.

  2. Re:I need my spam by gentix · · Score: 5, Funny

    And if you follow the instructions of every penis enlargement email you get, you'll soon really be part of something greater than you...

  3. Here's the URL... by EnglishTim · · Score: 5, Informative

    http://www.cloudmark.com/

    ... because the guy who posted this obviously couldn't be bothered....

  4. Nilsimsa's popularity will be its own demise by intuition · · Score: 5, Insightful
    Vipul's razor uses something they call "Nilsimsa" fuzzy signatures.

    The signatures are used to determine how "close" the email that your are testing is in content to known spam. The source code of this hashing algorithm is publically available.

    If this network ever became a real problem for spammers, they will simply use word substitution algorithms or any other number of simple methods to change the email until the nilsimsa's signatures are not close enough to flag the email as spam.

    This was the problem with Vipul's razor version 1.0, which was discussed on slashdot, and this remains the problem in Vipul's razor 2.0

    1. Re:Nilsimsa's popularity will be its own demise by Matts · · Score: 5, Insightful

      Disclaimer: I'm one of the SpamAssassin developers.

      I'm not really sure how Razor2 is managing to use Nilsimsa (and despite Vipul saying that Razor is open source, we don't get to see the server, so I can't find out easily).

      When I did testing of Nilsimsa for SpamAssassin it turned out that in order to be able to use Nilsimsa you have to use a special comparison function over every single nilsimsa hash in your database. This basically became unusable at about 50K signatures, as when you received an email, you first had to hash it with Nilsimsa, but then you had to use nilsimsa_compare (or whatever the function was called) on each and every one of those 50K entries.

      I'd really like to hear how they're doing it. Perhaps Vipul found some way of indexing the search so it wasn't a full scan. If anyone follows the Razor lists and knows how it works, please share.

      --

      Matt. Want XML + Apache + Stylesheets? Get AxKit.
  5. Spamassassin over Spambouncer by waldoj · · Score: 5, Informative

    I've run both Spamassassin and Spambouncer. For the curious, I prefer Spamassassin, and here's why.

    I was very impressed with Spambouncer. It was the first spam-heuristic system that I'd used (previously, I'd relied solely on MAPS, ORBS, ORDB, RBL, etc.), and I was very impressed. I found that it rejected a lot of legitimate mail until I grepped my "Sent Items" folder, extracted every "To" field and made that my white list. (The assumption being that if I've e-mailed somebody, I don't mind hearing from them.) That worked very well, and I was happy with Spamassassin. The odd piece of spam would get through, and I still had 1:100 legitimate messages get put in my spam folder. But it made my life much simpler.

    Then I tried Spamassassin. The big reason was because I wanted to take part in Razor and know that I was a part of a collaborative process. Also, Spambouncer hadn't been updated in months, which struck me as odd. But I also just wanted to try something different. I found that Spamassassin was better. Not in a way that made Spambouncer look bad, it was just clear that Spamassassin was a superior product. For example, Spamassassin provides a complete scoring in the headers, so you know exactly what criteria caused the message to be block. And I never had to set up a whitelist -- it just works. I still get that tiny little bit of spam that gets through, no more or less than with Spambouncer, but that's really not a complaint. It's very, very rare that a legitimate piece of mail gets caught up in the system. Best of all, the nonexistent addresses on my system that spammers have somehow discovered (big@waldo.net, aldo@waldo.net) can be forwarded via my aliases table to Spamassassin's (Or is it Razor's? I forget.) server to be automatically added to their honeypot collection.

    I'll stick with Spamassassin, I think. It appears to be the most mature, stable, simple, straightforward spam filtering product available today. For those looking to set up server-side spam filtering, I highly recommend it.

    -Waldo Jaquith

    1. Re:Spamassassin over Spambouncer by rw2 · · Score: 5, Interesting
      I'm a big fan also, in fact I introduced Taco to it. Folks interested in what the heuristics produce in terms of distribution of SA scores can view a graph of my logs. The three lines are the commonly used thresholds for deciding whether a mail is spam or not. Most folks run at 5, but some that are more paranoid about false positives run at 7 or 10. Myself, I find false positives to be practically non-existent and run happily at five. The missing data is just because I didn't keep statistics on non-spam mails until I had been running for a couple weeks.


      Now for a commercial. Craig Hughes has formed a company to bring spamassassin to outlook users . And I'm setting up a hotmail like service at spamassassin.net to help users that don't have the time or ability to setup spamassassin themselves.

  6. Have Your Cake and Eat It Too by pjrc · · Score: 5, Informative
    I've been using Razor with Spamassassin for many months. All you need to do it install the razor package (and the various perl modules it wants), and then add a line like this in your .spamassassin/user_prefs file:

    score RAZOR_CHECK 5.0

    I've also got the other "network tests" enables (blacklists), but I assign them low scores since they have a lot of false positives.

    Using spamassassin with razor and the blacklists really works. My spam file has 836 spams automatically filtered between March 1 to today, June 19. Of those 836 messages, 511 have the RAZOR_CHECK string in the "X-Spam-Status" line that spamassassin adds to the header.

    Not too bad, considering Razor uses a rigid message digest that fails if the spammer adds any "random" content to the messages. Saddly, it seems like that's becoming more common. Rumor has it that Razor is someday going to use "fuzzy" matches with one of two algorithms that somehow accomplish such a feat. Anyone know when/if this is supposed to happen??

  7. Re:Yeah, I remember that discussion by ahrenritter · · Score: 5, Insightful

    I don't believe that computing cycles are the contention point here. The difference is in who is paying for the bandwidth. Consider these two hypothetical cases:

    A. Not worrying about razor
    The spammer loads up their spam program and gives it a dump file of five hundred thousand email addresses. It takes these, and using its knowledge of spam friendly networks, sends one copy of the spam to 500 different relay servers. Each server receives an identical e-mail with 1000 different bccs. The e-mail body is only 20k, adding the 1000 addresses gives you another 20k or so, so the spammer spends 20 megs in bandwith (20k+20k * 500 mails sent)

    B. Worrying about razor
    The spammer loads up their spam program and gives it a dump file of five hundred thousand email addresses. It takes these, and the message to be spammed, and sends a slightly modified message to each group of we'll say 10 addresses. This way, if one of the messages gets razor'ed, they only lose 9 possible reads. The spamware sends out 100 emails to each of the 500 spam friendly servers. The e-mail body is only 20k, and the 10 addresses only add 1k or less, so the total message is only 21k now, but it is sent out 100*500 times. The spammer has spent over 1 gig in bandwith now.

    That doesn't come cheap.

    --

    All I wanted was a rock to wind a piece of string around, and I ended up with the biggest ball of twine in Minnesota
  8. Re:unsolicited [ commercial | bulk | junk ] email by pwagland · · Score: 5, Funny
    by Martin Spamer (Martin_Spamer@NoSpAM.kitv.co.uk)...
    So if you wish to get on my bright side, do not use the term Spam or its derivatives use the term(s) unsolicited [ commercial | bulk | junk ] email.
    Look at the e-mail address and tell me I am not the only person to find this ironic.

    Please?

    :-)

  9. Re:What I find strange by grytpype · · Score: 5, Funny
    Probably for the same reason they think you
    • are heavily in debt
    • have a miniscule penis
    • are impotent
    • have no job
    • have no college diploma
    • have no insurance
    • need prepaid legal representation
    • need to investigate/track down various people in your life
    • In short, the spammers are advertising... to themselves!

    --

    - Have a picture

  10. LART THE ISPs! by wowbagger · · Score: 5, Interesting

    The single best thing all of us who know how to run traceroute and whois can do is LART THE ISPS THAT HOST SPAMMERS!

    I've been forwarding every spam I get that come from a Verio hosted site, or spamvertises a site hosted on Verio to Verio and their parent company, NTT. I'm using bitch-list.net to do so, since they have a bazillion email addresses for Verio. I make sure the email has the spam attached, and since Verio has claimed the cannot read attachments (***cough***BULLSHIT****cough***) I also paste the mail headers into the message, along with a WHOIS and traceroute showing it to be a Verio customer. When they complain, I tell them "MY message isn't spam - your customer contacted me, so a prior business relationship exists. You want it stopped, stop the spammer."

    I won't say it is working, but if 10% of everybody who got these spams did as I do, then Verio's help desks would be so clogged that they couldn't HELP but see the damage on the bottom line.