Slashdot Mirror


A New Type Of Realtime Blocklist: The SURBL

Glamdrlng writes "The SURBL, or "Spam URI Realtime Blocklist", represents a nexus of RBL's and content filtering that may bring us one step closer to a spam magic bullet. While traditional RBL's perform a DNS lookup on the connecting mail server, SURBL's take this a step further by parsing the text of the email looking for URI's and doing a lookup on those web servers. They also prevent "joe jobs" by maintaining a whitelist of legitimate web servers whose domain names may show up in spam messages, e.g. EBay, Paypal, Microsoft, etc. The only requirement to implement the SURBL is a plugin on your MTA such as spamassassin that can parse the body of each email. While there is no MTA that directly supports SURBL's without a plugin, the author hints at one being in development."

26 of 219 comments (clear)

  1. Is this really a GOOD idea? by beh · · Score: 5, Interesting

    Blocking URLs is an "ACTIVE" measure - and one that opens very bad
    possibilities for abuse. While the While-List would protect against
    this it will protect the BIG players on the market - it can still
    wreak havoc on small/medium enterprises - e.g. a competitor of a
    (pretty much) 'niche' firm could get a spam out advertising the
    COMPETITOR in order to get HIM blocked...

    Or - the other way around - a company gets itself a whitelisting
    (via a "fake" joe-job on itself) and then continues spamming...

    Please stick to PASSIVE measures! They can't be abused...

    1. Re:Is this really a GOOD idea? by beh · · Score: 5, Insightful

      (one minor thing I missed before:

      The advent of bayesian spamming brought spams that included whole paragraphs of random words - just so that your list would get more and more bloated...

      How long do you think it will take spammers to add dozens of valid - but in the context of the spam nonsensical - URLs just to fill up the black-list and make it useless?

    2. Re:Is this really a GOOD idea? by acariquara · · Score: 5, Interesting
      What ever happened to Bayesian Noise Reduction/Dobly algorythms? I was hoping for these to get more known and widespread...

      snip snip from their page

      Feb 24: We broke 99.984% today and caught up with CRM114 =). DSPAM is now around ten times more accurate than a human. According to a study by Bill Yerazunis (CRM114), a correspondence secretary is approximately 99.84% accurate at filtering spam. As of today, DSPAM has classified 3140 spams and 3457 nonspams in my mailbox with only 1 false accept and 1 false reject. The false accept was caused by a bug in the BNR code which was fixed, so depending on how you count it, I am getting either 99.968% or 99.984% accuracy. These are from real mailbox statistics, and not based on some 'test corpus' mail sent in. As spammers continue to try and evade filters, statistical filters such as DSPAM continue to adapt easily maintaining their high levels of accuracy.

      And no, I am not posting an URL. If you want to get to the page, google for "Dobly" (yes, that is the actual spelling) and go to the first page.

      --
      Dear aunt, let's set so double the killer delete select all
    3. Re:Is this really a GOOD idea? by jelle · · Score: 4, Interesting

      Emails with paragraphs of random words are not very easy to distinguish from emails with paragraphs of actual language in nonspam emails. But emails with dozens of random links are better distinguishable from nonspam emails, so if the spammers start doing this, then you can filter out their spam even without having to check the SURBL by simply adding some points to the score of emails with a lot of links

      And if you use the auto-whitelist feature, then it won't increase the false-positive count, except for people who receive a lot of emails with lots of random links from lots of different people.

      Plus, the spam detection software may very well be capable of distinguishing between the decoys and the real spam-links by analyzing the context of the URI. At least that will be a lot easier than analyzing the grammar in an email and detecting the nonsensical paragraphs and the nonsensical/typo-ed words in spam.

      Sure, it's not the final battle, but it looks like a very promising improvement in the fight against spam.

      --
      --- Hindsight is 20/20, but walking backwards is not the answer.
    4. Re:Is this really a GOOD idea? by delstar+dotstar · · Score: 4, Interesting
      Words that can be more than one part of speech are used fairly infrequently. Ten in a row is a pretty good giveaway.
      • that:
        1. adj (Not this one, that one)
        2. dem. pron. (Look at that )
        3. rel. pron. (birds that sing)
      • can:
        1. noun (a can of whoopass)
        2. verb (The boss is gonna can your ass)
        3. modal (I can swim)
      • one
        1. adj ( one fine morning)
        2. pron (the one that got away)
      • part
        1. noun ( part of speech)
        2. verb ( part the Red Sea)
        3. adj ( part man, part machine)
      • used:
        1. verb (I used a hammer on the kitten)
        2. adj (a used car)
      • ten
        1. adj ( ten fingers)
        2. pron ( ten in a row)
      • row
        1. noun (ten in a row )
        2. verb ( row your boat)
      • pretty
        1. adj (a pretty girl)
        2. adv (a pretty good giveaway)
      OK, that was a little snarky. Anyhoo, spammers can just extend the stream-of-random-words technique and create "sentences" that are syntactically kosher but semantically empty: Colorless green dreams sleep furiously. Hell, they don't even need to create sentences -- they can just pinch real, human-generated text from any old web site.
  2. Time to dig out this old post. by Anonymous Coward · · Score: 5, Funny

    This article advocates a

    (x) technical ( ) legislative ( ) market-based ( ) vigilante

    approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)

    ( ) Spammers can easily use it to harvest email addresses
    ( ) Mailing lists and other legitimate email uses would be affected
    ( ) No one will be able to find the guy or collect the money
    ( ) It is defenseless against brute force attacks
    ( ) It will stop spam for two weeks and then we'll be stuck with it
    (x) Users of email will not put up with it
    ( ) Microsoft will not put up with it
    ( ) The police will not put up with it
    ( ) Requires too much cooperation from spammers
    ( ) Requires immediate total cooperation from everybody at once
    ( ) Many email users cannot afford to lose business or alienate potential employers
    ( ) Spammers don't care about invalid addresses in their lists
    ( ) Anyone could anonymously destroy anyone else's career or business

    Specifically, your plan fails to account for

    ( ) Laws expressly prohibiting it
    ( ) Lack of centrally controlling authority for email
    ( ) Open relays in foreign countries
    ( ) Ease of searching tiny alphanumeric address space of all email addresses
    ( ) Asshats
    ( ) Jurisdictional problems
    ( ) Unpopularity of weird new taxes
    ( ) Public reluctance to accept weird new forms of money
    ( ) Huge existing software investment in SMTP
    ( ) Susceptibility of protocols other than SMTP to attack
    ( ) Willingness of users to install OS patches received by email
    ( ) Armies of worm riddled broadband-connected Windows boxes
    (x) Eternal arms race involved in all filtering approaches
    ( ) Extreme profitability of spam
    ( ) Joe jobs and/or identity theft
    ( ) Technically illiterate politicians
    ( ) Extreme stupidity on the part of people who do business with spammers
    ( ) Dishonesty on the part of spammers themselves
    ( ) Bandwidth costs that are unaffected by client filtering
    ( ) Outlook

    and the following philosophical objections may also apply:

    ( ) Ideas similar to yours are easy to come up with, yet none have ever
    been shown practical
    ( ) Any scheme based on opt-out is unacceptable
    ( ) SMTP headers should not be the subject of legislation
    ( ) Blacklists suck
    (x) Whitelists suck
    ( ) We should be able to talk about Viagra without being censored
    ( ) Countermeasures should not involve wire fraud or credit card fraud
    ( ) Countermeasures should not involve sabotage of public networks
    ( ) Countermeasures must work if phased in gradually
    ( ) Sending email should be free
    ( ) Why should we have to trust you and your servers?
    ( ) Incompatiblity with open source or open source licenses
    ( ) Feel-good measures do nothing to solve the problem
    ( ) Temporary/one-time email addresses are cumbersome
    ( ) I don't want the government reading my email
    ( ) Killing them that way is not slow and painful enough

    Furthermore, this is what I think about you:

    (x) Sorry dude, but I don't think it would work.
    ( ) This is a stupid idea, and you're a stupid person for suggesting it.
    ( ) Nice try, assh0le! I'm going to find out where you live and burn your
    house down!

    1. Re:Time to dig out this old post. by tds67 · · Score: 5, Funny
      I think the preceding post:

      ( ) Was funny.
      ( ) Was informative.
      ( ) Was interesting.
      ( ) Was informative and funny.
      ( ) Was interestingly informative.
      (x) Was funny in an informative sort of way.
      ( ) Was rehash.
      ( ) Is itself spam.
      ( ) Is overrated.
      ( ) Gave me gas.

    2. Re:Time to dig out this old post. by interiot · · Score: 4, Insightful
      • (x) Users of email will not put up with it
      We'll see.
      • (x) Eternal arms race involved in all filtering approaches
      One of the few constants is that there will be way for money to get from the target back to the original spammer or seller. (well, it's possible something more complex is going on and that's not the real goal of spam, but at the least, it's something that's remained constant for years, which is notable in the world of spam). So "following the money" is really based on an acceptance of the above criticism, and a realization that the arms race can never get around the money stream.

      Filters may be lead to arms races, but does anyone NOT use them right now? There are few alternatives, namely things like making email non-anonymous / PKI, enacting large legal penalties along with huge international support, rejecting email from anyone you don't know, ....

      • (x) Whitelists suck
      Actually, it's a blacklist. Blacklists may suck, but it's possible they suck less than spam, and the proliferation of RBLs kind of implies that.

      Sure, there might be a way to stop spam once and for all and then blacklists would be hated, but the very presence of a antispam-rejection-template implies that there won't be a magic bullet for a long time to come.

      • (x) Sorry dude, but I don't think it would work.
      The only way it CAN'T work is if money isn't the real goal of spammers, or if they make it hard enough to "follow the money" that other methods are easier/nicer.
  3. The whitelist will always be limited by Anonymous Coward · · Score: 5, Interesting

    There are millions of legitimate sites, and most of them aren't major sites like ebay, yahoo, etc. If I want to do a joe-job on an enemy small site, I can cause them a lot of pain by including their link. They'll have a dificult time someone wasn't spamming on their behalf.

  4. We adjust the frequency of the shields, by winkydink · · Score: 4, Funny
    The adjust the frequency of the phasers.

    I don't see this as the be-all, end-all for spam, but I do find it a very interesting and potentially very effective arrow for my spam-killer quiver.

    --

    "I'd rather be a lightning rod than a seismometer." -Ken Kesey

  5. SURBL? by Mateito · · Score: 5, Funny

    I mean, really, who comes up with the names for these things?

    Show me one self-respecting spammer who's going to quake in their boots at the threat of being hit with a "SURBL".

    ("Oh no.. please.. not the SURBL. Don't SURBL me.. Its too much... no.. No.. NOOOOOO!)

    Why not just call it a "NERF" and be done with it?

    I propose we come up with Spam deterrents with names like "Knuckle Duster", "Jagged Bottle", "Bloodied Crowbar" and "Bubba the Love Truncheon".

    1. Re:SURBL? by Anonymous Coward · · Score: 4, Funny

      I propose we come up with Spam deterrents with names

      Personally, I like BASTARD:

      Bad Ass Spam Threat And Reduction Deterrant

  6. It's a great idea by Rapid+Home+Offer · · Score: 5, Informative

    Combine it with spamassassin, and you can whitelist emails from companies that you want to recieve email from. Heck, with spamassassin you can give it a very small weight, and adjust the results manually. Every bit of extra information helps, and just ignoring it because it is compiled by somebody else doesn't make sense to me.

    1. Re:It's a great idea by beh · · Score: 4, Insightful

      ...unless I would send out a spam with TONS of valid links on various sites that haven't got anything to do with the rest of the spam...

      Boy - that list will be f***ed up pretty soon...

  7. A plugin? by Pranjal · · Score: 4, Interesting

    The only requirement to implement the SURBL is a plugin on your MTA such as spamassassin that can parse the body of each email.
    Anything which requires extra software on the MTA or client side is not a simple requirement as it cannotn be implemented universally. This is doomed to fail.

    1. Re:A plugin? by Mateito · · Score: 4, Funny

      > extra software on the ... client side.. not a simple requirement as
      > it cannot be implemented universally

      Bollocks. Send it to random users as an encrypted zip file with the key in an attached jpeg and a title like "returned mail" from a user called "hg477d762@hotmail.com", and enough people will install it to make it effective.

      Never underestimate the stupidity of end users.

  8. Whitelist maintenance? by tepples · · Score: 4, Interesting

    From the article:

    This is a democratic effect, improved by manual de-selection of legitimate domains by SpamCop users when they submit their reports. More reports means more votes that a given site is indeed spam.

    Though the article's author feels that "most SC users probably make an effort to uncheck legitimate domains to prevent false reporting," I have read reports that some mail server admins claim that SpamCop's users are rather likely to mistakenly report ham as spam. So the domain whitelist becomes important, but what practices have the SURBL administrators put in place to prevent corruption with respect to sites reported to whitelist at surbl dot org?

  9. Then what happens when .... by Anonymous Coward · · Score: 5, Interesting

    Spammers could then post their web sites as search URL's on Google, MSN, etc.. If you block those URL's then lots of people would complain that they can't send Google entries. Even if you solved that, then what happens with sites like tinyurl.com? If you block them then you have liability and legal issues to think about. No doubt the spammers will script up a number of ways to cloak the marketeers site urls.

  10. DOSes and things outside of ones control by Corvar · · Score: 5, Interesting

    This type of system is very abusable.

    I know I have gotten spam reports from places like spam cop because people have included the URL of my website in their spam. My site had nothing to do with the spam other than the spammer was using an article on the site to back up his point of view.

    This type of system could very easily be abused to blackhole many mailing lists.

  11. sendmail internal RBL by mabu · · Score: 5, Informative

    A good way to start if you're running your own mailserver is to use an internal IP-based blacklist such as the one found here. It's incomplete due to Geocities limitations but send e-mail to that account and the guy running it will send you the whole file. It's a list that he's been compiling now for more than a year of IP blocks, mostly class Bs, that have virtually no useful SMTP traffic and should be completely cut off. This generally consists of the vast majority of Chinese, Korean and Brazillian DULs.

    We've been able to effectively stop about 50% of the spam using these lists and save resources and bandwidth. What's left is to start RBL'ing the domestic DUL IP space (Comcast, SWBell, Bellsouth, etc.) on a class B-level until the ISPs start cracking down on their rogue users.

  12. Re:Spam is unavoidable by rw2 · · Score: 4, Insightful

    We can't ever have a workable spam filter because of the adaptability of spam.

    This is because the solutions of the day focus on content instead of anonymity.

    I've said it before, I'll probably say it again, get rid of unauthenticated email and the spam problem becomes a thousand times easier to fight. SPF and various RMX solutions exist in design today. If people want the spam problem to go away, that can be done today. Unfortunately people would rather piss and moan and call for legislation or perfect solutions than deal with these good ones today.

    In the case of spam the perfect is the enemy of the good enough. We should stop spam today.

  13. Counter-attacks are bad-- read this summary by joelparker · · Score: 5, Informative
    Counter-attacks are bad--
    check this summary of spam methods.

    http://netextend.com/junkmail

    ........

    Overview

    • What is Junk Mail?
    • Why Send Junk Mail?
    • How Bad is the Junk Mail Problem?
    • What is Needed?

    Solutions

    • Blacklists
    • Whitelists
    • Greylists
    • Adaptive Filters
    • Challenge-Response
    • Counter-Attacks
    • Tagging
    • Fake Honeypots, Tarpits, Spamholes
    • Sender Policy Framework (SPF)
    • Personal Digital Signatures
    • Internet Mail 2000 (IM2000)

    Conclusion

  14. Re:My proposed solution to spam by hacker · · Score: 4, Interesting
    1. Spam isn't primarily coming from legitimate SMTP relays like Yahoo or Hotmail
    You're kidding, right?

    At least 80% of our incoming spam, brute-force attacks, and other SMTP violations are coming from behind legitimate hosts like AOL, Verizon, Blueyonder, RoadRunner, and so on. Not forged IPs that pretend to be those hosts, but actual IPs that return to those MXs.

    Look at today's list of brute-force attacks so far.. (as of Mon Apr 12 17:55:53 EDT 2004)

    Every single one of these lists gets collected and reported, per day, per provider, and to date, not a single one of them has done anything to stop the abuse. In fact, it keeps increasing every day. The more we block, the faster they come at us.

  15. Already in use by MT-Blacklist by santiago · · Score: 5, Informative

    This exact method is the basis of the MT-Blacklist comment-spam prevention system for Movable Type-based blogs. It works wonderfully, as it identifies spam on the basis of the one feature it must have to be successful--a link back to the spammer's site.

  16. Re:Present problem. by Phroggy · · Score: 4, Insightful
    Presently the only problem with this is that there are no plug-ins for the MTAs themselves yet. The plug-in is for spamassassin. That means that the message has to be transfered and passed onto Spamassassin before it can be dropped or tagged whereas, the other RBLs allow you to drop the connection before the message is transfered. This problem will be solved once there are plug-ins for the MTAs themselves.

    Sorry, but that's not because it's a SpamAssassin plugin vs an MTA plugin. That's because the SMTP protocol doesn't allow for what you describe.

    Let's say I'm an MTA. When you connect to me, the first thing you do is introduce yourself, then tell me the envelope sender and envelope recipient of the message you're about to send, then give me the full message including headers and body. My options for blocking the message are:
    1. Before you even connect, your IP could be blocked at the firewall level, so I'd never see you.
    2. After you connect, before you introduce yourself, I have your IP address, and can check it against a blacklist and/or whitelist, and give you an error and disconnect if I don't like what I find. I can also do reverse and forward DNS queries on your IP to make sure they agree.
    3. After you introduce yourself, I can compare your greeting against your reverse DNS, since that's how you should be introducing yourself. I can give you an error if I don't like it.
    4. After you give me the envelope recipient, I can make sure that domain exists, etc. (Side note: Verisign wants to break this; ICANN is currently not letting them.)
    5. After you give me the envelope recipient, I can make sure that e-mail address is OK - if it's my domain name and the username is somebody I know I'll accept it, or if it's a valid domain name somewhere else and your IP is on my LAN I'll relay it. Otherwise I can give you an error.
    6. If we've gotten this far, I must now accept the entire message, including headers and body. If there's something in the headers I don't like, too bad! If there's something in the body I don't like, too bad! I have to let you send the whole message.
    7. After I've accepted the message, if there's a problem, I can generate a bounce message to send back to you, assuming the e-mail address you gave me actually works. If that fails, I'll send an e-mail to my postmaster explaining what happened. Or if that's too annoying, I could just delete your message and not tell anyone.

    Existing RBLs work at step 2. Filtering based on message content can't happen until step 7. You could build it into the MTA, but MTAs are complex enough as it is; using something else (SpamAssassin, Procmail, whatever) is a better idea.
    --
    $x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
    $x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;
  17. Could be good, could be bad. by autopr0n · · Score: 4, Insightful

    I see one major problem with this, which is that Spammers might now be able to cause problems for legitimate websites simply by including their URL in the a Spam.

    I'm a little sensitive to this since a spammer is actually Jo-jobbing one of my domains (not autopr0n), and I get hundreds of "user unknown" messages every day, along with a handful of messages telling me "my" email was blocked. It's really irritating.

    But, if it's done right, it could work out pretty well. In fact, this would actually be effective against a lot of the current Spam out there, and kill Spam with off-site images.

    Anyway, let me throw one countermeasure out there. Suppose spammers start including commonly mailed URLs (such as those on hotornot, yahoo, etc) in their spams in order to decrease the usefulness of these things. If this thing gets popular, expect to see a lot of Spam include a lot of random URLs the way they now include lots of random words. You'll also start to see things like "Javascript decryption" and other techniques to prevent machines from figuring out which, exactly, URL it is that is being advertised, rather then random noise.

    --
    autopr0n is like, down and stuff.