Slashdot Mirror


Stopping Spambots: A Spambot Trap

Neil Gunton writes "Having been hit by a load of spambots on my community site, I decided to write a Spambot Trap which uses Linux, Apache, mod_perl, MySQL, ipchains and Embperl to quickly block spambots that fall into the trap. "

127 of 304 comments (clear)

  1. Elements of good design I'd missed by Dark+Paladin · · Score: 4, Informative

    Looking at my Day Job and personal web site, other than the very cool technical achievement of the trap (I'll have to see if I can rewrite this for my Checkpoint FW system), there were one things I learned about good design from this article:

    Eliminate mailto - makes sense. You should have an http based "send me a message system" - force a live person to type stuff in instead of letting a program pick out addresses.

    Eliminating mailto alone would probably help in mot of my spam problems (as I have my "contact me" address right on the first page).

    1. Re:Elements of good design I'd missed by hagardtroll · · Score: 5, Interesting

      I put my email address in a jpeg image. Haven't found a spambot yet that can decipher that.

    2. Re:Elements of good design I'd missed by Dark+Paladin · · Score: 2, Informative

      Good point - some sites (I think AOL did once) can get sued if you're a large enough business and don't make your site accessable to the blind. (Americans with Disabilities Act thing.)

    3. Re:Elements of good design I'd missed by carm$y$ · · Score: 3, Insightful

      Eliminating mailto alone would probably help in mot of my spam problems

      You're 100% right. And fighting against spambots by relying on UserAgent is akin to... well.... security thru obscurity, albeit somehow in reverse.

      What also looks strange is that he doesn't consider that one can get a link directly to a page on the n-th level: as human browsers don't usually download robots.txt either, sounds like he's gonna ban some poor guys who got a link from a friend...

      --
      -- No sig today
    4. Re:Elements of good design I'd missed by jonbrewer · · Score: 2

      I've found a text file works as well. Spambots don't seem to bother loading "contact.txt".

    5. Re:Elements of good design I'd missed by British · · Score: 2

      AOL's personal ads(not that I visit that) do that already. They just use a GIF that looks just like it was a regular string of text. Very clever. I'm assuming there's a module out there that can do this easily on the fly?

    6. Re:Elements of good design I'd missed by Permission+Denied · · Score: 2, Informative

      I put my email address in a jpeg image. Haven't found a spambot yet that can decipher that.

      But neither could blind internet users...


      Add an alt tag that describes how to email you. Eg, "The first part of my email address is 'username' and the second part is 'host.com' - the two parts are separated by an '@' sign." I've been doing the jpeg thing for three years; works great.

    7. Re:Elements of good design I'd missed by dattaway · · Score: 2, Funny

      but...but...a blind AND deaf internet user couldn't read your webpage.

      I'm sure you don't want THAT kind of lawsuit.

    8. Re:Elements of good design I'd missed by Technician · · Score: 3, Informative

      I like the way geocaching.com handles the problem. To email a user, you have to click on a link containing the user profile. A link in the profile provides a contact user option which provides a form to fill out - if you are also a regisered user of the site. If you are not a user of the site, then you are prompted to log in or become a user. If you are a user and contacting another user, there is a checkbox when if checked will also send your real address to the user you are contacting so then with his permission, contact may be made via regular mail. This is useful for sending graphics and attachments. The best part is your address is not given out unless you specificaly permit it on a case by case basis. I love it.

      --
      The truth shall set you free!
    9. Re:Elements of good design I'd missed by nathanm · · Score: 2
      I'm assuming there's a module out there that can do this easily on the fly?
      Yes
    10. Re:Elements of good design I'd missed by BlueUnderwear · · Score: 2
      I don't think blind people would be -that- interested in a skating club...

      Dunno about skating, but blind people do ski. They are preceded by a guide who shouts them directions (or uses a wireless intercom, in order to not disturb the other skiers). Have seen such pairs several times at 2 Alpes. It must still be a helluva difficult, but they manage to do it anyways.

      --
      Say no to software patents.
    11. Re:Elements of good design I'd missed by prizog · · Score: 2

      Or a blind person.

    12. Re:Elements of good design I'd missed by evil_one · · Score: 2

      It's called Braille.
      A Freshmeat search turns up quite a bit of information about using it on posix OSs.
      Here's the Linux braille tty driver.

      --
      Desperation is a stinky cologne
    13. Re:Elements of good design I'd missed by Webmoth · · Score: 2

      I put my email address in a jpeg image. Haven't found a spambot yet that can decipher that.

      The flaw: OCR.

      Try ASCII art next time. And never use the @:

      \/\/ e |3 /\/\ () + |-| (a) \/\/ e |3 /\/\ () + |-| * ( 0 /\/\

      Warning: if you spam me, you WILL be blocked. We proactively block spammers at our mail server through either the use of ipchains rules or header parsing. Our ipchains are already blocking at least a million addresses in China (only 1,277,730,500 to go).

      --
      Give me my freedom, and I'll take care of my own security, thank you.
    14. Re:Elements of good design I'd missed by Eil · · Score: 2

      - If you "hide" the links to those pages and make it obvious enough to users, then the "friend" will not have gotten the link in the first place.

      - And if a normal user accidently gets banned, they can send an email to get unbanned.

      - And if they don't want to send an email, oh well, the occasional moron can't visit your site for a full 24 hours.

      - And there's no danger of a search engine finding those pages either, if they follow robots.txt

    15. Re:Elements of good design I'd missed by Anonymous+DWord · · Score: 2

      Put in an alt="whatever@here.com" tag - it'll get read.

      --
      "If he thinks he can hide and run from the United States and our allies, he's sorely mistaken." Bush on bin Laden
  2. /.ed by Anonymous Coward · · Score: 2, Funny

    Looks like you should've written some code to handle an overload from slashdot too!

    1. Re:/.ed by HiQ · · Score: 3, Funny

      The dude fell in his own trap. :-D

  3. Slashbot by Ctrl-Z · · Score: 3, Funny


    "I have a truly marvelous demonstration of this proposition which this bandwidth is too narrow to transmit."

    --
    www.timcoleman.com is a total waste of your time. Never go there.
  4. Block? Are you kidding? by Anonymous Coward · · Score: 5, Interesting

    Why on Earth would you like to block a spambot? So it doesn't get any more useful addresses?
    No way, man.
    If you realize you're serving to a bot, go on serving. Each time the bot follows the "next page" link, you /give/ it a next page. With a nicely formatted word1word2num1num2@word1word2.com, where words and nums are random.
    Give it thousands, millions of addresses this way.

  5. http-referrer by sofar · · Score: 2


    hmm, just a wild guess, but does this technique involve using the http-referrer to see if there are too many clients coming from just a particalar address (which would obviously be a *bad* thingy), and subsequently block them too?

    might explain why we can't see it no more :-(

    I want it too!!! it seems to work pretty good!

    1. Re:http-referrer by DutchSter · · Score: 2, Interesting

      No. The point the author made was that good bots follow the 'robots.txt' standard. A versatile program like this can differentiate. If a robot comes in and plays by the rules on robots.txt, it's welcomed. OTOH, if one comes in and just starts grabbing at everything, it will quickly find itself blocked.

      I believe the exact quote in regards to why robots.txt should still be used is: "Most bad spambots don't even check the robots.txt file, so this is mainly for protection of the good bots."

      Another thing I find appealing is that on a large enough system the DB could be shared amongst several servers to provide common protection for all. I've always taken a don't put an address on the page approach, but it's cool to see someone looking at how these bots operate from a technical standpoint.

      Some ISPs (like mine) have policies against SPAM that stipulate that in addition to not actually spamming people, using their resources to prepare/collect addresses to SPAM is just as bad. The advantage the database gives you is that you can track the most recent offenders. A quick lookup to who owns the address, with hard evidence of one of their subscribers abusing both your system, and their policy will, if nothing else, cause the cost of spamming to rise. The reason SPAM is so popular is because it is VERY cheap to do. Once its costs approach those of 'traditional' marketing, things might get a bit more selective rather than sending my three year old '1-3 inches in 6 weeks!','Stop paying for cable', or 'Get out of debt now!' messages. Hardly directed.

      (Now I don't want anyone marketing to my three year old, but I know it will happen so I'd like to at least think they would be reasonable things, perhaps a bit relevant)

  6. How I track spammers using PHP by Elkman · · Score: 5, Interesting
    I did something rather low-tech: I created a "Contact Us" page on my web server that has an automatically-generated address at the bottom. It says, "Note: The address spamtest.1018617636@example.com is not a valid contact address. It's just here to catch spammers." The number is actually the current UNIX timestamp, so I know exactly who grabbed this mail address and sent me mail.

    As it turns out, I really haven't received that much mail to this address. About the only mail I've ever received to it is someone from trafficmagnet.net, who tells me that I'm not listed on a few search engines and that I can pay them to have my site listed. I need to send her a nasty reply saying that I don't care about being listed on Bob's Pay-Per-Click Search Engine, and that if she had actually read the page, she would have noticed that she was sending mail to an invalid address. Besides, the web server is for my inline skate club and we don't have a $10/month budget to pay for search engine placement.

    I think I've received more spam from my Usenet posting history, from my other web site, and from my WHOIS registrations than I've received from the skate club web site.

  7. Hammered already.... by cswiii · · Score: 5, Funny

    From the website:
    The Problem: Spambots Ate My Website

    s/Spambots/Slashdot/

    1. Re:Hammered already.... by Technician · · Score: 2

      OK, who is the turkey hunting and clicking the 1 pixel graphic?

      --
      The truth shall set you free!
  8. re: spidertrap by blibbleblobble · · Score: 4, Interesting

    My PHP spider-trap - See an infinity of email addresses and links in action!

  9. Re:problem with not giving an email address ... by wmoore · · Score: 2, Insightful


    The only problem with the idea of using entirely http based "send me a message systems" is that some people, like myself, would much rather have an actual email address to use instead of having to use 50 different layouts and 50 different configurations and 50 different methods of communicating with someone or a company. Every html based contact system has its own quirks and problems, I'd rather just need to learn my email programs issues instead.

  10. removing mailto: a bad solution by bluGill · · Score: 5, Interesting

    Removing mailto: links is a bad solution to the problem. It might be the only solution, but it is bad.

    I hate the editor in my web browser. No spell check (and a quick read of this message will prove who diasterious that is to me), not good editing ability, and other problems. By contrast my email client has an excellent editor, and a spell checker. Let me pull up a real mail client when I want to send email, please!

    In addition, I want people to contact me, and not everyone is computer literate. I hang out in antique iron groups, I expect people there to be up on the latest in hot tube ignition technology, not computer technology. To many of them computers are just a tool, and they don't have time to learn all the tricks to make it work, they just learn enough to make it do what they want, and then ignore the rest. Clicking on a mailto: link is easy and does the right thing. Opening up a mail client, and typing in some address is error prone at best.

    Removing mailto: links might be the only solution, but I hope not. So I make sure to regualrly use spamcop.

  11. Re:Block? Are you kidding? by f3lix · · Score: 5, Interesting

    This isn't such a good idea - for every random (non-existent) domain that you generate, a root DNS server will be queried when an email is sent to this address, which increases the load on the root servers, which is generally a bad thing. How about instead, returning pages with the email address abuse@domain-that-spambot-is-coming-from all over them...

  12. Similar to how the new ORBZ works? by Masem · · Score: 4, Interesting

    After the Battle Creek incident with ORBZ, the maintain changed the way it worked; instead of being pro-active on checking for open relays, he now has a 'honeypot' like system where a unique email address that isn't directly visible on the site but still may be harvested by a spam bot. Any server that sends email to that address is automatically added to The List. Mail server admins that believe that they should not be on this list can argue their case to remove their server.

    --
    "Pinky, you've left the lens cap of your mind on again." - P&TB
    "I can see my house from here!" - ST:
    1. Re:Similar to how the new ORBZ works? by toupsie · · Score: 4, Interesting
      he now has a 'honeypot' like system where a unique email address that isn't directly visible on the site but still may be harvested by a spam bot. Any server that sends email to that address is automatically added to

      This is the same method I have been using for a while. I have an e-mail account called "cannedham" that I had posted on several web sites as a mailto: anchor on a 1x1 pixel graphic. Any e-mail sent to that address updates my Postfix header_checks file to protect the rest of my accounts. It works like a charm.

      --
      Strange women lying in ponds distributing swords is no basis for a system of government.
  13. Re:problem with not giving an email address ... by Luyseyal · · Score: 2

    But, if you send him the message once with your return address, he'll know you're for real and when he replies you can use your regular mailer.

    $0.02USD,
    -l

    --
    Help cure AIDS, cancer, and more. Donate your unused computer time to worldcommunitygrid.org. Join Team Slashdot!
  14. Take a look in the mirror by Spackler · · Score: 5, Informative
  15. A tip by anthony_dipierro · · Score: 5, Informative

    Here's a tip for those of you writing spambot traps... How about not blindly responding to the faked Return-Path address?

    Now that should be illegal. You people whine about your 10 spams a day, try 10,000 from 2000 different email addresses. Idiot postmasters should be caught and jailed.

    1. Re:A tip by RollingThunder · · Score: 2

      That's not usually the spambot trap, it's usually the MTA, when the spammer sends to an invalid address.

      Although, the MTA would be looking at the envelope sender if it's any good, but most of the time those are faked too.

    2. Re:A tip by anthony_dipierro · · Score: 2
      Count Microsoft in that list of crappy MTAs...

      Received: from cpimssmtpa20.msn.com ([65.31.179.139]) by cpimssmtpa41.msn.com with Microsoft SMTPSVC(5.0.2195.4905);
      Thu, 11 Apr 2002 15:10:19 -0700
      Message-Id: <5n5k5movjtde50vvgss.io387p65vg1v88f1@cpimssmtp a20.msn.com>
      From: [NOT MY ADDRESS]@usa.net
      Date: Thu, 11 Apr 2002 15:08:40 -0800
      Subject: PERSCRIPTIONS! CHEAP AND PRIVATE!!
      Content-Type: text/html;
      charset="iso-8859-1"
      Content-Transfer-Encoding: 8BIT
      X-Mailer: Mozilla 4.08 [en] (Win98; I)
      To: [NOT MY ADDRESS]@xmail.com
      CC: [NOT MY ADDRESS]@msn.com
      Return-Path: [MY ADDRESS!]
      X-OriginalArrivalTime: 11 Apr 2002 22:10:20.0586 (UTC) FILETIME=[AB7A58A0:01C1E1A5]

      Freaking mke-65-31-179-139.wi.rr.com is sending the mail (I hid the rest of the addresses, which are most likely innocent), and I get the damn (well over 10,000) bounces. That's what I get for being publically against spam laws on slashdot, I guess. I wonder how hard it would be to subpeona the name and address of the original sender...

  16. he suggests formmail, another spam tool by nwc10 · · Score: 5, Informative
    Interestingly within the article he suggests hiding your e-mail addresses by making a feedback page. One of the programs that he suggests is formmail, and he links to Matt's original version.

    formmail itself (even the most recent version) can still be abused by spammers to use your webserver as a bulk mail relay - see the advisory at
    http://www.monkeys.com/anti-spam/formmail-adviso ry . df

    It's a shame he didn't suggest the more robust formmail replacement at nms which is maintained, and attempts to close all the known bugs and insecurities.

    1. Re:he suggests formmail, another spam tool by KjetilK · · Score: 2
      Yeah, I had a funny incident where my address was put in the From:-field of a pr0n-spam sent using a formmail exploit. I quickly made an autoresponder to the people who complained to me, but it turned out to be just a handful of people (who I then took the opportunity of educating about many things).

      But there are later versions of formmail that are patched, aren't there?

      --
      Employee of Inrupt, Project Release Manager and Community Manager for Solid
    2. Re:he suggests formmail, another spam tool by Chagrin · · Score: 2

      Yes! It's becoming a popular target for spammers. If you have formmail in a common location (like mysite.com/cgi-bin/formmail.pl) it will be eventually scanned for and picked up.

      I've seen it happen to sites I administer a number of times in the past, where individuals apparently using some sort of AOL name harvesting tool were using the formmail.pl scripts to send mass messages. Looking at the User-Agent headers, it looks like there's a VB script out there designed specifically to automate this exploit.

      --

      I/O Error G-17: Aborting Installation

  17. Re:Block? Are you kidding? by BlueUnderwear · · Score: 3, Insightful
    At first glance this might be a good idea but this will be resource burden on your system.

    Add a couple of sleep(20); into the cgi script that generates the bot fodder. The bot will still stay busy waiting for your webserver's response, but your script will exactly consume zero resources.

    For additional kicks, set up a DNS teergrube.

    --
    Say no to software patents.
  18. Removing the Mailto: may not be the best plan.. by liquidsin · · Score: 5, Interesting

    I've found that a lot of people just won't send email if there's not a link to facillitate it. I've become rather fond of using javascript to write the address to the page. Spambots read the source so they don't piece the address together but *most* browsers will still do it right. Just use something like:

    <script>document.write("<A CLASS=\"link\" HREF=\"mailto: " + "myname" + String.FromCharCode(64) + "mydomain"</script>

    Seems to work fine. Anyone know of any reason it shouldn't, or have any other way to keep down spam without totally removing the Mailto: ? I know this won't work with *every* browser, but it beats totally removing mail links. And I don't think spammers can get it without having a human actually look at the page...

    --
    do not read this line twice.
    1. Re:Removing the Mailto: may not be the best plan.. by bero-rh · · Score: 2

      This also makes it invisible to anyone who disabled JavaScript, and anyone using a browser that doesn't do JavaScript (lynx, links, etc.)

      --
      This message is provided under the terms outlined at http://www.bero.org/terms.html
    2. Re:Removing the Mailto: may not be the best plan.. by liquidsin · · Score: 4, Interesting

      hell, go one step further:

      <img src="myemailaddress.jpg" alt="me at domain dot com">

      that way people who use browsers that speak (ie. the blind) would still hear your address correctly, so long as spambots don't start to pick up on the spelling out of "at" and "dot".

      --
      do not read this line twice.
    3. Re:Removing the Mailto: may not be the best plan.. by Kiaser+Zohsay · · Score: 2

      In the description of the trap, the author has a warning page just in case a real user hits one of the bogus links. That page would also benfit from a handy javascript history.go(-1). You might consider an HTTP redirect header, but the bot might be smart enough to follow that.

      --
      I am not your blowing wind, I am the lightning.
    4. Re:Removing the Mailto: may not be the best plan.. by e_n_d_o · · Score: 3, Interesting

      On my company's Web site we've had success with this technique. The addresses posted on the Web site have not received any significant amount of spam. I have yet to see a single spam message that hits all four of the addresses on our contact page at once, which I believe would be a likely indicator we've been hit by a spambot.

      We embed this JavaScript code on each page that needs mailtos:

      <script type="text/javascript" language="JavaScript1.3">
      // Anti e-mail address harvester script.
      function n_mail(n_user) {
      self.location = "mailto:" + n_user + "@" + "yourdomain" + "." + "com";
      }
      </script>

      And then make email address links of this form:

      <a href="javascript:n_mail('foo');">foo<!-- antispam -->@<!-- antispam -->yourdomain<!-- antispam -->.<!-- antispam -->com<!-- antispam --></a>

      Our addresses even show up correctly in lynx, but are "clickable" only in JavaScript-enabled browsers.

      Of course, it's probably only a matter of time before spambots can compensate for this code. A more secure approach would be to put email addresses "components" in borderless cells of tables, or as a previous poster suggested, in images.

    5. Re:Removing the Mailto: may not be the best plan.. by liquidsin · · Score: 2

      that's what I did for my company's site as well. I have a linked src file with variables (orders = "steve" or what have you) and then just use a document.write( eval(orders) + String.FromCharCode(64)... and so on as I stated before. I definitely like your idea for borderless tables though. Between javascript, borderless tables, images, and the alt property, we should be able to keep harvesting bots confused. I'm sure there's a way to use the summary property of table tags for more fun, but I'm too tired to figure it out right now (it's 4pm on a friday...)

      --
      do not read this line twice.
  19. Similar setup without SQL requirements by bero-rh · · Score: 4, Interesting

    My setup (catches some of the more commonly used spambots) uses mod_rewrite to send spammers to a trap.
    Setup details at http://www.bero.org/NoSpam/isp.php

    --
    This message is provided under the terms outlined at http://www.bero.org/terms.html
  20. Another way to stop spambots by PanBanger · · Score: 3, Funny

    Have your page linked on slashdot! Page gets slashdotted, problem solved.

  21. Removing email addresses by Mr_Silver · · Score: 2
    I used a very nifty bit of javascript which masks your mailto address. Provided the person has javascript on (and lets face it, nearly everyone who doesn't read /. does) then it works well.

    You can generate the code for your own email address here or, if you want some source code, then you can find an implementation of it here.

    --
    Avantslash - View Slashdot cleanly on your mobile phone.
  22. Simple solution! by Balinares · · Score: 3

    1) Put a link such as: mailto:dedicatedaddress@wherever.com?Subject= [Question] About your site (or whatever)
    2) Trash any email sent to dedicatedaddress that doesn't have the [Question] tag in the subject.

    Hope this helps.

    --

    -- B.
    This sig does in fact not have the property it claims not to have.
    1. Re:Simple solution! by c=sixty4 · · Score: 3, Insightful
      1. Put a link such as: mailto:dedicatedaddress@wherever.com?Subject= [Question] About your site (or whatever)
      2. Trash any email sent to dedicatedaddress that doesn't have the [Question] tag in the subject.
      Congratulations. You just ensured you can't be emailed by anyone not running Internet Explorer.
      --
      "The good die first." "Most of us are morally ambiguous, which explains our random dying patterns." --- MST3K
    2. Re:Simple solution! by fanatic · · Score: 3, Informative

      Congratulations. You just ensured you can't be emailed by anyone not running Internet Explorer.

      This seems to work fine (the window comes upo with the right email address in the to: line and the '[Question]' tag in the subject: line) in Netscape 4.76

      and Lynx Version 2.8.3rel.1

      and Mozilla 0.9.7, which implies Netscape 6.x, and Galeon will work as well, though I haven't tested these.

      --
      "that's not encryption - it's a new perl script that I'm working on..." - from some Matrix parody
  23. Re:Block? Are you kidding? by BlueUnderwear · · Score: 5, Funny
    - for every random (non-existent) domain that you generate, a root DNS server will be queried when an email is sent to this address, which increases the load on the root servers, which is generally a bad thing.

    Why is this a bad thing? They are owned by Verisign.

    How about instead, returning pages with the email address abuse@domain-that-spambot-is-coming-from all over them...

    This is also a good idea. In fact, I have a script which does a traceroute to the IP of the bot, and then looks up the admin contact using whois for the last couple of hops, and returns these. Oh, and for additional fun, throw in a couple of addresses of especially loved "friends"...

    --
    Say no to software patents.
  24. A better solution: obfuscate the mailto: link by rsidd · · Score: 5, Insightful

    Write some of your email address using html code for the ascii characters, like &#36 &#35 114 for "r".
    (Yes, I've posted about this before, but it does work for me.) Browsers render it so users get the address they want, but spambots try to grab it from the raw html and get something meaningless.

    1. Re:A better solution: obfuscate the mailto: link by Sangui5 · · Score: 5, Interesting

      Some spambots will render that correctly. Less likely, though, is if they'll render an email that has had this done to it: it's encrypted through javascript.

      It is a rather impressive piece of work. Uses honest-to-god RSA.

      You could also encrypt all email addresses, and then in your spambot trap, put really really CPU intensive javascript. You'll win either way: either the spambot doesn't do javascript, and it won't get your addresses, or it does do javascript, and they've just spent an eternity wasting time. It would work the same way as a tarpit, but it wouldn't eat nearly so many resources on your end.

      If you're really clever, you could have the javascript do useful work, and then have the results of that work encoded into links in the page. You could then retrieve the results when the spider follows the link.

      There was an idea called hashcash floating arount a while back. The idea was that an SMPT server would refuse to deliver email if the sender didn't provide a hash collsion of so many bits to some given value. The sender has to expend way assymetrically more resources to generate the collision than it takes the reciever to check it. That way on can impose a cost on sending a lot of email. It's not so much to be a burden on ordinary users, but if you need to send thousands of emails, it will add up.

    2. Re:A better solution: obfuscate the mailto: link by Pathwalker · · Score: 2

      I use a little trick that combines both of those techniques.
      It's a little block of RXML that defines a tag called cloak. You use it like this:

      <cloak email='foo@pathwalker.org' />

      If Roxen determines that the client is a robot, or it can't identify what the client is, then they get a graphic.

      If they are detected as a normal webbrowser, then they get a partially entity encoded address.

      If anyone uses Roxen as their server it might be of some use.

  25. Re:Block? Are you kidding? by cperciva · · Score: 4, Interesting

    Add a couple of sleep(20); into the cgi script that generates the bot fodder. The bot will still stay busy waiting for your webserver's response, but your script will exactly consume zero resources.

    Zero resources, except for memory.

    A much better solution would be to point the bot at a set of "servers" with IP addresses where you're running a stateless tarpit.

  26. my spambot trap by romco · · Score: 4, Informative

    The page is already slashdoted. Here is a little
    script that traps bots (and others) that use your robots.txt
    to find directories to look through. Requires an .htaccess file with mod_rewrite turned on

    robots.txt
    #################

    User-agent: *

    Disallow: /dont_go_here
    Disallow: /images
    Disallow: /cgi-bin

    dont_go_here/index.php
    ############

    $now = date ("h:ia m/d/Y");
    $IP=getenv(REMOTE_ADDR);
    $host=getenv(R EMOTE_HOST);
    $your_email_address=you@whatever;

    $ban_code =
    "\n".
    '# '."$host banned $now\n".
    'RewriteCond %{REMOTE_ADDR} ^'."$IP\n".
    'RewriteRule ^.*$ denied.html [L]'."\n\n";

    $fp = fopen ("/path/to/.htaccess", "a");
    fwrite($fp, $ban_code);
    fclose ($fp);

    mail("$your_email_address", "Spambot Whacked!", "$host banned $now\n");

    --
    AdFuel
    1. Re:my spambot trap by Captain+Large+Face · · Score: 2

      How about rewriting denied.html each time to contain a list of e-mail addresses in the format:

      abuse@banned_host

      That way, the spammers might actually spam their own ISP's abuse account. Now THAT would be funny! :-)

    2. Re:my spambot trap by romco · · Score: 2

      "How about rewriting denied.html each time to contain a list of e-mail addresses in the format:
      abuse@banned_host"

      I do something like that..

      Denied!

      --
      AdFuel
  27. Re:Block? Are you kidding? by Ralp · · Score: 3, Informative
    Wpoison does this.

    From the website: Wpoison is a free tool that can be used to help reduce the problem of bulk junk e-mail on the Internet in general, and at sites using Wpoison in particular.

    It solves the problems of trapped spambots sucking up massive bandwidth/CPU time, as well as sparing legitimate spiders (say, google) from severe confusion.

  28. Re:Block? Are you kidding? by gclef · · Score: 3, Interesting

    Actually, I've done this w/a bot trap on my site at home. It's a perl script that generates a bunch of weird-sounding text w/some fake email addresses at the bottom and a bunch of database-query-looking links back to the original page.

    The bots don't fall for it anymore. Some dorks in Washington state decided to make a couple requests a second to it once, but in the two years I've had it up, they're the only ones.

  29. Other options.. by primetyme · · Score: 4, Informative

    A pretty good article, but being able to install modules into Apache may not be the best situation for everyone who wants to stop Spambots..

    Shameless plug, but I've got an ongoing series in the Apache section of /. that deals with easy ways that administrators *and* regular users can keep Spambots off their sites:
    Stopping Spambots with Apache
    and
    Stopping Spambots II - The Admin Strikes Back

    Just some more options and choices to help people out!

  30. Re:Block? Are you kidding? by liquidsin · · Score: 2

    I like that idea...look up the originating host, and make links back to abuse@, root@, webmaster@, and whatever else you can think of. Clog their mailservers. The problem is, it would be simple enough (if it's not already in place) to have your spam bot ignore addresses for your own domain.

    --
    do not read this line twice.
  31. Re:Block? Are you kidding? by Martin+S. · · Score: 2

    Give it thousands, millions of addresses this way.

    Liberally sprinkled postmaster@127.0.0.1 and abuse]@127.0.0.1.

  32. using images is bad for people with text browsers by hsenag · · Score: 2, Insightful

    If you use images for email addresses, what are people using text browsers supposed to do? Even worse is using them on the "warning" pages - someone with a text browser would have no idea what the image said and therefore nothing to stop them falling into the trap and getting firewalled.

    And of course if he uses ALT text for the images, then he has the same problem he was trying to avoid, of creating something the spambots can read.

  33. Re:Block? Are you kidding? by boky · · Score: 5, Interesting

    I agree. And, come on, how much technology do you need?

    This is my solution to stopping spambots. It's in a JavaServlet technology and I am posting it here to prevent my company's site from being slashdotted. It does not prevent the spammer from harvesting emails it just slows them down.... a lot :) If everyone had a script like this, spambots would be unusable.

    Feel free to use the code in anyway you please (LGPL like and stuff)

    Put robots.txt in your root folder. Content:

    User-agent: *
    Disallow: /members/

    Put StopSpammersServlet.java in WEB-INF/classes/com/parsek/util:

    package com.parsek.util;
    // Slashdot lameness filter trick... sklj lijef oiwej goweignm lkjhg woeèi weoij woefh woegih weoigj woefm weoikjf woeifh woefhpweifjwopejf pw
    // Slashdot lameness filter trick... flk joweij pgwej pweof ,mpeof ,mpweorj pweomfwpegj pwehg woeigh owèefij woeij eogih oibhwepoi upeorw wpeo
    // Slashdot lameness filter trick... fkjew fiwje spbojkwe gkwpeori wpbv-j wpeofksweok pweorjsw eigjhwoeifj pweorj wepoj wepfomwe fpmwoe fpowe
    // Slashdot lameness filter trick... epoiw epw0 w'pg wpoe wpeom, wpog wepfoiwpeor kwpeof, wpobm wepofkwpeofk wopvf,w bowkpeoirf pwoef,mwepof p
    // Slashdot lameness filter trick... vlwkepo wesp ibebemwf èsdm fèefo.bp kwèpef èlfk èeofsw èegjwegoweofiw peok èglks dgèlksdfèokwe ofèkwe èfoe
    import java.io.File;
    import java.io.StringWriter;
    import javax.servlet.ServletContext;
    import java.net.URL;
    import java.util.Enumeration;
    import java.lang.reflect.Array;
    public class StopSpammersServlet extends javax.servlet.http.HttpServlet {
    private static String[] names = { "root", "webmaster", "postmaster", "abuse", "abuse", "abuse", "bill", "john", "jane", "richard", "billy", "mike", "michelle", "george", "michael", "britney" };
    private static String[] lasts = { "gates", "crystal", "fonda", "gere", "crystal", "scheffield", "douglas", "spears", "greene", "walker", "bush", "harisson" };
    private String[] endns = new String[7];
    private static long getNumberOfShashes(String path) {
    int i = 1;
    java.util.StringTokenizer st = new java.util.StringTokenizer(path, "/");
    while(st.hasMoreTokens()) { i++; st.nextToken(); }
    return(i);
    }
    // Respond to HTTP GET requests from browsers.
    public void doGet (javax.servlet.http.HttpServletRequest request,
    javax.servlet.http.HttpServletResponse response)
    throws javax.servlet.ServletException, java.io.IOException {
    // Set content type for HTML.
    response.setContentType("text/html; charset=UTF-8");
    // Output goes to the response PrintWriter.
    java.io.PrintWriter out = response.getWriter();
    try {
    ServletContext servletContext = getServletContext();
    endns[0] = "localhost";
    endns[1] = "127.0.0.1";
    endns[2] = "2130706433";
    endns[3] = "fbi.gov";
    endns[4] = "whitehouse.gov";
    endns[5] = request.getRemoteAddr();
    endns[6] = request.getRemoteHost();
    String query = request.getQueryString();
    String path = request.getPathInfo();
    out.println("<html>");
    out.println("<head>");
    out.println("<title>Members area</title>");
    out.println("</head>");
    out.println("<body>");
    out.println("<p>Hello random visitor. There is a big chance you are a robot collecting mail addresses and have no place being here.");
    out.println("Therefore you will get some random generated email addresses and some random links to follow endlessly.</p>");
    out.println("<p>Please be aware that your IP has been logged and will be reported to proper authorities if required.</p>");
    out.println("<p>Also note that browsing through the tree will get slower and slower and gradually stop you from spidering other sites.</p>");
    response.flushBuffer();
    long sleepTime = (long) Math.pow(3, getNumberOfShashes(path));

    do {
    String name = names[ (int) (Math.random() * Array.getLength(names)) ];
    String last = lasts[ (int) (Math.random() * Array.getLength(lasts)) ];
    String endn = endns[ (int) (Math.random() * Array.getLength(endns)) ];
    String email= "";

    double a = Math.random() * 15;
    if(a if(a if(a if(a if(a if(a if(a if(a if(a if(a if(a if(a if(a email = email + "@" + endn;

    out.print("<a href=\"mailto:" + email + "\">" + email + "</a><br>");
    response.flushBuffer();

    Thread.sleep(sleepTime);

    } while (Math.random()
    out.print("<br>");
    do {
    int a = (int) (Math.random() * 1000);
    out.print("<a href=\"" + a + "/\">" + a + "</a> ");
    Thread.sleep(sleepTime);
    response.flushBuffer();
    } while (Math.random() out.println("</body>");
    out.println("</html>");

    } catch (Exception e) {
    // If an Exception occurs, return the error to the client.
    out.write("<pre>");
    out.write(e.getMessage());
    e.printStackTrace(out);
    out.write("</pre>");
    }
    // Close the PrintWriter.
    out.close();
    }
    }

    Put this in your WEB-INF/web.xml

    <servlet>
    <servlet-name>stopSpammers</servlet-name& gt;
    <servlet-class>com.parsek.util.StopSpammersS ervlet</servlet-class>
    </servlet>
    <servlet-mapping>
    <servlet-name>stopSpammers</servlet-name& gt;
    <url-pattern>/members/*</url-pattern>
    </servlet-mapping>

    Here you go. No PHP, no APache, no mySQL, no Perl, just one servlet container.

    Ciao

    --
    boky
  34. Re:Block? Are you kidding? by richie2000 · · Score: 3, Informative
    Wpoison basically does that; it serves a page with bogus addresses and adds a nasty delay between pages, keeping the spider occupied.

    However, the instructions for installating Wpoison more or less assumes that one has a single website to protect. I have around 20 virtual hosts. So instead of creating a renamed cgi-bin in every DocumentRoot, I added a single

    ScriptAlias /runme/ "/var/www/cgi-bin/"

    to httpd.conf and then linked it like this:

    <A HREF="/runme/addresses.ext"><IMG SRC="pixel.gif" BORDER=0></A>

    I also added a single transparent pixel to the link to keep it invisible but still fool the spiders. Add the runme directory as excluded in the robots.txt and you should be on your way. Muhahahah, and so on.

    --
    Money for nothing, pix for free
  35. Another Method by Captain+Large+Face · · Score: 2

    How about sending a parameter to a page which redirects to the mailto: protocol?

    For example:

    index.html

    <a href="filename.php?x=info">E-Mail Me&lt/a>

    filename.php

    &lt?php
    Header ("Location: mailto:" + $x + "@mydomain.tld")
    ?>

  36. Take this one step further... by Jason+Levine · · Score: 4, Interesting

    There's a spam-blacklist, so how about a spambot-blacklist?

    You'd have a standardized spambot trap (like the one described in the article) on various webservers. The new spambot info could go into a "New SpamBots" database (which wouldn't be blocked). Once a day, the webserver would connect up with a central database and submit the new spambot info it's obtained. Then the server would download a mirror of the updated "SpamBots" database which it would use to block spambots.

    The centralized SpamBots database would take all of the new SpamBot info every day and analyze them in some manner as to detect abuse of the system (ensuring that only true spambots are entered). E-mails could be fired off to the abuse/postmaster/webmaster for the offending IP address. Finally, the new SpamBot info would be integrated into the regular SpamBot database.

    This way you'd be able to quickly limit the effectiveness of the Spambot-traps across many websites.

    --
    My sci-fi novel, Ghost Thief, is now available from Amazon.com.
  37. Re:Block? Are you kidding? by blibbleblobble · · Score: 3, Funny

    Especially loved "friends"...

    Like hotline@mpaa.org, cdreward@riaa.org, senator@hollings.senate.gov for example?

  38. Attn Spambot Authors by NiftyNews · · Score: 5, Interesting

    Dear Spambot Authors,

    Thanks again for your interest. I hope that we were able to help you write the spambots of the future that will be able to detect and sidestep as many of the above protection schemes as possible. We tried to work all of our knowledge into one convienient thread for your development team to peruse.

    Thanks for your interest in SlashDot, home of too much information.

    1. Re:Attn Spambot Authors by Eil · · Score: 2


      Well this is one of the highest scoring trolls I've seen in awhile.

      So, what, website admins like myself are just supposed to sit back and let spambots a) harvest email addresses without consent b) eat up costly downstream bandwidth, memory and resources c) blatantly violate robot.txt directives?

      I'm patiently awaiting to hear your opinions on how to stop spambots without actually telling the web server administrators about any of it.

      I suppose next you'll argue that nobody should ever discuss ways to keep from being carjacked or mugged, because you just know that criminals are going to looking for this thread so they can watch out for said tips when they actually go do their dirty work.

      You know, maybe Apache could release some precompiled binaries with their own techniques for avoiding spambots and keep the source to themselves so the spambot authors can't see exactly what precautions have been coded in. You can't exploit a closed system, right? Just ask Microsoft.

  39. Re:Block? Are you kidding? by dirk · · Score: 3, Insightful

    Why on Earth would you like to block a spambot? So it doesn't get any more useful addresses?
    No way, man.
    If you realize you're serving to a bot, go on serving. Each time the bot follows the "next page" link, you /give/ it a next page. With a nicely formatted word1word2num1num2@word1word2.com, where words and nums are random.
    Give it thousands, millions of addresses this way.

    This would be good to do with known bad addresses, but random addresses only add more unknowing people to the list. You may add 1000 email addresses to the list and slow them down, but if even 10 of those email addresses are real, you've added to the problem. The bad addresses will be taken out as they are found to be bad, and the good ones will be left in. You've signed JoeRandomUser@RandomDomain.com up for all the spam he can handle, even if he has taken great lengths to keep his email address off the spam lists. In theory this sounds like a great idea, until your the guy getting your email address randomly fed to the bots.

    --

    "Information wants to be expensive" - Stewart Brand, the same guy who said "Information wants to be free"
  40. Re:Now, let's fake the other end. by Technician · · Score: 2

    I want a mail relay that refuses to process more than 10 mails from any single IP in a 24 hour period. It would be usable for home residential mail, but useless for bulk mail. As an added bennifit it would severly restrict the impact of the latest MS outlook exploit.

    --
    The truth shall set you free!
  41. Re:Pollute their database by nochops · · Score: 2, Insightful

    This helps, but not much...

    Think about it. With the scarcity of domain names lately, chances are that while the garbage email addresses may not be valid, more than a few domain names would be valid.

    So then the spammer fills his database with these non-existant addresses on existing domain names. He then sends his spam to these addresses, and their mail servers not only have to process the message to determine that it's an invalid address, but they also have to bounce the message back as undeliverable.

    IMO this is going to use twice the bandwidth, since you now have to consider the bandwidth used by all of those bounces.

    You could always use some non-existant domain names for the garbage email addresses, but the spammer could just as easily check a domain name's validity before sending spam to it, making it trivial to remove all of the trash from his database.

    Remember, the spammer couldn't care less about sending mail to bad addresses, as long as the good addresses are spammed as well. It's left to the poor sysadmin to clean up the mess.

    --
    "A terrorist is someone who has a bomb but doesn't have an air force." -William Blum
  42. Re:Block? Are you kidding? by nathanm · · Score: 2

    Try out the Book of Infinity. It's a CGI that generates an infinite trail of gibberish links. It could easily be modified to add gibberish e-mail addresses to each page.

  43. wonder what this means.. by SethJohnson · · Score: 2
    I was just checking out one of the email harvesting products and saw this in the description:

    Automatically avoids spam trap pages.

    I wonder if this is a lie.. I also think it's funny because the rest of the product literature doesn't refer to it as a spam tool, but then this blurb is straight-up admitting it.

    Here's another funny 'feature'--

    Resume at the same place it left off even if your computer
    crashes.


    Doesn't exactly instill confidence in the stability of this product..
  44. Re:Elements of good design I'd missed - P.Solution by skaldrom · · Score: 2, Interesting

    There is another solution: Usually these SpamBots are not able to execute JavaScript...
    As described at http://www.joemaller.com/js-mailer.shtml you can combine JavaScript and Images to protect your mail. Made very good expiriences with this one....

    But, as stated on the Website: this game is an arms race...

  45. Re: SpamBots: PHP Code by blibbleblobble · · Score: 2

    function SeedFakeEmail($Email)
    {
    echo "\n<font size=\"-5\" style=\"display:none\"><a
    href=\"mailto:$Email\"> Please don't email $Email</a></font>";
    }

    SeedFakeEmail("uce@ftc.gov");
    SeedFakeEmail("listme@dsbl.org");
    SeedFakeEmail("hotline@mpaa.org");
    SeedFakeEmail("cdreward@riaa.org");
    SeedFakeEmail("senator@hollings.senate.gov");

    Put that in your pageheader and smoke it!

  46. I use two methods on my site.... by Rahga · · Score: 2

    On rahga.com, I use a custom perl script with a html-based form that is programmed only to send messages to me. Here it is.

    On stuff like my FAQs, I use igPay Latin Encoded Email: ahgaray atyay ahgaray otday omcay

  47. Re:Block? Are you kidding? by AntiNorm · · Score: 2

    How about instead, returning pages with the email address abuse@domain-that-spambot-is-coming-from all over them...

    Most spambots know better than to send their crap to email addresses containing things like abuse, root, postmaster, .edu, or .gov.

    Also, in regard to the problem of root servers being queried every time a @randomdomain.com is looked up, could you not just use random IP addresses?

    --

    I pledge allegiance to the flag...
    of the Corporate States of America...
  48. Re:Okay... by DNS-and-BIND · · Score: 2
    You cannot defeat that which you do not understand. I think that you really can't talk about spam prevention unless you have one-on-one familiarity with programs like Atomic Harvester. Spammers certainly do (many, many other programs can be found with a google search for email harvester). Without knowledge of who the developers of these programs are, what kind of work they do, their track record in other projects, etc, it's pretty pointless to talk about spam-blocking in an educated manner.

    Matter of fact, I think it'd be a good idea to have an open-source email harvester. . . it'd give the good guys an idea of what works and what doesn't, and of course the open-source version would be free, polite to webservers, and best of all would steal thousands of sales from the real bad guys, the fellows who write spambots. (ObPipeDream) With any luck one of them would steal the code and resell it, and the GPL could get a slam-dunk court test.

    --
    Shutting down free speech with violence isn't fighting fascism. It IS fascism!
  49. Note to self by underclocked · · Score: 2, Funny

    Before announcing new useful project to Slashdot community, create Freshmeat/Sourceforge page first there by eliminating the need for my host to shut me down for execssive bandwidth.

  50. What I use by Phroggy · · Score: 3, Interesting

    Take a look at these two bits of code from http://www.slickhosting.com/contact.shtml :

    <A HREF="mailto:hosting%40slickhosting.com"
    onMouseO ver="window.status='mailto:hostingsli ckhosting.com';return true;"
    onMouseOut="window.status='';">hostingslic khosting.com</A>

    <!-- Spam trap
    abuse@
    (your domain) HREF="mailto:abuse@ (your domain) "
    root@
    (your domain) HREF="mailto:root@ (your domain) "
    postmaster@
    (your domain) HREF="mailto:postmaster@ (your domain) "
    uce@ftc.gov HREF="mailto:uce@ftc.gov"
    -->

    --
    $x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
    $x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;
  51. other solution: flash 8) by leuk_he · · Score: 2

    Make a big macromedia flash site. Let the bot's eat that: this is the thing a lot of company's do.

    don't worry, and google wil adapt. They read even pdf and .doc files.

    new thought: make a site written in .doc format.

    1. Re:other solution: flash 8) by tomknight · · Score: 2
      I really hope that's a joke....

      Tom.

      --
      Oh arse
    2. Re:other solution: flash 8) by tomknight · · Score: 2
      Right, I don't want to have to suffer any more bloody flash sites than I already do.

      And no, it's not just me. Have you thought about people who can't view flash sites? If you can't think of anyone who might have a problem here, search google for web accessibility and see if you get the picture.

      When I find a "commercial" site that I can't view without flash or images I email the marketing and sales guys and tell them why I have a problem with their site. None of them really seem to give a fuck, but one day it might make a difference.

      Tom.

      --
      Oh arse
  52. Alas, not practical... by OmniGeek · · Score: 2

    Odds are high that this system, should it become sufficiently widespread to be useful, would be vulnerable to poisoning by spammers spoofing spambot traps and causing legitimate IPs (such as Googlebot or large blocks of Net users) to be incorrectly blocked. There are countermeasures against this, but my guess is that the resulting arms race would not result in an adequately-usable system for enough of the time to be worth it. (Remember, the blacklist must update with reasonable frequency for both additions AND expirations, and must have a VERY low rate of false-positives). The authentication of "legitimate" submitters is a serious weakness of such a system. Nice thought, though...

    --

    "My strength is as the strength of ten men, for I am wired to the eyeballs on espresso."
  53. Re:Block? Are you kidding? by LinuxHam · · Score: 3, Interesting

    postmaster@127.0.0.1 and abuse]@127.0.0.1postmaster@127.0.0.1 and abuse@127.0.0.1

    Good idea but, I'm sure spam software has been rejecting 127.0.0.1 for many years.

    How about a few people volunteering real FQDNs that all resolve to 127.0.0.1? I realize that people would be volunteering horsepower and bandwidth for DNS lookups, but it would be in the name of dramatically reducing spam. Then, keep a list of all the "loopback FQDN's" and let the rest of us feed those FQDN's into spam-trap generators. Eventually, there would be so many real-looking spam trap email addresses that the spam software wouldn't be able to keep up with the list of loopback FQDN's.

    To take it to the next level, you could hide the list of "loopback FQDN's" by making a reverse DNS lookup against a couple of volunteered IP addresses return a random FQDN from the list of loopback FQDN's at the time that the spamtrap page is dynamically generated.

    Spammers would never know the entire list of FQDN's that resolve to loopback.

    --
    Intelligent Life on Earth
  54. Re:Block? Are you kidding? by erc · · Score: 4, Informative

    Way too much work. Here's similar Escapade [escapade.org] code:

    <QUIET ON>
    <html><head><title>Members area</title></head><body>
    <p>Hello random visitor. There is a big chance you are a robot collecting mail
    addresses and have no place being here.
    Therefore you will get some random generated email addresses and some random links
    to follow endlessly.</p>
    <p>Please be aware that your IP has been logged and will be reported to proper
    authorities if required.</p>
    <DBOPEN "SpamFood", "localhost", "login", "password">
    <FOR I=1 TO 100 STEP 1>
    <SQL select * from names order by rand() limit 1>
    <LET FN="$Name">
    </SQL>
    <SQL select * from lasts order by rand() limit 1>
    <LET LN="$Last">
    </SQL>
    <SQL select * from addresses order by rand() limit 1>
    <LET AD="$Address">
    </SQL>
    <a href="mailto:$FN.$LN@$AD">$FN.$LN@$AD</a> <br>
    </FOR>
    </body>
    </html>

    --
    -- Ed Carp, N7EKG erc@pobox.com PGP KeyID: 0x0BD32C9B What I'm up to: http://intuitives.mine.nu
  55. Don't stop spambots, feed them with Sugarplum by dananderson · · Score: 3, Interesting

    I don't stop spambots, I feed them. I feed them phony email addresses and addresses of spammers (gathered from places such as my fake /cgi-bin/formmail.pl). I use http://www.devin.com/sugarplum/, mentioned before on /. to dish it out!

  56. Problem with wpoison... by wideangle · · Score: 3, Informative

    is that some of the fake emails it generates will be real.

  57. Better yet, use a Spam Troll-box by samhart · · Score: 2, Interesting

    We've recently set up a Spam Troll-box using Vipul's Razor on our new Tux4Kids dev server (you can find our troll box here).

    A troll-box gives Spam-bots a place to send their spam. When this box intercepts the spam, it reports it to the Vipul's Razor network, and everyone else on this network becomes aware of that spam (if they are also using Vipul's Razor to filter, which, chances are they are, it will filter that spam if they get it).

    If Vipul's Razor isn't enough, one can even use something like SpamAssassin in conjunction with Vipul's Razor to get even better results.

    Of course, this isn't cutting off Spam-bots at their source... but if enough sites were to cut them off at their source, then I'd imagine the Spam-bot authors would get wise to this and devise a way around it. Whereas with something like a SPam Troll-box, the Spam-bots seem to still be working to those running the Spam bots ;-)

  58. Let's feed the serpent its own tail by Crash+Culligan · · Score: 2, Interesting
    This morning, after finding a junk fax on the office's voice mail system, I called the removal number (in little text at the bottom of the fax) and reached an automated voice system that would either 1) remove an inputted number, 2) add a new number, or 3) talk to a representative about their service.

    Well, I didn't trust (1), and (3) just got me a voice mail box instead of a person I could chew out, which I didn't use. That left (2), and I had a wicked idea:

    I hit 2, and input the number that I should call if I was interested in the fax (which appeared in BIG text right above the little text). Their own response number should start eventually getting faxes from them or, as I tend to experience, hangups.

    Cute story, I know, but what does this have to do with defeating spambots?

    I went to the page indicated...

    I was just checking out one of the email harvesting products and saw this [getyoursoftware.com]

    And I scrolled to the bottom, and looked at the source code, and noted two faaaaaascinating things:

    First, the HTML on that page is rather clean; I can see no evidence of anti-spambot code on their page.

    And second, the "Contact Us" link at the bottom is a mailto:.

    By all appearances, their page is vulnerable to their own spambot.

    So I had the thought... what if those generated-random-email-address pages were geared to produce not-so-random email addresses? What if the email addresses on those generated-page traps were geared to generate random email addresses at the domains of the various spambot-- (err, I mean) harvester producing companies? Let them see what it's like when less than discerning spammers use their software for evil. Hundreds of Viagra-substitutes! Thousands of hangover cures! Tens of thousands of opportunities to refinance their home mortgage!

    This is just an off-the-top-of-my-head idea. Opinions?

    --
    You cannot truly appreciate Dilbert until you read it in the original Klingon.
  59. Re:Pollute their database by Steev · · Score: 2

    Not if the TLD isn't .com, .net, or .org! There's almost NO chance that it's valid if the TLD is also random.

    Remember, the spammer couldn't care less about sending mail to bad addresses, as long as the good addresses are spammed as well.

    True, but the their address lists will depreciate in value because the authenticity of most of the addresses would be in doubt.

  60. the danger of mailing lists.. esp. SuSE user list by SethJohnson · · Score: 3, Informative


    Another way your e-mail address can be susceptible to spambots is if you participate in any mailing list. If the administrator decides to archive the list on a website, in many cases the email addresses of the participants will be there in plain text. I found this out after doing a google search for my own email address and having it turn up on the SuSE web site. I sent an e-mail asking that they do a regsub on the archive to substitute the '@' with [at] or something similar. That was more than six months ago and the SuSE website admin still hasn't done it.
  61. What about a Terms of Service page by splattertrousers · · Score: 2, Interesting

    What about requiring all of your users to go through a terms of service page before accessing any parts of your site?

    The page could have a form with "Accept TOS" and "Reject TOS" buttons. I wonder how many spambots would submit a form?

    And to catch spambots that did submit the form, your TOS could have some clauses that make it a violation for evil spiders (ones that don't honor "robots.txt") to use the site. Maybe you could make||lose a few bucks suing the spambotters who go through the TOS and still harvest your email addresses.

  62. New Program - Mailwasher by Peale · · Score: 4, Interesting

    Speaking of spam, I've come across this new program called mailwasher. You can check your mail while it's still on the server, and then - get this - fake a bounced message. There are probably other programs that do this, but this is the first one I've heard of.

    Anyway, AFAIK, it's WinBlows only, and available at http://www.mailwasher.com, although right now it seems the site is down, all I get is a 404!

  63. A friendlier solution. by Fweeky · · Score: 2

    Rather than filling the spider with a whole bunch of (potentially valid) addresses and loading your server with bogus clients you don't want, just make it difficult for them to extract the addresses.

    I wrote a bit of PHP a few months ago that applied some spamproofing ala SlashDot (only a bit less agressive) that some might find useful.

    Highlighted Source

    Raw Source

    It performs the following munging, depending on what you specify:

    freaky@aagh.net

    freaky (at) aagh (dot) net

    freaky@aagh.N0SPAM.net.SPAMN0

    f&#114;&#101;aky@&#97;&#97;g&# 104;&#46;n&#101;&#116;

    random one of the above

    random with entity encoding

    all of the above

  64. Re:Pollute their database by Steev · · Score: 2

    Ok, just make sure the TLD is longer than say, 5 characters and you can be almost certain that randomly created ones don't exist.

    I am fully aware of the non-com/net/org TLDs...just look at *mine* :)

  65. How about trying this by SnarfQuest · · Score: 2, Interesting

    There are "scanner" traps that start up a session and just drops it (not telling the scanner) which ties it up until the scanner softare times out.

    How about writing something for these spambots using a special web server that slowly responds to it's requests (sends out a small packet every 10 seconds) so it won't time out and won't consume much cpu time, and just feeds it a line or two lines of junk with each packet. Have it randomly generate a never ending supply of useless information to keep the spambot happy. While it's busy with the useless site, it's not bothering other people nor is it getting any real addresses.

    --
    Who would win this election: Andrew Weiner vs Andrew Weiner's weiner.
  66. Better than a honeypot.. by multipartmixed · · Score: 2

    ..howabout a glue trap?

    1. Publish false mailto: addresses on your web pages in the same colour font as your background

    2. Change them to visible, valid addresses by munging them with DHTML properties and a
    JavaScript include file (sorry, Lynx users)

    3. When a recognizable spam-bot comes in, refuse to load the javascript include file. mod_setenvif and mod_rewrite should help out here.

    4. When a probable spam-bot comes in, serve up the page reaalllly slowly, don't close the connection until it goes in CLOSE_WAIT. This ties up sockets on the remote machine and reduces its ability to troll OTHER sites. You can do this by writing a handler for your base directory, checking the browser, and returning DECLINED for friendly people. That should be in, I think the "post read" phase.

    5. When a recognized bad address comes through to your mail server (from step 1), slooooow the SMTP transaction down as much as you can (same idea as step 4), and throw an error at the end of the 354 DATA section a few times (to force him to come back!), etc. (Some sendmail internals hacking required here, although it would be much easier to hack if you don't have any real mail and just ran a script from inetd.)

    6. Those fake email addresses. Make them all point to a common MX or group of MXes that you control the DNS for. Make sure those MX records aren't used by anything legitimate. Slooooow your in.named down for requests to that domain. A cool side effect, besides tying up sockets on the spammers end, IIRC some OSs can only make one resolver request at a time -- this'll effectively block all of his out outbound spam traffic while he's trying to look up your MX record! Also, make sure the TTL is set to about 10 seconds, just to make sure he comes back the glue trap very often.

    How's *that* for spam countermeasures? I wish I had time to write it. :-)

    --

    Do daemons dream of electric sleep()?
  67. Re:Block? Are you kidding? by realdpk · · Score: 2

    Memory plus an Apache child. Any solution which causes Apache to be put sleep artificially can and likely will be used as a very effective DoS against your site. Unfortunately.

  68. Re:Block? Are you kidding? by evil_one · · Score: 2

    You don't need to do that.
    MX records do that for you.
    You can actually have email@mydomain.com when you don't have a box providing an ip for mydomain.com
    MX records say "hey, you, all the email for is handled by - as such, you could easily tell your DNS provider to set the MX for any number of hosts to 127.0.0.1

    --
    Desperation is a stinky cologne
  69. Re:Block? Are you kidding? by 56ker · · Score: 2

    Or billgates@microsoft.com

  70. Bzzt. by gblues · · Score: 2
    No soup for you!

    The mailto:address@foo.com?Subject=bar syntax was introduced by Netscape 2.0.

    Nathan

  71. Here's a Javascript that writes mailto: links... by Artifice_Eternity · · Score: 3, Informative

    ...so that you can leave them out of your HTML source:

    http://artificeeternity.com/includes/linkwrite.j s

    Instructions for use are included in comments. The script fragment that replaces mailto: links in the page will actually shorten your code -- it only requires entering the username and domain once. Also, the @ sign is added in by the script, so the address itself never appears in your HTML.

  72. Don't use a mailto and uce@ftc.gov by macdaddy · · Score: 2
    Many of the suggestions above say to put a mailto link on a hidden IMG that goes to uce@ftc.gov. THAT'S BAD! Or better put, that doesn't gain you anything more often than not. The reason I say this is because many of the spambots I've gotten my hands on lately automagically search for and remove that specific address. Many also remove all *.gov addresses. The best thing you can do as a server admin to seed an address that goes to uce@ftc.gov is to create a simple mail alias or user with a .forward on you mail server that forwards to uce@ftc.gov. That way these "smart" spambots won't detect that seeded address and remove it.

    Ideally you would actually create a spam trap account for this task and use a procmail recipe to briefly explain what you're doing in the forwarded message. That way the raw forwarded headers can't be misinterpreted as your server sending the spam.

    I do this very thing and have had great luck with it. I seed multiple addresses on key pages so that uce@ftc.gov is garunteed to receive a number of these pieces of spam. I also send this spam to the newsgroup bot for news.admin.net-abuse.sightings, a newsgroup filled with forwarded spam LARTs for us anti-spammers to search for patterns or previous spamming evidence. You just add "nanas-sub@cybernothing.org" to you recipient list and prepend the forwarded subject line with "(email)". That's it!

  73. Re:Block? Are you kidding? by slamb · · Score: 3, Insightful

    Way too much work. Here's similar Escapade [escapade.org] code:

    Not similar enough. That makes 300 queries per hit against your database, and I don't think you even used prepared statements. His code slowed their software to a crawl by sleeping. Yours will slow your software to a crawl by excessive database traffic.

  74. Robots Exclusion is usually honored by spambots by asackett · · Score: 2
    At least, in my experience, that's the case. I've got bot bait on my site that's been there for many months now, and it has yet to be crawled. I get lots of hits on robots.txt from agents I believe to be harvester bots, but none has yet ventured in. Most of the hits on the bait come from curious slashdotters.

    Just to make sure it gets said: The email address that's listed here on /. is a spamtrap. Don't use it! My user name in my domain is the same as my user name here. I didn't intend for that address to become a spamtrap, but it was soaking up so much spam it seemed wise to put it to good use.

    --

    Warning: This signature may offend some viewers.

  75. Elcomsoft (remember them) sells spamware! by Convergence · · Score: 2

    Hey... Here's something I found out a few days ago:

    http://www.mailutilities.com/aee/

    Elcomsoft, who are the makers of the Advanced Ebook processor (remember Skylarov?), also make various email utilities. Although some look like they might have legitimate uses, at least one looks to have *no* legitimate use. (When a tool is designed to scan web pages for email addy's, and DESIGNED to pull out real names&email from web forums...)

    Read the above URL and the rest of the site yourself and draw your own conclusion.

  76. Buffer Overflow? by xkenny13 · · Score: 2
    How about things like buffer overflows? Worms/hackers have been exploiting them for years ... would:
    • <a HREF="mailto:abc(insert 1000 characters here)@blahblahblah.com">
    have any detrimental effect?
  77. http://www.mailwasher.net/ by jasonk3 · · Score: 3, Informative
    1. Re:http://www.mailwasher.net/ by Peale · · Score: 2

      Whoops! Thanks!

  78. Re:Block? Are you kidding? by BlueUnderwear · · Score: 2
    Can you post the script you use to do this?

    I'd really like to, but unfortunately, I can't get the script past that lame lameness filter... Yes, I know, I shouldn't have used Perl... If any of the editors are reading this, please consider making that filter less strict. Thanks!

    --
    Say no to software patents.
  79. Build up the mailto with javascript by maddugan · · Score: 2, Informative

    Here is what I do on my website to protect email address

    Javascript:
    function sendmail()
    {
    var string = 'mail'
    string += 'to:'
    string += 'webmaster'
    string += '@'
    string += 'domain'
    string += '.com'
    open(string)
    }

    Usage:
    <a href="JavaScript:sendmail()">webmaster</a&gt ;

    This could be expanded to pass the values need to build up the email address.

  80. Re:Pollute their database by Sir+Tristam · · Score: 2
    So then the spammer fills his database with these non-existant addresses on existing domain names. He then sends his spam to these addresses, and their mail servers not only have to process the message to determine that it's an invalid address, but they also have to bounce the message back as undeliverable.
    So? The right answer is to make sure that the domain names you are giving out will be valid. Just go to http://www.spamhaus.org/ and compile a list of valid domains of current active spammers. Randomly select from these when generating bogus email addresses. It doesn't get rid of the network bandwidth issue, but it does chew up the resources of the spammers' servers since they will have to process the incoming messages. Let the spammers DOS each other.

    Chris Beckenbach

  81. At what point can spam be considered a DoS attack? by Mustang+Matt · · Score: 2

    Can I claim that all the spam these jerks send me are an attempt at a DoS attack?

    --
    The man who trades freedom for security does not deserve nor will he ever receive either. - Benjamin Franklin
  82. Slashbot? by Webmoth · · Score: 2

    OK, so we've got spambot prevention. Now we need some effective form of "Slashbot" protection. I envision a webserver that will detect a high number of referrals from Slashdot and put the server into "low bandwidth" mode, serving pages stripped of formatting and graphics (with links to graphics, of course) in order that content may be delivered in an efficient manner.

    --
    Give me my freedom, and I'll take care of my own security, thank you.
    1. Re:Slashbot? by Eil · · Score: 2


      This has actually already been done. I went to one web page linked directly from slashdot the other day that had the heading (to paraphrase):

      "Welcome Slashdot visitors. You have clicked through a slashdot link to access this page and in an effort to cut down on bandwidth expenses, you have been referred to this page. It is devoid of ads, script, and graphics but otherwise contains the same content. You can click here [link] to view the original page."

      I'm surprised I hadn't seen something like this earlier, but then I guess most people never expect to be slashdotted until it actually happens...

  83. Re:Block? Are you kidding? by F�an�ro · · Score: 2, Interesting
    How about a few people volunteering real FQDNs that all resolve to 127.0.0.1? I realize that people would be volunteering horsepower and bandwidth for DNS lookups, but it would be in the name of dramatically reducing spam. Then, keep a list of all the "loopback FQDN's" and let the rest of us feed those FQDN's into spam-trap generators. Eventually, there would be so many real-looking spam trap email addresses that the spam software wouldn't be able to keep up with the list of loopback FQDN's.
    Slashdot has been doing that for years with warez.slashdot.org . try it, it resolves to 127.0.0.1
    I always enter postmaster@warez.slashdot.org in spamforms
  84. Better than Loopback - Feed Them Open Relays by billstewart · · Score: 2
    Spamware that sends its own mail probably rejects 127.0.0.1, but spamware that abuses open relays won't notice, because it'll be the abused relay that resolves the domain name. Sendmail may be bright enough not to freak out with 127.0.0.1, though you could have fun with 127.0.0.2. And don't give them whitehouse.gov, but you might give them a FQDN that resolves to whitehouse.gov's IP address.

    But you can do better than that - Give them FQDNs that resolve to Open Relay sites, and use Round-Robin DNS if you can. If you've got your own domain, you can spare plenty of FQDNs, like mail2.mydomain.com.

    • The spammer or open relay will send spam to mail2.mydomain.com, which resolves to [relay1.school1.kr].
    • Relay1.school1.kr will relay it to mail2.mydomain.com, which your round-robin resolves to [relay2.school2.kr].
    • Relay2.school2.kr will send it to mail2.mydomain.com, which your round-robin resolves to [somebox.cn.net].
    • somebox.cn.net will send it to mail2.mydomain.com which ..... ad nauseum, ad erroneum.

    Depending on how you set up the round-robin, and where the relay machines get their DNS resolution done, you may be able to make them run in a tight little loop around the Korean broadband, or burn expensive international bandwidth between China and Sweden.

    Or you could give them random names at various spammer and spamhaus sites, or FQDNs that resolve to the addresses of spammers or spamhausen, or remove-me addresses of other spammers. They may filter out their own, and don't give them obvious addresses like abuse@ or postmaster@, but surely they won't recognize most of them, especially the latest Corrupt Nigerian Official trying to launder embezzled money.

    --

    Bill Stewart
    New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
  85. Better Addresses To Feed Spiders by billstewart · · Score: 3, Informative
    I've posted a separate article about fun tricks with round-robin DNS to feed spammers FQDNs that resolve to open relays, which will forward to other open relays. And if you know machines running Teergrubes, they're excellent addresses to feed spiders.*

    If you're not messing with DNS, though, there are lots of addresses that can cause trouble:

    • sales@spammerdomain.com, where the domain may be your spammer (if you customize your spidertrap) or a random spammer. They'll probably reject abuse@ and other obvious administrators, but names like "sales" and "purchasing" and "marketing" and anything that might get a real user is good.
    • randomjunkuser@spammerdomain.com. If they're not verifying the list before using it, this is good.
    • randomjunkuser@randomjunksubdomain.spammerdomain .c om
    • randomjunkuser@spamhausdomain.com, at some site that encourages spammer customers.
    • randomjunkuser@randomjunksubdomain.spammers-ISP. ne t - does the spammer's ISP check for bad DNS hits?
    • randomjunkuser@othercustomer-of-spammers-hosting -I SP.net. Your mission is to get the spammer's ISP to throw off the spammer. If you want to be much ruder, you can use real-presidents-name@othercustomer-of-spammers-hos ting-ISP.net .but both of those attacks require more customization to hit spammers you're having ongoing problems with, as opposed to shotgunning them all.
    • unsubscribeme-address@unsubscribemedomain.com - anything not immediately recognizable as "remove@". Give some other spammer's list builder a bunch of addresses to work with.
    --

    Bill Stewart
    New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
  86. Teergrubes and other traps for spammers by billstewart · · Score: 3, Informative
    Teergrubes are tarpits to stick spammers in. They look like perfectly correct SMTP servers, e.x.c.e.p.t. t.h.e.y. a.n.s.w.e.r. v..e..r..y.. s..l..o..w..l..y.. and maybe generate lots of error messages requiring repetition, and basically they leave the spammer's machine tied up for a long time with very little effort. A legitimate mailing list server that encounters a teergrube will normally survive, because it's usually multithreaded, or at least has almost all its recipients as legitimate users, but an occasional few minutes of one thread stuck in a trap isn't a major problem. But a spammer who's encountering a large number of teergrubes (especially if he picked them all up at once from a spidertrap) will have lots of threads tied up for a long time and may not have enough spare capacity to bother real targets. There are a number of implementations around.

    And somewhere out there is a far nastier variant on a teergrube that can keep a typical smtp session up for hours with only a few kilobits/minute, using tricks like setting TCP windows very small, NAKing lots of packets so TCP retransmits them, etc. (It basically works by saying "No, SMTP/TCP/IP isn't a set of protocol drivers in my Linux kernel, it's a definition of a set of messages and there's no reason I should user a bunch of well-tuned efficient reliable kernel routines when I can send raw IP packets myself designed for maximal ugliness."

    • Spamido is an automated tool for collecting spammers' addresses so they can be fed back to other spammers.
    • Wpoison and Sugarplum are spidertraps that generate lots of fake addresses for a long time.

    --

    Bill Stewart
    New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
  87. Re:using images is bad for people with text browse by Eil · · Score: 2


    Golly gee, let's see here. Ways to thwart the spambots.

    You can URL-encode and un-mailto your address.

    But spambots can still read most plaintext email addresses from the text itself...

    Then encode your email address into a piece of javascript.

    But many normal users don't have javascript turned on...

    Then write your email address into a GIF or PNG.

    But certain types of disabled people and lynx users won't be able to view those images...

    This author would argue that those two are one in the same. But still, you can also obfuscate your address for the user to figure out, providing directions on how to unobfuscate it. (NOSPAM.bob@NOSPAM.hoser.com)

    But there are many users who are too dumb to unobfusicate the address...

    Then write a web page with a form for sending the message... the email address remains hidden.

    But this is insecure / stupid / not fully supported by Mosaic 0.13beta...

    Then whoever can't use one of the above methods can go sod off. I plan to use most of these, grouped together into one contact.html page on my personal web site. If there are a couple of users in the world out of thousands who can't contact me due to technical or mental limitations, then dang them to heck for all I care.

    You see, it's a balancing act of preferences. Would you prefer to let (literally) a couple users slip through the cracks, or would you rather get bombed by potentially hundreds of spambots? Your choice...