Slashdot Mirror


Stopping SpamBots With Apache Part II

primetyme writes: "To address some of the concerns brought up in the first article about stopping email harvesting spambots with Apache, I've written a follow-up article that details even more methods to keep email-sucking bots off your Apache based site.
Stopping Spambots II - The Admin Strikes Back continues the epic saga that pits Spambot vs. Administrator."

15 comments

  1. Restarting the server? by epsalon · · Score: 3, Interesting

    The article suggests restarting Apache for every spam address detected. That could make DOSing your web server real easy. Spoof a bunch of IPs and request the honeypot dir. Watch as the webserver restarts over and over.
    Also, this approach would easily block legitimate dialup users, and more problemaically - proxies. If the spambot is behind a proxy, you would block the entire user base of that proxy.
    Maybe an X-Forwarded-For based approach? However, that is easily bypassed.

    1. Re:Restarting the server? by primetyme · · Score: 3, Insightful

      Valid point epsalon.. but to clarify, Apache only gets restarted for every *new* IP address detected.. As for the spoofing, it would take a lot of IP's to DOS the server, and anyone willing to go through that much trouble just to take down a webserver probably has better ways to do it. Point taken though :)

    2. Re:Restarting the server? by epsalon · · Score: 5, Interesting

      A simple improvement will be to send SIGHUP to the webserver to make it reload the config without restarting. This still can be used for DoS, but less efficiently.
      A better way to do it is by writing (using?) an Apache module that does the logging in memory with no costy reloads or restarts.
      However, this still does not prevent the proxy and dialup problems illustrated above. Also, you won't catch spambots that don't use robots.txt to find addresses.
      Another improvemnt will be to deny addresses the moment they ask for robots.txt while identified as "Mozilla" user-agent, and to detect clients that do a websuck without requesting robots.txt first and deny them as well. You can detect a websuck by posting a "hidden" link in a place normal users won't see and stop any IP that requests it.

    3. Re:Restarting the server? by primetyme · · Score: 2, Informative

      An actual Apache mod is slated for part 3 of the series.. stay tuned. Thanks again for the feedback!

    4. Re:Restarting the server? by beebware · · Score: 2, Insightful

      Why not just us a .htaccess file in your top-level htdocs directory?
      Order by deny,allow
      deny from spammers.ip.address.here, another.spammers.ip.address
      allow from all

      will _probably_ do it (ie this is an untested example!)

  2. my favorite tactic by augros · · Score: 1

    here's one of the best tactics i've found: http://www.phpconsulting.com/php/hide-email.php

    1. Re:my favorite tactic by glyph42 · · Score: 1

      Right, well now that this cat is out of the bag, it's going to take the spammers three lines of code to get around this trick, that is if they haven't already.

      --
      Music speeds up when you yawn, but does not change pitch.
    2. Re:my favorite tactic by beebware · · Score: 4, Interesting

      Best tactic I've see is just providing a web-to-email form for people to fill in. After all: if they've got their web browser loaded, do they really need to launch an email client to contact you? Keeps your address hidden, and as long as you don't use something like Matt Wrights formmail.pl script, quite secure. Get the outgoing mails tagged with the senders IP, browser details etc and it'll help track abusive messages as well...

  3. My trick... by Pathwalker · · Score: 4, Interesting

    I use this little rxml widget on all of the email addresses on my web site.

    If the client is detected as a robot, or the detection fails, the address is displayed as a randomly named graphic.

    If the client is not detected to be a robot, then just a light entity encoding (which I change from time to time) is applied to the address, which is displayed as a mailto link.

    1. Re:My trick... by cluge · · Score: 2

      The problem is that smart robot programs can make their robot appear as any client. From experience I can tell you that most of these critters running around harvesting e-mail addresses are telling the server they are WIN98 with IE 5.0.

      I just have one e-mail address on my honeypot page, when you send an e-mail to that address it triggers a script that firewalls the sending IP with iptables/ipchains/ipfw (depending on the server) and logs it. Makes it easy to find open relays and spamhaus servers.

      --
      "Science is about ego as much as it is about discovery and truth " - I said it, so sue me.
  4. My technique... by Anonymous Coward · · Score: 1, Informative

    On my web page I convert email addresses to .gif *images* of email addresses. A real person will be able to see the address, but will have to type it in.

  5. apache module by po_boy · · Score: 2

    I wrote an apache module in perl to do a very similar thing. No restarting your webserver.

  6. Cookies by giminy · · Score: 1

    Couldn't you just set a cookie, with a site-wide password in it? Then just require the cookie/password protect every page. Or do spam crawlers know what to do with cookies these days?

    --
    The Right Reverend K. Reid Wightman,