Stopping SpamBots With Apache
primetyme writes: "Sick of email harvesting spam robots cruising your Apache based site? Here's an in depth article that shows one way you can configure a base Apache installation to keep those nasty bots of your site - and the spam out of your Inbox." Anything that helps annoy spammers is a good thing.
Checking the user agent won't work for long - how hard will it be for the spammers to change the user agent to "Mozilla..."
Using some client side Javascript would be harder for them to deal with (although if your browser can view it they will be able to also).
I guess graphics would be next...
Surfing slowly, in the Bandwidth Ghetto
On the other hand, there are ways to fight spambots; they just don't rely on trusting the user. Here's one way:
There are good ways to deal with spammers but this isn't one of them. It *might* work on a small scale and it definitely won't work on a medium or large scale. It's about as useful as the Sendmail "MX/domain validation" trick that Eric Raymond and the rest of the Sendmail team thought would stop spammers dead in its tracks. (It didn't.) Instead he was "surprised by spam."
-CT
It's not exactly what you mean, but something similar is The Book of Infinity. It doesn't generate email addresses, but it does generate an infinite website.
It's called wpoison, and it's found at http://www.monkeys.com/wpoison/. The problem is that it's very easy to detect -- note the lack of punctuation marks, scarcity of two and three letter words, capital letters, verbs... and the fact that there's a four second pause in the same place, page after page... in short, it would be easy enough to spot a wpoison-generated page.
I've coded up an alternative that suffers none of those obvious defects, and instead of throwing out bogus email addresses, it throws out valid spamcatcher addresses. Any SMTP host who sends a message to one of those addresses is blocked (via DJB's rbldns) for a month from sending mail into my domain. The blocklist is self-maintaining, so I never need to mess with it.
It's been in place for about three months now, and my blocklist contains 125 entries right now -- five of which are netblocks I've manually added. The URL, sure to catch a bucketful of bad spiders thanks to this link, is http://www.artsackett.com/personnel/ and it is intentionally as slow as the rectification of sin.
Warning: This signature may offend some viewers.
One big difference - MSN discriminated against valid browsers that were just people trying to view their website. The user agent IDs here (with a coupla exceptions - *cough* wget *cough*) are all things that are only ever used for spam purposes. There is a difference between blocking people because they don't use your software and blocking spam robots.