How to Prevent Form Spam Without Captchas
UnderAttack writes "Spam submitted to web contact forms and forums continues to be a huge problem. The standard way out is the use of captchas. However, captchas can be hard to read even for humans. And if implemented wrong, they will be read by the bots. The SANS Internet Storm Center covers a nice set of alternatives to captchas. For example, the use of style sheets to hide certain form fields from humans, but make them 'attractive' to bots. The idea of these methods is to increase the work a spammer has to do to spam the form without inconveniencing regular users."
Ok, so captchas and other email obfuscation mechanisms are used a lot. Fine, a web designer can choose to do this.
Now, lets enter US law: American with Disabilities Act. Target is currently being sued for NOT complying with this federal law. I can understand why businesses would be required for this, but where will the net-boundaries stop?
For example, I have a US corp. I hire an offshore datacenter to handle web processing. Is my website have the compulsory ADA lawss upon it, or do they not apply due to international boundaries? Yipe.
I hadn't read the article yet, and just the summary, and as soon as they said 'hidden fields' that are attractive to spambots, I thought "Why not hide the fields from the spambot instead?"
It's easy, you just have the javascript create all or part of the form. Or modify the form in some way. It would happen before the user even sees the form, and the spambot would have to implement a javascript parser to get it. (Or a parser, that's unique to your site.)
I would think AJAX would be a huge hamper to them as well.
"If you make people think they're thinking, they'll love you; But if you really make them think, they'll hate you." - DM
Private Key encrypt the randomized field names and have a hidden Public Key field. That way, the fields foo, bar, and abacab have no sense of meaning to the bots, but will decrypt to subject, body, and spammer catcher.
I actualy like the ones like that.
instead of obfuscated images, just put in plain text questions.
What is 2+2?
What is the 3rd word in this sentance?
What is the name of my blog?
All of these can be answered by some one using a screen reader, and take less time then figguring out a captch. Sure it does not stop manual spamming, but what does?
Do Or Do Not, There Is No Spoon, There Is Only Zuul. Everything in the above post is probably opinion.
Many that I've seen recently actually have an audio key to listen too if you can't read the image.
Good. Cheap. Fast. Pick Two.
My Method is to just disallow posting of html. I have a simple blog, and if they try to do anything like post too many HREFs or or something, then I just deny the post. That seemed to work for the most part. The bots usually tried to post URLs on my site, so if they posted something like with < and >. They also try posting [link]...[/link] which also doesn't work on my blog, so I just display an error message and let the user fix it. You can still post straight URLs, but that's not too good for spammers, because they usually want a link. I also stop people from trying to post more than 5 URLs in a single post, since I noticed the bots like to do that. I recently upgraded by blog to use AJAX to submit the comments. Adds an extra layer of protection against the bots, but I really haven't needed any since I added in the filters mentioned above.
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
None of the spambots that attack my site fetch the comments page before trying to post. There's never (and I do mean never) a GET before a spambot's POST. So I have a hidden field with a meaningless name ("magic"), and the value is set to the server's current time. Comments with timestamps that are too old are ignored.
To make it less obvious that the value is a timestamp, it's XORed with a random number (which is included in the form value) and eight random, meaningless bytes are thrown in for good measure. The end result is 32 seemingly-random hex digits--it looks just like a session ID.
This technique certainly isn't going to fool a determined attacker, but no spammer is going to waste their time trying to figure it out.
If the CAPTCHAs were being defeated by humans, there should have been no change. It had to have been spammers mass-OCR'ing images.
~ roscivs
Shameless plug! I developed a plugin for Ruby on Rails that uses DNSBLs to combat form spam. (begin shameless self promotion)
dnsbl_check rails plugin
Basically what the plugin does is check clients against one or more DNSBLs. You might know them from mail servers. You see, it turns out that the forms are almost always abused by bots. These bots are quite well known. sbl-xbl from spamhaus catches 80% in my setup, spamcop catches the rest. You enable the plugin for key controllers and it really does work.
(/end shameless self promotion) mod me down if you wish