Why the CAPTCHA Approach Is Doomed
TechnoBabble Pro writes "The CAPTCHA idea sounds simple: prevent bots from massively abusing a website (e.g. to get many email or social network accounts, and send spam), by giving users a test which is easy for humans, but impossible for computers. Is there really such a thing as a well-balanced CAPTCHA, easy on human eyes, but tough on bots? TechnoBabble Pro has a piece on 3 CAPTCHA gotchas which show why any puzzle which isn't a nuisance to legitimate users, won't be much hindrance to abusers, either. It looks like we need a different approach to stop the bots."
So if the CAPTCHA is doomed, what is the next approach? Letting spam bots go rampant over a site is not an acceptable alternative.
Jumpstart the tartan drive.
...is the point going right over the author's head.
A CAPTCHA works well enough for the same reason greylisting works well enough. They may be trivial to bypass (for some definition of 'trivial'), buy many applications only need a tiny speed-bump to make a huge difference in undesirable traffic.
...until AI gets smart enough to answer questions intuitively.
"To err is human, to mod Funny divine."
That's where the issue is.
I've been a nerd since I was born. Grew up with early computers. Watched them evolve until now. But nothing makes me feel dumber than trying a CAPTCHA 5 or 6 times and failing every time. Its a serious annoyance and I've seen WORSE that I haven't even attempted.
Job? I don't have time to get a job! Who will sit around and bitch about being broke and unemployed then?
block the I address for 10 minutes, then an hour then a day.
The Kruger Dunning explains most post on
This troll actually gave me an idea. Why not ascii art?
Give an ascii art picture and asc the user to tell what it is.
In this case cock would let you through.
"I don't have to think. I only have to do it. The results are always perfect, but that's old news." - Meat Puppets
... which is another way of saying they really doesn't work at all. Both annoy legitimate customers and users while still allowing those with nefarious motives to do whatever they wanted to do in the first place.
If someone says he and his monkey have nothing to hide, they almost certainly do.
... you are a computer. Life, er, up-time will be easier.
The world is made by those who show up for the job.
Everyone seems to think that the answer to this is to challenge the user somehow. Why isn't a technical solution possible that doesn't require any interaction from a person?
On my own contact forms, I use a really simple obfuscation technique, it doesn't require any user interaction, and I don't get any spam. I've chosen to name my form elements with meaningless names, because obviously automated spammers rely on field names to fill in the blanks. If they see a form like this:
<input type="text" name="email">
<input type="text" name="subject">
<input type="text" name="message">
Obviously it's pretty easy to fill out. If they see this instead:
<input type="text" name="sj38d74j">
<input type="text" name="9sk2i84h">
<input type="text" name="m29s784j">
Then they probably won't even make it past the email validation part, unless they catch the error that my page is printing and try all combinations (or get lucky).
It makes it even more effective when you use fields with good names, but hide them from users with either CSS or Javascript:
<input type="text" name="email" style="display: none;">
That's a honeypot, if it's filled out then it's a robot. You can use the same CSS or Javascript techniques to also print messages informing users not to fill those out if their browser decides to not run my code and instead shows them.
Really simple solution, requiring no user interaction, and is at least if not more effective than a challenge and response type of solution. I don't know why everyone is hung up on a visual challenge when it's a lot easier to distinguish between a real web browser and a scraper that doesn't bother to execute Javascript or apply CSS. I've been saying this for years though, so I don't really expect anyone to start paying attention now.. at least my own inbox is spam-free though.
The author was arguing that one of the primary reasons to do captcha breaking is to get freebee email accounts on GMail/Yahoo to send spam from.
Limit the email the account can send, and you reduce the desire for the account. Reduce the usefullness of the account, and you reduce the desire to crack the captcha on new account signups, or at least the profitability in doing so.
It's one approach that would make a difference, but it's clearly not the only solution.
I'm out of my mind right now, but feel free to leave a message.....
Because an open ended question would get a million different responses.
And having the user select a radio button would narrow the probability down to 1/X choices. And when you have a million bots, 1/x is more than enough to get your spam out.
you could use the same questions for every picture, just make them generic:
Example: Picture of cat.
Question 1: Does this fly?
Question 2: Is this living?
Question 3: Would a human be able to pick this up?, etc.
Copyright 2010. All rights reserved. This comment may not be copied in any way including, but not limited to caching.
has a different take on the subject. Rather than trying to obscure the image with lines or similar measures, it uses a series of letters, some of which are a color. You are then asked to type in the colored letters to proceed.
I don't know if these are static images or generated each time but the owner claims his site has almost no spammers (i.e. people have to do it, not machines).
We will bankrupt ourselves in the vain search for absolute security. -- Dwight D. Eisenhower
All the bot needs to do is do a google search for "site:example.com", hit a random sampling of the results, and then register.
In the grand scheme of things, it probably only adds a few percent of overhead for the bot.
Most CAPTCHAs are hacked because their implementation is amatuerish. They are hacked by resusing session ids or dictionary attacks and nothing to do with actual image itself. Long story short CAPTCHAs reduce the amount of spam by more than 50% simply because it's not worth the effort for a spambot to break it, after all they have the entire internet to spam.
Some are good some are bad and most are downright horrible, but you wouldn't want your favorite forum to be trolled by spambots would ya? Might as well live with it. Nothing works 100% you should know that by now
did you forget to take your meds?
Already been done.
Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
The more effort someone is willing to put out to prove they are human or are backed by a human willing to be responsible for problems, the more abuse-able services you give them.
For example, e-mail service providers could offer several tiers:
Simple signup/new accounts:
Limited number and size of incoming and outgoing messages.
Verified signup/driver's license with confirmation by paper mail:
Nearly-full, with shutoff or limitations imposed at first sign of abuse.
Verified signup/credit card with confirmation:
Nearly-full, with shutoff or limitations imposed at first sign of abuse.
Established account, with a pattern of usage indicative of a human over a period of several weeks:
Nearly-full, with shutoff or limitations imposed at first sign of abuse.
Credentialed user, backed by a substantial bond or deposit and an explanation of why suspicious behavior really is legitimate:
Full access plus a free pass on "legitimate" suspicious behavior until someone complains, but if it's abused then throttle him and take the costs out of his deposit.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
Solve the following math problem to continue:
1/0 = ?
I have suggested a solution more times than I care to count:
There's your first clue that maybe your solution isn't the be-all-end-all you think it is.
impose default caps on sent emails per account, IP, whatever, until the sender has been established as a legit sender of mass mails.
OK, but who are you suggesting should impose these default caps? ISPs? That's fine, but the only way an ISP can do this is by firewalling outbound port 25 and requiring all their customers to relay mail through the ISP's mail server. A lot of ISPs do this and I wish more of them would, but it can cause problems for customers (if you're required to relay through your company's SMTP server instead and they haven't configured an alternate port such as 587, or if the ISP's SMTP server is poorly configured/overloaded/hacked/broken, then the user can't send mail and the resulting customer service calls are pretty expensive for the ISP and could drive the customer to leave).
On top of that, a lot of people are migrating away from traditional POP3/IMAP/SMTP e-mail accounts, and just using webmail services instead. Webmail services, of course, can impose all kinds of limits on the activities of their users, but these limits only make sense on a per-account basis. You can't put limits on the number of messages sent from one IP address regardless of who's logged in, because there could be 300 different users all connecting through a proxy server on one IP, and you have no way to tell the difference.
So, you have to limit each account. But a spammer can easily sign up for multiple accounts, using an automated program! Then they can get around your restrictions, by logging in on 300 different accounts and sending one e-mail from each of them. How do you prevent this?
By using a CAPTCHA.
Which is what we're talking about.
Thanks for playing!
$x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
$x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;
I agree there are ways to circumvent it, but the majority of bots will not go to the trouble of doing that, and that's the key.
Another idea would be to observe mouse movements through Javascript to detect a real user. This would be VERY inefficient for a bot, and probably not worth the while.
This would work great until the majority of websites do it, then it is worth the overhead for the bot to go to the trouble of doing it. When CAPTCHA started it wasn't worth the bot writers' trouble to crack it. They just went to easier sites, but as more and more sites adopted CAPTCHA the value of cracking it became greater. Any successful system will eventually be adopted by a large enough number of websites to make it worth the bot writers' time to crack. At which time they will.
The truth is that all men having power ought to be mistrusted. James Madison
Still won't defeat the army of underpaid workers to do it.
Disclaimer: I am not god.
We may not be created equal
But we can be treated equal.
This is my favorite captcha, some are ridiculous: http://random.irb.hr/signup.php
Refresh the page a bit, fun to see what you can get.
Sturgeon was an optimist.
Most posts on this topic have been along the lines of, "Maybe CAPTCHAs as they are implement now don't work, but here is a method that is trivial for people but hard for computers."
TFA's best argument, in my opinion, was that it is trivially inexpensive for a spammer to simply hire people to break CAPTCHAs. So, a method that doesn't annoy people but is hard for computers still won't work because the spammer will just use people. This is not a topic I know a lot about (not being a spammer I don't know what kind of revenue they generate) but would like to hear a response to this. Is the TFA off its gourd and better technology really will solve this problem? Or is gate-keeping for free services essentially pointless?
Greylisting only works because many sites don't use it; if everybody used it, it would stop working.
The economics of CAPTCHAs are even less favorable, since the cost of breaking a CAPTCHA is small compared to the cost of what the bot actually does after it has broken it.
Because my Lynx browser doesn't support it!
I will not be pushed, filed, stamped, indexed, briefed, debriefed or numbered. My life is my own.
I don't understand why we still use CAPTCHA's or Kitty tests. I have been using a method on my service providing site for the past 5 years that fools any bot.
I simply state "Are you Human" yes/no.
You wouldn't believe the amount of success I have since I took down the earlier CAPTCHA technology I was using.
Almost immediately, the amount of customer emails I get daily increased 453%! Many of these customers were offering me things, such as money or drugs! I was also able to buy Viagra at near wholesale prices (and then turn them for a profit on my business trips to Florida).
My traffic has increased too! The amount of people using my free service almost took down my servers. I had to get 3 more! Of course, I am now operating at a loss, but I sleep well knowing that I made a difference in the world by letting so many people access my great service.
If anyone wants the "are you human" technology from me, I will give it away for free! Just email me. Thanks!
I watched an amazing mini-documentary about Re-Captcha and really like the concept and the end goal. Basically Re-Captcha uses two words, one known word and one of the words is unknown and comes from book digitization efforts. The known word gets you into the site for whatever you are doing, the unknown one comes from a literary work that OCR couldn't figure out. After a large sampling of people have typed the unknown word the majority answer becomes the text entered in the digitization effort.
My contention is that people like myself who think it is a great cause would happily spend some free/bored time just entering the unknown words on a website without the whole captcha bit. If anyone here is a part or knows anyone on the team please bring this idea up.
http://teasphere.wordpress.com - A little spot of tea
Would it really be that hard to have a picture of a rabbit and set it to accept bunny or rabit or even hare?
When you spell it "rabit", it is.
Everyone has a great idea for a CAPTCHA, but very few people know what the hell is really going on. Remember that the machine doesn't need to solve the CAPTCHA every time, that machines are infinitely patient and have huge memories, and that another machine needs to make sure the human gave the right answer!
Ideas that won't work:
Really, it's very easy to think you've come up with a very clever CAPTCHA. When you think that, all you've done is stoked your ego and screwed yourself over. It's the same reason why we don't roll our own cryptography: CAPTCHA-making is a very hard problem, mainly because your problem space must be infinite (to avoid an attacking machine simply memorizing answers), the answers verifiable by a machine, but the problems not solvable by a machine.
How many questions can be checked by machines but not answered by them?
Not many; fewer every day. There are no questions that can't be answered by a computer (and which can be answered by a human mind). The Church-Turing thesis [wikipedia.org] has some validity: the human mind is no more powerful than a turing machine, and ultimately, computers and our brains are equivalently computationally. There's nothing a computer can't solve: there are just things we haven't figured out yet.
Is this like those things that pop up and ask you to type in what it says? Like letters and numbers? example: htyeopa9876hg.. but it's all fuzzy and you have to try and figure it out?
A CAPTCHA is not a Turing test. A Turing test requires that a person tell a computer and a human apart; the CAPTCHA problem is harder, from a certain point of view, because a computer is required to tell a human and a computer apart.
SPAM is sent from compromised computers. If you make people pay for posts then the owners of compromised computers will be billed - not the real senders of SPAM. Billing would help minimize the problem, but we would still receive a pile of SPAM. And a pile of people who only use their computer once a week would have to foot the bill.
When the PHPBB2 CAPTCHA became completely useless and I was seeing hundreds of bot registrations on a forum I ran, I built something else. I added a simple extra text field to the registration form. I ask a plain English question, giving away the answer, and require the user to write it in the blank.
i.e. What is the common name for a domesticated feline? (Starts with "c" and ends with "at" This is an anti-spam measure)
The field is checked for the right answer on the post-processing. This stopped 100% of the fake registrations. I ended up doing this on practically every web-accessible form I have built since then, and I've seen the method pop up on other people's websites as well (certainly parallel evolution rather than "they got it from me").
While that may be effective for the moment, as soon as a webmail provider starts using it, it'll be cracked overnight.
Limit the email the account can send, and you reduce the desire for the account. Reduce the usefullness of the account, and you reduce the desire to crack the captcha on new account signups, or at least the profitability in doing so.
Doesn't this increase the desire to get more accounts faster?
Sometimes, the captchas are ALWAYS unsolvable, like one site that uses complimentary colours of the same intensity. That works well unless you can't read text on a complimentary colour background, in which case you're always fscked. I am one of those.
Sounds like an animated captcha could be an alternative approach, since here you could vary the intensity over time. Of course the animated captcha should only be server generated series of bitmaps or vectors, and not be client generated (Flash would fail), for obvious reasons.
Jumpstart the tartan drive.
Then I have no idea how you would explain This.
Why is it so hard to only have politicians for a few years, then have them go away?
It's worse than that. Any free or recipient-pays message system is subject to exactly the same amount of abuse. When sending a message costs nothing, the marginal cost of advertising is zero. As long as the marginal gain is non-zero, however small, volume will go to infinity. You can filter and legislate to reduce the volume of this advertising, but you'll never actually eliminate it. These countermeasures just bring the marginal cost of email up to slightly above zero --- but not nearly high enough to discourage spam.
Email isn't special. SMTP is fine. There was fax-machine spam long before even Compuserve. Today, we see text message spam, Facebook spam, MySpace spam, and so on. Email itself isn't the problem. Changing what you call the system doesn't change how it works. It's recipient-pays messaging that's the problem.
Sure, sender-pay systems like the postal service see some volume of advertising, but the volume is kept down by the relatively high marginal cost. Ultimately, I don't see a way of reconciling free anonymous messaging with a spam-free inbox.
We all bloody well know how to get rid of spam but nobody ever talks about the real culprits. The credit card companies. The ones who facilitate the way for spammers to make money. Unfortunately the CC companies make money so they don't care, but let's face it, if the CC companies decided to get rid of spam and lose the income, it could be wiped out in a week. All they would have to do is deny any payments to somebody suspected of spam - problem solved - I never hear anybody bitch about the root of the problem which is the ability to recieve payments.
Stay tuned for new sig...
It is a task for United Nations. Spam is causing a major damage to the world economy via lost work time, traffic, etc. We need international enforceable laws, which would make spam illegal and inevitable punishable worldwide.
It is a bog problem and requires a big solution.
Our leaders shall overcome their cultural shock, phase out activities in local organizations, like EU, NATO, CIS, etc., and begin to work in a global setup, the UN, the WTU - world telecommunication union, Interpol, UNICEF, etc.
What is the point of fighting spam in, say, the USA, if it will continue to pour in from, say, Indonesia?
sweatshop ... paying roughly $5/hour
You're doing it wrong.
Alexander Peter Kristopeit bought his basement from his mommy for one dollar.