Building a Better CAPTCHA

← Back to Stories (view on slashdot.org)

Posted by Soulskill on Friday January 23, 2009 @12:04PM from the we-have-the-technology dept.

jcatcw writes "Steven J. Vaughan-Nichols reports that CAPTCHA cracking isn't that difficult these days. It has even become a business. For example, DeCaptcher.com will solve CAPTCHAs for your spamming needs at a rate of $2 per 1,000 successfully cracked CAPTCHAs. In response, newer systems are in development. Both Carnegie Mellon and Penn State (is there something about the water in PA?) are working on image-based systems. ESP-PIX and SQ-PIX both require the viewer to interpret pictures. Imagination CAPTCHA from Penn has the user find the center of an image. The idea is that humans are better at image recognition that computers, but humans can legitimately disagree on their interpretations and some humans are color blind. Problems remain. For now, sites would be well advised to look at reCAPTCHA — the system that works with Google Books and the Internet Archive to digitize printed texts — which comes with a wide variety of application and programming plug-ins and an open API."

34 of 197 comments (clear)

Indecipherable by Bordgious · 2009-01-23 12:06 · Score: 5, Insightful

I know _I_ often have trouble seeing those... Maybe some sort of an animated .gif would be better?
1. Re:Indecipherable by multisync · 2009-01-23 12:10 · Score: 3, Funny
  
  I know _I_ often have trouble seeing those... Maybe some sort of an animated .gif would be better?
  Me too. Wanna go halfers on 1000 CAPTCHAs?
  
  --
  I don't care why you're posting AC
2. Re:Indecipherable by Harik · 2009-01-23 15:35 · Score: 3, Insightful
  
  pretty much. It's outsourcing your captcha solving to impoverished third-world solvers. So really, there's nothing they can do to make Capchas better - humans ARE solving them, it's just an economic imbalance being exploited.
  I use it because I'm sick of capchas everywhere and it's dirt cheap. I figure if we break them bad enough people will stop trying dumb technical solutions to social problems. (spam)
Youtube captchas are terrible. by zymano · 2009-01-23 12:08 · Score: 2, Insightful

I speak for everyone. Captchas SUCK.
Get rid of them.
1. Re:Youtube captchas are terrible. by Goaway · 2009-01-23 15:38 · Score: 2, Insightful
  
  Yes, they are. They are not stopping all spammers, but that is very different from not stopping them at all.
Dying Technology by EdIII · 2009-01-23 12:10 · Score: 5, Insightful

The idea is that humans are better at image recognition that computers
C.A.P.T.C.H.A - Completely Automated Public Turing test to tell Computers and Humans Apart.
This is a dying technology.
1) Computers and synthetic systems in general are ONLY going to get better at doing anything a human can do. I mean anything.
2) Humans are a substitute for our lack of a synthetic system to solve a CAPTCHA.
A CAPTCHA has two answers to it's owner. This is a Human and this is a Computer. Humans can be hired to solve CAPTCHA at economically viable rates to meet the demand with a supply. Computers are catching up at being able to solve various CAPTCHAs creating an "arms race" between developers and those that need to crack CAPTCHA automatically with high throughput.
The window for this technology to be effective in its use is shrinking rapidly and it will only be a matter of time before it is nearly impossible to tell without phsyical inspection what is a synthetic human reponse and an actual one.
1. Re:Dying Technology by Goaway · 2009-01-23 12:15 · Score: 4, Informative
  
  Humans can be hired to solve CAPTCHA at economically viable rates to meet the demand with a supply.
  Not in general. For high-value targets, yes. For spamming blog comments, no.
2. Re:Dying Technology by Dhalka226 · 2009-01-23 13:43 · Score: 3, Insightful
  
  Using a human being to solve a CAPTCHA is not "cracking" the CAPTCHA, nor does it make the next blog or even the next CAPTCHA any less secure. If the CAPTCHAs are actually successful enough that the only solution is to hire third-worlders to do them for you, a large part of the battle is already won.
  Will it stop all spam? No. Will all spam ever be stopped? Nope, so let's take what we can get while we can get it.
3. Re:Dying Technology by AaronLawrence · 2009-01-23 14:27 · Score: 2, Insightful
  
  And:
  3) As you make it harder to solve for computers, you also make it harder to solve for humans.
  Since current CAPTCHAs are getting quite difficult for humans to solve, the process has already reached it's limit. Facebooks captchas are difficult enough for me that I have to ask for a new one 5-10 times to get one I'm fairly sure of.
  This one involving optical illusions is absurd, there will be large numbers of people who can never get it right.
  
  --
  For every expert, there is an equal and opposite expert. - Arthur C. Clarke
4. Re:Dying Technology by arbitraryaardvark · 2009-01-23 16:30 · Score: 2, Funny
  
  obligatory xkcd solution to captchas
  http://xkcd.com/233/
5. Re:Dying Technology by AaronLawrence · 2009-01-23 17:53 · Score: 2, Informative
  
  Well actually, systems like the one on facebook do have a kind of "I don't know" which is the "give me another". At least it makes it possible to solve, if extremely annoying ...
  
  --
  For every expert, there is an equal and opposite expert. - Arthur C. Clarke
6. Re:Dying Technology by NinthAgendaDotCom · 2009-01-23 20:04 · Score: 2, Funny
  
  Computers and synthetic systems in general are ONLY going to get better at doing anything a human can do. I mean anything.
  Robot sex slaves, here we come!!!
  
  --
  -- http://ninthagenda.com/
7. Re:Dying Technology by EdIII · 2009-01-23 20:24 · Score: 2, Insightful
  
  Well actually, systems like the one on facebook do have a kind of "I don't know" which is the "give me another". At least it makes it possible to solve, if extremely annoying ...
  That's not what I meant. A Turing test is designed to test subjects and from their answers determine if it is a human or a computer. You are talking about the answer that a subject may give to the test itself. I was talking about the result that the Turing test may give to the researchers or the system. They are two different things.
  Clicking "I don't know" or "Give me another" equates to a failure result from the CAPTCHA's point of view, not a third result type.
8. Re:Dying Technology by CookedGryphon · 2009-01-24 00:10 · Score: 2, Insightful
  
  That's heading towards the voight-kampff test.
How to get around CAPTCHA for Porn? by corsec67 · 2009-01-23 12:11 · Score: 4, Insightful

Even if they had a perfect system that could tell a person from a computer, how can they prevent a CAPTCHA for porn system?
(You make a website offering porn for entering the solution to a CAPTCHA from a 2nd site, and then use that solution on that 2nd site)

--
If I have nothing to hide, don't search me
1. Re:How to get around CAPTCHA for Porn? by Dwedit · 2009-01-23 12:26 · Score: 3, Insightful
  
  Captchas have right or wrong answers, which can be immediately verified.
  Spam or not spam can not. Some imbeciles can just make random selections without caring. Even if you give posts to multiple people to see if they agree, you can get enough imbeciles to ruin the system.
2. Re:How to get around CAPTCHA for Porn? by sexconker · 2009-01-23 13:00 · Score: 2, Funny
  
  But you have to add captchas to your 3rd site to make sure a 4th site isn't spamming your (3rd) site with fake spam/legit answers in an effort to steal your porn (to make their own porn-fueled, captcha-solving farm).
3. Re:How to get around CAPTCHA for Porn? by kohaku · 2009-01-23 13:45 · Score: 4, Funny
  
  It's porn all the way down.
Logical next step by sakdoctor · 2009-01-23 12:12 · Score: 2, Funny

Instead of one little captcha at the end of a web form, the whole site will be a captcha.
All the form labels will be jumbled images, and there will be 9 form submit buttons, 8 with dogs and 1 with a cat.
All textual content can be a mangled image to stop scrapers as a bonus.
Oh and please don't actually build this.
1. Re:Logical next step by sexconker · 2009-01-23 13:03 · Score: 2, Informative
  
  Image capture program will just capture multiple frames and combine them, just like your eye (basically, effectively does).
  Also, PAL is 50 fields per second, 25 frames per second. Not 25 fields and 12.5 frames.
Worded questions? by DavidR1991 · 2009-01-23 12:18 · Score: 2, Insightful

I thought the ideal captcha would be worded questions presented in the same image-like format as current captchas, e.g. "Two and Two makes?" or "The opposite of day is..?" Whilst the image recognition is now feasible, making a general system to solve this problem would be somewhat more difficult than just improved single-word captchas.
Annoyingly, however, the system to create such captchas cannot really be automated (in terms of creating the questions). So I suppose as long as the captchas are computer created / can be made automatically, they will also be computer crackable/solvable
Build a database of inputs and outputs by KPexEA · 2009-01-23 12:24 · Score: 3, Interesting

Any CAPTCHA system can easily be cracked by building a large database with the inputs and outputs that was actually solved by humans and then saved into the database for lookup later. The inputs don't need to be text, they can contain images ( or hash codes representing images ), or css or whatever is needed to define the input data. The only feasable way to stop this kind of caching of answers is to have no duplicate tests. For example, a large field of randomly colored circles that all vary in size and position and move slowly around, then tell the user to hover the mouse over the largest blue circle and then next have them move the mouse over the green triangle, etc. Then base their "pass or fail" on how well they could move the mouse fast enough. And change the test often, like, put the mouse over the shape that looks like a bunny etc.
Cylon Detector by fathom108 · 2009-01-23 12:33 · Score: 3, Funny

Will this detect Cylons?
Suck it, Vernor & Kurzweil by Anonymous Coward · 2009-01-23 12:34 · Score: 3, Insightful

No one could ever predict that it would be spammers and porn merchants who would solve the hardest problems in AI.
Stop Comment Spam By Analysing the Actual Content by jwieland · 2009-01-23 12:45 · Score: 2, Insightful

Enough with the annoying captcha's stop comment spam by just analyzing the content.
Free and works well:
http://defensio.com/
I really hate by BetterSense · 2009-01-23 12:46 · Score: 4, Interesting

I really hate image-based CAPTCHAS, because they discriminate against lynx users. I seriously remember at least one occasion where I was using lynx for whatever obscure reason, and I came upon "enter the text shown in the box at the left". Fail. I like the math problem ones better.
COLORblind? How about BLIND blind? by Ungrounded+Lightning · 2009-01-23 12:54 · Score: 5, Interesting

The idea is that humans are better at image recognition that computers, but humans can legitimately disagree on their interpretations and some humans are color blind.
COLOR blind? Some humans are BLIND blind. Others have various vision or vision processing impairments that would make meatware-visual-coprocessor-test CAPTCHAs reject them.
IMHO most CAPTCHAs are already and obviously violating of the Americans with Disabilities Act. So now, in the info-war between weapons and armor (which weapons always win anyhow), even more of us less-than-Aryan-Supermen become collateral damage.
Dogs are (allegedly) color blind and "... on the Internet nobody can tell you're a dog!". Well, maybe PEOPLE can't. But now the web applications can. B-(
The solution to being attacked by better weapons is not better armor. That's only a stopgap. The solution is to hunt down those who misuse weapons and make them incapable of or unwilling to continue.

--
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
I like how reCAPTCHA is the recommendation... by Stile+65 · 2009-01-23 12:56 · Score: 2, Interesting

...even though CraigsList uses reCAPTCHA and the article talks about a utility that helps spammers automatically post on CL.
Besides, it's fairly easy to set up a Mechanical Turk HIT for users to solve CAPTCHAs for a penny a piece. Assuming you make more than a penny per captcha solved, you're set. If not, make someone successfully solve more than one CAPTCHA per HIT submission.

--
I claim first use of "Error No. 0B" - or "No. 0B error." It'll be the new ID 10T!
Re:Pay captcha creators :) by brusk · 2009-01-23 13:45 · Score: 2, Insightful

Presumably the universe of tunes every internet user could be expected to know is quite small, so it would only be a matter of matching to that set. There's already an iPhone app (Shazam, I think it's called) that can identify ambient music and send you to the iTunes purchase link. That's presumably a much harder problem (a vastly bigger universe and probably poorer sound quality), and it's already been solved.

--
.sig withheld by request
Nope, that won't work either. by IdahoEv · 2009-01-23 14:06 · Score: 3, Insightful

Give me the frames of such an animation and I can trivially write a program that simulates persistence of vision by smearing the pixels over time, thus making it solvable by a computer.
In the long run, CAPTCHAs are doomed.

--
I stole this sig from someone cleverer than me.
Build a system that's not spammable. by SanityInAnarchy · 2009-01-23 14:50 · Score: 2, Interesting

I'm not sure how, yet, but I want people to start thinking about it this way.
Just like DRM.
See, with DRM, start with the assumption that all DRM can and will be cracked, and that all software and media can and will be pirated. Your challenge, then, is to make the legitimate product provide at least the quality and value of the pirated copy (something most DRM'd solutions fail miserably at), and ideally make it desirable enough that your price starts to seem reasonable, even when the alternative is "free".
So, the same applies to CAPTCHAs. Start with the assumption that all CAPTCHAs can and will be cracked, even if "cracking" means "using Mechanical Turk and/or a real sweatshop to have humans crack it". Now, start thinking in terms of economics. Build a system which doesn't have sufficiently good payoff for cracking it for anyone to bother -- a system which, by its very nature, can't be spammed.
If you can at least get it to where the only waste is bandwidth and disk space, you're doing pretty good. That's about my current spam situation -- it's a statistical filter which operates on the entire message, but it works incredibly well.
Until then, an automated hack that seems to work well, at least to stop blog spam, is to require AJAX, and send a bit of programmatically generated (but always different) JavaScript, and verify that it was executed. This will stop most automated systems until they start specifically targeting you with embedded Javascript engines. Next: Make it computationally expensive, so that they have to use a botnet if they're to get any real results.

--
Don't thank God, thank a doctor!
1. Re:Build a system that's not spammable. by SanityInAnarchy · 2009-01-24 06:52 · Score: 2, Interesting
  
  Please, don't suggest something stupid AND already obsolete, we might get saddled with it.
  Fortunately, it has two advantages:
  First, for those who aren't using botnets, or sufficiently large botnets, it's a significant impediment.
  Second, more cycles increases the chance that people will notice their computers slowing down and figure out its a botnet.
  Finally, it really doesn't matter whether we get saddled with it or not -- since it's just using Javascript, it's no more cumbersome than Slashdot's current comment system. And if it's completely ineffective, it could be turned off with no ill effects.
  
  --
  Don't thank God, thank a doctor!
Re:OCR by Strange+Ranger · 2009-01-23 15:40 · Score: 2, Interesting

I was thinking brute force isn't feasible when every failure generates a new question.
But let me take another stab at it.

What if the question wasn't always "what is in the picture?"
Given a database of 1000 basic images like animals, shapes, fruits, and vegetables matched to the word for what each one is and it's catagory (animal, fruit, etc).. Now the CAPTCHA shows 6 of them in 6 little squares. (~985 quadrillion combinations) It can ask a nearly endless list of questions using simple formulae:

What is the third image?
How many animals are shown? Spell the number.
Type the first 2 letters of each fruit.
Type the shape names using no spaces.

Instead of always asking "what are the 5 digits" now we're asking for an almost arbitrary number of digits. And there are 6 picture images that have to be ID'd.

Did I beat the OCR problem w/o introducing any fatal new ones?

--

Operator, give me the number for 911!
Re:OCR by DamnStupidElf · 2009-01-24 06:23 · Score: 2, Insightful

Who the hell knows that shit??? O_o
Google.
In other news, it's probably a bad idea to base a captcha on something Google will look up for you.