Defeating Captcha
An anonymous reader pointed us at PWNtcha, a package that breaks various on-line captcha algorithms. The site provides numerous examples of easy (Paypal, and an older version of Slashdot make the list) and hard Captcha. It also links various sources explaining why Captcha is a bad idea.
Entrepreneur : (noun), French for "unemployed"
here
Whew, I had never even heard of Captcha before...
A captcha is a type of challenge-response test used in computing to determine whether or not the user is human.
This is a good study of how hard it is to design secure systems. It's just like a non-cryptographer trying to create their own cipher, only in the visual processing world. Sadly, the article does not touch on non-visual captchas, which are alternatives for the blind. It would also be interesting to see what Jakob Nielsen might have to say on this technology from a usability perspective.
Of course, one of the primary bad things is that the concept of a captcha is patented, and the patent language is very broad. US Patent# 6,195,698
Also see the Wikipedia article for more information.
I swear this is not a troll. It actually was.
The best part is that *no* advance in captcha technology can really fix this. It's no longer a race against OCR technology, the whole can't be plugged by switching to object-based (rather than text based), neither can it be stopped by switching to audio-based captcha.
The main article refers to Inaccessibilyt of Visually-Oriented Anti-Robot Tests, which deserves a read and commentary.
Among the claims:
- captchas are inaccessbile to the blind - true
- a horde of human beings can decode the entire library over time - only true if the images are recycled, not if they are created on-demand or for one-time use.
It also discusses some of the side-effects of making access to real humans harder, or harder for a class of users such as the visually impaired. For example, I've seen sites that say "If you cannot read this, call this phone number for access." Too bad for you if you don't have a phone.
As alternatives, it offers
- logic puzzles
- sound output
- credit-card validation
- live operators
- limited-use of unverified accounts, such as throttling for email
- behavior and heuristic analysis
- already-established credentials, such as single-sign-on systems or public-key-based systems
- biometrics
The article briefly discusses the pros and cons of each.
I rate its conclusion
"Visual verification alone is known to create problems with users. It is imperative that site designers take the needs of users with disabilities into account, and it is likewise hoped that one or more of these potential solutions can make that process easier."
as: insightful +5 obvious -1.
The article as a whole gets an "informative +5."
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
I've heard this myth repeated on slashdot many times, but never seen any evidence of it being implemented in the wild.
The W3C proposed in 2003 a number of Solutions for the Inaccessibility of Visually-Oriented Anti-Robot Tests, including logic puzzles, audio captchas, credit card validation, etc. It is interesting that they also show how a federated identity system can help users with disabilities.
http://www.gh-sts.com/captcha.txt
This is what slashdot's previous iteration of a captcha looked like in an in-memory associative array after the intersecting lines had been removed and a de-skewing algorithm applied. There was actually a version of the code after that which properly picked out where the lines actually intersected the letters and didn't erase the intersecting section to create those gaps.
Before they switched to the newest CAPTCHA system, I was breaking their CAPTCHAs with a modified SS.pl script with almost 100% accuracy (it had a little trouble properly splitting up the text when a j or other similar character wrapped partially under another letter).
Of course, the new CAPTCHAs are much harder. I can't even read some of them myself, but the point is that breaking CAPTCHA that people can easily read usually isn't really that hard.
Yes, I used ImageMagick's Perlmagick library.
Alito: A vote for Alito is a punch in the eye to put that bitch back in her place!
scroll down to the bottom, eegh O_O
It originated as an off-hand remark by someone - maybe Cory Doctorow, I forget - as an example for a theoretical way to break captchas. This was quickly misremembered and blown out of proportion by people wanting to seem clever on Slashdot.
Nice, the site owner probably added it when he added the notice to slashdot readers.
Editors -
Please don't link to the goatse man without at least some warning.
Thanks.
Thanks for linking the Goatse Man image in the article. Oh how I've missed being tricked into viewing thee.
The link is not work safe.
This post contains benzene, nitrosamines, formaldehyde and hydrogen cyanide.
THIS IS ONE GIANT TROLL ARTICLE! LOL!
About 3/4ths down the page there is a goatse picture, and the caption at the top thanks the GNAA. Wake up slashdot.
This is my sig. There are many like it, but this one is mine.
Udi Manber (while he was chief scientist at Yahoo) mentioned it was happening to Yahoo, during a presentation at UCR.
Nothing to see here; Move along.
There are several programs doing the TREC (Text REtrieval Conference) Question Answering track that give you an accuracy of 80% upwards, and that's for hard questions like historical data on a huge corpus.
follow me on Twitter: http://twitter.com/moeffju
1... is... not... a... prime...
For info on why, see the mathworld prime number entry.
Interestingly, it says that, at one time, 1 was considered prime and 2 was not. Pretty amazing, considering importance of the Fundamental Theorem of Arithmetic.
"It's overkill, of course. But you can never have too much overkill." - Anonymous Slashdot Coward