Yahoo CAPTCHA Hacked
Hell Yeah! reminds us of a 2-week-old development that somehow escaped notice here. A team of Russian hackers has found a way to decipher a Yahoo CAPTCHA, thought to be one of the most difficult, with 35% accuracy. The Russian group's notice, posted by one "John Wane," is dated January 16. This site hosts a rapidshare link to what looks to be demonstration software for Windows, and quotes the Russian researchers: "It's not necessary to achieve high degree of accuracy when designing automated recognition software. The accuracy of 15% is enough when attacker is able to run 100,000 tries per day, taking into the consideration the price of not automated recognition — one cent per one CAPTCHA."
33% of Yahoo capitchas isn't really impressive - you still get a large quantity of negative hits, and unless you have an array of IP addresses (most people don't), there will still be a large quantity of addresses registered from a given IP. Also, a large quantity of negatives would cast doubt on any positive matches from the same IP.
Also, Yahoo captchas aren't that "hard" - they are black text from known font pools on a white background that get slightly warped and have black lines drawn on some characters. This is hardly strong since it doesn't hit all letters within the word (which is done by reCAPTCHA) or use a large font-pool variety.
Even the Slashdot Captcha is harder - it hits the whole image and uses different fonts within the word.
The character outlines are nicely distinct, which means that even basic OCR software should be able to break the CAPTCHA. Since it's so easy to break, you want to hide it from any bots that come by: remove all references to "captcha" from the page source, and you might want to move the HTML for the image away from the HTML for the entry box.
"They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
I'm impressed. That's better than I can do. Some CAPTCHAs take me five or six tries to get right.
-William Brendel
Hence all good modern captchas have moved away from character recognition captchas (such as yours) to segmentation based captchas. You only need to read the wikipedia article on CAPTCHAs to see some examples: http://en.wikipedia.org/wiki/Captcha.
http://news.bbc.co.uk/2/hi/technology/7067962.stm
Here is a link to a BBC article about something like that. It's a Windows program that rewards typing in captchas by showing a woman that takes off progressively more and more clothes.
I don't know exactly how large porn images are, never having looked at them, but if you guess a round number of 0.1 MB per picture, it's only about $0.0001, or 0.01 cent per captcha. I suppose it's better than nothing, but it's not yet very cost-prohibitive.