Yahoo CAPTCHA Hacked
Hell Yeah! reminds us of a 2-week-old development that somehow escaped notice here. A team of Russian hackers has found a way to decipher a Yahoo CAPTCHA, thought to be one of the most difficult, with 35% accuracy. The Russian group's notice, posted by one "John Wane," is dated January 16. This site hosts a rapidshare link to what looks to be demonstration software for Windows, and quotes the Russian researchers: "It's not necessary to achieve high degree of accuracy when designing automated recognition software. The accuracy of 15% is enough when attacker is able to run 100,000 tries per day, taking into the consideration the price of not automated recognition — one cent per one CAPTCHA."
by having a teenage boy do it in exchange for letting him see porn.
They're used to seeing Cyrillic, the captcha has got to be easier to read!
A few months ago Yahoo introduced a CAPTCHA to prevent bots entering their chatrooms. Within a few days every room on yahoo was filled with bots once more, and still are to this day.
Given the current situation of the chat rooms on yahoo, it comes as no suprise at all that the other parts of the Yahoo system are inadequately protected from bots either.
I've found Yahoo's CAPTCHA to be really annoying. I probably get it wrong about 20% of the time because the picture is so distorted (and I've been surprised that I got it right a lot of the time). I even considered writing them an email complaining about it, but then I realized they probably don't give a crap.
33% of Yahoo capitchas isn't really impressive - you still get a large quantity of negative hits, and unless you have an array of IP addresses (most people don't), there will still be a large quantity of addresses registered from a given IP. Also, a large quantity of negatives would cast doubt on any positive matches from the same IP.
Also, Yahoo captchas aren't that "hard" - they are black text from known font pools on a white background that get slightly warped and have black lines drawn on some characters. This is hardly strong since it doesn't hit all letters within the word (which is done by reCAPTCHA) or use a large font-pool variety.
Even the Slashdot Captcha is harder - it hits the whole image and uses different fonts within the word.
That reminds me of the age check for Leisure Suit Larry back in the day... Who knew that the desire of a horny teen to see pixellated boobs would lead to history research?
What doesn't kill you only delays the inevitable
without a chord is fine... ...it's when you're missing a cord that you need to worry
To register, answer these questions and click the button on the right
What colour are buses in London?
What is three times three?
[Red] [Green] [Blue]
Yes, those are undoubtedly hard questions for a computer. How, exactly, do you plan to generate billions of these questions? For a CAPTCHA to work, it must still be hard even if the generation algorithm is public knowledge.
Did anyone notice that the image recognition code is imported from a binary DLL? I was under the impression that the Russian hackers would provide the source for the recognition code as well. But then, the people who released this are only interested in generating as much spam. Why should you trust them? You would be foolish enough to _not_ execute your test program that imports this dll in a vmware instance instead of your actual machine. Anybody done a comprehensive strace to determine sockets/descriptors opened by using this dll?
Hence all good modern captchas have moved away from character recognition captchas (such as yours) to segmentation based captchas. You only need to read the wikipedia article on CAPTCHAs to see some examples: http://en.wikipedia.org/wiki/Captcha.
Not really. After a couple of (thousand) runs through, the attacker would have a reasonably accurate database of the questions. They can then analyze the text to find the nearest match to one of the questions in its database.
That's true. I've found, however, that introducing custom spam blocking methods, such as this, no matter how easy to break, often does a better job at stopping spam bots than more robust publicly available methods. For a target as big as Yahoo, this probably won't work, but I've found on PHPbb for instance, instead of using any of the publicly available captchas, which are easily defeated by bots, creating a simple question of this sort does wonders for bot-blocking. Even if it's just one question. If your site isn't big enough to be specifically targeted by bot farmers, sometimes a simple solution is better than a more complex one that everybody else is using.
ZuluPad, the wiki notepad on crack
Just put some hard to read perl code in there and ask the user to say what it does. If the answer is correct it's a bot, if the answer is wrong it's probably a human ;)