Yahoo CAPTCHA Hacked
Hell Yeah! reminds us of a 2-week-old development that somehow escaped notice here. A team of Russian hackers has found a way to decipher a Yahoo CAPTCHA, thought to be one of the most difficult, with 35% accuracy. The Russian group's notice, posted by one "John Wane," is dated January 16. This site hosts a rapidshare link to what looks to be demonstration software for Windows, and quotes the Russian researchers: "It's not necessary to achieve high degree of accuracy when designing automated recognition software. The accuracy of 15% is enough when attacker is able to run 100,000 tries per day, taking into the consideration the price of not automated recognition — one cent per one CAPTCHA."
by having a teenage boy do it in exchange for letting him see porn.
They're used to seeing Cyrillic, the captcha has got to be easier to read!
A few months ago Yahoo introduced a CAPTCHA to prevent bots entering their chatrooms. Within a few days every room on yahoo was filled with bots once more, and still are to this day.
Given the current situation of the chat rooms on yahoo, it comes as no suprise at all that the other parts of the Yahoo system are inadequately protected from bots either.
I did my own captcha, but I'm not sure how much its worth - figured any non-standard one is better than none (or a std one).
Please take a look - are the effects actually helping the recognition process?
--
social bookmarking widget for your site
Natural language processing etc:
To register, answer these questions and click the button on the right
What colour are buses in London?
What is three times three?
[Red] [Green] [Blue]
I've found Yahoo's CAPTCHA to be really annoying. I probably get it wrong about 20% of the time because the picture is so distorted (and I've been surprised that I got it right a lot of the time). I even considered writing them an email complaining about it, but then I realized they probably don't give a crap.
We hate CAPTCHA. Most thing they do to make it difficult for computers to decode, make it a lot more difficult for humans to decode. Most of them are not usable by text browsers (dah), and the blind. Some have audio that is hard for people to hear, and sill easy for computer to decode. Last, CAPTCHA's are so over used that people just do them without thinking. For all you know that Porn/ware site is using you to do CAPTCHA for them. Not that it is needed. This is just one more nail in the CAPTCHA coffin.
33% of Yahoo capitchas isn't really impressive - you still get a large quantity of negative hits, and unless you have an array of IP addresses (most people don't), there will still be a large quantity of addresses registered from a given IP. Also, a large quantity of negatives would cast doubt on any positive matches from the same IP.
Also, Yahoo captchas aren't that "hard" - they are black text from known font pools on a white background that get slightly warped and have black lines drawn on some characters. This is hardly strong since it doesn't hit all letters within the word (which is done by reCAPTCHA) or use a large font-pool variety.
Even the Slashdot Captcha is harder - it hits the whole image and uses different fonts within the word.
That reminds me of the age check for Leisure Suit Larry back in the day... Who knew that the desire of a horny teen to see pixellated boobs would lead to history research?
What doesn't kill you only delays the inevitable
without a chord is fine... ...it's when you're missing a cord that you need to worry
To register, answer these questions and click the button on the right
What colour are buses in London?
What is three times three?
[Red] [Green] [Blue]
Yes, those are undoubtedly hard questions for a computer. How, exactly, do you plan to generate billions of these questions? For a CAPTCHA to work, it must still be hard even if the generation algorithm is public knowledge.
What about introducing spelling and grammatical errors? This would be difficult for a computer to interpret, but doable for a human.
Paul Anderson
"I drank WHAT?!" -- Socrates
I'm impressed. That's better than I can do. Some CAPTCHAs take me five or six tries to get right.
-William Brendel
Are you bashing MS just to bash them. Honestly, their so called 'stupid system' is the best thing I've seen out there. Please enlighten me wise one, and link me to a better alternative.
/. , might be a challenge for yourself, wise one.
p.s. How do you know that Gmail accounts haven't been hacked into? Do you have data validating this?
It's not a challenge to bash MS, that comes way to easy, but to add some useful content to
Hey now, be fair...what's the point of bungee jumping if you can't have "Thunderstruck" or similar playing on the way down?
Jumping without a chord would be no fun at all.
I can has sig?
Did anyone notice that the image recognition code is imported from a binary DLL? I was under the impression that the Russian hackers would provide the source for the recognition code as well. But then, the people who released this are only interested in generating as much spam. Why should you trust them? You would be foolish enough to _not_ execute your test program that imports this dll in a vmware instance instead of your actual machine. Anybody done a comprehensive strace to determine sockets/descriptors opened by using this dll?
Not really. After a couple of (thousand) runs through, the attacker would have a reasonably accurate database of the questions. They can then analyze the text to find the nearest match to one of the questions in its database.
Just because you're paranoid doesn't mean there isn't an invisible demon about to eat your face
Botnets have a whole bunch of IP addresses. Simply deploy your Yahoo CAPTCHA cracker code on a botnet that some other fine internet entrepreneur has assembled, and it doesn't matter how many negatives you generate because they will be from a variety of hosts. Certainly with 33% success rate, you're doing pretty well, especially considering your typical spray-and-pray spam blitz.
Yeah, that would solve the problem until someone developed an automated program to check spelling and grammar, which I'm sure is near-imposible. (By the way, does anyone know why there's a red line under that last word? Is my screen screwed up?)
kthxby
[Fuck Beta]
o0t!
>What about the form that is around the captcha, generally a new account application, etc? What if those were to be made dynamic so the automated software trying to look for a hard-coded form fail?
Even if this were dynamic, there is only so many possible methods of displaying a form while still letting it be decipherable by a human. Given this limited set of possibilities, the programmer of a spam bot needs only to take into account any possible page mutations. More likely though, the spammer doesn't even look at a certain spot on the page; they probably do a little javascript to search the DOM for all text boxes and all images and ignores any images it already has copies of, the remainder image is likely the captcha. Then they would just search for context clues around the text boxes to see which box is most likely to be the one that accepts the captcha answer.
>Or better yet, have questions that modern computer AI has yet to break. Show a picture of a circle and ask "is this round?" or "is this not round?". Generally make the questions a bit more complex as AI gets better.
This is also suffers from the problem of limited number of possibilities. If someone can spend time putting questions in, someone can spend time filling in answers, and they only have to fill in answers once, after that, the bot can remember them for the next time it sees the same question.
If some sort of AI was used that could ask common sense questions, like cyc, the problem would be that the spammers have access to the very same AI.
The leading thought is that AI is not going to create better CAPTCHAs, but that bots that break CAPTCHAs are going to create better AI.
>I wonder if there could be some sort of AI research project that works in conjunction with a captcha system.
Not exactly AI, but the reCACPTCHA project does uses CAPTCHAs to decipher text that OCR programs can't when scanning books.
Yahoo!'s captcha has been hacked, perhaps not as well, in the past. I've seen open http proxies pounding away at Yahoo to the tune of 100,000 per hour and more. Hotmail's is broken, so are others. The real shame is that the Storm Worm controllers are being protected by a national government and law enforecement system.
So what's the answer?
I'm sure I don't know. I do know that the wild west theory of accepting any kind of behaviour isn't acceptable. I know that some minimum standard of what's allowed and what isn't is going to have to take place. Where these limits are placed is a thing for a global conversation, and there will be differances of opinion.
Is cracking a captcha acceptalbe? Is phishing and identity theft acceptable? Is fraud and uncontrolled spam acceptable? What limits, and on what actions?
I'm just not that smart. But I think we can agree on a few things. Let's start to find out what those things are... and acting in concert with other network operators to enforce those standards. Fail to meet them, and your network routing gets dropped...
Necessity is the plea for every infringement of human freedom. It is the argument of tyrants; it is the creed of slaves.
Segmentation and intersecting arcs can be difficult for automated attacks: http://portal.acm.org/citation.cfm?id=1054972.1055070
You know those annoying flash advertisement games (shoot the monkey for a free iPod)? Well, they could potentially be adapted for CAPTCHAs as well: http://cups.cs.cmu.edu/soups/2006/posters/misra-poster_abstract.pdf
Not really. After a couple of (thousand) runs through, the attacker would have a reasonably accurate database of the questions. They can then analyze the text to find the nearest match to one of the questions in its database.
That's true. I've found, however, that introducing custom spam blocking methods, such as this, no matter how easy to break, often does a better job at stopping spam bots than more robust publicly available methods. For a target as big as Yahoo, this probably won't work, but I've found on PHPbb for instance, instead of using any of the publicly available captchas, which are easily defeated by bots, creating a simple question of this sort does wonders for bot-blocking. Even if it's just one question. If your site isn't big enough to be specifically targeted by bot farmers, sometimes a simple solution is better than a more complex one that everybody else is using.
ZuluPad, the wiki notepad on crack
That's still not as good as this solution. I can't understand why it's not widely adopted.
Red lining ( a motoring term) comes from tiping too fast, typing to fist, typing two farst, um, using more than one finger per hand.
The key is to never type faster than your brains alpha rhythm. Otherwise, you slide into a meditative zone known as 'T-pool bimbo limbo'. On the other hand, I've generally found typists to be saner than managers, so maybe the mediative zone is a defense mechanism. The frontal cortex contemplates what's for dinner tonight while some low reptilian region recognizes scrawled letters and types them.
Which leads back to the main topic.
What is the lowest animal life that could be trained to log into Yahoo?
If you've ever tried the Yahoo chatrooms, you know they're overrun by spam bots. The problem wasn't with the captcha, it was that it challenged users only once and at the beginning of the session. So as long as your spam bot didn't appear idle or lose connection, it could stay on indefinitely. Now with the captcha broken, spammers don't even have to do captchas manually.
(if anyone uses this and makes a million, at least cut me in 10% for the idea)
I gather the last frontier for computers is image recognition. I'm not sure of the state of image processing, but if you could randomly color simple pictures (one flower, one pen, one cup (NO PUN INTENDED)) into about twenty different shades, and get about a hundred different photos, and just start rotating two or three a week in. So the user sees a small photo with radio boxes below:
The cup is ()red ()blue ()green ()purple ()orange ()yellow orange
The flower petals are ()orange ()blue ()brown ()black
The pen is ()grey ()black ()yellow
You could even start throwing in random names for the colors (silver, charcoal, etc.) using it in sentences, combine with shape guesses (the longer pens are what color? the biggest cup is what color?) Either that or use tiny bits of flash with motion. (the bouncing flower is what color? the flashing red object is what?)
I say a few thousand different sites armed with the same "screen green" paint and tens of thousands of different photos could throw up somewhat of a roadblock.
What say ye?
That has been my experience, too. I admin a small bb and was having horrible problems with spam sign ups. CAPTCHAs didn't slow the spammers down at all. I went to a simple question that will be easily known by all of my target audience but probably won't be known by someone half way around the world entering CAPTCHAs for a penny a piece and allowed any spelling that is even close. I haven't had any spammers sign up for a couple years now. That obviously won't work for a major target like YAHOO though.
Just put some hard to read perl code in there and ask the user to say what it does. If the answer is correct it's a bot, if the answer is wrong it's probably a human ;)
The topic of "are you human" was covered on Security Now a while back and someone brought up a great point. Tools to deter bots also makes it difficult for accessibility software since they use many of the same concepts as bots. Even audio captchas are no longer a strong bot deterrence.
With advocacy groups like the National Federation of the Blind suing Target for their inaccessible website it'll be a very tough challenge to develop new good captchas while maintaining accessibility to everyone.
On another note, could an organization representing the mathematically challenged sue companies using math captchas?
You want fun, go home and buy a monkey!
I have a little site, only really intended to share stuff with family and friends, served with custom scripts. I couldn't believe it when it was targetted by spammers. I could even see the test posts they made, checking to see if html was allowed etc., before unleashing the the bot to post dozens of links a day.
Worst BBC News Stories
As these CAPTCHAs get more complicated, it becomes more difficult for non-speakers of the language to interpret them.
The saddest poem