Slashdot Mirror


Yahoo CAPTCHA Hacked

Hell Yeah! reminds us of a 2-week-old development that somehow escaped notice here. A team of Russian hackers has found a way to decipher a Yahoo CAPTCHA, thought to be one of the most difficult, with 35% accuracy. The Russian group's notice, posted by one "John Wane," is dated January 16. This site hosts a rapidshare link to what looks to be demonstration software for Windows, and quotes the Russian researchers: "It's not necessary to achieve high degree of accuracy when designing automated recognition software. The accuracy of 15% is enough when attacker is able to run 100,000 tries per day, taking into the consideration the price of not automated recognition — one cent per one CAPTCHA."

28 of 252 comments (clear)

  1. I thought those things were already broken by Anonymous Coward · · Score: 5, Funny

    by having a teenage boy do it in exchange for letting him see porn.

    1. Re:I thought those things were already broken by 2.7182 · · Score: 4, Insightful

      I think the parent is serious. The idea is that your robot goes and grabs the images that needs to be decoded. Then on another website, it is presented and you can see free porn if you type in the word. I've heard of this but never read about it. Sounds like a good idea. Anyone know what this is called or some references ?

    2. Re:I thought those things were already broken by rthomas6 · · Score: 4, Informative

      http://news.bbc.co.uk/2/hi/technology/7067962.stm
      Here is a link to a BBC article about something like that. It's a Windows program that rewards typing in captchas by showing a woman that takes off progressively more and more clothes.

    3. Re:I thought those things were already broken by kesuki · · Score: 4, Interesting

      that's why it costs 1 cent per 1 captcha, the overall cost of webhosting the porn for exchange boils down to 1 cent per solved captcha. obviously, if you're hosting on root-kited windows boxes in the us (the highest rate of infection is in the us) the cost is still about 1 cent per one captcha because the cost of paying hackers to keep a bot net sizable enough comes to about the same cost.

      especially with sp3 coming out now, the cost of bot nets is higher, since sp3 offers a 'easy' bot net removal path, since staying off-line long enough to get all sp2's flaws patched is crucial in preventing reinfection. believe me, having a root-kit installed is easy even for a veteran computer guy to miss.

      i have dvd's i burned almost 3 years ago that reinfect any windows machine with a root-kit, and are un-readable in linux, apparently the root-kit was using some hooks in nero burning rom to 'randomly' pick a burn project and put the root-kit installer on there so when windows tried to auto run it would install the root-kit, then show the 'window' that normally shows up on auto-run would show up. the rootkit took an 'extra' session, that was transparent, eg: it would only show using burning software to read the track data, for the burned cd or dvd. no additional files showed up in windows, but the extra session made it unreadable to linux.

      also, the root-kit only runs in a 'blank' screen saver, which it protects and makes sure loads when the system is idle, so it never sends data when the user might be there to notice. and i think it sends the data as like, internet explorer, to bypass firewall rules. since none of the firewalls i tried could block it. i actually only found the original root kit when a second root-kit moved the first root-kit's files to the recycle bin. other than that none of the root kit scanners that were recommended to me could even detect this thing. only the 'symptoms' and the fact i could 'remove them' by staying off-line and not using my old discs were proof that i had a root kit.

      symptoms included, auto-run becoming disabled, screen saver always resetting to 15 minutes (only when both root-kits were on there), and the 'desktop' showing up 2-3 times a day when in full-screen games (also only with both root kits), and finding root-kit files in recycle bin(only found on networked systems with the root kit, and didn't return on reinstall of both root-kit, likely was a 1 time 'bug' that was fixed later on)

      so yeah, I didn't notice it for 3 years. Not that i usually have to deal with virus, but in the past I had only ever had to deal with 3 virus and in my 15 years online. and the third one was really a root-kit. I've also been using open-source software for 11 years, so that probably helped, of course, one of the virus was one that affected my open source software, the other 2 were windows based.

      it's still easy to miss windows root-kit's nowadays, especially when hackers have root-kits that aren't published, and they use scripts to make the exe's have unique signatures (using compiler tricks) for known root-kits.

    4. Re:I thought those things were already broken by novakyu · · Score: 3, Informative

      that's why it costs 1 cent per 1 captcha, the overall cost of webhosting the porn for exchange boils down to 1 cent per solved captcha. Er, where did you get that number? At Nearly Free Speech, it only costs $1 / GB (of transfer), and that's how much it would cost nearly anywhere else (or even less!), if you use significant amount of bandwidth.

      I don't know exactly how large porn images are, never having looked at them, but if you guess a round number of 0.1 MB per picture, it's only about $0.0001, or 0.01 cent per captcha. I suppose it's better than nothing, but it's not yet very cost-prohibitive.
    5. Re:I thought those things were already broken by Anonymous Coward · · Score: 3, Funny

      I don't know exactly how large porn images are, never having looked at them.

      Posting on /. and you've never seen porn? Bullshit.

    6. Re:I thought those things were already broken by nb+caffeine · · Score: 3, Funny

      Maybe he only watches movies

      --

      "Something's wrong with you...and I hope we never do meet again." - Deftones When Girls Telephone Boys
  2. Hey by Misanthrope · · Score: 5, Funny

    They're used to seeing Cyrillic, the captcha has got to be easier to read!

    1. Re:Hey by Janek+Kozicki · · Score: 4, Interesting

      The 3D captcha seems to be a good solution here (that's a link from wikipedia article)

      You pick several 3d models, like people, chairs or flowers. Name all their parts, like "chair leg", "human head" etc. The CAPTCHA is generated by placing a several 3D models randomly rotated on a scene and rendering them with easily readable letters "A", "B" placed on the named parts. The captcha questions are: "what is the letter on human head", "what is the letter on chair leg", etc..

      People can answer pretty easily. The 3D models are always randomly placed and rotated on a scene, so bots have a problem.

      --
      #
      #\ @ ? Colonize Mars
      #
  3. Not really news by Anonymous Coward · · Score: 5, Insightful

    A few months ago Yahoo introduced a CAPTCHA to prevent bots entering their chatrooms. Within a few days every room on yahoo was filled with bots once more, and still are to this day.

    Given the current situation of the chat rooms on yahoo, it comes as no suprise at all that the other parts of the Yahoo system are inadequately protected from bots either.

  4. Re:Gentlemen, start your spambots by xaxa · · Score: 3, Insightful

    Natural language processing etc:

    To register, answer these questions and click the button on the right
    What colour are buses in London?
    What is three times three?
    [Red] [Green] [Blue]

  5. That's really impressive. by heyguy · · Score: 5, Insightful

    I've found Yahoo's CAPTCHA to be really annoying. I probably get it wrong about 20% of the time because the picture is so distorted (and I've been surprised that I got it right a lot of the time). I even considered writing them an email complaining about it, but then I realized they probably don't give a crap.

  6. Only Yahoo? by Sigma+7 · · Score: 4, Informative

    33% of Yahoo capitchas isn't really impressive - you still get a large quantity of negative hits, and unless you have an array of IP addresses (most people don't), there will still be a large quantity of addresses registered from a given IP. Also, a large quantity of negatives would cast doubt on any positive matches from the same IP.

    Also, Yahoo captchas aren't that "hard" - they are black text from known font pools on a white background that get slightly warped and have black lines drawn on some characters. This is hardly strong since it doesn't hit all letters within the word (which is done by reCAPTCHA) or use a large font-pool variety.

    Even the Slashdot Captcha is harder - it hits the whole image and uses different fonts within the word.

  7. Re:Gentlemen, start your spambots by SoupGuru · · Score: 5, Funny

    That reminds me of the age check for Leisure Suit Larry back in the day... Who knew that the desire of a horny teen to see pixellated boobs would lead to history research?

    --
    What doesn't kill you only delays the inevitable
  8. Re:Malware by wellingtonsteve · · Score: 4, Funny

    without a chord is fine... ...it's when you're missing a cord that you need to worry

  9. Re:Gentlemen, start your spambots by paeanblack · · Score: 4, Insightful

    To register, answer these questions and click the button on the right
    What colour are buses in London?
    What is three times three?
    [Red] [Green] [Blue]


    Yes, those are undoubtedly hard questions for a computer. How, exactly, do you plan to generate billions of these questions? For a CAPTCHA to work, it must still be hard even if the generation algorithm is public knowledge.

  10. Re:captcha security by Carnildo · · Score: 3, Informative

    The character outlines are nicely distinct, which means that even basic OCR software should be able to break the CAPTCHA. Since it's so easy to break, you want to hide it from any bots that come by: remove all references to "captcha" from the page source, and you might want to move the HTML for the image away from the HTML for the entry box.

    --
    "They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
  11. Re:Gentlemen, start your spambots by driftingwalrus · · Score: 3, Insightful

    What about introducing spelling and grammatical errors? This would be difficult for a computer to interpret, but doable for a human.

    --
    Paul Anderson
    "I drank WHAT?!" -- Socrates
  12. 35%??? by wbren · · Score: 3, Informative

    I'm impressed. That's better than I can do. Some CAPTCHAs take me five or six tries to get right.

    --
    -William Brendel
    1. Re:35%??? by GiMP · · Score: 3, Insightful

      I agree, that is better than I normally do as well. Maybe someone could make this a firefox plugin so that mere mortals can actually access webpages that use CAPTCHAs.

      It is sad because with corrective lenses, my vision is 20/20, and I'm highly technical. I should not have any problems with CAPTCHAs; However, my grandmother is another story. She has poor vision, can't figure out how to do a carriage return on her computer, has difficulty understanding the concept of scrollbars, and I'm sure would not be able to deal with even the easiest CAPTCHAs in use today. This is not usability. Granted, given the choice between SPAM or CAPTCHAs, I'll chose the lesser of the two evils...

  13. Warning on playing with the demo by xynopsis · · Score: 5, Insightful

    Did anyone notice that the image recognition code is imported from a binary DLL? I was under the impression that the Russian hackers would provide the source for the recognition code as well. But then, the people who released this are only interested in generating as much spam. Why should you trust them? You would be foolish enough to _not_ execute your test program that imports this dll in a vmware instance instead of your actual machine. Anybody done a comprehensive strace to determine sockets/descriptors opened by using this dll?

  14. Re:captcha security by yani · · Score: 4, Informative
    Although it seems counter-intuitive, character recognition (even with your filtering) is a relatively easy problem for a computer to solve. The hard problem is segmentation. It is relatively easy for a human to segment characters when they are somehow joined together, by artifacts or occlusion, it can be very hard to do with current methods.

    Hence all good modern captchas have moved away from character recognition captchas (such as yours) to segmentation based captchas. You only need to read the wikipedia article on CAPTCHAs to see some examples: http://en.wikipedia.org/wiki/Captcha.

  15. Gee, Ya THINK by buss_error · · Score: 3, Insightful

    Yahoo!'s captcha has been hacked, perhaps not as well, in the past. I've seen open http proxies pounding away at Yahoo to the tune of 100,000 per hour and more. Hotmail's is broken, so are others. The real shame is that the Storm Worm controllers are being protected by a national government and law enforecement system.

    So what's the answer?

    I'm sure I don't know. I do know that the wild west theory of accepting any kind of behaviour isn't acceptable. I know that some minimum standard of what's allowed and what isn't is going to have to take place. Where these limits are placed is a thing for a global conversation, and there will be differances of opinion.

    Is cracking a captcha acceptalbe? Is phishing and identity theft acceptable? Is fraud and uncontrolled spam acceptable? What limits, and on what actions?

    I'm just not that smart. But I think we can agree on a few things. Let's start to find out what those things are... and acting in concert with other network operators to enforce those standards. Fail to meet them, and your network routing gets dropped...

    --
    Necessity is the plea for every infringement of human freedom. It is the argument of tyrants; it is the creed of slaves.
  16. Other interesting work on CAPTCHAs by ChoppedBroccoli · · Score: 3, Interesting

    Segmentation and intersecting arcs can be difficult for automated attacks: http://portal.acm.org/citation.cfm?id=1054972.1055070

    You know those annoying flash advertisement games (shoot the monkey for a free iPod)? Well, they could potentially be adapted for CAPTCHAs as well: http://cups.cs.cmu.edu/soups/2006/posters/misra-poster_abstract.pdf

  17. Re:Gentlemen, start your spambots by omeomi · · Score: 5, Insightful

    Not really. After a couple of (thousand) runs through, the attacker would have a reasonably accurate database of the questions. They can then analyze the text to find the nearest match to one of the questions in its database.

    That's true. I've found, however, that introducing custom spam blocking methods, such as this, no matter how easy to break, often does a better job at stopping spam bots than more robust publicly available methods. For a target as big as Yahoo, this probably won't work, but I've found on PHPbb for instance, instead of using any of the publicly available captchas, which are easily defeated by bots, creating a simple question of this sort does wonders for bot-blocking. Even if it's just one question. If your site isn't big enough to be specifically targeted by bot farmers, sometimes a simple solution is better than a more complex one that everybody else is using.

  18. Re:Gentlemen, start your spambots by nazanne · · Score: 3, Interesting

    That has been my experience, too. I admin a small bb and was having horrible problems with spam sign ups. CAPTCHAs didn't slow the spammers down at all. I went to a simple question that will be easily known by all of my target audience but probably won't be known by someone half way around the world entering CAPTCHAs for a penny a piece and allowed any spelling that is even close. I haven't had any spammers sign up for a couple years now. That obviously won't work for a major target like YAHOO though.

  19. Re:Random Coloration Photos by jsoderba · · Score: 3, Insightful

    I say that a lot of people are color blind.

  20. Re:Gentlemen, start your spambots by aliquis · · Score: 5, Funny

    Just put some hard to read perl code in there and ask the user to say what it does. If the answer is correct it's a bot, if the answer is wrong it's probably a human ;)