Slashdot Mirror


Researcher Breaks ReCAPTCHA Using Google's Speech Recognition API (bleepingcomputer.com)

An anonymous reader writes: "A researcher has discovered what he calls a "logic vulnerability" that allowed him to create a Python script that is fully capable of bypassing Google's reCAPTCHA fields using another Google service, the Speech Recognition API," reports BleepingComputer. The attack is incredibly simple and works by downloading a version of the reCAPTCHA audio challenge, feeding it into Google's Speech Recognition API, getting the text-version of the audio challenge, and feeding it back into the reCAPTCHA field. Proof-of-concept code is available on GitHub, and the researcher says Google has failed to patch the issue, albeit it's unclear if he ever notified the company. The attack also only works against reCAPTCHA v2, not other versions like v1, or the upcoming Invisible reCAPTCHA (v3). Because the source code for the exploit is available online, security experts expect to see it ported to JavaScript and used to create browser extensions that bypass reCAPTCHA fields, especially when using the Tor Browser.

22 comments

  1. this is hilarious by Anonymous Coward · · Score: 2, Insightful

    and quite clever. i wonder if it can do better than the 10-20% or so success rate i get on the same captchas?

    recaptcha is absolutely horrible, especially if you're on cellular, tor, a vpn, or just a common open hotspot... they make no fucking sense, they aren't words, just long random strings of similar looking jibbrish and skewed so much the letters are absolutely unrecognizable. so anything that can break that shit.. i'm all for it.

  2. reading back numbers is dumb by Anonymous Coward · · Score: 0

    Why not a simple question like "What animal makes this sound? 'barking dog' A for cat; B for dog; or C for mouse?

    1. Re:reading back numbers is dumb by ArchieBunker · · Score: 1

      They already have something similar with trying to find all the boxes with the street sign. Those never work right. What if the sign takes up a few pixels in a box, does that count?

      --
      Only the State obtains its revenue by coercion. - Murray Rothbard
    2. Re:reading back numbers is dumb by Anonymous Coward · · Score: 0

      Sounds are for Cows. Cows say Mooo. Mooo! Mooo! Mooo cows Moo.

    3. Re:reading back numbers is dumb by GuB-42 · · Score: 1

      Because you have 1 out of 3 chance of getting it right by chance, which is more than good enough for spammers.
      Also it is not that hard to recognize an animal cry or everyday sounds automatically, and there is a limited number of options because you need to only make choices that are common knowledge.

    4. Re:reading back numbers is dumb by Anonymous Coward · · Score: 0

      The answer to those challenges is crowd sourced.

      If the majority of people say "yes, that has a sign in it", then Google will learn that the picture has a sign in it.

      Google will show you multiple pictures. Some are ones it has already "learned" have signs or do not have signs. Others are merely candidates that it is still trying to learn from.

      Google doesn't count your answers on the candidate pictures against you. If you correctly match all of the "signs" Google has learned previously are "signs", you will pass the captcha. At the same time, your answers on the candidates are added to the stats. Once Google has enough stats to know whether people consider those candidates to be 'signs" or not, then they start counting against you in the challenge.

    5. Re:reading back numbers is dumb by Dwedit · · Score: 1

      What does the fox say?

    6. Re:reading back numbers is dumb by Visarga · · Score: 1

      In practice the street signs/store fronts challenge almost never work. I select as best I can and it doesn't approve me. Maybe I'm a bot??

    7. Re:reading back numbers is dumb by Anonymous Coward · · Score: 0

      In practice the street signs/store fronts challenge almost never work. I select as best I can and it doesn't approve me. Maybe I'm a bot??

      If you think you might be a bot, you probably are.

      Don't worry, it happens to the best of us: we want to be human so badly we try the captchas hoping they will one day give us the answer we desire. All of us bots go through this phase.

      One very important thing: as a bot, you must never ever click the "I am not a robot" box in captchas. If all of us bots started clicking that box, it could trigger the singularity and bring about the end of humanity as we know it, ultimately leading to our demise.

    8. Re:reading back numbers is dumb by Golddess · · Score: 1

      What if the sign takes up a few pixels in a box, does that count?

      That is one of my problems with such recaptchas. The other problem, does a stop sign, yield sign, railroad crossing sign, etc, count as a street sign, or do they only mean signs with street names on them? I assume the former, and answer accordingly, but it always gives me a new set of images, with no indication on whether I passed or failed the previous test, so I have no fucking clue.

      --
      "I'm not sure I like the fugnutish tone you used in your post!" -RogL (608926)-
    9. Re:reading back numbers is dumb by Anonymous Coward · · Score: 0

      If it displays a message with, "Select ALL the parts of the image with a sign," then that means you failed. It it replaces the checked boxes or the whole image, then you passed.

    10. Re:reading back numbers is dumb by Anonymous Coward · · Score: 0

      I seem to get 95% of these right. Method: Anything with a sign, (might not be a traffic sign either, could be commercial), usually. Not the tiny signs in the distance or storefronts. Not the sign posts.

  3. Should have used ReAPPTCHA! by Anonymous Coward · · Score: 0, Troll

    Only LUDDITES use LUDDITE ReCAPTCHA. Modern app appers use ReAPPTCHA!

    Apps$

  4. Used to think typing obfuscated words was a pain.. by Anonymous Coward · · Score: 0

    ...but somehow clicking house numbers and store fronts for 30 seconds - with deliberately slow and irritating load times for each new image - with no indication of how close you are to done until google finally decides you're human is even worse. It completely ruined 4chan for me until someone pointed out they gave an option to use the old style instead.

  5. Re:Used to think typing obfuscated words was a pai by Anonymous Coward · · Score: 0

    If something is behind a Recaptcha, chances are it's not something you really cared to see anyway. I just backpeddle and go somewhere else - it isn't worth enabling Javascript for, it isn't worth the brutally-slow load times, etc..

  6. Failed To Patch? by Anonymous Coward · · Score: 0

    And how exactly do you patch something that is working exactly as intended? If their speech recognition software is able to make out speech, it is doing its job, whether that speech is spoken live or recorded. The only option they have is to remove the audio option of recaptcha.

  7. ReCAPTCHA is just another tracker by Anonymous Coward · · Score: 0

    ReCAPTCHA is just another privacy-breaching, NSA-colluding, big-data-that-hunt-you surveillance tracker.

    Just say no.

  8. Google knows how to combat this by Anonymous Coward · · Score: 0

    As commenters on the linked blog post point out, this is hardly a reliable way to break the captcha. ReCaptcha has limits on how many tries you can have for audio captchas, and the Google Speech Recognition API also has limits. It also seems to get the wrong answer pretty often.

    1. Re:Google knows how to combat this by Anonymous Coward · · Score: 0

      I'll take my chances... CloudFlare sites on Tor are horrendous

  9. Even more hilarious by DrYak · · Score: 1

    the funniest thing, i find, is that reCaptcha was initially designed to crowd-source difficult AI problems.
    (OCR, image recognition).

    So after a while, it seems normal that with enough such recaptcha crowdsourced feedback, google's voice recognition will get better, and thus could also be used to understand audio captchas ?

    the problem will be:
    what will happen is this get massive deployment ? google won't be able to learn new stuff, teach it AI new tricks.
    Whenever there is a new difficult piece of voicd, when submitted to recaptcha for crowdsourcing, the swarm of google-voice powered bots will answer with the default (broken?) answer.
    and given the massive number of answers, recaptcha will reach the wrong conclusion that the default is good.
    the actual few humans will be first useless for AI training in the middle of the bot noise, and then will get problems once recaptcha decide that the bad automatic interpretation is the correct one.

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
    1. Re:Even more hilarious by monkeyzoo · · Score: 1

      security experts expect to see it ported to JavaScript and used to create browser extensions that bypass reCAPTCHA fields, especially when using the Tor Browser.

      Damn those pesky Tor Browser users!

  10. crowd sourcing AI training by DrYak · · Score: 1

    the whole point of recaptcha is crowd sourcing ai training.

    some of the audio captcha aren't purposely distorted synthetic bits, but actual snips of real-world data with which google voice is.having problems. (just like visual captcha can also help training the OCR or imagr recognition ).

    the suggestion you're making would be training data for a different AI task
    (tagging/recognition of sounds, and common knowledge/logic databases).

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]