Researcher Breaks ReCAPTCHA Using Google's Speech Recognition API (bleepingcomputer.com)
An anonymous reader writes: "A researcher has discovered what he calls a "logic vulnerability" that allowed him to create a Python script that is fully capable of bypassing Google's reCAPTCHA fields using another Google service, the Speech Recognition API," reports BleepingComputer. The attack is incredibly simple and works by downloading a version of the reCAPTCHA audio challenge, feeding it into Google's Speech Recognition API, getting the text-version of the audio challenge, and feeding it back into the reCAPTCHA field. Proof-of-concept code is available on GitHub, and the researcher says Google has failed to patch the issue, albeit it's unclear if he ever notified the company. The attack also only works against reCAPTCHA v2, not other versions like v1, or the upcoming Invisible reCAPTCHA (v3). Because the source code for the exploit is available online, security experts expect to see it ported to JavaScript and used to create browser extensions that bypass reCAPTCHA fields, especially when using the Tor Browser.
and quite clever. i wonder if it can do better than the 10-20% or so success rate i get on the same captchas?
recaptcha is absolutely horrible, especially if you're on cellular, tor, a vpn, or just a common open hotspot... they make no fucking sense, they aren't words, just long random strings of similar looking jibbrish and skewed so much the letters are absolutely unrecognizable. so anything that can break that shit.. i'm all for it.
Why not a simple question like "What animal makes this sound? 'barking dog' A for cat; B for dog; or C for mouse?
Only LUDDITES use LUDDITE ReCAPTCHA. Modern app appers use ReAPPTCHA!
Apps$
...but somehow clicking house numbers and store fronts for 30 seconds - with deliberately slow and irritating load times for each new image - with no indication of how close you are to done until google finally decides you're human is even worse. It completely ruined 4chan for me until someone pointed out they gave an option to use the old style instead.
If something is behind a Recaptcha, chances are it's not something you really cared to see anyway. I just backpeddle and go somewhere else - it isn't worth enabling Javascript for, it isn't worth the brutally-slow load times, etc..
And how exactly do you patch something that is working exactly as intended? If their speech recognition software is able to make out speech, it is doing its job, whether that speech is spoken live or recorded. The only option they have is to remove the audio option of recaptcha.
ReCAPTCHA is just another privacy-breaching, NSA-colluding, big-data-that-hunt-you surveillance tracker.
Just say no.
As commenters on the linked blog post point out, this is hardly a reliable way to break the captcha. ReCaptcha has limits on how many tries you can have for audio captchas, and the Google Speech Recognition API also has limits. It also seems to get the wrong answer pretty often.
the funniest thing, i find, is that reCaptcha was initially designed to crowd-source difficult AI problems.
(OCR, image recognition).
So after a while, it seems normal that with enough such recaptcha crowdsourced feedback, google's voice recognition will get better, and thus could also be used to understand audio captchas ?
the problem will be:
what will happen is this get massive deployment ? google won't be able to learn new stuff, teach it AI new tricks.
Whenever there is a new difficult piece of voicd, when submitted to recaptcha for crowdsourcing, the swarm of google-voice powered bots will answer with the default (broken?) answer.
and given the massive number of answers, recaptcha will reach the wrong conclusion that the default is good.
the actual few humans will be first useless for AI training in the middle of the bot noise, and then will get problems once recaptcha decide that the bad automatic interpretation is the correct one.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
the whole point of recaptcha is crowd sourcing ai training.
some of the audio captcha aren't purposely distorted synthetic bits, but actual snips of real-world data with which google voice is.having problems. (just like visual captcha can also help training the OCR or imagr recognition ).
the suggestion you're making would be training data for a different AI task
(tagging/recognition of sounds, and common knowledge/logic databases).
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]