CAPTCHA Busted? Company Claims To Have Broken Protection System
sciencehabit writes "A software company called Vicarious claims to have created a computer algorithm that can solve CAPTCHA with greater than 90% accuracy. If true, the advance would represent a major breakthrough in artificial intelligence. It would also mean that the internet will have to start looking for a new security system. The problem, however, is that Vicarious has provided little evidence for its claims, though some well-known scientists are behind the work."
That's better than my success rate
I cured cancer, stopped global warming, and found the last missing episodes of Doctor Who.
Just take my word for it.
In Soviet Russia, dot slashes YOU!
I wish I could get CAPTCHAs right 90% of the time.
I'm sorry, but I don't consider CAPTCHA a security system.
I would say it's an anti-spam system.
New things are always on the horizon
Another researcher had a program that solved captchas with better accuracy years ago. He didn't release it "for the common good".
Support my political activism on Patreon.
Well, not me per se. But I live vicariously through these guys.
A software company called Vicarious claims to have created a computer algorithm that can solve CAPTCHA with greater than 90% accuracy.
I just re-serve the CAPTCHAs on my own popular website. Crowdsourcing for the win.
Although "Recursive Cortical Network" sounds really cool, it would be nice to, you know, learn a bit about how it WORKS.
This headline makes no sense. CAPTCHA is just a concept, there are hundreds of implementations. I'm sure some of them are crap and only block bots that aren't even trying, some block 100% of bots (and half the humans, too), and most are somewhere in the middle. So what does it mean to "solve CAPTCHA with 90% accuracy?" Does that mean he's tested it on every system out there, and aggregated the results? That would actually be interesting if he has, but more likely he's just tested it on one kinda-crap system that I could probably write a bot in a week to do the same thing.
It does sound like it's built to be more robust, working with more different types of captchas than perhaps many captcha-busting algorithms, but I doubt it's the first of its kind (maybe it uses a new algorithm, but it's still a captcha-buster, that's not new.)
Time for the reverse CAPTCHA. If you can guess it correctly, you must be a bot.
Security to who? More like an annoyance
did you forget to take your meds?
From the video, I think they used mathematical optimization. Multiobjective vectorial optimization if I had to guess. The big breakthrough here is that instead of OCR'ing the image they tried to rerun the captcha construction algorithm controlling the random choices the algorithm makes. Each choice is a variable here. Them you implement a function that measures how close this variables get to the CAPTCHA image. Now you use optimization to get to the global minimum of this function.
At least that is how I would have done it.
So we've got OCR nailed. What NP-hard problem do we dupe the spammers into solving for us next? Can we throw halting problem at them, or should we work up to it with traveling salesman first?
See title of comment.
Ran into one the other day asking to only enter the numbers under a little circle. The numbers were distorted as usual, but only some of the numbers had a circle above them. Others had a little square or triangle.
It'd be trivial to extend this to say "only enter numbers where the number of circles around the number corresponds to that number" or similar.
Such minor changes would pose no significant problem for humans, but making sense of the instructions (which might either be embedded in the image or not) would be very hard for an algorithm to do. You wouldn't even need to limit yourself to the same instructions each time.
[imagine this as a captcha graphic]
Spell last month.
Or this:
[image]
Type the one that flies:
England Turkey Russia
Or this:
[image]
Type the word for
2 + number of days in a week
Or just to confuse things, split the "challenge" into code + html:
[image]
2 + number of days in a week
[html] What is the number above minus 4, as a word: ___
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
The summary suggests this marks an advancement in AI, but it depends on what AI means. There are generally two areas of AI: 1) artificial "thinking" , and 2) Using advanced algorithms to get things done. Most people think about #1 when you say AI, however solving captcha is just an example of #2. I would argue that #2 really isn't "AI" at all. In fact, all advancements in "AI" are of type #2. Attempts at #1, thus far, have been absolute failures.
I sort of hope that the CAPTCHA-busting code is just vapor, and it doesn't get released.
If it does come out and get into widespread use, what will likely result are websites likely going another step up the chain and doing more annoying stuff such as requiring access through Facebook, demanding a phone number for SMS authentication (of course, said number ends up getting sold to robodialers), or more intrusive means.
I see some CAPTCHA replacement schemes like counting how many cat butts are facing a person in a row of six photos and inputting the number, but those seem at best a stopgap measure, and block out access to the site to the blind.
I wonder if the turning test is: does the subject attempt to solve something too obscure or does in spin for another puzzle. Failing on the poorly made ones instead of rejecting them and going on to the next might show which is a human and which is a machine.
Does Download.com have it yet? I need a program like this to help me figure those freaky, wormy wordnumbers out.
Isn't this the point of computers? To do what humans do so humans don't have to do it.
If you read Black Hat World, you find that CAPTCHAs are a solved problem for spammers and fake account creators. The better systems run them through several OCR programs in parallel. That knocks off about 67% of them. There's a lot of special casing involved, but from the spammer's viewpoint, this is a solved problem. Getting from 67% to 90% would be convenient, but humans aren't at 90%. If all the OCR programs give up, the problem is sent to an outsourced service where low-wage people solve CAPTCHAs all day.
The Black Hat forum system itself makes users play and win a short video game to lock out 'bots.
First reliable text recognition software developed!
http://xkcd.com/810/
If you found the article worthless, you pass. If you found the dancing letters in the video entertaining, you also pass.
Guardian article from 2008 called 'Captcha is broken, now what?', which in turn references a Captcha-breaking algorithm that was created in 2005, "and demonstrated it by posting automated comments to nearly 100 blogs to demonstrate their vulnerability."
http://www.theguardian.com/technology/2008/aug/28/internet.captcha
Palaces, barricades, threats, meet promises
Alternately... use the alternative audio and run speech recognition on it to solve the captcha.
No one thinks outside the box any more...
> In fact, Vicarious's researchers go on to claim that their algorithm works in an analogous way to the human brain.
> He can add more distortions, but we can simply add a few more training data that captures that distortion, if it is not already captured by the existing training examples.
Really, it just sounds like they have a supervised ML algorithm which seems to be performing better than the umpteen others trying to break CAPTCHAs. Unless they release more details proving otherwise, I can't see how this is a breakthrough of any sort.
Between their 90% and my 10%, we could solve them all!
Artificial Intelligence now exceeds human capability.
Secession is the right of all sentient beings.
Most captchas were cracked 17 months ago.
It's time for something that's easier for humans and harder for computers. For example, these images have been tweaked such that the standard routines don't work:
https://bettercgi.com/sb5/
No computer, by definition, can exhibit 'artificial intelligence'. The modern use of the term AI describes computers using massive massive databases of pre-captured data used by 'rule engines' applying standard statistical methods to do some form of pattern processing.
The problem with CAPTCHAs is trivial to explain. A 'Turing Test' requires Human adjudicators, but clearly a CAPTCHA is designed to allow a COMPUTER to judge whether the 'user' at the other end is a Human or a 'machine'. So a machine must generate unique CAPTCHA data, and a machine must judge the 'correctness' of the response, ensuring that is is ALWAYS possible for a machine to correctly solve the CAPTCHA.
How to make CAPTCHAs that avoid this problem? You can't! Use Humans to create unique CAPTCHA questions, and an attack can use any numbers of methods based on discovering a finite collection of 'questions'. Crowd-source an approach to the questions, and the known statistics of the psychology of crowd responses eliminates the useful Human dimension.
Some people hope that CAPTCHA creation can be ultimately made analogous to the non-symmetric nature of strong encryption methods, but this is a mathematical fallacy. The issue is this- 'common sense' suggests to most of us that an image, obviously to a Human, but incredibly hard for a computer vision algorithm to identify, must be easily creatable by sufficient computer power. But 'common sense' ignores the reality of the problem.
The Human vision system, when processing something useful for the CAPTCHA system (usually text) works in a VERY trivial way. Semantic thinking, the impossible part of Human thinking to replicate on a computer, plays no part in such a CAPTCHA. To solve such a CAPTCHA, a computer does NOT have to consider Human thought or perception, simply the pattern processing 'hardware' in the Human eye, and the nerves and brain function immediately connected to the eye. All the 'clever' visual jiggery-pokery to 'hide' the text from the computer 'solver' fails, because WE strip it out with very simple visual 'hardware', hardware a modern computer can very easily replicate.
The best a CAPTCHA system can try to do is "security through obscurity"- in other words constantly change the form of the CAPTCHA, so although it is easily breakable, the software to break it is always one generation behind the software producing it.
Why shouldn't a CAPTCHA company have HUMANS producing new CAPTCHA systems ever hour, so the 'crackers' are always out-of-date? The significance of the work of the company named in the article is the claim that even this approach would have limited success, because the 'solving' simply gets to the root of how Human vision systems strip out the obscuring 'noise'. But then, the CAPTCHA companies, with their hourly Human coded permutations of the 'noise' system could simply include varying screen 'patterns' that instruct the user as to the 'order' of the letters/numerals.
The trick is creating an ever changing HUMAN dimension for the display of the CAPTCHA data that the crackers have to learn, and code into their solver system- to keep the 'crackers' constantly one step behind. Dumb linear text, no matter how crapped up with 'noise', will easily fall to a perfected, once-and-for-all, machine method.
There's a new system on the way called BORE - Back Orifice Recognition Engine. They claim no two are alike. A seat is included with the system.
Eternity: will that be smoking, or non-smoking? I Corinthians 6:9-10
Spammers, and bots seem to have broken it sometime ago, is this something new?
If you think about it.. what we are asking is... show us something you can do that a computer cant do..through a computer. Mildly mind boggling logic puzzle there.
Captcha is worse than the problem it's supposed to fix.
Recaptcha from google has been broken for awhile. I had it implemented on my site and got about a dozen spam sign-ups a day.
The moment I switched to a local "mycaptcha", which should have been easier to OCR, they stopped dead.
As if my website (http://asecretspot.com) didn't get enough spam as it was, now bots will be able to solve captcha!? I'm doomed.