Researchers Break Video CAPTCHAs
Orome1 writes "After creating the 'Decaptcha' software to solve audio CAPTCHAs, Stanford University's researchers modified it and turned it against text and, quite recently, video CAPTCHAs with considerable success. Video CAPTCHAs have been touted by their developer, NuCaptcha, as the best and most secure method of spotting bots trying to pass themselves off as human users. Unfortunately for the company, researchers have managed to prove that over 90 percent of the company's video CAPTCHAs can be decoded by using their Decaptcha software in conjunction with optical flow algorithms created by researchers in the computer vision field of study."
Commies vs West
MPAA vs sharers
coders vs decoders (that includes captcha vs decaptcha)
It's fun to observe it when government does not interfere.
I do not believe in karma. "Funny"=-6. Do good and forbid evil. Yours, Oft-Offtopic Flamebaiting Troll.
We need some made up law.
"Anything a computer can generate it can understand."
This is why chat bots still suck. Computers cannot generate context.
Oblig XKCD.
http://xkcd.com/1019/
21st Century Renaissance Man
..if your user can interact with it, they can screw with it. The nature of HTTP and the web is a stateless environment, one has to impress state onto it for things like secure transactions and sessions. Basically, you need to come up with a test that randomly checks to see if the input is coming from a person; all without breaking the experience of the web browser, or the web in general. It's an arms race, and things are even again; another advantage bites the dust.
The catchpa is worthless against an army of Indians being paid just pennies a pop to break them. The only thing they do is annoy the script kiddies. Far better success would be had in doing pattern recognition on sign ups instead.
Honestly, I fucking hate CAPTCHA and will cheer on its demise. Good luck typing this shit in...
We have to face that fact, capcha is just a temporary measure anyway. Software is rapidly approaching the ability to do anything online that your average human can. While computers rapidly increase in capability, the average human stays the same. Eventually the only way to tell a computer from a human, will be the humans are easier to confuse.
What could possibly go wrong? v1agra
Daily read for tech news: Freezenet.ca
have anything else to do?
Sorry, had to say it.
I'm not a lawyer, but I play one on the Internet. Blog
http://xkcd.com/810/
At least something good could come out of captchas.
Why oh why real pictures aren't used is beyond me.
Example:
We have a simple room strip, make sure it has a rug, carpet, something big and varied.
We have thousands of pictures of hairy creatures, be it cats, dogs, hamsters, whatever.
Skew, stretch, distort in various ways, even throw random lines through them, every one of them. Rotate them too.
Remove their faces. (since those aren't as hairy and can easily be ID'd as whatever animal)
Place 2 distinct creature types in same picture, several of them.
To solve captcha, you type in the first letter of the animal (it will tell you, such as Cat, Dog, Hamster, Sloth, etc.) for each of those animals.
So, you'd end up with something like CCDDCDDC as the captcha.
Not only does the algorithm have to be capable of deciphering the differences between 2 animal types, they have to be able to get every one of them in sequence.
It won't take up that much more space than your typical messy captchas.
And the best part? Until we get quantum computers, it is going to take a huge botnet to solve these in any reliable time.
Never allow cache or re-use a single one of those captchas, ever. This is 2 of the biggest reasons why a lot of systems fail, they have too large an age and they can be re-used for loads of people.
Well, you COULD re-use them, just don't re-use them a lot. Separate them by region. Generate a new batch of pictures every so often if, god forbid, it does somehow manage to get cracked.
If you have a small-ish site that caters to a niche community where your target audience will share some knowledge that non-target folks don't have, a riddler where you can set the questions can work great. Just structure your questions in such a way that the answer is non-obvious in an automated way to all but the best AI engines.
For example, Phoronix could use a question like this --
Which of these is superfluous? Intel, ATI, NVIDIA, AMD
The CAPTCHA industry is not doing well.
ReCAPTCHA needs to be retired. OCR is getting too good. ReCAPTCHA, remember, is using images from book scanning, ones that the OCR system couldn't recognize. When ReCAPTCHA started, the text presented was usually an English word. Now, if the book scanning OCR system can't figure out something, it's probably not an English word. You're lucky if it's a sequence of characters found on an A-Z keyboard. People have reported ink blots, mathematical formulas, and Cyrillic.
Worse, ReCAPTCHA's idea of the "right" answer is crowdsourced. It's possible for bots to pollute the ReCAPTCHA database, by providing the same wrong answer more than once. You only have to get one of the words right, so if you can read one, a junk response for the other works. This goes into the database as a vote for the "right answer", to be presented to someone else later. I sometimes type "whatever" when one of the images is unreadable.
Where the hell is Standford University? Is that near Stanford?
The going rate is around $1 per 1000 solved catchpa.
Of all the things to research, why research this? We know that captchas can be broken, but why does Stanford need to help the spammers get their quicker? I am sure that there are tangential benefits to the research (character recognition etc), but why not focus on that directly without the captcha specificness?
Yes, the captchas will be broken eventually, but at least lets make the spammers figure it out themselves without helping them do it.
Well, the whole CAPTCHA system is itself flawed - it's putting all the data in one place. The only way to make it harder would be to have multiple data sources for users to have to put information through - e.g. not simply one CAPTCHA to verify, but 3 or 4 separately loaded, and all indepent of each other. (Even 2 would be an improvement.)
Still, it would only be a matter of time before the bots figured out how to track all the CAPTCHAs and thereby defeat it yet again.
Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
There are a variety of low-tech techniques that are far more effective than using Captchas or even "security questions". You don't have to annoy your legitimate users, or make them jump through hoops. One trick is to include a "honeypot input" in your form. Give it a tantalizing name attribute such as "username", give it visibility of "hidden", and when validating your form simply check to see if any values have been entered. If it's non-empty, it's a bot.
On my own site, I load my form into the page via an AJAX call, which means that there is no reference to "registration", "form", "username", or any of the other tokens in the page source that a bot is looking for. Bots may be sophisticated enough to figure out Captchas, but they haven't progressed to the point that they can parse, comprehend, and execute javascript.
If a bot ever arises that can thwart both of these techniques uses in tandem, it'll be too busy amassing a robot army to bother with your silly little site.
I NEED one of these captcha solver programs. When I try to register for a website or forum, many of them are so unreadable it takes me 20 minutes of trying to get it right and NO PHONE NUMBER to call their technical to register me by tele.
i was thinking one time that instead of using recognition of letters to distinguish between a human and a bot, it might be easier to use object recognition. so you would show a picture, for example, of some objects on someone's desk and ask "what is to the right of the pencil?", and the user would have to say "paper clip" or something like that. it seemed to me like that might be a task that is easy for a human, but harder to code a bot to accomplish.
I had no idea they existed -- haven't seen any video captchas, and don't want to. Does nobody realize how horrible that is for usability? (I guess some people do, since I haven't seen them on any website I use.) At this point, it's become clear that fight is not winnable -- regardless of computer capability, there's just too many people willing to solve captchas too cheap to make it effective. Better to give it up than to continue the arms race where the only losers are legitimate users...
What about charging 10-15 seconds of CPU time with some arbitrarily hard code? It seems like everyone agrees that CAPTCHAs are an arms race that the good guys can't win, why not make it where it isn't profitable to solve the CAPTCHA replacement on a large scale?
The state of OCR has changed little in over a decade, at least at the consumer end. I've tried the top software like Acrobat Pro and Omnipage and hardware solutions from Xerox, HP, Fujitsu, etc. The text can be printed clear as day yet, with no flaws, and the OCR programs all fail to get above I'd say a 70% accuracy. Maybe it's different in the commercial world, where one can afford a $25,000 glorified copier, but I've been unable to find anything you can buy from Amazon or the like that will reliably scan a document.
There are a variety of low-tech techniques that can be more effective than using Captchas or even "security questions", especially when you mix and match. You don't have to annoy your legitimate users, or make them jump through hoops. One trick is to include a "honeypot input" in your form. Give it a tantalizing name attribute such as "username", give it visibility of "hidden" (with CSS from a style-sheet), and when validating your form simply check to see if any values have been entered. If it's non-empty, it's a bot. On my own site, I load my form into the page via an AJAX call, which means that there is no reference to "registration", "form", "username", or any of the other tokens in the page source that a bot is looking for. Bots may be sophisticated enough to figure out Captchas, but they haven't progressed to the point that they can parse, comprehend, and execute javascript. If a bot ever arises that can thwart both of these techniques used in tandem, it'll be too busy amassing a robot army to bother with your silly little site.
What about charging 10-15 seconds of CPU time with some arbitrarily hard code?
A major obstacle to this is that you have to make the puzzle easy enough that your users on lower-end or mobile devices still have the necessary computation power to complete the puzzle in a reasonable time. Malicious organizations behind the spam will just put more hardware into their attack, typically by using the compromised machines in botnets. They'll also optimize the code, and parallelize the attack by performing the computation for multiple attempts on multiple CPU cores, while your code has to work for single-core machines.
Let's now imagine a perfect world in which you create a check that actually takes 15 seconds to complete. They can still do that 5,760 times per day.
"Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager
lol wut?
"They were pure niggers." – Noam Chomsky
You could have panels of comic strips that combine visual information to provide a context, and partially filled word bubble with one very obvious missing word.
By combining visual cues to provide context for missing word, you could at least make it harder for algorithms, although Underpaid Indian People attack still works.
I have nothing to lose but my bindings.
Most annoying piece of shit I've ever seen. Every time I use Ticketmaster I wish the asshole who created reCaptcha the most gruesome death. Oh, it's owned by the Google assholes, of course.
The key with CAPTCHAs is diversification, just like the key to avoiding disease in biological specimens is avoiding a monoculture. If there were 15000 different CAPTCHA methods, it wouldn't be profitable to create CAPTCHA tools that would each only work on some small subset. There are a lot of low population sites I use that check whether I'm a human with some unique set of hoops through which I must jump. The effectiveness of those hoops comes from the fact that they're often unique to that site, not a lump of code used by thousands of different sites. Diverse CAPTCHA breaking might require something like Watson, which isn't going to be available to spammy types in the near future.
"I zero-index my hamsters" - Willtor (147206)
Have the captcha page displays some really good porn video footage - drawn from a huge repository of suitable images (say, the rest of the internet). The clips are fairly long (say 3-5 mins or so). To pass the captcha the user merely has to click on a button at the right time. :P
So, if the user clicks right away, its a bot. if there is a suitable pause (say 3-5 mins), then its more likely human
"The first time I got drunk, I got married. The second time I bought a chimpanzee, after that I stayed sober" Arian Seid
I have to wonder just who Standford is trying to help out with this research. Captcha's may be annoying but when their research makes its way to the script kiddies and the industry comes up with a new solution does anyone really think the new solution won't be even more annoying?
FTFY:
Oblig XKCD.
http://xkcd.com/810/
There should be an oblig XKCD link for all the bloody times people post oblig XKCD links.
finding a captcha is on the verge of proving that you ARE indeed a robot...
for helping spammers?
Dear northamerican, from Soviet Asia, chinese captcha dooms you!
Comment removed based on user account deletion
I always thought that was pretty secure because the machine couldn't tell which picture was a cat? What about combining video and cat captcha. 4 videos, one of which is a cat. But it could be a close video, or a zoomed out one where the cat is running around. A computer really shouldn't be able to decode that. Use a large enough database and they'll never solve it.
I got to chat with Luis von Ahn, co-creator of the Captcha and reCaptcha, and it turns out he's a surprisingly idealistic guy. Taking inspiration from people in gyms pedaling and going nowhere, he hoped to actually *do* something with the brainpower needed to solve a reCaptcha (he said something along the lines of, "actually your brain is doing a pretty amazing thing -- translating an image to text.") Maybe digitizing the archives of the New York Times and ancient manuscripts isn't world hunger or world peace, but it's pretty damn cool. And as you probably know, that's what you're helping to do every time you translate a word in a reCaptcha box.
GeekDad, TED speaker, Wipeout loser, author of Brain Trust
I wonder why most people cannot spell Stanford University's name correctly.
Hopefully they'll start integrating these deCaptcha tools into Firefox and Chrome. Captchas became so hard it's impossible for mere humans to solve them.
Made to look like a captcha, with the text, "What is 2+3?" Spammers read the captcha and submit that back. Normal people type 5. "What site is this?" is another good one. Heck, you don't even need to make it look like a captcha; it's just funnier that way. One site I mod used to get dozens of spam threads a day, until a couple years ago they added a box to the end of their registration with one of a handful of questions like "what do seal clubbers club" (answer: seals), or "what is the first letter of the alphabet". It now gets a few a month, most of which look entirely nonautomated. When bots have evolved to read natural language and spit out correct answers to arbitrary (though easy) questions, well, as xkcd once said, mission accomplished. (Though his mission was a different, albeit related, mission.)