ReCAPTCHA.net Now Vulnerable to Algorithmic Attack

colours by orange47 · 2010-08-05 09:03 · Score: 2, Funny

"There's probably an excellent Firefox plugin to render this page's color scheme more bearable."
just select all page, its better.

Re:colours by electrostatic · 2010-08-05 09:17 · Score: 4, Informative

"...an excellent Firefox plugin to render this page's color scheme more bearable."

Yep. Color Toggle

https://addons.mozilla.org/en-US/firefox/addon/9408/

I have it set so Ctl-Shift-Z set light yellow background, black text, and blue links.
Re:colours by commodoresloat · 2010-08-05 09:46 · Score: 1

sweet! Perfect for reading slashdot discussions in this IT color scheme! It's hilarious that a slashdot summary in this section is complaining about color schemes on other pages... glass houses and all
Re:colours by Anonymous Coward · 2010-08-05 09:53 · Score: 1, Informative

Neat, I also use yellow background, black text and bluish links. It is very relaxing.
The color codes are #FFFF00 for the background, #000000 for the text and #00EFFF for the links.
Re:colours by Anonymous Coward · 2010-08-05 09:59 · Score: 1, Informative

View>page style>no style
easy.
Re:colours by SkyDude · 2010-08-05 10:32 · Score: 1

Just hit Ctrl-a and all the text shows up just fine.

--
== First cross river, then insult alligator.
Re:colours by rnturn · 2010-08-05 16:35 · Score: 1

"just select all page, its better."
Ugh. That looks even worse on my browser. I found that "View->Page Style->No Style" worked much better. At least the text was easier to read. And no time was spent searching for an obscure plug-in.

--
CUR ALLOC 20195.....5804M
Re:colours by delinear · 2010-08-05 23:55 · Score: 1

People with one of the most common forms of dyslexia find it easier to read black on yellow (usually soft/light/pastel yellow) - it's high contrast but doesn't have the glare of pure white - just because that colour scheme doesn't sound more readable to you, that doesn't mean it's not more readable to someone.
Re:colours by jonadab · 2010-08-06 02:03 · Score: 1

I haven't browsed with page-specified colors turned on ever since I realized, back in the nineties, that most web content creators have extremely terrible taste in colors. Web pages appear in the system colors (#FFE6BC on #294D4A), same as everything else. That way I can actually stand to look at them.

--
Cut that out, or I will ship you to Norilsk in a box.
Re:colours by commodore64_love · 2010-08-06 03:49 · Score: 1

(1) If white is too glaring, just turn down the brightness on your monitor.
(2) The Black-on-red style reminds me of the 4-color IBM PCs and Apple IIs, circa 1988. Yeah. Nostalgic?!?!?
(3) I cursed the fact I had to downgrade from the beautiful 4000-color Commodore Amiga to a lowly PC or Apple, but that's what the school used, so I was stuck with it. I can't believe those ugly Apples and PCs "won" the computer war. Where did it all go wrong?

--
"I disapprove of what you say, but I will defend to the death your right to say it." - historian Evelyn Beatrice Hall

Human Success? by Anonymous Coward · 2010-08-05 09:03 · Score: 5, Insightful

So what is the average human success rate? I think mine is only about 50%

Re:Human Success? by Anonymous Coward · 2010-08-05 09:52 · Score: 2, Informative

Mine is 100%. Recaptcha is probably one of the easiest captcha I've ever had to deal with; something is wrong with you, sorry.
Re:Human Success? by wickedskaman · 2010-08-05 10:08 · Score: 1

Maybe that's why it's been figured out at ~30% eficacy. *shrug*

--
Sand's overrated... it's just tiny little rocks.
Re:Human Success? by artg · 2010-08-05 11:22 · Score: 1

So, is there a firefox plugin that fills in captchas for me ?
Re:Human Success? by Kalriath · 2010-08-05 15:41 · Score: 3, Insightful

Yeah, I agree with this. Recaptcha is one of the easiest out there.
Admittedly though, I have around about 3% success rate with vBulletin captchas. Hear that forum owners? I'm not joining your forum because I can't read your captcha!

--
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
Re:Human Success? by delinear · 2010-08-06 00:05 · Score: 1

Recaptcha allows you to make some mistakes as standard than most othert captcha solutions I think (and TFA's findings seem to support this also - it suggests you can get one word wrong and one letter from the other word, although when I tested that it was too much, but I have successfully tested one right and one wrong word and still passed the captcha). Really you're being served one captcha word and one word Google's book scanning project couldn't recognise, you're solving the captcha word but the other word (usually the harder word to read) you're only adding to the statistical weighting of what the word probably is, in other words you can afford to be a little wrong and still get the captcha right.

My eyes! by Yvan256 · 2010-08-05 09:04 · Score: 2, Funny

The goggles, they do nothing!

Re:My eyes! by SomeJoel · 2010-08-05 09:40 · Score: 4, Funny

Did you not learn when I explained this yesterday? The quote is: "My eyes! The goggles do nothing!". There is no "they", nor is there any bad pronunciation. Indeed, it is correctly articulated and enunciated, with an accent.
Easy there champ, nobody appreciates a Family Guy nerd correcting everyone's quotes.

--
<Complete your profile by adding a signature!>
Re:My eyes! by Agent0013 · 2010-08-05 10:00 · Score: 1

Unless it's actually a quote from the "Is it a good idea to microwave this" guys on Youtube. (Although I think they actually say the line about the mask and not the goggles.) http://www.youtube.com/watch?v=ewGkH-E_HWA

--

-- ssoorrrryy,, dduupplleexx sswwiittcchh oonn.. -Quote found on actual fortune cookie.
Re:My eyes! by sexconker · 2010-08-05 11:05 · Score: 1

Except those guys are simply bastardizing the Simpsons quote.
Re:My eyes! by billstewart · 2010-08-05 11:39 · Score: 1

Dude, if you're getting old enough to need reading glasses, just get them....
There are some really bad CAPTCHAs out there - recapcha is one of the more human-readable ones, but sometimes just magnification isn't enough.

--

Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Re:My eyes! by SomeJoel · 2010-08-05 11:53 · Score: 4, Funny

Judging from the other replies, meta-humor is a little hard for you guys...

It works wonders though. For instance, the next time someone is talking about "the force" or jedis and such, tell them "Get a life, Star Trek sucks!". You'll find the reaction much more interesting than if you correctly identify the franchise.

--
<Complete your profile by adding a signature!>
Re:My eyes! by Yvan256 · 2010-08-06 06:05 · Score: 1

Did you take the time to check the website linked in the resume? The goggles reference was to that website, not CAPTCHAs.
Re:My eyes! by billstewart · 2010-08-06 12:52 · Score: 1

Yes, it was pretty ugly, but easy to find the link and look at the article..

--

Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks

OCR improvements? by Anonymous Coward · 2010-08-05 09:05 · Score: 3, Interesting

Can these attack algorithms actually increase the accuracy of normal OCR programs?

Re:OCR improvements? by Chad+Birch · 2010-08-05 09:22 · Score: 1

I haven't RTFA, but that's unlikely. With a captcha, you receive a response indicating whether you were correct or not. When using OCR, there isn't really any automated way to be sure if you've gotten it right.

--
Sturgeon was an optimist.
Re:OCR improvements? by nizo · 2010-08-05 09:24 · Score: 1

Better living through spam!

--
I Am My Own Worst Enemy
Re:OCR improvements? by ottothecow · 2010-08-05 09:33 · Score: 1

recaptcha was created to increase the accuracy of normal OCR programs...
so technically the bots solving them would also be helping proof Project Gutenberg texts so long as they are getting both the test word and the book word correct.

--
Bottles.
Re:OCR improvements? by AusIV · 2010-08-05 10:28 · Score: 2, Informative

They're not. I saw the presentation these guys gave at DefCon (their presentation was about as painful as their website), and they're only getting the test word correct with about 30% accuracy. They're not completely sure about their success rates on book words, but they believe it to be considerably lower than the test words.
Re:OCR improvements? by Garble+Snarky · 2010-08-05 10:32 · Score: 1

Unless of course, all the bots use the exact same algorithm, and they all make the same mistake on the book words. Recaptcha uses consensus, right?
Re:OCR improvements? by Peach+Rings · 2010-08-05 12:56 · Score: 1

IIRC, as part of the marblecake time magazine vote thing, people submitted thousands of PENISes as the book word to try to get it inserted randomly into ebooks. The recaptcha people said they've anticipated such an attack and that it's not possible to influence final book word results.
Re:OCR improvements? by n3ond4x · 2010-08-05 14:31 · Score: 2, Funny

They are not considerably lower because as book words are solved they become verification words. Also, if you didn't enjoy my talk, don't come next time.
Re:OCR improvements? by Sparr0 · 2010-08-05 19:44 · Score: 3, Insightful

The problem is that since you are *probably* solving the verification words with higher accuracy to begin with, you are actually poisoning the data being gathered regarding the book words. So, while a book word becoming a verification word based on your "solutions" will keep your solution rate constant, it actually damages the system when it comes time for humans to solve the CAPTCHA, or worse when the solutions are used as OCR corrections.
To clarify, given a classically OCR-able "foo" and a non-OCR-able-but-human-readable "bar", a human is expected to recognize the slightly-deformed-by-reCAPTCHA "foo" and is trusted to get "bar" right more often than OCR would. This attack only defeats the deformation applied by reCAPTCHA, it doesn't actually improve the OCR on the non-deformed words, which means you are going to submit an answer of "foo ban" every time this pair is encounted (or "blah ban" for a different scenario), and the reCAPTCHA system is eventually going to decide that the book word really is "ban".
Re:OCR improvements? by AusIV · 2010-08-06 05:08 · Score: 1

Basically, they had written out a manual for breaking reCAPTCHA, copied their manual into powerpoint slides. Then they stood in front of the audience and read straight from the slides, providing practically nothing we couldn't have gotten just from reading the slides.
On about the third slide, someone in the audience yells out "Are you guys seriously going to just stand up there and read your slides?" And one of the presenters confirmed that they were indeed going to stand up there and read their slides. Half the audience got up and left. If it weren't for my strong personal interest in breaking CAPTCHAs I would have been in the half that left.

Pretty cool stuff by Monkeedude1212 · 2010-08-05 09:07 · Score: 1

But that just means more spambots, right?

Re:Pretty cool stuff by Kepesk · 2010-08-05 09:09 · Score: 1

Personally, I don't think there will ever be an effective CAPTCHA or similar image-based technology. Someone will always come out with a better algorithm to beat them.

--
Help me fix my brother's injured butt!
Re:Pretty cool stuff by fyrewulff · 2010-08-05 09:14 · Score: 1

On the other hand, it'll be easier to block the known spammers because fewer of them will be able to afford the hardware/sweatshop/botnet setups once the computational brute force needed increases.

--
"We need to get over this notion, that, for Apple to win... Microsoft must lose." - Steve Jobs, 1997
Re:Pretty cool stuff by Anonymous Coward · 2010-08-05 09:23 · Score: 1, Insightful

This won't happen. Many current CAPTCHAs are already hard to solve for humans, and increasing the computational cost to solve a CAPTCHA will also make it harder to solve for humans.
Now, the problem is, computers are getting more powerful every day, while humans don't. Sooner or later, this simple fact will render CAPTCHAs useless.
Re:Pretty cool stuff by veganboyjosh · 2010-08-05 09:29 · Score: 1

Useless to humans, maybe.

Maybe not so much to Skynet.
Re:Pretty cool stuff by John+Hasler · 2010-08-05 10:41 · Score: 1

> Maybe not so much to Skynet.
Then we just have to hope the spammers piss off Skynet.

--
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
Re:Pretty cool stuff by Peach+Rings · 2010-08-05 12:57 · Score: 1

Obviously the evolution of CAPTCHA science will be toward text (or audio) that is more easily recognized by humans but not by bots. It's not just a matter of making captchas really really hard.
Re:Pretty cool stuff by KahabutDieDrake · 2010-08-05 15:40 · Score: 1

This assumes only text CAPTCHA. Iconic based CAPs exist, and are incredibly simple for humans to solve, and very difficult for computers to solve. 4 human face pictures. The question says "which one is smiling?" Now, I realize that even that can be computed, eventually. But the cost is MUCH higher for a computer than for a human. Further, that's the simplest possible example, and hardly encompasses all the possible variations or difficulties. Computers are really good at raw power. They SUCK at pattern recognition (although they are getting better quick) and they REALLY suck at emotional response. The concept of a CAPTCHA is sound, we just need to get away from text based systems. Because as OCR and raw power become more common and more powerful, they won't hold up.
Re:Pretty cool stuff by delinear · 2010-08-06 00:29 · Score: 1

Yes, centralised captcha would seem eventually destined to failure. The better approach would be that each instance be reasonably unique (even if it was only so far as each users uploading their own images), that way a determined spammer might break one site, but he can't use what he did there to attack others, each time he'll have to start from scratch. The obvious downside is the massive cost - instead of one person spending a week linking images in the database, you'd need one person per company. Still, for a reasonably large company for whom spam is a big issue it might eventually still become the best option (and to periodically add to and remove from the database of existing images).
Re:Pretty cool stuff by david_thornley · 2010-08-06 04:15 · Score: 1

One problem with that is that even the dumbest program has a 25% chance of correctly identifying which of four pictures has a smile, or a kitty, or Yog-Sothoth. If I, as a human, want to get into a website, a 25% success rate is very discouraging. For a robot that doesn't require one particular success, and can try several times, it's quite sufficient.
The critical part of a captcha is that the viewer has to enter the correct one of a very large number of choices. It's really hard to do that with pattern recognition.

--
"When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes

Speaking about re-captcha by imsabbel · 2010-08-05 09:08 · Score: 3, Informative

I recently went to their homepage and looked _really_ hard for any statistics about which books are transcriped. I read their Science paper. Tried all sections.
Its all about the captcha part, and _nothing_ about the RE.
The way they state how it works ("We are using 100.000 unique words") sounds like they have given up on that part long ago and just recycle their old database again and again...

--
HI O WISE PRINCE. WHT TOOK U SO DAM LONG?

Re:Speaking about re-captcha by icebraining · 2010-08-05 09:14 · Score: 4, Informative

Currently, we are helping to digitize old editions of the New York Times and books from Google Books.
http://www.google.com/recaptcha/learnmore

--
Dilbert RSS feed
Re:Speaking about re-captcha by Mashiki · 2010-08-05 09:17 · Score: 1

Dunno. I've been seeing a lot of unique stuff recently like hebrew, chinese, japanese, and vertical lettering.

--
Om, nomnomnom...
Re:Speaking about re-captcha by MozeeToby · 2010-08-05 09:28 · Score: 1

You don't even need that, the attacker has access to everything, remember? They can just look at the file directly if it's predownloaded on the page or send the page the mouse over event for that element. I highly doubt that the people doing these algorithms are using a full web browser to pull and post data.
Re:Speaking about re-captcha by imsabbel · 2010-08-05 09:49 · Score: 4, Interesting

Hm.
So its for-profit work for the biggest advertising firm in the world.
Sort of expected project gutenberg or something.
Too bad.

--
HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
Re:Speaking about re-captcha by Cyberax · 2010-08-05 11:32 · Score: 1

As far as I understand, this data will be publicly available on Google Books.
Re:Speaking about re-captcha by al0ha · 2010-08-05 12:02 · Score: 1

>> The way they state how it works ("We are using 100.000 unique words") sounds like they have given up on that part long ago and just recycle their old database again and again...

I think they recycle the database of words that are known, not the ones also proffered up that are unknown to the API, so the book building continues.

Entering the known word gets you past the gate, and once you've done that they assume that you also correctly answered the word they don't know. As far as I understand the unknown word is also offered up more than once to validate against multiple responses. The link below is an interesting outline of how some people figured out the known words were being recycled and exploited it.

http://www.theregister.co.uk/2010/03/01/ticket_scalping_hack/

--
Did you ever wake up in the morning, with a Zombie Woof behind your eyes? -- FZ
Re:Speaking about re-captcha by martin-boundary · 2010-08-05 12:08 · Score: 2, Insightful

Google books isn't really public, though. You can only view a small number of pages of each book, which is pretty useless from the point of view of public uses that come to mind.
Re:Speaking about re-captcha by Cyberax · 2010-08-05 12:12 · Score: 1

Google Books allows you to view and download entire books, if they are in the public domain.
Example:
http://books.google.com/books?id=Q_rLGDGlQz0C&printsec=frontcover&dq=Mark+Twain&hl=en&ei=HlNbTOrmKtWN4gb-q9j3AQ&sa=X&oi=book_result&ct=result&resnum=2&ved=0CCsQ6AEwAQ#v=thumbnail&q&f=false
Re:Speaking about re-captcha by bill_mcgonigle · 2010-08-05 15:16 · Score: 2, Insightful

So its for-profit work for the biggest advertising firm in the world.
Sort of expected project gutenberg or something.
Google's digitizing hundreds of thousands of historic books from some of the great university libraries. What's the problem here, that they won't lose money on the effort?
The NYT archive has been done for at least a year, it made reCAPTCHA a feasible company.

--
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Re:Speaking about re-captcha by extra88 · 2010-08-05 23:29 · Score: 1

I think you're underestimating the public good from what Google provides but even so, the universities get their own copy of the data for their books so they can do even more with it, as copyright allows.
Re:Speaking about re-captcha by jonadab · 2010-08-06 02:11 · Score: 1

> You can only view a small number of pages of each book, which is
> pretty useless from the point of view of public uses that come to mind.

It's useful to the public for the purpose of search -- being able to *find* what you're looking for, even if it's not in the bibliographic data and only appears embedded in the text someplace. I wish the catalog computers at the public library could do that. It brings us one step closer to LCARS.

--
Cut that out, or I will ship you to Norilsk in a box.

Can the mouse cursor be positioned by a script? by master_p · 2010-08-05 09:11 · Score: 1

If not, then the captcha should only be visible when the mouse cursor is over it.

The key to a successful captcha is to make it accessible only by a user sitting in front of the screen.

Re:Can the mouse cursor be positioned by a script? by machxor · 2010-08-05 09:18 · Score: 1

It's fairly trivial to use AutoIt to position the mouse and is scriptable.
Re:Can the mouse cursor be positioned by a script? by Lehk228 · 2010-08-05 09:28 · Score: 1

even if it couldn't be done normally, a hostile client could say the cursor is over the script just as easilly as it could place the cursor there.

--
Snowden and Manning are heroes.
Re:Can the mouse cursor be positioned by a script? by AusIV · 2010-08-05 10:31 · Score: 1

As a couple of ACs have pointed out, the people breaking CAPTCHAs aren't using browsers, they're using scripts. They don't care if a DOM element is hidden, or if they have to make an extra ajax request of some sort. The scripts will be tailored to the CAPTCHA they're trying to break, and you can't keep a script from getting a hold of something that you plan to show a human.
Re:Can the mouse cursor be positioned by a script? by IBBoard · 2010-08-05 20:09 · Score: 2, Insightful

Remember, iPads and touch-screens can't do hover. Plus there's the whole disability accessibility aspect as well ;)

I'm a computer, apparently by El_Muerte_TDS · 2010-08-05 09:11 · Score: 2, Funny

It looks like that tool is better at deciphering the captchas than I am.

far from it by MagicM · 2010-08-05 09:12 · Score: 3, Informative

I'm watching the video, and the end result is "b:1/78 1.28% s:27/78 34.62%" indicating that out of 78 tests of two words per test it got a single word right 35% of the time, and both words right only once or 1% of the time.

Since both words need to be correct "solve the current CAPTCHA at an efficacy of 1%" would be closer to the truth.

Re:far from it by NegativeK · 2010-08-05 09:18 · Score: 2, Informative

35% * 35% ~ 12%. And that ignores that one word is a known control, while the other is a word they're trying to OCR.

--
This statement is false.
Re:far from it by BarryJacobsen · 2010-08-05 09:21 · Score: 2, Informative

I'm watching the video, and the end result is "b:1/78 1.28% s:27/78 34.62%" indicating that out of 78 tests of two words per test it got a single word right 35% of the time, and both words right only once or 1% of the time.
Since both words need to be correct "solve the current CAPTCHA at an efficacy of 1%" would be closer to the truth.
My understanding is that only one of the words needs to be correct, but it has to be the "right" one (reCAPTCHA presents two words one it's very certain it knows what it is and one it's less certain, you have to get the one that it's very certain of in order to pass).

--
Track your TV Shows with your iPhone - FREE
Re:far from it by MagicM · 2010-08-05 09:33 · Score: 1

All I'm saying is that just because the algorithm got 30% of the words right doesn't mean that it can "solve the current CAPTCHA at an efficacy of 30%".
Re:far from it by MagicM · 2010-08-05 09:39 · Score: 1

Actually I guess that's not what I'm saying, because I said "1%" which was wrong. You may consider my face egged.
Re:far from it by sexconker · 2010-08-05 09:42 · Score: 1

Only ONE word needs to be correct for recaptcha.
There is a known word you are tested against, and an unknown word pulled from a database of shit they scanned.
Solving the known word correctly means you probably also got the unknown word correct. They then pool the "correct" submissions for the unknown words and see what the most common ones are.
I don't know if this is completely automated or if they have an intern monkey clicking "yes" or "no" for unknown words and probable solutions, but the whole "crowd sourcing OCR for a bunch of shit we scanned" is the POINT of recaptcha.
Re:far from it by sexconker · 2010-08-05 09:45 · Score: 1

All I'm saying is that just because the algorithm got 30% of the words right doesn't mean that it can "solve the current CAPTCHA at an efficacy of 30%".
Yes, yes it fucking does.
"Solving" a captcha - to an attacker or a legitimate user - means getting past the damned popup and creating your account, posting your /. obama poop copypasta troll, etc.
Being correct with regards to the OCR means nothing.
Re:far from it by rm999 · 2010-08-05 09:45 · Score: 2, Informative

You are right, there is no need to get both words right.
But, your 35% * 35% calculation assumes the recognition difficulty of the words is independent, which is a bad assumption in this case; the OCR word is one that is known to be hard to guess. It is probably more like 35% * 5% or something.
Re:far from it by IICV · 2010-08-05 09:51 · Score: 1

Not necessarily; I'm not sure exactly how reCAPTCHA works, but in theory they don't know one of the words - in fact, that other word may very well be unknowable, due to smearing or just not being a word (that happened to me the other day actually, I got one word and one thing that looked like a Farsi character). Thus, if you successfully guess the correct thing for the "known" word, it doesn't really matter what you guess for the "unknown" word as long as it's close or at least something a human might type.
Therefore, making the big assumption that this system correctly guesses both "known" and "unknown" words with equal chance, the algorithm's expected "win" percentage would be about 17%, not 1% as you claim.
Of course, I bet you anything that if reCAPTCHA gets a lot of wrong answers from a given IP address, they'll start sending pairs of known words in order to detect this sort of thing and to prevent pollution of their databases. That would give this algorithm a 1% win chance.
Re:far from it by hydrofix · 2010-08-05 09:53 · Score: 5, Informative

Since both words need to be correct "solve the current CAPTCHA at an efficacy of 1%" would be closer to the truth.
Actually, that is incorrect. The other word is already positively known by the OCR, and serves as a control, while the other is the one that the OCR could not read. It will of course only check the one that it knowns, and assumes the other one is then correct as well. So, if you get one of the words correct AND this is the same word that as their OCR identified correctly (which is very likely the case), then you pass, but most of the time (99%) give a bad answer for the harder, non-OCR word. Sadly, this leads to pollution of their database in the long run.
Re:far from it by retchdog · 2010-08-05 10:15 · Score: 2, Insightful

Interesting. If this is true as stated, and one knew/modeled OCR performance, you could use this information in some cases to pick out the plum and boost the crack...

--
"They were pure niggers." – Noam Chomsky
Re:far from it by petermgreen · 2010-08-05 10:30 · Score: 1

I seem to remember recapatcha claiming that if they think they are being screwed with they switch to sending two known words rather than one known and one unknown

--
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
Re:far from it by mugurel · 2010-08-05 10:37 · Score: 1

35% * 35% ~ 12%.
35% corresponds to P(A or B), not P(A), and you don't know if P(A)=P(B), neither whether P(A) and P(B) are independent.
P(A or B) = P(A) + P(B) - P(A and B)
.35 = P(A) + P(B) - .01
P(A) + P(B) = .35 + .01 = .36
If we assume that P(A) = P(B) = .36/2 = .18, then that implies that P(A) and P(B) are not independent, since .18*.18 != .01. And if they are independent P(A and B) = P(A)P(B), they can not be equal.
It looks more likely that P(A) and P(B) are independent and unequal, than that P(A) and P(B) are equal and dependent. It could be that the first word is often shorter than the second (or v.v.).
Re:far from it by retchdog · 2010-08-05 10:42 · Score: 1

meh. never mind. it'd only take twice as long at most, to just do your best on both. duh.
i guess if there were a limited number of attempts you might use this to decide which ones to attempt vs. reload.

--
"They were pure niggers." – Noam Chomsky
Re:far from it by mysidia · 2010-08-05 11:04 · Score: 1

The order is random... you don't know which word is the first word and which is the less-certain one. Only reCaptcha knows that.
Re:far from it by omnibit · 2010-08-05 12:12 · Score: 1

Mod parent up - I'm out of points!
Re:far from it by Jorl17 · 2010-08-05 13:20 · Score: 4, Informative

This is not informative. As many have said. If You read: http://www.google.com/recaptcha/learnmore , you'll get it.

Here is the deal: reCAPTCHA presents two words. One is picked by it and is previously known. The other one is a word from a book that has been scanned. Said word is unknown to the reCAPTCHA system. When the user enters both words, reCAPTCHA checks to see if the known word has been properly recognized. If that is the case, then reCAPTCHA can assume that a human is answering. Given that a human is answering, then the second unknown word given by the human is most likely correct, because he/she will be able to recognize it as well. Using this system, reCAPTCHA works as a CAPTCHA (spam prevention) mechanism and also helps transforming old books/papers into digital format, such as the New York Times.

So, in practice, only one word has to be correct -- the word that reCAPTCHA knows. What's sad is that bots may contribute incorrect second words...

Next time, get informed before going all crazy.

And here is the relevant info, quoted from the aforementioned website:

reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly. But if a computer can't read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here's how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.

--
Have you heard about SoylentNews?
Re:far from it by 42forty-two42 · 2010-08-05 16:07 · Score: 1

Since recaptcha only actually checks one of the words, you actually have a 0.35 * 0.50 chance, or 17.5% chance of success. Of course, since google will just plug this back into their OCR algorithms, and recaptcha only uses things their OCR algorithms failed on in its captchas, any such advances are only temporary in nature.
Re:far from it by twocows · 2010-08-05 22:09 · Score: 1

That's not true, though. The way reCaptcha works is that only one word needs to be correctly solved. It's actually relatively easy for a human to tell which one needs to be solved; it's often the longer one or the one with unusual characters.
Re:far from it by twocows · 2010-08-05 22:11 · Score: 1

Er, the longer one or the one with unusual characters that *doesn't* need to be solved, that is.
Re:far from it by delinear · 2010-08-06 00:42 · Score: 1

It's a little confusing but the presentation seems to suggest if you get one of the words wrong too many times in succession - 32 it claims - it will log your IP and force you to get both right (I'm not sure how that works, maybe it uses a word with a large number of identical responses, or maybe it just generates two words it knows). It doesn't say if this is reset after you have successfully got both right, if it was you could still brute force it (the counter would reset after an average 100 entries if there's a 1% chance of matching both, in which case for every 132 entries you'd still get about a quarter of them through which is not a bad hit rate), if not the suggestion is "dynamic IP" - not particularly useful if you plan to spam thousands of these things.
Re:far from it by delinear · 2010-08-06 00:46 · Score: 1

You can make a reasonable guess as a human - the unknown word is usually unknown for a reason (archaic type rendering two letters together, an ink smudge on one letter, etc) which a human can spot with reaonsable accuracy once they know the trick. If they introduced some of these tricks on the known words they might make it more difficult (and of course, just because a human can spot the difference reliably, doesn't mean it's easy to do in software, as supported by the results in the video).
Re:far from it by delinear · 2010-08-06 00:52 · Score: 1

That depends - you say you're assuming there's an equal chance of spotting a known and an unknown word. It's more likely that his software is better at recognising the word that has been generated by an algorithm rather than some randomly obscured word (that was then run through the algorithm), so while the success rate for two known words might not be 30% there's a good chance that it would be better than 1%, since that figure is factoring in the ability to recognise a word that even Google's OCR couldn't realiably recognise.

Plugin not needed... by knarf · 2010-08-05 09:13 · Score: 3, Informative

There's probably an excellent Firefox plugin to render this page's color scheme more bearable

No plugin needed:

View->Use Style->None

That is what it looks like in Seamonkey, Firefox will be similar. This more or less always works.

--
--frank[at]unternet.org

Re:Plugin not needed... by interval1066 · 2010-08-05 09:47 · Score: 1

Or if you're using ff 3.6....; View->Page Style->No Style.

--
Python: 'And then suddenly you have a language which says "we're all stuck with whatever the whiniest coder wants".'
Re:Plugin not needed... by c++0xFF · 2010-08-05 10:32 · Score: 1

This more or less always works.
You're not joking: it even makes the Time Cube site somewhat readable!

Hmm by Tailhook · 2010-08-05 09:15 · Score: 5, Funny

Should I run the DEFCON presenter's giant SWF or not?

o_O

--
Maw! Fire up the karma burner!

Re:Hmm by machxor · 2010-08-05 09:25 · Score: 2, Funny

Why not. You run Firefox right? If yes then you have no worries because it's not full of hole like IE is...
Re:Hmm by Chad+Birch · 2010-08-05 09:33 · Score: 1

You are disturbingly misinformed.

--
Sturgeon was an optimist.
Re:Hmm by machxor · 2010-08-05 09:45 · Score: 1

And you don't understand sarcasm... Or maybe I fail at it... Either way, cheers :-)
Re:Hmm by Monkeedude1212 · 2010-08-05 10:06 · Score: 2, Insightful

I'm glad YOUR common sense kicked in before hundreds of others.

Bad Hacking by pz · 2010-08-05 09:16 · Score: 4, Insightful

Why would anyone want to do this? It's like attacking the UN peace keeping troops or the Red Cross. reCAPTCHA is doing good work, digitizing scanned printed books so that the the text can be made available for online searching. Breaking reCAPTCHA is like defecating in the village well, ensuring that everyone suffers. No one benefits from reCAPTCHA being broken. No one.

--

Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.

Re:Bad Hacking by kyrio · 2010-08-05 09:28 · Score: 2, Informative

4chan already broke it.
Re:Bad Hacking by Dhalka226 · 2010-08-05 09:31 · Score: 5, Insightful

No one benefits from reCAPTCHA being broken. No one.
Spammers.
Re:Bad Hacking by maxume · 2010-08-05 09:32 · Score: 5, Insightful

Actually, it could be of use to reCAPTCHA, they can just pass their test words through this system before they make them public and then use the output to help prevent similar attacks.

--
Nerd rage is the funniest rage.
Re:Bad Hacking by Purity+Of+Essence · 2010-08-05 09:35 · Score: 1

Advertisers benefit. Or rather, people who sell advertising and SEO services and work automated lead/sales referral systems. Their clients are probably hurt by all the forum spam done in their name. Look around you. Wherever there is money being made, there are assholes joining in.

--
+0 Meh
Re:Bad Hacking by rbcd · 2010-08-05 09:37 · Score: 1

The field of AI is advanced as CAPTCHAs are broken (eg: OCR). The great thing is that spammers work on this for us, too. When humans and computers cannot be separated, then we'll have computers that can pass the Turing test. AI research will have finished.
Re:Bad Hacking by Flyne · 2010-08-05 09:42 · Score: 4, Insightful

The problem of breaking reCAPTHCA is precisely the same problem as increasing computer OCR abilities, since reCAPTCHA by design uses words which current OCR abilities are inadequate for. This is a good thing for AI and computer vision and text digitization.
Re:Bad Hacking by maxume · 2010-08-05 09:45 · Score: 1

Right, because human level intelligence is the obvious upper limit.

--
Nerd rage is the funniest rage.
Re:Bad Hacking by beothorn · 2010-08-05 09:53 · Score: 1

If it can be broken it must be broken.
Re:Bad Hacking by sbayless · 2010-08-05 09:58 · Score: 5, Insightful

No one benefits from reCAPTCHA being broken. No one
You couldn't be more wrong. Sure, breaking reCAPTCHA would create a headache for website admins (including me, for example), but in order to break reCAPTCHA someone has to devise a better text recognition program. And that's great news! This is an example of a general side effect of the cat and mouse game that are captchas. Captcha's are a simple form of Turing Test, where website admins are trying to determine who is a computer and who is a real human being. Every time a captcha gets broken, we get a sophisticated new algorithm for doing something that previously only humans could do (or only humans could do well, at least).
Re:Bad Hacking by rbcd · 2010-08-05 10:01 · Score: 1

That's an entirely separate and irrelevant discussion.
Re:Bad Hacking by cant_get_a_good_nick · 2010-08-05 11:07 · Score: 1

It's not about breaking reCaptcha, it's about avoiding the reCaptcha hurdle on all the sites that use it. If a site put up a captcha, there's some resource it's protecting that other people want. This is a way to get it in a bulk way, therefore economically cheaper.
And you think that a person who can benefit with a fat check will care about some abstraction that they're polluting the village well? For money, people sell drugs that kill people. This is nothing compared to that.
Re:Bad Hacking by mysidia · 2010-08-05 11:09 · Score: 2, Insightful

reCaptcha, and indeed all Captchas have a fundamental flaw.... advances in computer vision will eventually render them all obsolete.
Most of the CS knowledge is already around to totally defeat captchas of this sort... it's only an Engineering question. They will most likely get broken when sufficiently unethical engineers are hired by sufficiently wealthy spammers.
It's basically a known fact, that spammers will eventually break conventional captchas totally, by developing algorithms to guess captcha answers. It's only a question of when and how long will it take them to figure out all the systems that matter.
This does not mean it is a respectable thing for people to specifically target Captcha and attempt to hasten its demise.
reCaptcha is a big one... but there are other Captcha systems that matter (like Google's).
And there are other ways around them besides software algorithms... Amazon-style mech turk, for example... find a few thousand folks in certain countries to pay $0.05/hour for breaking captchas, and suddenly reCaptcha is no longer a boundary.
Re:Bad Hacking by Timmmm · 2010-08-05 11:10 · Score: 3, Insightful

The problem of breaking reCAPTHCA is precisely the same problem as increasing computer OCR abilities
No it isn't. Well, not unless you read books with wavy crossed-out words and don't mind 30% accuracy.
Re:Bad Hacking by mysidia · 2010-08-05 11:12 · Score: 1

A 30% recognition rate is no good for useful OCR. It's only that beneficial when breaking Captchas.
30% just means you have to retry the captcha a few times.
Re:Bad Hacking by mysidia · 2010-08-05 11:14 · Score: 2, Insightful

Except the algorithm doesn't really do that... to defeat the captcha, it only needs to get it right about 10 or 20% of the time, to give the malicious script a "good enough guess" to brute-force the Captcha with 5 or 6 retries.
As long as the number retries are less than those the a fair percentage of humans require....
Re:Bad Hacking by cant_get_a_good_nick · 2010-08-05 11:17 · Score: 1

4chan didn't quite break it, more like they broke time's form implementation. They did a lot of 'hacks' but most was on how Time handled the poll - they didn't use any CAPTCHA at the beginning, then took the form offline, but not the voting script, so 4chan voted well past the cut off time, will millions of monkeys voting.
see reCaptcha blog and this well written article
Re:Bad Hacking by shird · 2010-08-05 11:45 · Score: 1

No the OP is pretty much right. 4chan has now implemented reCaptcha, yet is still getting hammered with spam. Thus some spammer using 4chan has managed to find a way around it with a pretty good success rate.

--
I.O.U One Sig.
Re:Bad Hacking by Osso · 2010-08-05 12:17 · Score: 1

Lots of books are less than 30% accurate :)
Re:Bad Hacking by lennier · 2010-08-05 13:39 · Score: 1

No one benefits from reCAPTCHA being broken. No one.
But wouldn't a universal algorithmic crack for reCAPTCHA imply an algorithm that could automatically tell the difference between a correct OCR transcription and nonsense? So just fold that algorithm into future open-source OCR libraries and watch the recognition rate soar. We're using black hat hackers to write AI code for us, and everyone wins! *
* Except John Connor, after Skynet starts reading the Hollywood rejected scripts vault.

--
You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
Re:Bad Hacking by bill_mcgonigle · 2010-08-05 15:20 · Score: 1

I went to a talk by one of their founders last year. The company philosophy is that they day their business is obsolete will be a great day for humanity.
That's pretty rare.

--
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Re:Bad Hacking by discord5 · 2010-08-05 19:49 · Score: 1

Why would anyone want to do this?
Because CAPTCHA in essence is a flawed system, also because it's a fun "puzzle" and if you don't try to solve it someone else will.

It's like attacking the UN peace keeping troops or the Red Cross.
That's an overstatement if I've ever seen one, like that guy who posted here saying he felt raped when his employer didn't pay him. People just love to make controversial statements on slashdot. "Oh my god, someone found a bug for SSH and wrote an exploit. It's like that time some guy bashed in the window of my car with a rock and took my laptop, only to look up my personal information, come to my home and abduct my dog for ransom."

Breaking reCAPTCHA is like defecating in the village well, ensuring that everyone suffers. No one benefits from reCAPTCHA being broken. No one.
reCAPTCHA is about CAPTCHAs, and using CAPTCHAs to get stuff digitized. No matter how "good" the intentions are, the captcha system is horribly flawed and there is an increasing amount of libraries popping up to break captchas. I would say that any real advances made to breaking captchas has the benefit of having better OCR in the short run, and in the long run maybe we get to see a better solution for the problems captchas try to deal with.
Anyway, if this guy isn't trying it, a clever spammer will and trust me, they are trying. Captcha protected sites aren't really that impervious to spam lately.
Re:Bad Hacking by xtracto · 2010-08-05 20:32 · Score: 1

That, and a better JDownloader ^_^

--
Ubuntu is an African word meaning 'I can't configure Debian'
Re:Bad Hacking by whm · 2010-08-05 20:34 · Score: 1

No, it's a win-win situation. If they cannot solve reCAPTCHA then we get website security. If they do solve it, it means that we can digitize all of that content without any human interaction. This is great news.
Re:Bad Hacking by DrXym · 2010-08-05 20:37 · Score: 1

Well if some hacker can crack recaptcha then so can some random spammer. I run a website with captcha protection and the ratio of spambots to real humans has reached epidemic proportions. 95% of new applicants are spambots meaning they have managed to crack the image despite it being set to the maximum setting. Fortunately all new registrants require approval so even if they pass the captcha I can still weed them out but its still a pain in the arse.
Captcha schemes are quite good but they are vulnerable to a class break. I'm currently considering creating a unique challenge for my site to stop this crap. I think if I put a simple form next to it that had to be filled in it would stop the drivebys dead.
Re:Bad Hacking by orasio · 2010-08-06 00:40 · Score: 1

Why would anyone want to do this? It's like attacking the UN peace keeping troops or the Red Cross. reCAPTCHA is doing good work, digitizing scanned printed books so that the the text can be made available for online searching. Breaking reCAPTCHA is like defecating in the village well, ensuring that everyone suffers. No one benefits from reCAPTCHA being broken. No one.
Good OCR is more valuable than good captchas.
Re:Bad Hacking by delinear · 2010-08-06 01:17 · Score: 1

I think the point is more that, once a computer can simulate the best of human thinking, it will be more productive to have the computer think about these issues. The computer can process the data faster and doesn't need food or sleep, and it's far easier to duplicate a computer process multiple times than to create more genius level humans (not to mention far quicker, none of that messy growing up and puberty stuff to deal with, just load the disk image and go).
Re:Bad Hacking by maxume · 2010-08-06 01:46 · Score: 1

So when I unpack "AI research will have finished." I am supposed to realize that the statement is qualified to humans doing AI research and that the research itself will really have just gotten started?

--
Nerd rage is the funniest rage.
Re:Bad Hacking by pz · 2010-08-06 09:05 · Score: 1

No one benefits from reCAPTCHA being broken. No one
You couldn't be more wrong. Sure, breaking reCAPTCHA would create a headache for website admins (including me, for example), but in order to break reCAPTCHA someone has to devise a better text recognition program. And that's great news! This is an example of a general side effect of the cat and mouse game that are captchas. Captcha's are a simple form of Turing Test, where website admins are trying to determine who is a computer and who is a real human being. Every time a captcha gets broken, we get a sophisticated new algorithm for doing something that previously only humans could do (or only humans could do well, at least).
No. No, no, no. Doing research into OCR and publishing the results is fantastic. It makes the world a better place.
Showing that you have written software to cheat a system that is in wide use to benefit society is morally wrong. It is bad.
Devising a better Turing test is good, coming up with a way of cheating the current one is bad.
Devising a better way to recognize counterfeit currency is good, coming up with a new way to counterfeit currency is bad.
Constructive behavior, good; destructive behavior, bad. Do I need to make it clearer?
Breaking reCAPTCHA is bad.

--

Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.

Readability by pgn674 · 2010-08-05 09:19 · Score: 1

There's probably an excellent Firefox plugin to render this page's color scheme more bearable.

I like using a Readability bookmarklet in my bookmarks bar: Readability - An Arc90 Lab Experiment

Re:Offtopic by Anonymous Coward · 2010-08-05 09:29 · Score: 4, Informative

No, Firefox addons used to be called extensions, plugins are still plugins.

Re:So many better ways than recaptcha by sugarmotor · 2010-08-05 09:45 · Score: 1

You wrote, "There is more than enough written and audio samples that the world would love to see OCR'ed." -- Where do you get those?

--
http://stephan.sugarmotor.org

Is this related? by Khyber · 2010-08-05 09:48 · Score: 4, Interesting

Anybody that pays attention to 4chan recently knows they had to implement captcha due to a massive spamflood of infected morons. recaptcha got busted thanks to someone in /g/ who leaked the vulnerability in the sound system for reCAPTCHA, and the whole site was again inundated with spam, though not to the degree as the original spam attack.

--
Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.

Re:Is this related? by Dice · 2010-08-05 10:36 · Score: 1

The audio vulnerability is unrelated, and more effective than the algorithm presented in TFA.
Re:Is this related? by Monkeedude1212 · 2010-08-05 11:26 · Score: 1

Anybody that pays attention to 4chan
What, all 200 of them?
Re:Is this related? by MostAwesomeDude · 2010-08-05 17:53 · Score: 1

I was going to note this. It's to the point where 4chan is actively exhausting the reCAPTCHA dictionaries, causing interesting words to show up. Half the time, one of the words is some kind of kanji or untypeable symbol now.

--
~ C.

Re:So many better ways than recaptcha by JesseMcDonald · 2010-08-05 09:49 · Score: 3, Informative

There is ZERO reason to use worthless tests like these as opposed to using real identification. That is instead of using computer generated difficult test, use actual pictures of actual 'difficult text' that an OCR agent failed to identify. Each person is given one alread tested sample and one unknown sample. If you get the already tested sample, then your answer is accepted as 'probable' correct for the unknown sample.

Congratulations, you've just described ReCAPTCHA! This is exactly how the current system works.

--
"The state is that great fiction by which everyone tries to live at the expense of everyone else." - Bastiat

Re:The video shows nothing but failures by sexconker · 2010-08-05 09:49 · Score: 1

The percentages shown are a running total of all the captchas tested against in that run.

b is the % of cases where BOTH words were correctly recognized
s is the % of cases where AT LEAST ONE word was correctly recognized

You only need to know ONE word to pass a recaptcha captcha. Though it has to be the CORRECT word, and I don't know if the developers of this program knew which word was known, or if they took that into account when displaying the percentages.

The worst case scenario is that they can solve it about 1/6th of the time (getting one right 1/3 of the time, and having it be the right one 1/2 of those times). It stands to reason, however, that the "known" captchas (the ones recaptcha tests against) are the ones that are easier to solve, and thus, the actual success rate is indeed about 33%.

Re:Offtopic by vlueboy · 2010-08-05 09:52 · Score: 1

From TFS:

There's probably an excellent Firefox plugin to render this page's color scheme more bearable.

Halfway through this sentence I realized someone will now implement a nice little extension such that I never again have to answer these recaptchas. Pretty sure they would break this extension shortly with cunning, though. Anyway, at 30% accuracy now, it's easier to <F5> or click refresh 3 or 4 times than to get my hands off the mouse to type 2 word captchas that sometimes are eye-straining.

You don't have to reply here if you don't want to lose karma with such guilty-pleasure extension, brave spammers^Wcoders! :) I'll be googling the currently "virgin" string "captcha this fox" to find your work posted wherever.

Re:Offtopic by Cougar+Town · 2010-08-05 09:59 · Score: 3, Informative

Wrong. Plugins have been around since Netscape and are still called plugins. They have a different function than an extension (and an extension is what we would want in this case to fix the site's colours).

Both plugins and extensions, along with themes, are collectively referred to as "addons." "Plugin" is the wrong word in the summary. "Extension" or "addon" would have been acceptable.

How is this 30% accurate??? by mwvdlee · 2010-08-05 10:02 · Score: 3, Insightful

When it is claimed to be 30% accurate, I'd expect some 30% of all captchas being correcly guessed. Watching the video, I noticed the algorithm gives itself 30-40% scores for getting just one of the two words right or sometimes even for getting the right length and a few correct letters. Didn't watch it to the end, but in the few minutes I watched, ZERO entire captcha's were solved. So that's ZERO% acurate in my book. For instance, actual captcha text "ware readiness", guessed captcha "votarry rehabbed", reported accuracy 38.24%... how the hell is that over 38% accurate? If you had that level of accuracy when trying to get past a captcha (which is pretty much the definition of it being vulnerable, right?), you wouldn't get past a single captcha. it's 30% accurate if it correcly guessed about 3 out of every 10 captcha's, not if it fails every single captcha.

--
Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?

Re:How is this 30% accurate??? by Timmmm · 2010-08-05 11:11 · Score: 1

You only need to get one (specific) word right.
Re:How is this 30% accurate??? by wdavies · 2010-08-05 13:05 · Score: 1

mod this up. I hadn't gotten the implication of the exploit either until now.
Re:How is this 30% accurate??? by afaik_ianal · 2010-08-05 14:20 · Score: 1

As a number of others have stated, the reCAPTCHA server only knows the answer to one of the words it's giving you. You only need to get the "easy" one right to be passed as a human. Getting the "hard" word right makes no difference in terms of passing the test.
If this were getting 30% accuracy on the hard words, then that would be *real* news.
I suspect this is getting slightly lower success than they're reporting, as that 38% figure is assuming they're only getting the easy words right, but in actual fact they're bound to get only the hard one right every now and then.
Re:How is this 30% accurate??? by yuhong · 2010-08-05 17:44 · Score: 1

That raises a question. If most of the time when this succeed they only get the easy word right, how would that pollute the digitalization of the text?

Re:So many better ways than recaptcha by Lunix+Nutcase · 2010-08-05 10:35 · Score: 1

In other words.... use reCAPTCHA?

Re:Offtopic by CarpetShark · 2010-08-05 10:36 · Score: 1, Funny

A Firefox extension is not the same thing as a plugin.
Firefox plugins ***used*** to be called Firefox extensions. You must just be too young to know this.

It's a bit like watching scary women fight over who was married to some guy first on Ricky Lake.

On the bright side by MadGeek007 · 2010-08-05 10:49 · Score: 1

Maybe this hack can be used to improve book scanning.

Re:Offtopic by easyTree · 2010-08-05 11:27 · Score: 1

Wrong. Plugins have been around since Netscape and are still called plugins. They have a different function than an extension (and divine intervention is what we would want in this case to fix the site's colours).

--
Requiem for the American Dream

Re:Offtopic by MobileTatsu-NJG · 2010-08-05 11:52 · Score: 1

A Firefox extension is not the same thing as a plugin.

B.F.D.

--

"I like to lick butts!" by MobileTatsu-NJG (#32700246) (Score:5, Informative)

Re:Bad Hacking vs Penetration Testing by billstewart · 2010-08-05 12:14 · Score: 1

If reCAPTCHA's too easily breakable, then Bad Guys will figure out how, and will start exploiting sites that use reCAPTCHA for protection.

So we need to know how vulnerable it is, and the reCAPTCHA folks need to figure out how to fix it. It's an arms race, always has been, probably always will be.

--

Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks

%30 seems about right by dilvish_the_damned · 2010-08-05 12:19 · Score: 1

since thats about the accuracy of a human

--
I think you underestimate just how much I just dont care.

New Human Verification Scheme by BlueMonk · 2010-08-05 12:39 · Score: 3, Interesting

Seeing this article gave me an idea to come up with a new human verification process. I created a C# program in about an hour that loads images from Google images based on searching for 3 of 2000+ nouns. It shows 3 examples of each noun and asks the user to pick the correct noun from a list of 6. This program is just a proof of concept of course. Could this become useful? (Binary and source code included.)
http://enigmadream.com/misc/HumanVerification.zip

Re:New Human Verification Scheme by afaik_ianal · 2010-08-05 14:37 · Score: 1

Nice idea, but I can break it trivially with my Android phone: Open "goggles", point at screen, click, and "Similar Images" gives me the answer (or a multi-word answer containing the answer you're looking for).
I have to keep my hand really still as I take the shots though, so perhaps a bit of image distortion would be enough to work around that.
Re:New Human Verification Scheme by KahabutDieDrake · 2010-08-05 15:54 · Score: 2, Interesting

If you used something that wasn't a public resource based around text strings, then yes.

Better still... show a bank of images, ask which one has a happy little girl in it. (all images contain a girl, only one obviously happy). Randomize the backend with a cryptographic routine (so the file names don't give anything away) and you are set for a while. Computers are terrible at such things, people are pretty good at it.
Re:New Human Verification Scheme by BlueMonk · 2010-08-05 23:08 · Score: 1

It would be hard to come up with a bank of images as large as Google images. It has to be quite large because otherwise someone could create a database of picture checksums unless some sort of distortion was applied.
I don't specifically see how detecting a happy girl in a picture is better than picking a description of a picture. Personally I would think picking a description that looks like a good categorization of 3 images (as the program does) is even more difficult to fake. It requires more than a single ability; a large vocabulary of words and experience. Maybe your idea is better in that it's easier for people with a limited vocabulary.
Re:New Human Verification Scheme by BlueMonk · 2010-08-05 23:11 · Score: 1

You didn't look at the program or read my post carefully enough. You have to do this 3 times in a row, which yields less than 1% probability of randomly guessing a correct answer. With a 1 hour delay after 3 bad attempts, this would significantly limit automated passage through the system.
Re:New Human Verification Scheme by BlueMonk · 2010-08-06 00:17 · Score: 1

Interesting. I'm not familiar with that app/feature (I don't have a smart phone). It gives you a list of words along with the similar pictures you say? I wonder what the actual probability of the result containing the same word as the test is. The actual test shows 3 pictures and it's up to the human to pick the common element out of them. I wonder if the android app could/would find the same word in 3 lists (or 2 of the 3).
Yes, distortion is another thing that occurred to me that would probably be easy to add if it would introduce some degree of additional challenge to the automated systems without hindering real humans too much.

Re:Offtopic by Peach+Rings · 2010-08-05 12:42 · Score: 1

reCAPTCHA isn't bad, but Google's captchas are so hard they're probably more easily solved by learning algorithms than actual human beings.

Re:My eye's... by Peach+Rings · 2010-08-05 12:52 · Score: 4, Funny

You know a hacker is hard core when his site is monochrome in a monospace font, and he saves his files as straight up docx.

Let's hope they hit 100% by drinkypoo · 2010-08-05 13:04 · Score: 2, Interesting

Then we can just put reCAPTCHA on all pages being used for spam, and get transcription services for free.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"

Re:Let's hope they hit 100% by cmdrwhitewolf · 2010-08-09 03:05 · Score: 1

Bravo! I haven't seen such a display of flaming stupidity in a while. So, you disagree with Drinkypoo's viewpoint and that he can express it before you can, so you decide the devolve into name calling. Did anyone ever tell you that you might just be a bloody slow reader?
Well, at least now we have proof that the mod point system is totally indiscriminate, because it's giving out points to individuals who probably don't deserve them in the first place...

--
[Now, I'm off to lift my le... Um, visit... at another place.]

Re:My eye's... by Peach+Rings · 2010-08-05 13:08 · Score: 2, Insightful

By the way, that wasn't just a facetious comment. TFA isn't a serious paper. It's not even typeset, just typed into Microsoft Word. And god knows why I'm being warned about VBScript macros when I try to open it.

And this isn't a case where the little guy is making real scientific progress right under the nose of the obsolete establishment. The author doesn't even have a freshman understanding of big-O notation, it's completely juvenile.

Don't Worry by flimflammer · 2010-08-05 13:11 · Score: 1

reCAPTCHA is already on the road to beating this. When your images are on the verge of being discovered algorithmically, use Hebrew.

Re:My eye's... by hairyfeet · 2010-08-05 13:41 · Score: 5, Funny

You young ones and your complaining. "Ohhh the colors suck" SO WHAT! You don't remember when the Internet was invaded by those dual demons from hell, Geocities and Comet Cursors! Now THAT was torture buddy! YOU try dealing with a page that looks like it was designed by Unicorns on a crack binge, while having a fricking pocketwatch suddenly appear and hang from your cursor like a ball of snot on a string, all while having your shotgunned modems drug down to 300 baud land thanks to a bazillion puke inspiring GIFs spinning all out of time!

Now THAT is real suffering kid! /wanders off muttering/

--
ACs don't waste your time replying, your posts are never seen by me.

Captcha by Phat_Tony · 2010-08-05 14:26 · Score: 1

When OCR gets so good that recaptcha becomes pointless, my idea for the next step of harder-for-AI captchas is to stop using line art and start using gradients. That is, currently, they use text, which is line art, and then warp it, chop it up, and run miscellaneous clutter through it. It's getting harder and harder for people to read, and machines are still catching up.

I propose that if you start with a photograph, make a selection that's block text, feather the edges, than shift the colors in the selection (Hue, saturation, inversion, remapping, whatever) that it's going to be easier for humans and harder for computers than some of the stuff we've got now. But generating it can be automated just as easily, I scripted Photoshop to make these in a few minutes.

Here's an example

--
Can anyone tell me how to set my sig on Slashdot?

Re:Captcha by sugarmotor · 2010-08-05 16:33 · Score: 1

Sounds like a good idea. You still have to swap out the background photo though.
-> Bicycle ride captcha, where photos are taken from the video of the ride

--
http://stephan.sugarmotor.org
Re:Captcha by sugarmotor · 2010-08-05 17:33 · Score: 1

Actually "Type this text" is not that easy to read -- the usual problem with CAPTCHA development is finding that balance; when you can't use the same picture over and over again, it'll be difficult, I think.
How do you like the one at http://stephansmap.org/sign_up

--
http://stephan.sugarmotor.org

Re:So many better ways than recaptcha by SmlFreshwaterBuffalo · 2010-08-05 14:59 · Score: 1

...and how in the hell do you OCR an audio sample???

Multiple choice doesn't work for CAPTCHAs by mrnobo1024 · 2010-08-05 15:38 · Score: 2, Insightful

The spammers can just choose a random option until they get in. All that will do is slow them down a bit.

Re:Multiple choice doesn't work for CAPTCHAs by BlueMonk · 2010-08-05 23:02 · Score: 1

That's why you have to pick 1 of 6 choices 3 times in a row correctly. The probability of getting that right with completely random guesses is less than 1%. And if you combine it with a 1 hour delay after 3 bad attempts, that should be a significant impedance.
Re:Multiple choice doesn't work for CAPTCHAs by Arlet · 2010-08-06 01:53 · Score: 1

Forced delays become meaningless if the attacker has access to a large botnet
Re:Multiple choice doesn't work for CAPTCHAs by BlueMonk · 2010-08-06 11:37 · Score: 1

It's just one ingredient. Better to have a delay for attackers that don't than not to have one. The less-than-1% chance of success per attempt is still better than the old method, right?

Re:The Goggles WORK! by JambisJubilee · 2010-08-05 18:00 · Score: 1

Try taking a picture of the CAPTCHA with your phone using the google goggles app. It works... remarkably well!

Re:My eye's... by yuhong · 2010-08-05 18:20 · Score: 1

I downloaded it now and there are no macros, not to mention this is a .docx not a .docm.

Re:My eye's... by Lavene · 2010-08-05 18:41 · Score: 1

I had such a site on Geocities. Flashing gifs, blink tags, a horrible MIDI playing in the background with no way of turning it of. Blue text on pink background and, naturally, CometCursor.
And of course no meaningful content what so ever.

Innocent times...

Comment removed by account_deleted · 2010-08-05 19:50 · Score: 1

Comment removed based on user account deletion

Re:Offtopic by delinear · 2010-08-05 23:45 · Score: 1

The problem is, because they're serving up words that a computer has failed to recognise as part of their OCR project, those same words are often impossible for humans to identify also (maybe they're smudged on the original source for instance) - this does result in some incredibly difficult words to read. According to the powerpoint, you only have to get one word right, I tried this and sometimes it worked, other times it gave me an incorrect result - I think the truth is probably more like (and I'm sure I read once this is how it works) they serve one recognised word and one unrecognised word - the requirement for success is only getting the recognised word right, they just compile the results of the unrecognised word to advance their OCR projects. Usually the recognised word is more readable (because we know it at least started out readable whereas we can't make the same assumption for the unrecognised word), so in the majority of cases so long as you type the word you can read and then make a best guess at the other you will still successfully solve the captcha. Of course, it might still be easier to hit refresh a few times until you get a more readable pair.

Have to know which is known and which is unknown by Sockatume · 2010-08-06 01:21 · Score: 1

If you get one of the answers right, and it's not the known, then you're still stuck, though. So its success rate is closer to 18%: it identifies one word correctly 35% of the time, and on 50% of those occasions, it's the known word.

--
No kidding!!! What do you say at this point?

Re:Offtopic by Peach+Rings · 2010-08-06 03:00 · Score: 1

Yes that's how it works, and everyone already knows that. I'm not even talking about recaptcha though, I'm talking about the impossible captchas you get when failing too many times to log into a Google account.

Re:My eye's... by Alien1024 · 2010-08-06 03:02 · Score: 1

Not only that, look at the email address - 0xn3on@gmail.com.

D4t b t3h stuff h4x0rz 1z m4d3 0f.

Re:So many better ways than recaptcha by neminem · 2010-08-06 08:40 · Score: 1

Step 1: open it in notepad...

Re:My eye's... by tirnacopu · 2010-08-07 03:09 · Score: 1

Judging by the way you spell "off" or "whatsoever", the grammar mistakes (perhaps "I used to have"?) and how you place your commas, you are still as innocent as they come.

Slashdot Mirror

ReCAPTCHA.net Now Vulnerable to Algorithmic Attack

176 of 251 comments (clear)