Google Using ReCAPTCHA To Decode Street Addresses
smolloy writes "Apparently some users of reCAPTCHA have recently begun seeing photographs appear in their CAPTCHA puzzles — photos that look very much like zoomed in house numbers taken from Google Streetview. It appears that Google has decided to put the reCAPTCHA system to help clean up Google streetview images, and 'according to a Google spokesperson, the system isn't limited to street addresses, but also involves street names and even traffic signs.' A large collection of these has appeared on the Blackhatworld website."
This is an incredibly fascinating and great use of the technology.
Yeah because those street number designed to tell everyone passing by what number the house is on the street are meant to be private.
And put your house number in Roman Numerals. Nothing like living in number CLXXIV to screw up the recaptcha. Anyone answering with 174 is likely counted as wrong...
Those who can make you believe absurdities can make you commit atrocities. - Voltaire
I don't find it worrying. The existence of a street address is properly public knowledge. It's not an invasion of privacy until they link the address with who lives there.
Oh shit
http://www.whitepages.com/
I have read the quote from Google about what they are doing several times, and I don't see what everyone else sees. It appears to me that they are using the already known street names and numbers as possible ReCAPTCHA images. What they are NOT doing is using the results given by people to define what the image says. The point of the experiment is to determine whether these images are sufficient to separate people from web-bots. I imagine that they will look at the number of 'wrong' answers from both sides of the test, and see if bots are able to parse the street view images significantly more often than the standard test images.
So... can anyone point to something in the Google quote to show me where I went wrong? From TFA, here is the quote:
We’re currently running an experiment in which characters from Street View images are appearing in CAPTCHAs. We often extract data such as street names and traffic signs from Street View imagery to improve Google Maps with useful information like business addresses and locations. Based on the data and results of these reCaptcha tests, we’ll determine if using imagery might also be an effective way to further refine our tools for fighting machine and bot-related abuse online.
Great. You know what they were previously? OCR for things like libraries.
I think your own answer to them describes what you are.
For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
Yet Google would have to know what the address numbers really was in order to validate the reCAPTCHA, so that can hardly be why they are doing it. They don't need to crowd source an answer that they already know.
No they don't. They also add an altered text image alongside the picture (which presumably they generated), and can use that to validate the CAPTCHA. The street number can be validated by numerical probability (if 70% of them say it is "257", and the numbers "2,5,7" appear frequently in the rest, it is probably "257") even if they don't already know what it is.
"None can love freedom heartily, but good men; the rest love not freedom, but license." --John Milton
I don't think you know how reCAPTCHA works. You are always presented with two different items to decode. One of them is always a known answer, and the other they are less sure about, but become more sure after they show it to enough people and get a crowd sourced answer. They don't give you two prompts just to be double sure you are human.
Oh, climb down off that ledge before you get hurt.
reCAPTCHA is for what ever you want to use it for, Its simply a technique for crowdsourcing guesses.
In my estimation, Google maps and street view is one of the great accomplishments of our time, easily worth every penny Google monetizes out of it.
Sig Battery depleted. Reverting to safe mode.
It's quite noticeable if you use a site which relies heavily on recaptchas. For example, when you get a word which has old english S which looks like a modern small case F, you're much better off claiming it's an F instead of giving the correct answer.