Slashdot Mirror


Fill Out CAPTCHAs, Digitize Books At The Same Time

alphadogg wrote with a link to a Networld article about a noble endeavor: putting CAPTCHAs to work for the good of humanity. A scientist at Carnegie Mellon is looking to create a new type of security check that will assist in a project meant to digitize and make searchable text from books and printed materials. Above and beyond that, the offering would probably be more secure than most current systems. "Instead of requiring visitors to retype random numbers and letters, they would retype text that otherwise is difficult for the optical character recognition systems to decipher when being used to digitize books and other printed materials. The translated text would then go toward the digitization of the printed material on behalf of the Internet Archive project."

4 of 121 comments (clear)

  1. Re:Verification? by greatgregg · · Score: 5, Informative

    From recaptcha.net: "But if a computer can't read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here's how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct."

  2. Better links by Falkkin · · Score: 4, Informative

    The article is lacking some information. Here are some better links:

    Official reCAPTCHA site
    Hide your email address with reCAPTCHA (super easy!)
    A more detailed blog post about how the system works

    Disclaimer: I work with Luis von Ahn, who's the professor running the reCAPTCHA project.

  3. Official reCAPTCHA site by traindirector · · Score: 4, Informative

    I originally missed the link to the official site - D'oh. The article also doesn't mention that the system is already in use! http://recaptcha.net/

    1. Re:Official reCAPTCHA site by caffeinemessiah · · Score: 4, Informative

      There's an interesting solution to this problem -- the "scientist at Carnegie Mellon" is Luis von Ahn who was recently awarded a MacArthur genius award. In optical recognition tasks like this where the "true" answer is not known, how do you verify that a human agent correctly did the recognition? Just see if a bunch of other users type the same thing. It's a clever twist on consensus voting, and was recently snatched up by Google as "Google image labeler" here.

      --
      An old-timer with old-timey ideas.