Slashdot Mirror


reCAPTCHA Hard At Work, Rescuing Fading Texts

sciencehabit writes "Computer scientists have developed a program, called reCAPTCHA, which is being used in lieu of CAPTCHA by several sites, to help digitize old books and newspapers. The reCAPTCHA takes entries from old and faded texts that optical scanners and digital-text readers have trouble with. So every time you solve that string of crooked letters, you may actually be helping historians digitally reconstruct a page from the 1908 New York Times." The Science Now story links to the longer and more informative article at Ars Technica. (We last mentioned this program last year — and now it's good to get some sense of how well it's working.)

15 of 112 comments (clear)

  1. Validate your data, guys! by Anonymous Coward · · Score: 3, Funny

    I can usually tell which of the two words is from a real old text. With high probability (>90%) I can correctly answer the real CAPTCHA and replace someone's OCR'd word with "penis".

    I've only ever done this maybe ten or twenty times, but it could easily become an automatic part of using the system.

    1. Re:Validate your data, guys! by Spasmodeus · · Score: 2, Funny
      As soon as I heard about this project, I figured there'd be people finding ways to abuse it.

      I can see future generations sitting down for a good read:

      MOBY COCK

      Chapturd One

      Call me LOLOLFAG...

  2. Huh? 1908 New York Times? by mschuyler · · Score: 2, Funny

    The New York Times is already online from 1851 onwards. the concept is cool, truly, but why not CAPTCHA something not already accomplished? Oh, I know. That was, like, a metaphor, right?

    --
    How about a moderation of -1 pedantic.
  3. DMCA Violation by Nymz · · Score: 5, Funny

    The feature known as FADING was designed to protect copyright works from being pirated by becoming illegible before the work could fall into the public domain.

  4. Prior art by armanox · · Score: 4, Funny

    I think that erosion on stone tablets predates fading by quite a bit....

    --
    I'm starting to think GNU is the problem with "GNU/Linux" these days.
    1. Re:Prior art by Nymz · · Score: 2, Funny

      I really wish the RIAA (Rock Industry Association of the Archean eon) would update their business model to the current Phanerozoic eon.

  5. Re:Not new by felipekk · · Score: 3, Funny

    Facebook uses reCAPTCHA. I guess you can make something useful out of the millions of useless teenagers wasting their time on Facebook.

  6. Re:Not new by grahamd0 · · Score: 5, Funny

    Facebook uses reCAPTCHA. I guess you can make something useful out of the millions of useless teenagers wasting their time on Facebook.

    That's not fair.

    Plenty of useless adults waste their time on Facebook.

  7. Re:Cool possible uses by burgundysizzle · · Score: 5, Funny

    Or perhaps SLASHDOT-READER:

    OVERWEIGHT

    GEEK

    SPENDS-TO-MUCH-TIME-USING-COMPUTERS

    ALL-OF-THE-ABOVE

    I fit into the category ALL-OF-THE-ABOVE. The only generalisation that is missing about slashdotters is the one about girlfriends.

  8. Re:One Problem by Anonymous Coward · · Score: 3, Funny

    The following security test allows us to validate you are a human and not an automated script.

    please type the following two words in the text box below

    you moron

    ____________ _____________

  9. Finally logged in by narcberry · · Score: 2, Funny

    Took me a bit to get past the new security measures, But I got a coupon 5 cents off my next shoe purchase.

    --
    Modding me -1 troll doesn't make me wrong.
  10. Re:AC for the plain old CAPTCHA by grahamd0 · · Score: 4, Funny

    Let me introduce you to my friend, the question mark.

  11. Re:One Problem by RedWizzard · · Score: 4, Funny

    One FUNDAMENTAL problem with this

    ... is that you didn't RTFA.

  12. Re:Image Captchas by Martz · · Score: 4, Funny

    Just use an alt tag.

  13. Re:Not new by Alzheimers · · Score: 3, Funny

    But you...

    *sigh* ...Nevermind. It's Friday. Go have a beer or something.