Slashdot Mirror


Google Docs' OCR Quality Tested

orenh writes "Google has released a Google Docs application for Android, which includes the ability to create documents by OCR-ing photos. I tested the application's OCR quality and found that it's mediocre under the best conditions and poor under real-world conditions. However, I believe that this poor performance is caused in part by an intentional decision by Google."

4 of 99 comments (clear)

  1. Um... by Shadow+Wrought · · Score: 4, Insightful

    He uploaded the 120 dpi image instead of the 300 dpi image and is surprised the OCR sucks. Really? Lossy isn't the concern when you're OCR'ing bloack text on a white background. Seriously. Think about what the image is actually going to be used for, then make your decision.

    And, seriously, how effective of OCR'ing are you really imagining you're going to get off of a camera phone pic, anyway?

    --
    If brevity is the soul of wit, then how does one explain Twitter?
  2. Re:CAPTCHA Breakers by jewelises · · Score: 3, Insightful

    I don't think that spammers have any amazing tech, they just have different requirements. They can still send spam with a 1% success rate whereas with OCR you'd want a 99% success rate.

  3. 99% success rate is crappy ... by perpenso · · Score: 3, Insightful

    I don't think that spammers have any amazing tech, they just have different requirements. They can still send spam with a 1% success rate whereas with OCR you'd want a 99% success rate.

    I once worked on an OCR project. The client specified a 99% success rate and we strained to restrain our grins. 99% is about one error every one or two lines of text. We got 99.6% in our first implementation before we even began to work on accuracy. Admittedly we had excellent image quality. This was a custom solution that had its own optics.

  4. Re:/b/ by Super+Dave+Osbourne · · Score: 1, Insightful

    Slashdot has become formula boring. Quite a long time ago. This is verifiable, and not meant as flamebait. If the mods would stop acting like scripts without some AI built in for content /. would be once again a viable worthwhile place to contribute on a regular basis, rather than drive-bye train wreck contribution.