Slashdot Mirror


Google Releases Tesseract as Open Source

An anonymous reader writes "Google recently released Tesseract as open source. Originally developed at the HP Labs from 1985-1995, it has been touted as one of the most accurate Optical Character Recognition (OCR) programs available. Having sat on the shelf gathering dust for so many years, Google cleaned up some of the more outdated portions of the code and released it for general consumption. You can download Tesseract over at Sourceforge.

2 of 251 comments (clear)

  1. Re:As much as I like open source software ... by aweinert · · Score: 5, Informative

    CAPTCHAs are specifically meant to break OCR... and if you RTFA, it say it does poorly with grayscale and color documents. Baisically its meant for reading typed text... like in a book.

  2. Re:NFB owns you by MrNonchalant · · Score: 4, Informative

    You can build accessible CAPTCHAs, using images with a sound backup for blind users. My girlfriend is visually impaired and non-accessible CAPTCHAs are a real problem for her, she can't register at some sites without assistance.