Best OCR for Technical Texts?
An anonymous reader asks: "I'm scanning in user manuals for older lab equipment. I've never used OCR before today, so I installed the Caere Omnipage 9.0 that came with the scanner. I was pretty happy except for a few things. It doesn't seem to want to recognize engineering symbols like the one char +/-,square root, omega, simple equations, it has trouble with super- and subscripts, and it outputs funky Word files. For example, from an 8.5 x 11 original page scanned in at 1 bit at 300 dpi, the output Word file was 10 inches wide, used tons of Omnipage text styles and didn't match the original text's flow. It did do a good job of italicizing headers and recognizing the various sections in a two column page. Googling the news and net just backs up my claims but provides no real solution. A Google search that provides nothing useful looking for best OCR for engineering."
Have you looked at the open-source Clara OCR? I've used it for some very unique texts in the recent past. It's accuracy is quite good. Besides that, the proofing mechanisms are great!
Go here: http://www.claraocr.org/.
It has very recently been ported to win32, and the community support (via e-mail lists) is excellent.
Use 8 bit, NOT 1 bit. When I switched from 1 to 8 bit on a page of normal text, the dozen or so errors vanished.
: //docmorph.nlm.nih.gov/docmorph/
l l. asp?category=ocr4
s ource. htm
Since Omnipage is up to version 12, perhaps there's been an improvement since your version.
Your google skills are sorely lacking, the "Hacking Google" book would be a good investment for you. Eliminating the quotes and word "best" in your search string would help.
2 different free web based ocr, just upload a 300 dpi b/w (8bit greyscale) file
http://www.expervision.com/webtr6.htm
http
here are some OCR programs
http://www.scansoft.com/omnipage/
http://www.abbyy.com/
http://www.newsoftinc.com/redir/digitaloffice_a
more ocr links than you really want
http://web3.humboldt1.com/~jiva/ocr/_ocr_re