Best OCR for Technical Texts?
An anonymous reader asks: "I'm scanning in user manuals for older lab equipment. I've never used OCR before today, so I installed the Caere Omnipage 9.0 that came with the scanner. I was pretty happy except for a few things. It doesn't seem to want to recognize engineering symbols like the one char +/-,square root, omega, simple equations, it has trouble with super- and subscripts, and it outputs funky Word files. For example, from an 8.5 x 11 original page scanned in at 1 bit at 300 dpi, the output Word file was 10 inches wide, used tons of Omnipage text styles and didn't match the original text's flow. It did do a good job of italicizing headers and recognizing the various sections in a two column page. Googling the news and net just backs up my claims but provides no real solution. A Google search that provides nothing useful looking for best OCR for engineering."
Good luck!
I've used a few different version of Omnipage PRO, and it works OK if the layout is not complicated, it uses standard fonts, the text is clean and clear and it doesn't have too many weird logos or symbols. You still have to proofread everything and correct it by hand, though, so I'm not convinced it's a time saver as much as it is a typing saver.
OmniPage Pro does do a MUCH better job of identifying words that the free version they throw in with scanners because it uses spelling and grammar checkers to help ID words from context. The free version is as close to useless as you can get in the software world - it's really just an ad for Pro.
Engineering and math symbols are right out.
"Lawyers are for sucks."
- Doug McKenzie