Slashdot Mirror


Best OCR for Technical Texts?

An anonymous reader asks: "I'm scanning in user manuals for older lab equipment. I've never used OCR before today, so I installed the Caere Omnipage 9.0 that came with the scanner. I was pretty happy except for a few things. It doesn't seem to want to recognize engineering symbols like the one char +/-,square root, omega, simple equations, it has trouble with super- and subscripts, and it outputs funky Word files. For example, from an 8.5 x 11 original page scanned in at 1 bit at 300 dpi, the output Word file was 10 inches wide, used tons of Omnipage text styles and didn't match the original text's flow. It did do a good job of italicizing headers and recognizing the various sections in a two column page. Googling the news and net just backs up my claims but provides no real solution. A Google search that provides nothing useful looking for best OCR for engineering."

2 of 28 comments (clear)

  1. Try spelling superscripts correctly by keesh · · Score: 0, Insightful

    That might help slightly...

  2. Re:Use Greyscale by SeanAhern · · Score: 2, Insightful

    Your google skills are sorely lacking

    No joke! The link in the post doesn't even connect to Google - it's a Yahoo link.