Paper to XML?
Scott Taylor writes "I have a paper manual that I would like to convert to an HTML browsable manual and to a text searchable PDF manual. Most of the pages of the manual use the same table layout (albeit an irregular table). My current thought is to scan in the tables and then somehow using OCR software convert the data in the table to a xml marked up file. From there I can use XSLT and FOP to convert the data to HTML and PDF. The problem is that I don't know how I can make the jump from a scanned in picture of a table to XML. Anyone out there tried this before? Is there any software that lets one mark up OCR text based on the table cell it was found in? I don't mind spending money on commercial software if necessary (as long as it doesn't cost too much). Is there a better to solve the problem?"
I fail to see why you would go through all those steps when Adobe Acrobat already does what you want pretty much automatically.