Converting Word Files to Text for Archiving?
Unknown Relic asks: "Our company has large quantities of old, MS Word documents which we are looking to permanently archive. One of the requirements of our archiving process is that the documents be stored in plain text format. Unfortunately we also have another, conflicting requirement: the text files must retain basic formatting information from the original documents, including bullets, indentations and basic table layout. While all of this formatting is possible using plain text, I have not been able to find any tools which do a decent job of retaining the above mentioned formatting during conversion. Even Word's 'Save As' option does a horrible job, though I suppose that's not overly surprising. Has anyone undertaken a project similar to this before? If so, what tools did you find or create to make the job feasible?"
Use MS Word to save it as HTML, then run it though lynx -dump to save it as text.
Although, you may want to give strong consideration to another poster's recomendation of using PDF. (Particularly since you care about formatting.)
-Bill
SlashSig Karma: Excellent (mostly affected by moderatio