Converting Word Files to Text for Archiving?
Unknown Relic asks: "Our company has large quantities of old, MS Word documents which we are looking to permanently archive. One of the requirements of our archiving process is that the documents be stored in plain text format. Unfortunately we also have another, conflicting requirement: the text files must retain basic formatting information from the original documents, including bullets, indentations and basic table layout. While all of this formatting is possible using plain text, I have not been able to find any tools which do a decent job of retaining the above mentioned formatting during conversion. Even Word's 'Save As' option does a horrible job, though I suppose that's not overly surprising. Has anyone undertaken a project similar to this before? If so, what tools did you find or create to make the job feasible?"
Word files convert YOU!
(oh, and fp r0x0rs my b0x0rs)
You are seriously asking the impossible.
If you want something automated that somehow preserves this information, you'd better find something that understands Word encoding 100%. Nothing besides Word can do that (though things like ClarisWorks comes close).
The best you can do is go through the things by hand and transcribe them with whatever concocted plaintext encoding scheme you came up with.
The other solution is to have a deep think about why you are abandoning Microsoft Word in the first place. Is it too difficult to keep one machine around with a copy of Windows for viewing these files?
There's one simple answer: uuencode.
*ducks*
I think the best solution would be to tell them to piss off, and quit.
Otherwise, save everything as HTML. That's text, and it'll retain enough shit to make your loser bosses happy.