Docvert 3.0 Lessens Reliance On Microsoft Office
An anonymous reader writes "After 10 months of development Docvert 3.0 was released today. This open source web service converts DOC files to Oasis OpenDocument 1.0, and then to HTML, RSS, or any XML format. Try the ODF demo or download the source and install it on your own box. Version 3.0 comes with an MS Word Plugin, FTP/WebDAV upload, and an in-browser document editor."
as:
Object
Oriented
X
M
L
and whimpered at the thought...
To a Lisp hacker, XML is S-expressions in drag.
Despite what Microsoft thinks and how they're been acting in the past with all their 'standards'; Describing all the exceptions doesn't make something a standard. Describing them in the context of a non-standardized environment, makes it even less so.
Although I'm quite sure that Microsoft really doesn't give a and will push this through as 'their' standard that everyone else will have to adhere to to be able to do anything with Mickyshaft generated content anyway.
Whether ISO approves of this or not is inconsequential, the only thing that matters is that M$ can now say: Look, we proposed a standard, it's not our fault 'they' think it's not good enough.
Coz eternity my friend, is a long *ing time.
I solved the issue by writing a program that ran on a Windows PC (an old one that had been discarded and was gathering dust in the closet) that received SMTP mail, detached the Word attachment, started up Microsoft's Word Viewer to read the attachment, then "printed" it to a file in PDF format and finaly SMTP mailed it back to the sender.
From then on all we had to do was forward the email to the robot and wait for a readable version to bounce back. As I used Microsoft's own Word Viewer there were no problems whenever a new version of Word came out, I just downloaded the latest viewer :-)
All true, but if it does get adopted as a standard, then MS can use this to ensure the continued use of MS Office by government agencies around the globe. If it doesn't get adopted, MS will be under pressure to provide a supported, native, OOD format.
Microsoft isn't doing this for you silly! The whole intent is likely that it is *hard* for anyone to implement.
...Some people think its fine that way. A friend of mine, quite pro-ms, told me that all those little strange things in the specification where normal to have backwards compatibility, and that reading the specification was a waste of time. Instead, he directed me towards a preview of Ms office 2007. Because for him, as for many more, what's important is the final product, the cuteness of the buttons, the way it works and displays its own format. Why bother using a free program that displays word documents badly, when Office is already perfect huh? I feel so misunderstood sometimes. What makes me sad is that they don't see the use of a clear straight-to-the-point format. Maybe only geeks can be horrified by this one.
The second design requirement was that the spec be developed and released quickly, before ODF had time to gain much traction. Between these two objectives, it's hardly surprising that it ended up the way it did...
I am TheRaven on Soylent News
<microsoft_word_document>
(Content of
</microsoft_word_document>
In my opinion there are two reasons Microsoft is trying to create their own standard: PR and government contracts. The PR aspect is obvious. The US government is Microsoft's largest customer (by far) and also the most likely to demand open document standards. Other governments will likely do the same long before corporations demand it. So Microsoft needs to have their own standard which they implement first in order to get the contracts.
They don't have to implement it correctly. They can claim support for a standard for years without actually following it (e.g. CSS, Kerberos, etc.) and still get the contracts. They were actually involved in creating some CSS standards and still didn't follow them.
It's all about the money. Get the big contracts and nothing else matters.
Developers: We can use your help.
Check out the article on Groklaw Searching for Openness in Microsoft's OOXML and Finding Contradictions for further comments. The article also has links to a couple of wiki pages with further comments.
That's the reason for all the "render like WordPerfect 5.x" options that people have complained about, because they have to allow people to convert to the XML format and then convert back without reducing the document to an unreadable mess.
There is no reason I know of why the XML format cannot support all the features of Word and round trip, without relying on nasty hacks like this, it just takes more work. The problem with "Open"XML that I've seen is the concentrate entirely on supporting only the features of .doc files and their interactions with other programs to the exclusion of anything else. Rather than "render like WP 5.x" you need to define how WP 5.x renders that feature, then incorporate it into your conversion script in a way that makes sense in general for documents.
The whole format is built upon the assumption that only MS and Word will be using it and it is not designed to abstract word processing documents in general, but to kowtow to the eccentricities of Word.
The alternative is to not support roundtripping and then wait for slashdot headlines like "Users find that the new Office XML format mangles their documents".
No, the alternative is to do it right and build hacks like the ones you mention into the import and export routines, rather than embedding them, without any definition, into the format.