Docvert 3.0 Lessens Reliance On Microsoft Office
An anonymous reader writes "After 10 months of development Docvert 3.0 was released today. This open source web service converts DOC files to Oasis OpenDocument 1.0, and then to HTML, RSS, or any XML format. Try the ODF demo or download the source and install it on your own box. Version 3.0 comes with an MS Word Plugin, FTP/WebDAV upload, and an in-browser document editor."
Ya, I'm on the edge of my seat. It will get adopted as a standard or it won't. Office will use it either way and anyone wanting to interoperate with Office will have to try to implement it as well.
"I use a Mac because I'm just better than you are."
as:
Object
Oriented
X
M
L
and whimpered at the thought...
To a Lisp hacker, XML is S-expressions in drag.
One of the things that bugs me are these 'enormous specifications' that are inconsistent. What we need is not just a document, but the tools necessary to verify a generated file. Not just for valid XML, but for all the little microsofty-bits hidden inside.
--jeffk++
ipv6 is my vpn
Despite what Microsoft thinks and how they're been acting in the past with all their 'standards'; Describing all the exceptions doesn't make something a standard. Describing them in the context of a non-standardized environment, makes it even less so.
Although I'm quite sure that Microsoft really doesn't give a and will push this through as 'their' standard that everyone else will have to adhere to to be able to do anything with Mickyshaft generated content anyway.
Whether ISO approves of this or not is inconsequential, the only thing that matters is that M$ can now say: Look, we proposed a standard, it's not our fault 'they' think it's not good enough.
Coz eternity my friend, is a long *ing time.
I solved the issue by writing a program that ran on a Windows PC (an old one that had been discarded and was gathering dust in the closet) that received SMTP mail, detached the Word attachment, started up Microsoft's Word Viewer to read the attachment, then "printed" it to a file in PDF format and finaly SMTP mailed it back to the sender.
From then on all we had to do was forward the email to the robot and wait for a readable version to bounce back. As I used Microsoft's own Word Viewer there were no problems whenever a new version of Word came out, I just downloaded the latest viewer :-)
It's to be expected as Open XML is a straight transliteration of the DOC "dumps" to XML format.
I wonder how it ended this way: not enough time to properly develop and implement a more proper standard, or by design.
I feel it's both.
...Some people think its fine that way. A friend of mine, quite pro-ms, told me that all those little strange things in the specification where normal to have backwards compatibility, and that reading the specification was a waste of time. Instead, he directed me towards a preview of Ms office 2007. Because for him, as for many more, what's important is the final product, the cuteness of the buttons, the way it works and displays its own format. Why bother using a free program that displays word documents badly, when Office is already perfect huh? I feel so misunderstood sometimes. What makes me sad is that they don't see the use of a clear straight-to-the-point format. Maybe only geeks can be horrified by this one.
I wonder if you could get 60 people to review 100 pages each (or divide up chapters or sections in some logical manner). That may be feasible in 1 month. At least the glaring problems would be flagged. I have no idea how to organize this however.....
putting the 'B' in LGBTQ+
Wait a minute, I know this! This is just Phoenix Wright!
This is why we oldsters hate Microsoft. 25 YEARS of this.
That Andy Updegrove, who runs a law firm that works for IBM (which has a massive vested interest in making sure that there is one and only one XML word processing format (ODF)) would submit a story to /. pointing out his own article that is critical of Microsoft's XML word processing format.
Amazing. Who would have thought of something like that.
...when you can have oo-mox?
Chris Mattern
So it looks like the Open Source community is now debugging Microsofts Document format. I am sure Microsoft does not itself know what is going on in here half the time and much of this document was generated by code scrappers looking for structures and interfaces.
Congrats to the world community but they should really submit a bill to Microsoft.
"additional Microsoft technology that must be emulated (but is not covered by the Microsoft patent pledge); elements that can't be implemented without Microsoft technical assistance; dependencies on Windows itself; mandatory bugs; and more. And then there's also the fact that OOXML heavily overlaps ODF -- a platform-independent, already-adopted ISO/IEC."
Pretty much like everything they do.
Wait - where are the virus APIs? Did they leave those out?
Naah...
Gotta be there somewhere. Keep looking.
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
Does the OO mean "Object Orientated" or "Open Office"? Microsoft always arrive at the poolside late and attempt to muddy the water and ruin it for everyone else. I propose Microsoft remove the ambiguity by renaming it OMXML, "Obnoxious Monopolist XML" which is still awful because the fact it uses XML is irrelevant. How would Microsoft feel if the open source community began prefixing all their work with "MS"? That might help us get more F/OSS past management, since they may be under the impression we are talking about Microsoft software. Hell it works for Microsoft!
Anyway, the contradiction is that Microsoft want to be seen to be involved, they participate in community standards for all the wrong reasons. They submit technically unsound proposals and generally waste everyones time. How is the naming of their proposed "standard" not intended to cause confusion and slow the adoption of ODF and OpenOffice.org?
(Content of
<microsoft_word_document>
(Content of
</microsoft_word_document>
...whether ISO has simply become a dumping ground for people simply wanting to market their stuff as standards (ECMA), or a real standards body.
As it is, there is not a snowball in hell's chance that OpenXML can become an ISO standard. It is simply a dump of the existing awful doc format into a nice incomprehensible 6000 page document, and it doesn't even use existing ISO standards. There's even a set group of banners and bullet points defined in there which can by no stretch of the imagination be called international.
I know Microsoft has managed to butter the ECMA up as their usual standards dumping ground, but I simply cannot see how they can get past the shortcomings in that article. To do so would be a huge amount of work (and Office 2007 is already using this format) and it would threaten their Office monopoly - which is what this obfuscation was about in the first place.
The biggest question to ask is not whether or not Microsoft provides access to the XML, nor whether Microsoft provides access to a schema, the question to ask is, "Is OOXML Truly Open Source?"
The biggest issue I have with the OOXML "standard" (and I use the word quite losely) is there are BLOB's (binary large objects) in the OOXML file created by Microsoft. In this BLOB is all the byte code used in the Macros, etc for the file in question (i.e. an Excel file). Since Microsoft has not provided proper instructions (whether it be a schema, or source code) to read the the BLOB containing this information, and how to intrupret this information, I doubt this will ever pass as a true ISO standard, nor be truly accepted as open source (not to mention marcos are still programmed using the Microsoft defined, and patented, VBA rather than using an open source standard such as JavaScript).
Check out the article on Groklaw Searching for Openness in Microsoft's OOXML and Finding Contradictions for further comments. The article also has links to a couple of wiki pages with further comments.
Outside Office 2007, who would ever implement this "standard"?
If it's true, is anybody really surprised? This is MS after all.
There is a another analysis on groklaw.
Well, there's spam egg sausage and spam, that's not got much spam in it.
here's too it not getting adopted and I hope they kick it back out on the proprietary horse it rode in on.
> there's also the fact that OOXML heavily overlaps ODF -- a platform-independent, already-adopted ISO/IEC.
Couldn't the Microsoft people use the existing standard instead? That way everyone would be able to communicate. Someone should call to let them know about it.
Does the OO mean "Object Orientated" or "Open Office"?
;-).
It means neither. OOXML is shorthand for Opaque and Obfuscated eXception-based Markup Language. However, Marketing rejected the longhand name for the format because it didn't test well in developer focus groups. However, marketing found the shorthand OOXML appealing because psychologists have said the roundness of the O's induces a sense of calmness. BillG liked it because legislators could make an (incorrect) association between OOXML and OpenOffice.org (often abbreviated OO.o), and he hopes the confusion could lead to the inadvertent acceptance of MS' pseudo-open file format in government.
So OOXML stays but it officially stands for about as much as DVD does, which is nothing (or whatever you want it to stand for--it is all about "personal freedom" after all, so OOXML stands for what means the most to you
So, we've had an ODF "production ready" converter ready since May 2006. http://www.groklaw.net/article.php?story=200605040 15438308
Where is it? Why won't Gary Edwards of the OpenDocument Foundation say anything?
Hey people, how about some promoting of ODF?
Oh go ahead and call Microsoft if you wish. But I bet they implemented their own phone standard which is incompatible with ours. Embrace, Extend, Exterminate. 'nuff said.
I think, therefore you are.
They already know that everyone is locked-in to their proprietary Office formats. This XML "standard" is not created as a real product that Microsoft hopes to promote. That would be a conflict of interest. Instead, they want to make sure that it is a gigantic convoluted spec that no one can implement. It's designed as a distraction. They want the spec to leave you with a feeling of disgust for open XML formats because:
1. You'll go back to your works-good-enough-for-me Office formats you've already been using (e.g. Word Doc).
2. If you're the typical uneducated business person, you'll get confused between OOXML and ODF and falsely believe that ODF is that bloated mess of a spec you believe you heard about from your fresh-out-of-high-school IT guy. Well he knows about computers, so that ODF (OOXML? Open Office XML? XML? Open Document?) thing must be a bad idea.
Not many people know much about Open Office, even many supposed "techs" in many businesses (at least in the U.S.). Microsoft wants to take advantage of their greater mind-share to control public opinion through their usual tactics of FUD and confusion. They want to make sure that the reputation among developers of XML as being a bloated exchange medium will work in their favor by amplifying that perception thereby killing off ODF and any chance of the industry adopting a common format.
Wikipedia has a good doc outlining the difference between OOXML and ODF:
c ument_and_Microsoft_XML_formats
:)
http://en.wikipedia.org/wiki/Comparison_of_OpenDo
It may not be an ISO standard, but it's a heck of a lot better than the completely proprietary older formats.
How about a good "atta boy" for Microsoft at least?
Doc files don't parse as UTF-8. Try application/octet-stream?
Or do you mean better for msft?
The old formats are known. OpenOffice and AbiWord can read the old formats - as can older versions of ms-office. With this OOXML cr@p it's right back to square one.
Why is this still being called Oasis OpenDocument? Have you already forgotten that it's an ISO standard?
GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
<microsoft_word_document>
<[!CDATA[
(Content of
]]>
</microsoft_word_document>
Got to make sure it's valid XML.