Microsoft Word Document ML Schemas Published
Lars Munch writes "On Monday the 17th November the xml schemas for the Word Document ML along with documentation, was uploaded to the Infostructurebase (ISB). With the Word Document ML specification anybody can generate, view and process Microsoft word documents on any format." (Here are the legal terms under which the schemas can be used.) "The Word Document ML is based on the W3C specification eXtensible Markup Language (XML), there by providing documents that are easy to integrate into a large variety of systems. The Danish Government Infostructurebase is the first schema repository to make the schemas accessible to the public. The Microsoft Office Document ML schemas and documentation can now be downloaded from the ISB Repository." There are more links on this page.
It's NOT reasonable. They don't allow any modifications or derivatives of the schema without permission.
So, Microsoft will be free to continue changing their format with each new release, breaking all the open source programs for a time, causing time and trouble for users to upgrade.
We don't like Word formats because they change frequently, and they are developed in a direction that suits Microsoft. How does this change anything?
This is America, damnit. Speak Spanish!
Microsoft is allowing you to license the patent free of charge but not to sublicense it. The GPL requires that you be allowed to sublicense patents applicable to GPLed software. And that's somehow Microsoft's fault?
I'll take this over having to reverse-engineer the specs and deal with potential IP issues. For once, Microsoft did us a favor, even if it does come with strings attatched.
Finding God in a Dog
> They don't allow any modifications or
> derivatives of the schema without permission
Hm. I guess I'm not sure what would be gained by doing that - i.e., changing the spec and republishing it. Why would that be a good thing to do, even if you could?
> Microsoft will be free to continue
> changing their format with each new
> release, breaking all the open source
> programs for a time
Right... but couldn't the same be said of any API? I mean, if the Apache plugin API changes, I'll need to rewrite my mod_foo module to use the new API.
The Army reading list
I already have the ability to save my word processing documents as XML. I already have the ability to transform them into other things I want. So do you. check it out.
I'm sure someone, someplace is already working on the appropriate xslt to transform Microsoft's stuff into this more open format, and I'm sure Microsoft has some ace up their sleeve technically or legally to push it into a 'gray' area...
But I just cannot imagine anyone having the gaul to say that my data is only available to me in a format that they control the terms and conditions on. how successful would a paper company be if they put 'terms and conditions' on the use of their wood pulp?
Why bother with proprietary file formats when you have DRM? Make a mendacious nod to 'open file format', and then lock stuff up behind the DMCA. If you want to read a DRM encoded word document, you'll need word. Period.
--Lawrence Lessig for Congress!
Previously we could reverse engineer their format and use it. Their work was covered by copyright, no problem once we create our own implementation.
This schema is patented. Patents are an exclusive right to use an idea. Now if you use their format without upholding their conditions, you're a criminal, even if you figured out the format yourself.
By publishing the format, they can cast doubt on anyone that does reverse engineer it. "I bet you read the spec on line".
Also, being able to view the format isn't much use. It's XML, but that doesn't mean it will be meaningful cleartext. They can simply uuencode a big block of binary data, stick it between two tags, and it's valid XML.
Learn from the past. Microsoft are not here to do us favours.
Expert in software patents or patent law? Contribute to the ESP wiki!
>Hm. I guess I'm not sure what would be gained by doing that - i.e., changing the spec and republishing it. Why would that be a good thing to do, even if you could?
1) All specifications are incomplete. The requirements that it addresses today are not static, and in 10 years there will be new requirements.
2) Microsoft will change their XML schema.
3) Historically, Microsoft has done things that are in the interest of Microsoft. Everyone else must follow along.
4) Therefore, the changes that Microsoft will make the the XML schema have a high liklihood of being advantageous to Microsoft.
When Microsoft keeps all the real control of the format, it turns any open source developer into a sharecropper. We're going to be plowing a field that we don't own, and the price we pay is going to entrench the Microsoft format even further.
This is America, damnit. Speak Spanish!
That certainly is a nice pro-Microsoft spin you put on things, but perhaps you can explain the logic behind your statements. How did they "out-open-source" Open Source software? How can they be more open that what is already completely open?
I am still skeptical that Microsoft has truly made this open. Excuse me, but I don't just blindly accept what Microsoft says at face value. Microsoft has a serious credibility problem from lying about so much for so long. Even if Microsoft has finally caught up to the Open Source community regarding the openness of file formats, that helps OpenOffice and its users. It would make me feel even better about NOT spending hundreds of dollars on an office suite every few years.
Microsoft just cut our legs off over security issues? Do you think opening a Word file format just magically makes all of their security issues go away?
I saw some other Microsoft cheerleader congratulate Microsoft for "leapfrogging" Linux by finally providing a decent (remains to be seen) shell, but this person did not explain how this infant shell surpassed bash, pdksh, or zsh. Just because someone makes some wildly unsubstantiated claim about Microsoft's superiority does not make it true. Why should I believe this is anything more than PR and spin? I'm not convinced they have joined us, let alone beat us, at anything. Honestly, please explain your rationale.
Apart from the legal loopholes in Microsoft's license that are big enough to drive a truck through, much more worrisome is the fact that Microsoft asserts that they are getting a patent on an XML Schema. What is the novelty in that schema? It's a standard XML representation of well-known word processing data structures and concepts.
.NET APIs is a similar trial balloon.
This would be a very bad precedent. Microsoft is really trying to push the limits of patentability and testing what they can get away with. Their patent application on
That is something open source and free software developers should really worry about.
Microsoft isn't going to give up the golden strength of a file format lock-in any time soon, even if they let companies use custom indexing tools on their store of documents (which is really what this whole XML business is about).
Unless I'm missing something, I think this does break the lock-in, in large part. With a published, standardized format, non-Microsoft tools can implement support for it, and users can expect it to work reliably. Openoffice.org, for example, can probably support the new MS format simply by adding a pair of XSLT stylesheets (though they may want to take a different approach for performance).
This means that users of non-MS tools will be able to create documents, confident that MS Office users will be able to read them. There are still limitations going the other way, but that still means that non-MS tools only have to write import filters for the old Office formats, halving the work, and that is really won't be an issue in the business world, where Office Pro is the norm anyway.
I think think this move will prove painful for MS, but probably less painful than sticking with completely closed formats, given the way they've been getting beat up about it.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.