Stephane Rodriguez Dismantles Open XML
Elektroschock writes "Stephane Rodriguez, a reengineering specialist who became popular for his article on MS Office 2007 binary data, now comprehensively debunks Microsoft's new Open XML format. With small case studies he demonstrates the impossible challenges third-party developers will face. His conclusion: it is 'defective by design.' Next week members of the International Standard Organization are likely to approve the format as a second official ISO standard for office documents, even though most nations have submitted comments. Rodriguez claims he is 'not affiliated to any pro-MS or anti-MS party/org[anization]/ass[ociation].'"
Isn't the ISO committeee supposed to check all this before it becomes a standard? Do the test anything or just read through the spec?
Um, isnt the fact that not even Microsofts own software can handle OOXML which btw. is designed by Microsoft themselves, proof enough that something is seriously wrong with the design of OOXML?
I mean if not even the maker of OOXML can get it to work properly in its own products, how are third parties supposed to do it? And if no one is able to implement OOXML correctly, what is this "standard" good for besides being a great smoke-and-mirrors tactic by Microsoft themselves?
Sent: Saturday, December 5 1998
To: Bob Muglia, Jon DeVann, Steven Sinofsky
Subject : Office rendering
One thing we have got to change in our strategy - allowing Office documents to be rendered very well by other peoples browsers is one of the most destructive things we could do to the company.
We have to stop putting any effort into this and make sure that Office documents very well depends on PROPRIETARY IE capabilities.
Anything else is suicide for our platform. This is a case where Office has to avoid doing something to destroy Windows.
I would be glad to explain at a greater length.
Likewise this love of DAV in Office/Exchange is a huge problem. I would also like to make sure people understand this as well.
I'm not saying this as some linux nut job but its things like that which just drive me nuts. Regardless of which ever os I prefer that kind of thinking just boils my blood.
How can any committee deciding on open standards seriously take a company which has been proven time and time again to play by its own rules and whenever it offers something labeled OPEN its about as open as the doors to Fort Knock are to the average person.
I tried to repeat the cell changes experiment but I do not see the Excel error.
I bet Mr. Stephane is not saving the sheel xml in utf-8.
The header of the xml file says its utf-8, but he might be saving it without the UTF-8 BOM header.
This "OpenXML" stunt is just a smokescreen covering Microsofts controlled retreat in the office format battle. It only needs to keep parties distracted until Microsoft has reclaimed the control over business content by means of vendor lockin v2.0 aka Microsoft Office Sharepoint Server.
/ 2007/04/while_you_were.html
http://weblog.infoworld.com/openresource/archives
http://www.itbusinessedge.com/blogs/mia/?p=198
This is not proof of OOXML being defective by design. It only shows that apparently MS's software isn't able to handle OOXML properly.
OK, lets have MS have their choice either way on this one.
If their office tools work well but are not using the OOXML spec, they must be using some other spec, perhaps MOOXML. In which case they are not OOXML compliant.
On the other hand, if they want to be OOXML compliant then I guess Redmond programmer can't read their own spec and thus are having problems being compliant.
Either way, and for whatever reason Microsoft is not compliant with their own spec. Shall we call this MOOXML? And while I have only read a part of the spec, it is far too "undefined" and thus ambiguous to be reliable used by itself. A standard needs to be defined enough, that 2 or more parties could take the standard document specifications, run off and program it from scratch. And have a reasonable chance that their code will inter operate on the same data sets.
Trouble is, if Microsoft cannot do that, how is anyone else?
But might I submit, Microsoft wrote office and then wrote the spec. A poster child of why you think about and write the spec before the software is a good practice.
No, this is a pretty reasonable thing to point out. It wasn't a value that was undisplayed. When you look at the cell it shows it (in decimal) as 1234.1234 (without the cell rounding). So it shows you that on the screen but doesn't store it properly in the XML file. I would say it's a problem. If it were stored as a binary floating point number in the XML I'd say you might have a point, but if it's displayed on the screen in decimal and then the decimal value in the file is different, that's pretty broken. And it's not just broken, it's now damned hard to work with. What happens if you pull the value from Excel using VBA and then try to change a value in the XML? They're not going to be the same.
Separating the value and the display solves the problem. As long as the value stored is preserved, other programs can work with it without introducing arbitrary changes. That M$ does not store the exact value and relies on the reader to make the same rounding error is crazy. It's a trap for every system that is not M$, and might not even work across different processors for M$.
I've run into this problem in my own work, where it did not matter. A data acquisition system I used required Winblows. It could write to either text or some nasty binary format. I chose text with a sufficient number of digits to avoid the binary conversion. This blew up my file size, but made it easy to read. In my case, the extra digits were noise anyway and it only gets read once by other programs. In a bank this clearly would not work. In a place where the values must be read and saved multiple times, this would not work. As a programmer, I'm a relative zero but even I can see how broken the M$ way is.
Value storage was only the beginning of OOXML problems. The formula and binary inclusions are even worse. Hopefully, ISO will reject this mess.
Friends don't help friends install M$ junk.
But that's still a problem. Microsoft's implementation becomes the de facto standard and all others must (attempt to) conform to the behavior of that implementation or be judged defective.
.doc as if nothing happened? I guess they will do the latter since it's the most economical option for them. If that happens I'm curious what the EU will think of that, and how long it will take before MS is forced to use ODF as standard, if it ever comes to that.
I wonder what happens if OOXML is not voted a standard. Will MS simply discard it, and embrace ODF, or will they continue to use
-- Cheers!
I agree with all that, but I think you're missing something very fundamental: the purpose of a document format is to encode what the user did and what it means. This is the reason why the details of binary floating point arithmetic are irrelevant in this context, and their use in the file a flaw: if the user typed "1234.1234" in the document, the user meant 1234.1234, and the file better guarantee me, author of a program that reads it, that I can find out for sure that the user meant 1234.1234. The trivial way to do that, of course, is to store precisely what the user is shown on the screen, because it is the thing that the user manipulates until it looks right to them, i.e., until they judge that the thing that they see in the screen is what they mean.
This doesn't apply just in spreadsheets, of course; it applies everywhere.
Are you adequate?