Stephane Rodriguez Dismantles Open XML
Elektroschock writes "Stephane Rodriguez, a reengineering specialist who became popular for his article on MS Office 2007 binary data, now comprehensively debunks Microsoft's new Open XML format. With small case studies he demonstrates the impossible challenges third-party developers will face. His conclusion: it is 'defective by design.' Next week members of the International Standard Organization are likely to approve the format as a second official ISO standard for office documents, even though most nations have submitted comments. Rodriguez claims he is 'not affiliated to any pro-MS or anti-MS party/org[anization]/ass[ociation].'"
This is not proof of OOXML being defective by design. It only shows that apparently MS's software isn't able to handle OOXML properly.
-- Cheers!
Isn't the ISO committeee supposed to check all this before it becomes a standard? Do the test anything or just read through the spec?
But that's still a problem. Microsoft's implementation becomes the de facto standard and all others must (attempt to) conform to the behavior of that implementation or be judged defective. This is what happened when MS published the MAPI (Mail API) spec and then released an implementation alongside it. Lotus and others could never fully mimic what the MS implementation did, so they eventually languished.
"by design" is of course about motivation which we can know in OOXML from emails, quotes, obtuse or brittle design, and lack of specification.
The document contains all of these. I suggest that you read it.
By the way -- there's newly discovered undocumented Microsoft tech present in OOXML, such as SSPI ("Security Service Provider Interface") which is a proprietary Microsoft developed protocol for security providers, and OLE ("Object Linking and Embedding") which is for embedding (eg, taking an Excel spreadsheet and putting it into a Word document). This is undefined in OOXML only available on Microsoft Windows.
Um, isnt the fact that not even Microsofts own software can handle OOXML which btw. is designed by Microsoft themselves, proof enough that something is seriously wrong with the design of OOXML?
I mean if not even the maker of OOXML can get it to work properly in its own products, how are third parties supposed to do it? And if no one is able to implement OOXML correctly, what is this "standard" good for besides being a great smoke-and-mirrors tactic by Microsoft themselves?
OOXML is a theoretically perfect standard that just happens to have no implementations whatsoever.
http://rocknerd.co.uk
Stéphane is a French male name. The female version is Stéphanie.
It's deliberate. The standard is just a distraction, to keep competitors busy trying to implement it, while documents are actually being created in the Office 2007 variant of OOXML. A few months of legacy almost guarantees a transition to the real OOXML would be an uphill battle, especially with no real documentation of how *either* format works. So even with a supposed 'standard' and a near-enough implementation, the vendor lockin is just as strong as it was with the binary formats.
Sam ty sig.
Sent: Saturday, December 5 1998
To: Bob Muglia, Jon DeVann, Steven Sinofsky
Subject : Office rendering
One thing we have got to change in our strategy - allowing Office documents to be rendered very well by other peoples browsers is one of the most destructive things we could do to the company.
We have to stop putting any effort into this and make sure that Office documents very well depends on PROPRIETARY IE capabilities.
Anything else is suicide for our platform. This is a case where Office has to avoid doing something to destroy Windows.
I would be glad to explain at a greater length.
Likewise this love of DAV in Office/Exchange is a huge problem. I would also like to make sure people understand this as well.
I'm not saying this as some linux nut job but its things like that which just drive me nuts. Regardless of which ever os I prefer that kind of thinking just boils my blood.
How can any committee deciding on open standards seriously take a company which has been proven time and time again to play by its own rules and whenever it offers something labeled OPEN its about as open as the doors to Fort Knock are to the average person.
I don't believe OOXML should be a standard, but it seems to me to be pretty nit-picking to complain that numeric values are stored with "rounding errors" since that is inherent in converting between ASCII values and any binary format, including IEEE-standard floats. How does ODF handle this? It explicitly defines how the conversions are to be done? Or it caches the string the user typed?
Other than that, most of the other stuff he talks about is rather damning.
I tried to repeat the cell changes experiment but I do not see the Excel error.
I bet Mr. Stephane is not saving the sheel xml in utf-8.
The header of the xml file says its utf-8, but he might be saving it without the UTF-8 BOM header.
This "OpenXML" stunt is just a smokescreen covering Microsofts controlled retreat in the office format battle. It only needs to keep parties distracted until Microsoft has reclaimed the control over business content by means of vendor lockin v2.0 aka Microsoft Office Sharepoint Server.
/ 2007/04/while_you_were.html
http://weblog.infoworld.com/openresource/archives
http://www.itbusinessedge.com/blogs/mia/?p=198
How about everyone in /. email the people from your country and get all your friends to warn them about all the technical problems in the proposed standard!
b erCountryList.MemberCountryList/
http://www.iso.org/iso/en/aboutiso/isomembers/Mem
You are correct.
That's why the title says "Microsoft Office XML Formats? Defective by design"
not "OOXML defective by design"
He is dissing the Microsofts claims of transparency and openness of Microsoft Office XML
This is not proof of OOXML being defective by design. It only shows that apparently MS's software isn't able to handle OOXML properly.
OK, lets have MS have their choice either way on this one.
If their office tools work well but are not using the OOXML spec, they must be using some other spec, perhaps MOOXML. In which case they are not OOXML compliant.
On the other hand, if they want to be OOXML compliant then I guess Redmond programmer can't read their own spec and thus are having problems being compliant.
Either way, and for whatever reason Microsoft is not compliant with their own spec. Shall we call this MOOXML? And while I have only read a part of the spec, it is far too "undefined" and thus ambiguous to be reliable used by itself. A standard needs to be defined enough, that 2 or more parties could take the standard document specifications, run off and program it from scratch. And have a reasonable chance that their code will inter operate on the same data sets.
Trouble is, if Microsoft cannot do that, how is anyone else?
But might I submit, Microsoft wrote office and then wrote the spec. A poster child of why you think about and write the spec before the software is a good practice.
Still it's better than the original DOC format.
A DOC is actually a FAT12-like filesystem (called OLE) that has files and clusters. Clusters can be lost and files can be fragmented. One of the files is the document's text; it's not plaintext but rather another obscure binary format, with text chunks seperated by some kind of metadata (my brain nearly exploded when trying to understand how to separate text from the metadata and I gave up). Images, videos and embedded objects are stored as separate files in the OLE file.
Instead of a simple *.zip file with an HTML-like text file they invented a completely fucked up format that gives people nightmares. The only point is making third-party compatible applications is extremely difficult, but the plan seems to have backfired because even Microsoft's own Word Mobile doesn't work well with native *.doc files (and ironically, Documents To Go for PalmOS works better with DOC than Word Mobile!)
The relevant code from an ODF spreadsheet:
<table:table-row table:style-name="ro1">
<table:table-cell/>
−
<table:table-cell office:value-type="float" office:value="123456.123456789">
<text:p>123456.12</text:p>
</table:table-cell>
</table:table-row>
quite right, it's primarily a male name in French, but sometimes used as un prénom féminin.
AC sound like caveman.
Well, I suppose that's an improvement over Vista, "defective by nature." I can just imagine Bill Gates stamping his foot and crying. "Defective? I meant to do that!"
Kwisatz Haderach
Sell the spice to CHOAM
This Mahdi took Shaddam's Throne
The rounding errors given by the author in section 2 are wrong:
12345.12345 12345.123449999999 o(1e-5)
The rounding error is 0.000000000001 = 1e-12
Thought, the overall remark remains valid: The value should (also) be saved exactly as it was entered.
No, this is a pretty reasonable thing to point out. It wasn't a value that was undisplayed. When you look at the cell it shows it (in decimal) as 1234.1234 (without the cell rounding). So it shows you that on the screen but doesn't store it properly in the XML file. I would say it's a problem. If it were stored as a binary floating point number in the XML I'd say you might have a point, but if it's displayed on the screen in decimal and then the decimal value in the file is different, that's pretty broken. And it's not just broken, it's now damned hard to work with. What happens if you pull the value from Excel using VBA and then try to change a value in the XML? They're not going to be the same.
Oh nice! So you mean the W3C took it over?
STEPHANIE DE MONACO!!!!
But that is the surprise:
See
http://www.noooxml.org/
and
http://www.noooxml.org/arguments
The format is broken
Ms Office support of the format is broken
microsoft: Let's break ISO.
There is now an overwhelming literature on why OOXML should be refused acceptance as a standard by ISO.
The real question is why it is about to be endorsed as a standard. Is this just the power of money and influence?
It all beggars belief.
We all clearly benefit from international open standards. It's also clear that a central coordinating authority like ISO can expedite widespread adoption of such standards by virtue of being even-handed impartial experts on such matters. However, when ISO starts publishing so-called "standards" that are nothing more than paid-for advertising, the credibility of everything they do is called into question. If ISO is unable to withstand political and financial pressure, and we can no longer depend on them to impartially adjudicate the standards making process, then ISO has become irrelevant, at least as far as the IT industry is concerned. The ISO directors have a choice: they can either be paid shills, or a respected standards body - but not both.
If Office can't read OOXML files produced by other tools, and other tools can't read Office OOXML files, where do you suppose end users will place the blame?
And what do you suppose users will do when faced with incompatibilities?
It's a brilliant strategy: Define a new "standard" but don't quite implement it yourself, ensuring that no one can implement a competitive office suite that is compatible with yours. Further, make the standard complex and weird enough that you can always blame inconsistencies on the other implementations. Voila! You get to proclaim to the world that your de facto standard office suite supports an open, ISO-blessed international standard format -- but with no worries about losing your lock-in.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Stephane has been trying hard to win his 2,500 euros throughout this process, I'm guessing that this is just his final shot at winning the prize.
http://www.noooxml.org/kayak/
For example, the part about "Entered versus stored values" is certainly valid (though I wonder if that's not a problem with Excel itself, and not the format). The complaint about the date format is also on the money.
However, other things seem either wrong or have a bias towards hand editing of the files, e.g. "International, but US English first and foremost". He complains that it uses U.S. English settings. He may not like the U.S., but it's called picking a canonicalized format. Consider the alternative for implementing this in software, parsing of the values in the XML would now depend on settings also found in the XML. That would be insane.
Um, isnt the fact that not even Microsofts own software can handle OOXML which btw. is designed by Microsoft themselves, proof enough that something is seriously wrong with the design of OOXML?
No, because the manually edited document did not conform to the standards. I read what the author did; basically he seems to think that a document conforming to a standard must: 1) be easily modifiable by hand. 2) That you should be able to open the document and start editing away without knowing the spec. In the end, he basically is saying "its too complex to manually modify!!" Which is absurd; a program should be doing the modification, one built to adhere to the specification. Nothing about a standard says that it need work with notepad just because its based off of xml.
I mean if not even the maker of OOXML can get it to work properly in its own products, how are third parties supposed to do it? And if no one is able to implement OOXML correctly, what is this "standard" good for besides being a great smoke-and-mirrors tactic by Microsoft themselves?
You assume his manual edits were done according to spec, which is not likely given that the spec seems pretty complex. I would say that Excel handles it fine, as long as you obey the specfication. A third party will need to invest some time making a standard compliant reader / writer, but once its done, its done.
Its not that you can't implement it properly; its that doing so in notepad will prove very time consuming and error prone. But that doesn't suprise me; its meant to be done in an automated fashion anyway. May as well complain that you can't telnet to a DNS server and manually talk to it successfully.
And you don't know a fucking thing about floating point numbers. Try reading about how they work before bitching about differences that don't actually exist.
I seem to recall Lotus didn't like MAPI and wanted to push their own API called VIM? (http://en.wikipedia.org/wiki/Vendor_Independent_M essaging).
... and if the grandparent still doesn't get it, Stephane is basically the French & Spanish version of "Steven".
http://www.chmodoplusr.com/
"Sometimes" ? technically, yes, apparently (though that comes as a complete surprise to me).
:)
But to be fair, according to aufeminin.com, the highest ever (per year) is 68 individuals while it is 23320 for the male version. 1:343 is not "sometimes" in my book, it's frigging random occurence
That is all she is refering not whole
I never heard Stephane as the Spanish version of Steven, usually it's translated as Esteban; partly because 'v' and 'b' sound the same in Spanish; and also speakers of the language have a phenomenal difficulty pronouncing 'st' alone at the beginning of a word, generally pronouncing it with a leading 'e' whether it is written or not.
Qxe4
This post in your journal is hilarious:9 36709
:)
http://slashdot.org/comments.pl?sid=251519&cid=19
Is "Slam Dunk Networks" your competitor?
That's half right. The Spanish version is Esteban. For many more national variants, see: http://en.wikipedia.org/wiki/Steven.
So yeah, Stéphane Rodriguez has a French first name and a Spanish last name.
"..Next week members of the International Standard Organization are likely to approve the format as a second official ISO standard for office documents.."
Err.. Next week news called, they want their draft story back.
There is no certain outcome of next weeks vote; and the fact that we even are discussing the defects of OOXML are proof that the ISO body will have much problems just waiving this through. Please refrain from taking sides just because this is an 'Microsoft-standard'.
I'd say it's possible that OOXML will NOT be approved next week. It will probably have to take the long road through the ISO as a real standard proposal, not just a fast-tracked 6000 page gorilla.
"-Who said sit down?!"
-- S. Ballmer @ MSDC 2003.
Oh, I just love being schooled by AC's who don't know what they're talking about.
So, there are numbers that floating point formats do not represent well. However, the world is not floating point numbers. And computer math is not just floating point numbers.
The number is stored in the XML as an ASCII represented decimal real number. They're not stored as binary floating point numbers and they shouldn't have the kind of brain damageness that floating point has.
Let's look at what's going on here.
User enters a number in a decimal format. User sees the number in a decimal format displayed on the screen. Excel apparently does not use floating point or it's got a lot of compensation because if you do things like multiple 12345.12345 * 100000000 you get 1234512345000 and not some weird approximation. I would guess that the XML output routine is using floating point (and why would be a good question).
Why is this a problem? Well, we don't know how many digits of precision to work with here or how to round things. If I write an app to work with the spreadsheet I'd probably use something like a Java BigDecimal to handle the numbers. But, I don't know how to round things out so that I get the right numbers. If I use a BigDecimal, 12345.123449999999 is going to be 12345.123449999999. If I multiple by 100000000 I will get 123451234499.99999 instead of 1234512345000 as I would expect from looking at the values that were put into the spreadsheet.
Excel should be putting the proper values out in the XML or the standard should define the form of rounding/conversion to be applied.
since Sun doesn't use Java on a single one of their internal projects (it's banned by policy)
Sources please?
I already know how this is going to turn out.
OOXML will be voted in as an ISO standard.
Third party vender's trying to implement the "standard" will waste time, money and effort and accomplish nothing of import.
MS will continue as normal, claiming support for open standards while locking anyone they can into formats/software they own.
ODF will continue as a marginalized format used by people on the "fringe".
The point is, those values have to be stored in the form in which they were entered. Rounding errors are intolerable when a formula that treats the number as text might be used (which is not uncommon when doing sorts and lookups).
If I construct a spreadsheet in Excel that uses text(), concatenate(), and similar formulas, and I save it as a .xls and again in OOXML, I should be able to open the OOXML in any spreadsheet application that understands OOXML and all the cells would be identical with the .xls version.
But if TFA is true, even if I open the OOXML file with the Excel that created it, I will find that the values of the formulas that rely on text conversion will be different from the .xls version. Saving in OOXML of itself has corrupted the spreadsheet.
The speed with which Microsoft has developed OOXML, the volume of documentation that is used to describe it, and Microsoft's powerful efforts to keep it on a very fast track for adoption, with none of the usual revision processes, taken together strongly suggest that Microsoft is attempting to make a standard that won't work. That MS Office products fail in basic ways to meet this proposed standard is strong supporting evidence that Microsoft has designed OOXML to fail. That it is "defective by design" in every meaning of that phrase.
When you use MSExcel, you type in decimal numbers, represented by ASCII (ANSI?) characters.
You expect to get that stored exactly in the ANSI characters of the XML file.
And you can store IEEE floating point numbers exactly using ASCII characters.
(after all, you can code binary as a series of ASCII "0"s and "1"s)
The examples given by Rodriguez do indeed only prove that Microsoft's implementation sucks. Parent's assertion is correct.
s
;-)
On the other hand, a rather lengthy list of objections against the standard itself can be found here:
http://www.grokdoc.net/index.php/EOOXML_objection
So it seems that both the standard and its implementation suck
C - the footgun of programming languages
Well, duh!
No, I don't think so. It will serve Microsoft's purposes better if they too cannot properly implement the OOXML standard. Then their fully proprietary file formats would continue to be used since no one could trust that an OOXML document hasn't been corrupted by the OOXML save process.
This is how Microsoft destroyed the nascent RTF standard that the US Navy wanted to use: they implemented it, but gee there were problems in getting it to work right so maybe all you sailor boys should use Word's native file formats until we get things worked out (which never happened).
Windows just don't belong on a battleship or aircraft carrier. You would have thought the US Navy would have known that, but no, they had to go and try it anyway.
It should also be pointed out that many of his complaints would require application specific extensions in ODF as well. i.e. ODF doesn't define a way to encrypt documents, or store filesystem metadata. Where he talks about calculation chains and other aspects that have no equivelents in current ODF documents because of a lack of spreadsheet formula definition, etc...
Basically, many of his arguments could be said about ODF (though not all), since ODF doesn't provide a standard way to do those things, they would therefore have to be application dependant.
If you need web hosting, you could do worse than here
Dude. You do realize that OpenOffice also has OLE and SSPI support, right? These are platform specific features, and any office product on Windows has to support them, or they won't be very popular.
You're not coming up with some kind of revelation. It's more of a "Duh, no shit sherlock".
If you need web hosting, you could do worse than here
No. What it means is that Office has so much legacy code that they can't rewrite it all to be conformant. Think of OOXML as a target that MS feels they can eventually meet with office, not necessarily what office will actually meet today. After all, much was changed in OOXML after Office 2007 went to bed. One would expect the next version of Office to be much closer to the spec, since they will have had a full design cycle to conform to it.
If you need web hosting, you could do worse than here
There was a lot of whining about MS formats not being open, so MS throws everyone a bone. For the most part, Excel does in fact follow the standard, because the standard is BASED ON EXCEL. Everybody knows that there's years of cruft in Excel. Why would you expect anything different?
.NET to do it. It's a business need, right? You may even be able to get away with grabbing Open Office and doing the same when they get their implementation going.
If you want interoperability between spreadsheet packages, then its up to each spreadsheet implementor to replicate all of those nastry rules, not the end users. Hell, if they've done an xls format conversion, they're probably half way there. The difference is that the open format is well, open (supposedly).
If you want to read values from an OOXML spreadsheet, then it looks like it works just find, barring the wack rounding errors. If you want to write Excel files, and you're not an implementor, it would be simpler to just buy the doggone software and use
P.S. Invader Zim ROCKS
Did it ever occur to you that the Office 2007 was finished before the OOXML spec was? Remember, there were many changes in ECMA comittee long after Office 2007 was finalized.
If you need web hosting, you could do worse than here
Separating the value and the display solves the problem. As long as the value stored is preserved, other programs can work with it without introducing arbitrary changes. That M$ does not store the exact value and relies on the reader to make the same rounding error is crazy. It's a trap for every system that is not M$, and might not even work across different processors for M$.
I've run into this problem in my own work, where it did not matter. A data acquisition system I used required Winblows. It could write to either text or some nasty binary format. I chose text with a sufficient number of digits to avoid the binary conversion. This blew up my file size, but made it easy to read. In my case, the extra digits were noise anyway and it only gets read once by other programs. In a bank this clearly would not work. In a place where the values must be read and saved multiple times, this would not work. As a programmer, I'm a relative zero but even I can see how broken the M$ way is.
Value storage was only the beginning of OOXML problems. The formula and binary inclusions are even worse. Hopefully, ISO will reject this mess.
Friends don't help friends install M$ junk.
Don't forget the delicious language. Instead of the legendary "syntax error", we now get a "catastrophic failure". Do it yourself FUD!
(Scene at office)
ComputerGuy: "Sure, let's open that with GoogleApps."
Colleague: "Why am I getting a catastrophic failure? Maybe I better use Excel."
My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
Some of the complaints only indicate that MS sucks at implementing it's own standard.
Other complaints are with the format itself, such as numerous different ways of marking up the same thing; dependencies hidden in various files instead of listed up front (forcing a parsing of the entire zip file to make a trivial change); inclusion of proprietary, undocumented, or partially documented parts, like VML; including assinine legacy structure, like the way dates are improperly stored, and on and on.
Imagine this happening to some accounting software...
You have a good point. If OOXML is defective by design because it's not implimented by MS... despite not even being a released and solidified standard, that must mean Java (which Sun itself doesn't use) is defective by design. And likewise, teh Lunix must be defective, since they keep having to rewrite it and come out with new versions. It's high profile failures are also proof of this, like how Munich's total Lunix conversion has been a disaster, and they've been trying since 2002.
I agree with you about everything except the part about Excel handling it fine. I'm using Excel 2007 and have many spreadsheets with large data tables. Moving the columns around in the table (something the old file format did flawlessly) very often results in the "Unreadable content error" the next time you open the file. An extremely frustrating thing to happen, because when it does it kills your data table.
Your mind looks a little cramped. Why don't you stretch it a little?
But that's still a problem. Microsoft's implementation becomes the de facto standard and all others must (attempt to) conform to the behavior of that implementation or be judged defective.
.doc as if nothing happened? I guess they will do the latter since it's the most economical option for them. If that happens I'm curious what the EU will think of that, and how long it will take before MS is forced to use ODF as standard, if it ever comes to that.
I wonder what happens if OOXML is not voted a standard. Will MS simply discard it, and embrace ODF, or will they continue to use
-- Cheers!
I always figured that, since an ODF file is basically zipped, an encrypted ODF is just an encrypted ZIP.
http://spamdecoy.net - free throwaway anonymous email - avoid spam!
Part of the reason it is designed that way is that back in the dark ages, when it mattered a bit more, it was faster than a simple zip with an HTML-like text file.
Nerd rage is the funniest rage.
Do not get me started on the ever decreasing competence level at Microsoft. One of my daily tasks is dealing with and MS provided XML datastream. It's incredibly rigid and the protocol devoid even the most basic functions. This is no shock to me, having dealt with M$oft at the byte level since the early 90's. Ugh, why do I even waste time reading about M$soft stupidity.
As a mentor of mine once said. "Microsoft makes money, in spite of itself".
If you want to beat Microsoft at the Office game, build a better word processor. All this non sense over document formats is just a weak attempt by FOSS to compete through legislation where you fail to compete on merit.
Get off your ass and turn Open Office into something sexy that looks, feels and runs on par with Microsoft Office and then let the market decide. You have google behind you. You have IBM behind you.
BTW. I use Office 2007. I think Open Office is garbage compared to MS Office. I don't use the OOXML document formats as most people in the business world need the old format.
Stephane has for a long time presented a weak case against OpenOffice XML.
"1) Self-exploding spreadsheets"
His top issue "1) Self-exploding spreadsheets" has been discussed on Brian Jones' weblog:
http://blogs.msdn.com/brian_jones/archive/2007/08/ 15/why-there-s-no-microsoft-in-open-xml.aspx
It boils down to: the fact that is XML does not mean that you can modify it in any way you want; There are rules for modifying the schema and Mr Stephane is not happy with that. Had he followed the actual rules he would have had no issue.
This is a case where two locations must be updated per the spec; He can avoid updating the two locations by removing the chainCalc.xml file (which is optional, and Excel will reconstruct). He later gets upset because if he does that, he claims performance on load will be slower.
"2) Entered versus stored values"
His second point in "2) Entered versus stored values" in an interesting distinction between entered values and stored values. It reflects the way that Excel works (and so does Gnumeric) by storing the values instead of the data that was entered by the user. This responds to the need of the spreadsheet to do something interesting with the data, for example when you enter a date, it is stored as a number with a format applied not as a string. This allows computations on dates to happen based on the underlying numeric value. The featured is used extensively by spreadsheets.
In the Excel/gnumeric case you have to generate a single value, in the ODF case you must generate and update the two values (which just a point before, Stephane was having a seizure about).
The precision issue that he brings up, I suspect is merely an issue with double format precision. He claims that the data is unusable and there is a loss of precision, but handing that out to a C compiler will produce the expected result with no loss of precision. I do not know how "atof" or the compiler work internally to cope with this issue, but at least my libc/gcc combo does not have this problem.
I would not be surprised if this is an artifact of floating point, someone with more background on doubles and floating point math could probably answer the question with more authority, but a cursory read of "What Every Computer Scientist Should Know about Floating Point" seems to validate that there is no error in the floating point representation for the values that he uses: http://docs.sun.com/source/806-3568/ncg_goldberg.h tml
3) Optimization artefacts become a feature instead of an embarrasment
His 3rd point is open for debate, like the 1st case, we have a case where he has to handle things differently. Stephane sells a commercial product to handle Excel files and I suspect that his product has to cope with the same patterns in different ways, which has naturally upset him. OOXML might be inspired by Excel's needs, but it does not mean that it has to be a 1-to-1 match.
4) VML isn't XML
VML is labeled as "deprecated" in the OOXML documentation (Section 8.6.2, page 25) and it states: "The VML format is a legacy format originally introduced with Office 2000 and is included and fully defined in this Standard for backwards compatibility reasons. The DrawingML format is a newer and richer format created with the goal of eventually replacing any uses of VML in the Office Open XML formats. VML should be considered a deprecated format included in Office Open XML for legacy reasons only and new applications that need a file format for drawings are strongly encouraged to use preferentially DrawingML."
So the standard basically says "VML is still in use, but its better to use DrawingML". Stephane misconstrues the above statement and tries to portray this as evil
but Office can read OOXML files produced by other tools; You just have to generate proper files.
As its pointed out in this thread:
http://blogs.msdn.com/brian_jones/archive/2007/08
Stephane basically wanted a shortcut, and gets upset that he can not use a shortcut. This is equivalent to complaining that a web browser wont display things properly when you feed it invalid CSS.
In addition, Excel happens to recover nicely from the lack of data that Stephane complains so loudly about, you just happen to get a warning if the file you feed it happens to be incorrectly formed and even offers you an option to "repair" it.
...if he didn't use such emotive terms as "exploding," and "minefield." It really doesn't help him sound objective when the topic he's talking about is file formats for office software, rather than undetonated mines which is what, from that wording, you might be expecting him to be talking about with such language.
Of course, it's no accident that he doesn't sound objective...it's because he isn't. I have no problem whatsoever believing that Microsoft's file standard proposals are more than likely harmful, given their track record, but I'd prefer to read an account of such from someone who doesn't sound quite so strongly like a full member of the cult of Richard Stallman.
One other thing I really wish FSF cultists could do is actually come up with your own terminology for things, rather than simply parroting your leader's loaded language ad nauseum.
If, as people have said, it's "disingenuous" of me to refer to the FSF as a cult, then maybe it's also equally disingenuous of the group's supporters to keep acting so much like cult members. You know the old saying...
"If it walks like a duck, and quacks like a duck..."
Seeing that MS controls the platform and the platform calls MAPI it seems like a silly battle to fight.
To go back and reiterate darkatom's comment: Microsoft has always taken 'standards' and extended them to break everyone else's version except theirs. Nothing has ever stopped them except a court order (like JAVA, maybe...) but if they don't dominate and control they always try to take their ball and go home. ("I'll see your JAVA and raise you an Active-X (who cares if it makes using the web uncontrollably dangerous!)")
What an irony that Microsoft is having to embrace and extend its own file format
Microsoft is manipulating many members of the ISO technical committee into voting 'yes', and even rigging the voting process in several countries.
The voting process is 'defective by design' as votes from each country must be unanimous. Microsoft is a member in most (if not all) countries and will always vote 'yes'. This means that the vote can only be 'yes' or (when no unanimous vote can be reached) 'abstain'. All other votes will be declared invalid, and only the 'yes' votes count. Still believe the outcome will be no?!?
To Terminate, or not to Terminate, that's the question - SCSIROB
I agree with all that, but I think you're missing something very fundamental: the purpose of a document format is to encode what the user did and what it means. This is the reason why the details of binary floating point arithmetic are irrelevant in this context, and their use in the file a flaw: if the user typed "1234.1234" in the document, the user meant 1234.1234, and the file better guarantee me, author of a program that reads it, that I can find out for sure that the user meant 1234.1234. The trivial way to do that, of course, is to store precisely what the user is shown on the screen, because it is the thing that the user manipulates until it looks right to them, i.e., until they judge that the thing that they see in the screen is what they mean.
This doesn't apply just in spreadsheets, of course; it applies everywhere.
Are you adequate?
Yep. Brilliant, isn't it. Given a horribly complex and incomplete specification, Microsoft can easily blame any problems on the other tools -- and they can do this with a straight face because they'll be right! (Quietly ignoring the fact that their own tool produces non-compliant OOXML). Even better, they can smugly point out how their tools fix the "errors" caused by other crappy tools, even as the text of their messages frighten users away from trying any tool that doesn't come from Microsoft ("catastrophic failure", no less!).
If MS weren't trying to pull a fast one, they'd have designed a more reasonable format, one that does make it practical to make small edits to the XML and expect reasonable results or, even better, used an existing standard like ODF. If ODF can't fully represent all facets of Office documents, the format has a well-defined technical and procedural path to add any necessary extensions.
By way of comparison, try the same series of experiments with a .ods document, using any of the handful of available applications that supports it, and you'll quickly see how a format that is designed to be straightforward, accessible and specifiable in less than 500 pages compares to the brilliantly-executed monstrosity that is OOXML.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
The motives of the MS critic run the gamut.
Some are pro-Linux as you say.
Others are pro-BSD.
Still others are equally proprietary, from the OS X community for example.
What unites them? The ABM Treaty: Anything But Microsoft.
What part of "level playing field" and "conformance to standards developed in the traditional manner" (e.g. not the OOXML SUV-flanking movement) seems odd, biased, or unreasonable, Your Anonymity?
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
-Q(1) What does Rodriguez's article show?
-Q(2) is OOXML in and by itself flawed?
-Q(3) What's the practical relevance of the question whether OOXML is flawed?
-Q(4) So what's in it for Microsoft? Why do they bother?
-
- Q(1) : What does Rodriguez's article show?
- A(1) : Rodriguez's article show that the OOXML format written by latest Microsoft Office applications, among them MS Excel, is:
- sorely defective in that you can't be sure to get your original data back after saving it to OOXML
- impossible to change outside MS Office applications
- tied to the MS Office way of representing internationalised versions of documents because "of the way Microsoft chose to store XML using the US English locale, no matter how good your implementation is, you have to retrofit it to work just like Office does" in order to accommodate internationalised documents
- MS Office legacy formats supported throughout, greatly (and unnecessarily) contributing to the size and complexity of the 6,000 page standard.
- Q(2): Is OOXML flawed in and by itself?
- A(2):Yes, I think so, partly because of Rodriguez's article, partly because of flaws documented elsewhere: see http://www.noooxml.org/petition The points 2,3,4,5 listed there seem especially crippling to me:
(2) There is no provable implementation of the OOXML specification: Microsoft Office 2007 produces a special version of OOXML, not a file format which complies with the OOXML specification;
(3) There is information missing from the specification document, for example how to do a autoSpaceLikeWord95 or useWord97LineBreakRules;
(4) More than 10% of the examples mentioned in the proposed standard do not validate as XML;
(5) There is no guarantee that anybody can write software that fully or partially implements the OOXML specification without being liable to patent lawsuits or patent license fees by Microsoft;
- Q(3): What's the practical relevance of the question whether OOXML is flawed?
- A(3): Enormous. We currently see that Microsoft is trying to convince the world to accepted OOXML as an ISO "standard", whereas it's no such thing. It's too loosely defined, and opposed to the existing Opendoc standard there is no open-source reference implementation. So there will be a morass of possible implementations, of which only Microsoft's own implementations will be guaranteed mutually compatible. That's a polite way of saying that Microsoft simply aims at continuing its format lock-in, only this time the under the name of OOXML.
- Q(4) : So what's in it for Microsoft? Why do they bother?
- A(4) : Well ... Microsoft has a policy whereby it quite explicitly does not want other people's software, let alone Open Source software, to render MS Office documents correctly.
For reference, see this email, (cited from Rodriguez's article):
Is that
My guess is, yes, it occurred to the poster you were responding to, since I highly doubt that when he wrote exactly that, it was in his sleep. Did it occur to you that reading his post all the way to the end might have resulted in slightly less of your foot being inserted into your mouth?
San Francisco values: compassion, tolerance, respect, intelligence
"A pro/anti-MS ass.....", ha, nice try
just wonder why there are so many anonymous cowards in this world....
That's not the only way to interpret his point. A more charitable reading would be the following: he's arguing that the standard, because it's based too closely on the legacy representations and algorithms used by Excel, is more complicated than it ought to be.
Of course, if that's what he's doing, then he certainly ought to say so more clearly, and even better, explain how the whole thing could be simplified, or illustrate some other document format that does it better.
Are you adequate?
Did I just eat too many syrup coated waffles? He's telling me the rounding error is 10^-4 or 10^-5 on values with more trailing nines than I can count between sugary blinks. Not long ago I came across a slide presentation from a HEP lab concerning C++0x with a slide proclaiming that decimal floating point in hardware was the wave of the future. Now while I don't see any numerical advantage to this change, it will probably reduce the number of floating point gurus who gouge their own eyes out after rubbing shoulders with the ULP-retarded hoi pollio.
Here he comes to save the day! It's WonderMiguel. Always read to come to the defense of Microsoft.
Otherwise, horrible things could happen, like ODF could be used instead, or it could be extended to include stuff in OOXML and then the world would have one unified standard, instead of two of them even experts can't use that are not interoperable. We couldn't have that.
So the entire FOSS world wishes to thank Miguel for helping Microsoft keep its users locked in. Hey, man, what kind of game are you playing?
It only shows that apparently MS's software isn't able to handle OOXML properly.
The question is whether they have any intention of supporting it "properly".
I say the answer is a big "no". Their XML is just a thin ASCII veneer applied to their existing format.
The only reason for making OOXML it was political, they never had any intention of it being useful to anybody except Microsoft.
Users of OOXML will be just as locked in to Office as if they kept right on using the old binary format.
No sig today...
I don't think you intended it that way, but you should be aware of the vast number of people you just insulted. US English and US dates are only "canonical" in the minds of US citizens. If not for Microsoft purposely and determinedly screwing up the implementation of anything but US standards in their software the usage would have no traction at all.
The majority of the "English speaking" world still uses the English language and English formats and standards, not US variant ones. The fact that the USA has seen fit to re-invent English, still refer to that as English, and then foist it on the rest of the world doesn't make it "canonical."
As the author of this article so aptly describes, date formats and language implementations are a multi-stage nightmare in Office. To the point that the majority of users even in English speaking countries like Canada, Australia, New Zealand and the UK itself, often end up using American English and American dates simply because Office is the only game in town and you cna only bash your head against the wall on these things for so long. That doesn't make it right, and that doesn't mean that those users wouldn't be happier and more productive if they were not forced to use a US standard when they may have not even traveled to the US.
Any kind of English except the US variant, is severely broken in Office and always has been. Your answer sounds to me a lot like: "So what, they should all be using our standards and language anyway." Not helpful at all, and illogical as well.
Open Office also has support for .doc and .xls files. That doesn't mean either of those are well documented or not proprietary.
From the article:
The unfortunate consequence is being unable to know whether a part relates or not to another part makes it impossible to know, when you delete a part, if you are going to corrupt the document or not.
If this is not a proof of a defective design, then what defective design is?
I've seen this show before. A horrible standard (XML or otherwise) can have repurcussions that transcend the evil acts of one company- but it can also backfire on them simultaneously. That company may have an edge if the format takes off, but usually, the way it ends up is with that company supporting its proprietary format under two names: the old proprietary name, and a "public" name. Same crap, new umbrella.
I've been burned by a poorly written XML specification before, which was tightly coupled to one vendor's internal implementation and was heavily promoted by them. Lots of people in the sector were thirsty for a common standard to process experimental data, and an expert group was formed to come up with a common XML format for interoperability- and they jumped on this one. People across the industry wasted a lot of time trying to get it to work. I myself almost developed an ulcer. It was completely unusable- it could simply not be parsed or encoded to/from anything useful, without the involvement of continuous human intervention requiring real-world expert knowledge at every tag. (An XML DOM tree effectively isolated from any meaningful context is not "useful".) Unless you could write code that had a PhD and several years of experience in experiment design and lab work, you couldn't create a valid document. You had to know what to do with a tag reference that would mean "control signal", and another that referred to a standardized enumeration of the types of bedding used in rodent cages. It was ridiculous.
Everyone gave up on it after wasting a lot of time and the "experts" were reduced to sending out emails to software manufacturers, research groups at universities, and device manufacturers, pleading with them to use this standard. It was embarrassing to watch.
This turkey still presents a political roadblock because a lot of MBA types fall for the continued marketing from the guilty company (which benefits by having competitors waste time and resources), the continued reluctance of the experts group to admit to their mistake, and the popular perception of XML as being inherently readable. Anyone who actually has to get work done is still stuck with tab delimited text annotated with English. At least you can create a valid document.
Yes, and Microsoft has every right to do whatever the hell they want to do with their own damn proprietary OOXML format.
I believe it is the part where Microsoft is pretending that OOXML is an open standard and pretending that it is what is being implemented in Office 2007 that people are calling MS on.
Don't run from the real point: the only way to properly implement OOXML is to use MS Office. Any other method, and you can never implement the whole specification.
As was explained in A Beautiful Mind, there's no point everyone hitting up the hot blond who is going to reject you anyway, when your failure to achieve your first objective then compromises your chances to succeed with a second objective. Does anyone here think that a total anhilation of Open XML is in the cards? The point of that scene is not that it's a particularly good expression of Nash equilibrium, but that even a blond can understand it, which indirectly serves as a good example of settling for second best, when best is not in the cards to begin with.
Maybe a better plan would be to hit on the brunette. Microsoft has a long and sordid legal history concerning the display of scare boxes. This gratuitious bonking behaviour goes back at least as far as DRDOS. I've lost count of the number of cases since. It's not like they have much cred to say "we don't do that" or even "we don't do that anymore". Microsoft is like a drug addict where everyone has come to accept that "not any more" translates to "not since the last time".
How about as a condition for accepting Open XML, Microsoft is required to provide BSD licenced source code that scans an Open XML document for every possible defect that any version of any Microsoft software might display to the user with even the faintest whiff that anything is not entirely right with the OOXML document being processed.
Before Open XML can be regarded as an open standard in any significant sense, Microsoft needs to be deprived of their privileged position with regard to validation of Open XML and ability to taint the minds of the users with fear, uncertainty, and doubt.
The complexity of the open source validation suite would itself raise eyebrows concerning the purported openness of the Open XML standard. It is quite likely that Microsoft would refuse to go along with this proposal. That wouldn't reflect well on their intent either.
I'm putting this forward as an example of negotiating by contract: the open source implementation of the validation suite would serve as the contract by which Microsoft agrees on what manipulations of an Open XML might potentially sterilize your children, and what will certainly not, nor be purported as such, except to Microsoft's corporate liability. Personally, I could live with that.
"... Sun doesn't use Java on a single one of their internal projects (it's banned by policy)."
I've heard that too, but I don't have a link. Can anyone help?
From a recent comment: My understanding is that Sun does not allow its own programmers to use Java for important programs because Java is bytecode interpreted, not compiled. That makes Java easy to de-compile. Sun apparently designed the language for other people to use. Microsoft did the same with C#; apparently none of the programs Microsoft sells are written in C#.
Examples of Java de-compilers:
Jad - the fast JAva Decompiler
DJ Java Decompiler
Jode
JReversePro
SourceTec Java Decompiler
I think Sun and Microsoft are far more destructive to the computer world than anyone has analyzed thoroughly. This XML thing is just one example.
Back in 99 or 2000 or so, I could say, with a fair amount of confidence, that the word processor I used (I think it was AbiWord) was far superior to MS Word for that purpose. It was lightning-fast to open and run, had quite a lot of features Word didn't, was free (and a very small download), and often supported old Word formats better than Word itself.
But it wasn't completely compatible. It couldn't actually save a Word document, so it would save to RTF when people told it to, so they would stop getting requests for that functionality. And if you were to ever show anyone one single thing out of place when you imported their document, they wouldn't touch it with a 10 foot pole.
Now, I admit OpenOffice is huge and bloated, and in some ways is behind Word -- although it is ahead in other ways that I doubt you notice, since it sounds like you haven't given it a try. But we've learned our lesson.
The fact is, as long as there isn't a standard format, you cannot compete on merit. Linux was better than Windows in just about every way for a very long time. Windows XP started to make it less of an issue, and Vista finally implements Sudo properly, but Linux and OS X are still better than Windows in quite a lot of visible ways. But since there's no easy, standard way to write portable apps -- and Microsoft has actively tried to kill attempts like Java -- people use Windows because they have apps that need Windows because they use Windows.
Now, you're partly right -- Firefox is where it is because IE 6 sucked so much for so long that just about anything would be better. But it also exists because there was at least an attempt at a standard of HTML, JavaScript, CSS, and so on. If there had only been "Microsoft BinaryML", the Internet would have been Windows only.
Don't thank God, thank a doctor!
He wanted to remove a formula from a given cell. His first attempt was to simply remove the formula and change the value.
Instead, he has to go update all the reference and dependency information, which programs have to generate and update all the time anyway. I can't really think of a good reason this information needs to be saved to disk, and I certainly can't think of a good reason that Excel deletes the cell, rather than updating the dependencies itself to reflect the physical document.
In fact, I can't think of a good reason to store the value alongside the formula, except as an optional cache, which a program can recalculate if needed.
They are using XML in the first place. The point of XML is interoperability and human-readability/editability, not performance.
Don't thank God, thank a doctor!
Only an extremely poor programmer could not understand that it's possible to represent and work with numbers of arbitrary complexity. IEEE floats are fast and easy, but that doesn't mean it's impossible to represent a number that can't be fit into an IEEE float.
We've had lots of well tested relatively fast bignum libraries for years. Introducing rounding errors in a spreadsheet without being explicitly told by the user that such errors are allowed is absolutely unacceptable.
> But that's still a problem. Microsoft's implementation becomes the de facto standard
> and all others must (attempt to) conform to the behavior of that implementation or
> be judged defective.
It's worse than that. Since MS defines a number of aspects of the specification solely
in terms of compliance with MS application software, the MS implementation is not only
the -defacto- standard, but the very explicit standard. Not only can no one conform
to a sufficient level to be judged compliant in the marketplace, for all contractual
specifications, -nothing- but MS software can -ever- be 100% compliant.
This means on big, contract driven projects, such as many government projects, MS
and vendors using MS tools are effectively the only possible competitors, unless
the contracts and specifications specifically waive vendor compliance with those
parts of the spec.
And I strongly doubt anyone would ever write a contract like that.
Call me crazy, but unlike Bush I do not divide the world in "them" and "us" I like to live in a world of colors, a world of Pantone if you will and abandon the black and white mentality.
There are good and bad things about Microsoft. When they do something bad, I point it out, when they do something good, I do not see why I would not point it out. I also try to judge everyone with the same metric, I do not use one metric to judge Microsoft, and another one for us.
Stephane's article touches on a subject that I have plenty of experience on (I originally wrote Gnumeric, and later worked with Sun to open source StarOffice and over the years worked to grow the OOo team at Ximian and later at Novell).
Stephane's criticism lacks meat. If someone had done a review of Linux with this level of quality, we would have rightfully called it bullshit.
Miguel.
I dislike OOXML by default. OOXML was approved in Portugal, using faulty play. The comitee was pratically entirely made of Microsoft business partners and friends. The chairman of the comitee was from Microsoft. Sun and IBM weren't allowed to participate because "there was no room", as they "only had 20 seats". The first meeting had 24/25 participants, one of which was an express guest, an OOXML expert.
Anyhow, fuck you Microsoft.
I was referring to support for these technologies in ODF documents.
If you need web hosting, you could do worse than here
As a manager of mine once said, sometimes people TAKE offense. They take it where none was given. Sometimes people are looking to be offended - I suspect that's true in your case.
A CANONICAL format is generally preferred for storing data (e.g. storing time in GMT and then adjusting for local time). MS picked U.S. English as the canonical format for OOXML. They could have picked Swahili, but not as many people would be able to look at the text and understand it.
Perhaps if you had taken the chip off of your shoulder and taken a moment to understand what what was stated you would see the logic. But I doubt you're capable of that. Fortunately, most of the world is.
Rodriguez must be mistaken! User I'm Don Giovanni told me here that it was an open and freely implementable standard complete with examples of how to implement it! With such an authority, I assumed it was the end-all, be-all of open document standards!
"Rodriguez claims he is 'not affiliated to any pro-MS or anti-MS party/org[anization]/ass[ociation].'"
Yes, but Rodriguez bases all his arguments on real examples. And, everyone knows that reality has a strong anti-Microsoft bias.
"Microsoft's implementation becomes the de facto standard and all others must (attempt to) conform to the behavior of that implementation..."
Didn't Java have a reference standard?
Two vendors can't even implement HTML to render the same results from a given set of pages, since default fonts, sizes, margins, padding, and so on for many elements are implementation dependent.
Just seems like another excuse (not that we need one) to bash MS...
Any sect, cult, or religion will legislate its creed into law if it acquires the political power to do so.
I fully expect Novell to change their version of OpenOffice.org to use as the default format OOXML at some point. All focus at Novell seems to be on OOXML and not ODF. Even Novell's participation in the ODF TC at Oasis seems to *always* reference OOXML in some manner.
It's sad really.
There's a big difference between supporting platform specific IPC mechanisms in an application and integrating those mechanisms into file format that you claim can be implemented on other platforms.
-- The act of censorship is always worse than whatever is being censored. Always.
Stephane has been a very vocal opponent of ooxml. As someone just trying to do something useful with it, I see him banging the anti-ooxml drum loudly in the comments sections of most ooxml sites. His input is never that helpful, and I beleive his intent is to basically interfere with anyone trying to do anything creative and useful with the format. From what I understand, he works for a commercial company with a vested interest in maintaining the legacy excel formats. Also, when challenged on his technical critiques, his level of discourse often degrades into taking things a bit personally. Thanks to Miguel for breakin down the flaws of his arguments. I myself don't care for MS, but I find ooxml and what they've done with office2007 intriguing and supercool to work with. But I also think xquery is the future and sql's the new cobol, so go figure.
I have great doubt that Microsoft fixed the issues with OOXML in this short period of time so one must ask why is this format being addressed again so quickly?! One has to wonder if the ISO simply realizes that no matter what they do Microsoft will just keep pushing this until they finally get a yes vote. Are they simply caving?
If they agree to this it will simply be plain evidence that they are being influenced or the members are not competent. I'm of the opinion that they need to force a delay between considerations of at least 6 months. This gives the industry time to mull over any possible changes that have been made.
One must ask themselves how on earth one can claim an open standard for a format that is closed, misrepresented to those attempting to implement it, and can be changed by Microsoft at any time to shaft any open source groups thinking of implementation.
What Microsoft will get is simply a format that no one will use that no one will be willing to pay royalties on that closes their customers down so that they are locked into a single platform.
If the ISO members agree to this something is very wrong.
You can lead a man with reason but you can't make him think.
Is that website in the sig yours? I laughed for 10 minutes going through it. Brilliant. Sick, but brilliant.
Put identity in the browser.
Thanks, glad you liked it. I've made more than one slashdot foe because of it, though.
Qxe4
From http://www.ibiblio.org/pub/linux/docs/HOWTO/Advoca cy
Ever try disassembly or decompilation yourself?
It is sometimes a bigger intellectual challenge disassembling or decompiling than writing the program yourself. I find disassembly a bit easier than decompilation, but of course it is very, very tedious. Decompilation of C (I've never decompiled C++.) is difficult for me because every little thing is a call to some other area of the program.
The byte code of pseudo-compiled languages like Java is just a coded list of the instructions the programmer wrote. There are no comments, and the variable names can be hidden, so byte code is not as easy to read as the original code. But it is easy to produce a list of the original instructions, with some areas of confusion. If a programmer has written code using an especially excellent algorithm, it is possible to copy the code and change it enough that it is not covered by copyright.
That wasn't the problem. The problem is the numbers are stored with rounding-errors *AND* Excel contains some undocumented method of consistently correcting this and display the number as originally entered.
This method is not documented in the standard. Thus *other* programs that want to read Excel-files have to resort to guesswork to do a very basic thing that Excel does: Display a number that was entered by the user, the way the user entered it.
This means if we both get sent a valid OOXML-document, and you open it with Excel to read out some numbers, I open it with a program implemented by this so-called standard, the end-result is, you get the correct numbers, as entered by the user, I may or may not get the same numbers, depending on luck. This is a problem. It makes everyone else than Excel a second-class citizen.
I remain to be convinced that encryption is a useful concept at the application level. (Note : this is a challenge ; convince me!)
Nope, encryption at the application level simply doesn't convince me as being useful. That's not to say that it's not a buzz-word which suits might be sold on, but that's not an actual need.
Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"
What's clear is the current version of the SPEC is intended to document the legacy garbage remaining in Office 2007.
However even allowing for legacy garbage it doesn't succeed in documenting the actual formats Office generates.
The next version of Office won't aim at implementing the spec. Office devs will follow their own roadmap (including removing/changing legacy stuff documented in MS-OOXML) and the MS spec writers will release a new iteration of the ISO SPEC that will attempt to document the new format (with only enough accuracy to be accepted as a new ECMA standard, and enough mistakes to prevent any reliable third-party implementation)
What, you thought MS-OOXML was completed? It's going to be a continuous FUD engine, with implementors trying to catch up with MS specifiers, who themselves will half-heartedly follow Office developments, while ECMA (and maybe ISO) squanders the little respectability it has left blessing the whole process.
OO.o also support OLE.
Also, Mac Office supports OLE as well, so it's not "Windows-only".
And you claime that OLE is "newly discovered"? It's been around for over 13 years, and was present in the very first OOXML specs.
I don't know about SSPI, but given that your OLE knowledge is so woeful, I feel safe in assuming that your SSPI complaint is FUD as well.
-- "I never gave these stories much credence." - HAL 9000
This shows that neither OO.o nor K-Office handle ODF faithfully, nor are they compatible with each other.t e/summary.html
http://develop.opendocumentfellowship.org/testsui
Also, OO.o adds things to its files that are outside of the ODF spec. If MSO's files aren't true OOXML files, then OO.o's files aren't true ODF files either.
Same situation as many other standard formats, such as HTML. Different apps handle formats differently, and often not 100% faithful to the spec.
-- "I never gave these stories much credence." - HAL 9000
A clue insert: useLineWrapLikeWord95 is not specified in the OOXML 6000 page spec. It thus can only be implemented by MS.
OOXML docs can incorporate other, unspecified, MS technologies. Hence, OOXML can only be implemented faithfully by MS itself.
Ummm, that's what I said.
Moving the columns within excel? Certainly that should not happen, but it doesn't mean the spec is broken (not that you were saying that).
I know I must have hit on something, since a mod simply went with overrated on a normal 1 post..
Can't believe that it has any chance of becoming an ISO standard unless Microsoft has completely bought the entire committee.
l e.php?story=20070824182623970
http://www.consortiuminfo.org/standardsblog/artic
This is going to be really overwhelming due to the fact that Microsoft does not follow any general programming concept but only theres this is what I dislike about Microsoft, I am a programmer, and if you ask me.. there is a difference with general coding style and m$ coding style.
Am not sure why you want me to defend their products; I do not use them, beyond the casual use. The discussion is not about their products, but about a specification they published, it is an important distinction.
A few months ago I realized that the criticism of OOXML had gotten out of hand, I had not read the specs but the flash-mob like attack on it made it look terrible. I got the feeling it had to be an exaggeration, researched it, and found that what we had was basically a witch burning party.
Miguel
I love /.
/. has already deemed "tainted") that just doesn't make logical sense. If you're going to make an attack, make it factual and rational--the better to win converts to your POV!
TFA contains criticism of OOXML that contains several logical and factual errors, and all of the "+5 Informative" responses in this thread are just in agreement with the article summary or provide an anecdote about how "M$ is teh suck" or "OOXML is killing software development!"
Someone like Migel comes along, who has actually worked on spreadsheet development (Gnumeric) and who has researched the spec himself can't manage +5 anything. There is a bias against Microsoft here (and anyone who offers an explanation for something
It may be offensive to you, but it's also good sense. When storing data, you pick a common independent format to store the data, and the application is then responsible to map the stored data to the local presentation. The internal format could be Klingon for all it matters, it just happens to be US English in this case. The fact that a particular application does a poor job of translation should be looked on as a fault of the application, not the storage format. Try looking around at a couple of Open Source projects. I'd be willing to bet you quite a bit of money they all take the same approach.
Try to think of this from the perspective of somebody trying to implement the spec, rather than somebody just trying to pick a way at the document with a couple of quickly hacked up Perl scripts, as Mr. Rodriguez appears to be. Would you rather write a parser for one canonical format, or a parser for every possible localization. Keep in mind that some localizations depend on system settings, so you would need a way to store and check those values on the system at the time that the document was written. Then there is the question of extensibility. Do you write every supported localization into the spec, closing off all futer language support forever (or until the next revision)? Or do you leave it open, so that any time another company anywhere in the world writes a software product using your document format in a new language, everyone else in the world has to upgrade to recognize the new language stored in the document.
I'm sorry that you have a grudge against Office for it's poor localization support, but writing localization into the storage format itself us just not a viable path to getting what you want.
As an aside, it appears that almost all of this author's "debunking" falls into one of two categories:
1) Problems with the Office 2007 implementation of OOXML, rather than the spec itself.
2) It's not as easy as he would like to hack up the file by hand, which has no bearing on somebody actually trying to implement the spec in a full product.
#1 would not be an issue for somebody trying to actually write a real implementation of the spec.
#2 shouldn't be a surprise to anyone with the most basic understanding of computer floating point math.
#3 he has a minor point in that the OOXML way is a regression from the binary way, but again, for somebody actually trying to write an implementation of the spec, rather than a quick 'n dirty perl script, this is not a big deal.
#4 VML is as much XML as SVG. Yes, it's undocumented. It's also deprecated.
#5 it sounds like he might have a valid complaint here, although it may just be another complaint about trying to hack the file by hand. I'll give him the benefit of the doubt...
#6 is bull for the reasons just discussed.
#7 there may be valid reasons for the multiple formats if you look in the spec, but it does sound much more complicated than necessary. In the absence of further evidence, I'll give him credit for this one.
#8 the VariantTimeToSystemTime documentation is a red herring. The only thing that's important here is that MS has chosen a floating point number showing the number of days since 1900 as their internal date format. It's not platform dependent, or undocumented (unless the number of days since 1900 is different for Unix users than Windows users? Didn't think so...)
#9 as long as both documents follow the spec and look the same when you open them in a program that implements the spec, I don't see the problem here. The problem I see is this: "If a business process assumes the existence of a custom structure, it won't work." It sounds like the problem here is that his "business process" is not following the spec.
#10 seems to be more of a complaint about Office 2007's implementation than the document format, but without knowing how the spec addresses encrypted files, I can't really say.
#11 is completely irrelevant to OOXML.
#12 likewise is a complaint about Microsoft's abilit
If I don't put anything here, will anyone recognize me anymore?
And let me be one of the few people to thank you for your insightful response to the criticisms. Anyone who's been around code a few years knows you probably have a good feel for what is and isn't a good specification for the storage of spreadsheet data.
We may or may not all like each others' work, but you've got a head on your shoulders and just as I'd defer to John Carmack about what makes a good 3D video card, I'll defer to you on what makes for a good spreadsheet XML spec.
I suggest everyone else forget this is a Microsoft spec and read it and the responses in light of what it is -- a document that may or may not help or suck for that matter.
- Michael T. Babcock (Yes, I blog)
Thanks, I wasn't ready to put in the effort to write what you just did, but I'm glad you did, starting w/ debunking the localization complaint.
I have a ton of issues w/ Microsoft, and I am skeptical off this new standard, but this article didn't give me many reasons to consider OOXML a technical mess. As you said it had a couple of good points, but otherwise struck me as someone stretching to find something bad to say.
I do think that it should be possible to edit the document "by hand", or at least not require a full document analysis to modify the document. That seems to me to be one of the advantages of an xml format. To this extent I don't have a problem with having to fix up various references elsewhere in the document when you make a change, but having to parse formula's is too much (if that is really required as point #1 implies).
One of the most distressing things about OOXML to me from this article was that MS seems not to have rationalized similar functionality into a common schema to describe this functionality (ie the article states that there are many different ways of describing text formatting).
Mike
I completely agree. I really appreciated your critique because the article felt like someone reaching to say bad things, but other than some obvious areas (such as the complaint about using English as the canonical form) I didn't know enough about the actual spec. I appreciate the comments from someone who does.
This isn't to say that I now think that the OOXML spec is w/o issues, I'm still pretty skeptical about it, but this article hasn't make me more suspicious, and your response has helped.
Mike
My guess is, yes, it occurred to the poster you were responding to, since I highly doubt that when he wrote exactly that, it was in his sleep. Did it occur to you that reading his post all the way to the end might have resulted in slightly less of your foot being inserted into your mouth? ;-)
Well put.
Why not? A Naval vessel is (presumably) a closed system. There should be little, if any, concerns about interoperability with the outside world. The should be few, if any, points of egress/ingress for data on a Naval ship. Maybe a dedicated sattelite link for updating position and orders, in a strictly controlled manner. Anything else would seem to be a huge security risk. Presumably you can trust the sailors on the ship to some extent.
It is inherent in converting between text (ASCII or otherwise) formats and IEEE-standard floats, to be sure, or between text and any fixed-length floating point representation more generally, but there are "binary formats" that can take arbitrary precision decimal reals and store them without rounding errors.
Why don't you actually answer the question? He asked you for a URL, you replied with a nice PR department santised piece of emptiness with words like "triage the feedback". Saying that Apple, OOo and whoever else supports OOXML when all they're doing is trying to insure import/export compatibility is fucking disingenious. You know that no one else will ever use OOXML as a main format.