Slashdot Mirror


Office 2007 Fails OOXML Test With 122,000 Errors

I Don't Believe in Imaginary Property writes "Groklaw is reporting that some people have decided to compare the OOXML schema to actual Microsoft Office 2007 documents. It won't surprise you to know that Office 2007 failed miserably. If you go by the strict OOXML schema, you get a 17 MiB file containing approximately 122,000 errors, and 'somewhat less' with the transitional OOXML schema. Most of the problems reportedly relate to the serialization/deserialization code. How many other fast-tracked ISO standards have no conforming implementations?"

97 of 430 comments (clear)

  1. What's the Problem? by eldavojohn · · Score: 5, Insightful

    If you can change a vote of "no with comments" to "yes" I don't see why you couldn't change "fails with 122,000 errors" to "passes." I mean, when your standard passes through sheer lobbying and politics with little technical analysis, it's going to take a lot to surprise me with how epically it fails.

    --
    My work here is dung.
    1. Re:What's the Problem? by Finallyjoined!!! · · Score: 5, Funny

      Repost.
      OOXML: "The best Standard money can buy"

      --
      If I had an Ass, I'd call it Fanny Bottom, then I could slap my Ass; Fanny Bottom, on the Arse.
    2. Re:What's the Problem? by bhtooefr · · Score: 4, Informative

      Diebold voting machines run Windows CE.

    3. Re:What's the Problem? by eldavojohn · · Score: 4, Funny

      All I have to say is that it's a good thing Microsoft isn't running the 2008 Presidential Election! Diebold voting machines run Windows CE. Please press any key to start voting!

      >> [Enter]

      Are you sure you want to vote today?
      (Allow/Deny)

      >> Allow

      *An anthropomorphic paper clip appears*
      "Hi! I'm Clippy, I see you're trying to vote!"
      "Let me help you with that! Which of these do you enjoy the most:"
      A) Fear Mongering
      B) Economy Stunting Taxation ...

      Yeah, I can't wait to vote this year ...
      --
      My work here is dung.
    4. Re:What's the Problem? by CodeBuster · · Score: 3, Insightful

      which is why it doesn't really matter. The standards which can actually be implemented and have an open source reference implementation, such as the Open Document Format (ODF), will become the de-facto standards at least for archive and long term storage. Also, there will be tremendous pressure on Microsoft to at least implement ODF for their Office products and probably to make that the default save format as well. However, it would be nice if the standards could allow for optional extensions which are not required (I believe that the TIFF format for images allows this) but could be used by programs which want to add enhancements, but allow readability and editing in other programs which only meet the minimum standards. Perhaps this is already a feature or could someone with more detailed knowledge about ODF comment?

    5. Re:What's the Problem? by Bu11etmagnet · · Score: 5, Funny

      The standards which can actually be implemented and have an open source reference implementation, such as the Open Document Format (ODF), will become the de-facto standards at least for archive and long term storage.
      I find your lack of realism...disturbing
      --
      Life is complex, with real and imaginary parts.
    6. Re:What's the Problem? by danskal · · Score: 4, Insightful

      "B) Economy Stunting Taxation ... " BZZZZZ.... wrong!!! There's nothing stunted about the scandinavian economies (other than the US economy & subprime crisis dragging them down slightly at the moment), and they have some of the highest tax rates in the world.
      If tax money is used to lubricate the wheels of commerce, by ensuring a fit, well-educated, flexible, motivated work force, and by ensuring that infrastructure just works, that monopolies aren't abused etc.. Then there is no reason for taxation, within reason, to be a problem. I guess the logic is that sometimes, an intelligent government, voted for by the people and working for the people, can spend/invest the people's money more wisely then they can themselves.
    7. Re:What's the Problem? by fbjon · · Score: 4, Insightful
      The real problem is not with how much taxes are collected, it's the "intelligent government" part. I think a part of the problem is that the larger the government or governing structure is (in terms of people and country size, not legislation), the more it becomes an inefficient sieve rather than funnel.


      On one hand, a person should indeed be free to live as one sees fit, including spending. But on the other hand, people are stupid, so electing smart people and raising taxes seems like a win to me. That just leaves the "election" part, then. Now what to do about that.....

      --
      True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
    8. Re:What's the Problem? by h4rm0ny · · Score: 3, Insightful

      I'd be happy to continue this via e-mail.

      Please continue things like this on Slashdot. Many of us come here mainly so we can read the debates that go on and it's a shame if an interesting one retires to private email discussion. That was a fascinating post.
      --

      Aide-toi, le Ciel t'aidera - Jeanne D'Arc.
    9. Re:What's the Problem? by GauteL · · Score: 2, Informative

      Next, the middle class does not have more money than the top 5%. You are falsely stating this as fact. In fact, the top 1% holds 33% of all wealth and, the top 20% holds 51% of all wealth. The middle and lower class - the 80% of the country - hold just 16% of the wealth. I want to preempt anyone complaining about your maths. What you mean is that the "rest of the top 20%, apart from the top 1%" holds 51% of all wealth. Oherwise you'd be very wrong in adding the 33% to the 51% to get 84%.

      But the figures I assume you cite (*), does indeed support that the bottom 80% owns only 16%.

      (*) Edward N. Wolff at New York University (2004).

      In my opinion democracy is an illusion as long as 20% of the people own 84% of the wealth.

      The bottom 80% simply have no way of making informed opinions based on sources that aren't owned by the top 20%.
    10. Re:What's the Problem? by QuoteMstr · · Score: 2, Interesting

      Thank you for your fascinating post. I find myself wondering though, why you are a "hardcore libertarian" despite your solid grasp of the economics. Clearly, a very high top tax rate, strong corporate regulation, and an extensive public welfare system lead to an equitable society. What is the downside, and why would you oppose these kinds of regulations?

  2. Does anyone know if Open Office is compliant with by notaprguy · · Score: 4, Interesting

    the Open Document Format? Just curious.

  3. Technical Details by Enderandrew · · Score: 5, Insightful

    Technical details mean absolutely nothing in this discussion. I thought we established this.

    --
    http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
    1. Re:Technical Details by Iamthecheese · · Score: 2, Funny

      122,000 Errors? Thats, what, one error per 100,000 lines of the standard? I'd say they did a damn good job!

      --
      If video games influenced behavior the Pac Man generation would be eating pills and running away from their problems.
  4. So are most MS Word files by EmbeddedJanitor · · Score: 4, Funny

    You just use this conversion tool called Open Office

    --
    Engineering is the art of compromise.
  5. Stop using MiB by hedleyroos · · Score: 3, Insightful

    Men in Black? What happened to good old megabytes? The article says 17MB!

    1. Re:Stop using MiB by Anonymous Coward · · Score: 4, Funny

      Men in Black? What happened to good old megabytes? The article says 17MB! Maybe, but I make this shit look GOOD.
    2. Re:Stop using MiB by Richard+Steiner · · Score: 3, Funny

      Shh... The submitter is trying to impose those trendy "base 2" SI prefixes on us in spite of 40+ years of prior art to the contrary. Another case of ivory tower types not being sophisticated enough to grok current industry usage, methinks...

      And don't even get me started on folks who assume a byte is always eight (b) bits. There's a reason folks in the Real World use the term "octet", people. Really.

      Sheesh! :-) :-)

      --
      Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
      The Theorem Theorem: If If, Then Then.
    3. Re:Stop using MiB by Digi-John · · Score: 2, Insightful

      I see a lot of this happening in Wikipedia articles lately, too. Someone let the hyperpedantic nerds out of their basements to confuse every normal person on the fucking planet.

      Similar to the new prevalence of BCE and CE vs. BC and AD. Come on, you must admit that "Anno Domine" is far cooler than "Current/Christian Era". Up next, we change "Wednesday" to "Threeday", because references to Odin are just far too Euro-centric. That is, assuming we stick with that Judeo-Christian concept about Sunday being the seventh day.

      --
      Klingon programs don't timeshare, they battle for supremacy.
    4. Re:Stop using MiB by Yvan256 · · Score: 3, Insightful

      Just because people have been using SI prefixes to redefine that "kilo means 1024" for 40+ years doesn't mean they're right.

      Also, "octet" is the french word for "byte", so it's also 8-bit. :P

    5. Re:Stop using MiB by hardburn · · Score: 2, Funny

      More like fixing 40+ years of hard drive manufacturers lieing to us about storage space.

      --
      Not a typewriter
    6. Re:Stop using MiB by Richard+Steiner · · Score: 4, Informative

      Language is typically defined by usage, not the other way around. Unless you're the French, perhaps. :-)

      Remember that "kilo" *did* (and does) mean 1024 in a computing context. Everybody understood that who was involved on a technical level. Everybody. There was no miscommunication in the general case ... except when it came to laypeople who largely didn't understand what was described in the first place. When that happened, we just told them that bigger is better and moved on...

      Your comment about octet confuses and annoys me. Go away. :-)

      --
      Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
      The Theorem Theorem: If If, Then Then.
    7. Re:Stop using MiB by Schraegstrichpunkt · · Score: 4, Informative

      40+ years of prior art to the contrary

      "1 MW" has always meant 1,000,000 watts. "9.6 kbps" has always meant 9,600 bits per second. A "500 GB" hard drive still means 500,000,000,000 bytes.

      There are relatively few places where this is screwed up, most of which fall into these categories:

      • RAM or things derived from RAM (e.g. page sizes) where the physical layout imply powers of 2
      • Microsoft

      The latter doesn't even get it consistent. "1.44 MB" floppies are actually 1440 * 1024 bytes.

      Another case of ivory tower types not being sophisticated enough to grok current industry usage, methinks...

      "Current industry usage" is to be ambiguous; 17 MB means "somewhere between 16 and 18 megabytes". The people you call "ivory tower types", including the IEC, are trying to use more precise language.

      And don't even get me started on folks who assume a byte is always eight (b) bits. There's a reason folks in the Real World use the term "octet", people.

      The term "octet" does exactly the same thing that the binary prefixes do: They indicate more precisely what is being talked about.

      As someone else in this thread said, "just because some people made the mistake, decades ago, of choosing to equal kilo to 1024 doesn't mean they were right."

    8. Re:Stop using MiB by k33l0r · · Score: 2, Informative

      Uh, to quote Wikipedia (all hail the omniscience of it.):

      In Jewish and Christian tradition, the first day of the seven day week is Sunday.
    9. Re:Stop using MiB by psychodelicacy · · Score: 3, Insightful

      To be fair, we don't use "hour" to mean "sixty minutes" in every context except computing, where it means "fifty-eight and a half minutes". The rationality lies in the removal of confusion, as much as in the units themselves.

      --
      A closed mouth gathers no foot.
    10. Re:Stop using MiB by Yvan256 · · Score: 4, Insightful

      If language is defined by usage, does that mean that copyright infringement now equals theft? ;-)

      You have never seen the confusion of metric users entering the CS field, have you? Ever seen a teacher struggle with the very same point we're having right now?

      As I said, in the rest of the world, kilo means 1000, not 1024. And here you're saying it becomes something else because a particular field has abused it for 40 years?

      Also note that both hard drive manufacturers and digital telecommunications, in a computing context, use 1000 for kilo.

      So your argument becomes "if you're in a computing context BUT not talking about hard drives OR telecommunications, then kilo means 1024"...

      I'd rather use KiB=1024, thank you very much. :-)

    11. Re:Stop using MiB by BKX · · Score: 2, Informative

      Two mistakes:

      1. It's "Common Era".
      2. Replace Judeo-Christian with Christian. The Jews aligned their calendar with the Romans such that the Sabbath (Saturday) fell on the last day of the week (which, according to the Romans was Saturday.). The Christians decided that they would celebrate their new prophet on the first day of the week (Since most early Christians were originally of the Roman religion rather than Jewish, they equated Jesus with Sol, their Sun god, who was worshipped weekly on Sunday.). Later that celebration merged with the Sabbath concept, but the day of Sunday stuck, only now most people erroneously think of it as the end of the week.

    12. Re:Stop using MiB by Digi-John · · Score: 2

      Referring to the Wikipedia article, I see it referred to as Common, Current, and Christian. Pick your favorite.

      --
      Klingon programs don't timeshare, they battle for supremacy.
    13. Re:Stop using MiB by gbjbaanb · · Score: 3, Insightful

      Also note that both hard drive manufacturers and digital telecommunications, in a computing context, use 1000 for kilo. Also note that both hard drive manufacturers and digital telecommunications, in a marketing context, use 1000 for kilo.

      There, fixed that for you.
    14. Re:Stop using MiB by benwaggoner · · Score: 4, Interesting

      Except "computing" isn't a clear-cut domain. For example, in my field of compression. Does that count as "computing" (power of 2) or telecommunications (power of 10)? Unclear?

      So, we had a problem where different tools and formats defined it different ways. For a number of years, QuickTime used K=1024, while Windows Media and RealMedia used K=1000. Unless you were using Sorenson Squeeze, which "corrected" its Windows Media and RealMedia values by 1.024 so they matched the QuickTime files sizes!

      Horrible.

      Fortunately, the compression world has standardized on power-of-10 numbers, since that's what the MPEG standards and, well, all the professionals use.

      So, now we have to do with complainsts about the mismatch between encoding a file that should be "4 GB" but doesn't fill up "4 GB" of drive space...

      Sorry, 1024's got to be a KiB. No other feasible solution at this point, unless we decide to stop having computers talk to each other...

    15. Re:Stop using MiB by hakr89 · · Score: 3, Funny
    16. Re:Stop using MiB by fbjon · · Score: 2, Interesting

      In fact, do we even need to express filesizes in powers of 2 at all? Is there any reason to continue this practice other than tradition?

      --
      True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
    17. Re:Stop using MiB by Jesus_666 · · Score: 2, Interesting

      Where does 1024 follow from a byte having eight bits? 1024 is not a power of eight. It's divisible by eight, but so are more reaonable numbers like 512, 4096 or 32768.

      But still, if we assume a byte to be one unit we can as well use powers of ten.


      Of course you could argue that the tendency of certain things (like RAM chips) to have sizes that are powers of two might imply using a power of two in language usage. But then again, lots of other things don't use power of two (e.g. most storage media and almost everything transmission-related). Who prevails? Do we follow RAM usage and have non-fitting storage and transmission? Do we follow storage/transmission and have non-fitting RAM? Do we follow xkcd and settle on 1012 bytes per kilobyte?

      Or, of course, we just use unambiguous prefixes so people know which base we use. If you don't like "kibibyte" you can lobby IEC to instead adopt "computer science (not storage) kilobyte (CSkB)" and "general standard kilobyte (GSkB)".

      --
      USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
  6. A heck of a job, Brownie! by llamafirst · · Score: 5, Funny

    In a blog posting this week, Alex Brown, leader of the International Organization for Standardization (ISO) group in charge of maintaining the Office Open XML (OOXML) standard, revealed that Microsoft Office 2007 documents do not meet the latest specifications of the ISO OOXML draft standard. "Word documents generated by today's version of Microsoft Office 2007 do not conform to ISO/IEC 29500," said Brown in a blog post recounting the process of testing a document against the "strict" and "transitional" schema defined in the standard.

    Ahem. Let me be the first to say:
    Brownie, you're doing a heck of a job!

  7. Re:You're missing the point of an ISO standard by Richard+Steiner · · Score: 5, Insightful

    Without a reference implementation, how do you know a standard is valid?

    --
    Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
    The Theorem Theorem: If If, Then Then.
  8. Duh by Arreez · · Score: 4, Funny

    Seriously......anyone not see it coming? Office 2007 being submitted to this test is like submitting to a "Will it float?" test with your hands tied and the good ol' cement shoes strapped on.

    1. Re:Duh by jnik · · Score: 2, Funny

      Great. Now I just want to know...will it blend?

  9. You're missing the point... by voislav98 · · Score: 5, Funny

    which is that it's the standard that's deficient. I'm sure that the standard will soon be "improved" so it conforms with Office 2007

  10. OOXML is such a Fraud! by Nom+du+Keyboard · · Score: 5, Insightful

    OOXML is such a fraud that it's disgusting that we continue to waste such time on it. If it could win on the merits it wouldn't need such underhanded tactics by its (very few) supporters. It's clearly intended as an ODF-killer by creating an unnecessary parallel "standard".

    --
    "It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
    1. Re:OOXML is such a Fraud! by BearRanger · · Score: 2, Interesting

      No. It's intended to sway governments that have passed laws requiring all documents to be created using open standards. This is all about Microsoft being able to sell Office to European countries and (soon) California.

  11. Impressive by rumith · · Score: 5, Insightful

    While it's hardly unexpected that Office 2007 document format isn't *cough* ISO compliant, 122k errors for a 60Mb file results into a remarkable ~500 bytes of markup per error.

    I really do not understand where Microsoft is heading. They've rammed their miserable OOXML format through - supposedly so they could advertise their product as ISO compliant. But what's their advantage now that their product is shown to be so horribly incompatible?

    1. Re:Impressive by SatanicPuppy · · Score: 2, Insightful

      If the open standard is bloated and buggy, then people will keep using the closed formats.

      Microsoft has zero percentage in having a good, workable, open format.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    2. Re:Impressive by daveime · · Score: 2, Insightful

      This IS XML we are talking about ... even transmitting a boolean yes or no which should in principle take 1 bit becomes :-

      <xml schema="http:fuckingxml.com">
      <myboolean>
      TRUE
      </myboolean>
      </xml>

      On that basis, 500 bytes per error probably equates to around 1.152 bits of "useful" error information.

      Rather than standardize even more bloated crap, on this occasion I applaud MS for comitting OOXML to the early grave it deserves, by failing to even pass the tests on a standard they effectively created (and paid a lot of money) to get approved.

    3. Re:Impressive by PitaBred · · Score: 4, Insightful

      Except that open standards are usually government mandated. Microsoft would have otherwise ignored it completely, going with the lock-in you describe since they "own" the office landscape. They submitted OOXML because they didn't want to be locked out of new gov't initiatives requiring more accessible data formats, so they forced their crap through trying to call it open, while not really being so.

  12. HTML by WK2 · · Score: 4, Interesting

    It's not a fast-tracked ISO standard, but HTML and CSS have no conforming implementations. I'm not sure, but links might conform to HTML.

    --
    Write your own Choose Your Own Adventure. http://www.freegameengines.org/gamebook-engine/
    1. Re:HTML by pembo13 · · Score: 4, Insightful

      And see how well that turned out.

      --
      "Thanks for all the money you paid to us. We've used it to buy off ISO among other things" -Microsoft
    2. Re:HTML by Schraegstrichpunkt · · Score: 3, Insightful

      The current HTML specs are trainwrecks for the same reason. That's what HTML 5 is attempting to fix.

      Incidentally, the W3C specs are actually called "Recommendations". There's probably a reason for that.

    3. Re:HTML by AdamKG · · Score: 2, Insightful

      Okay, help me out here. Do you mean "how well that turned out" in the sense that HTML has been a huge success (you know, what with being the medium that we're using to display our comments right now ...) or in the sense of being a huge disaster?

      I mean, I can sympathize with both views. I'm just wondering which one I should sympathize with in the context of your post.

      --
      groupthink: It's good for self-esteem.
  13. 122,000 errors sure but... by msh104 · · Score: 5, Insightful

    I don't want to destroy the mood that the slashdot editor wanted to create by posting this sensational peace of propaganda. but this is not 122.000 bugs is it? this is a parser generating 122.000 error results. sure it's bad.. but anyone who has ever tried to make code w3c compatible or debug any piece of code will know that just 1 error can result into many many many error results. thus ( despite my will for it to be so ) does not really give you much insight in microsofts compatibility with it's own standard.

  14. As bad as it may seem... by HetMes · · Score: 2, Insightful

    ... it's actually worse. We're all agreeing here, it's who comes up with the most ludicrous comparison or the most disturbing details about the case what counts. So, the question is: What can any of us do about this?

  15. You're doubly missing the point by EmbeddedJanitor · · Score: 4, Informative
    Developing a standard without having a working example is very foolish. Stuff that looks cool in a standard often does not work out well in real life (theory != practice). Technically, it is far better to survey the landscape for things that work well and standardise those. There are problems with this approach: the companies that have implemented the winning standards often have a competitive advantage,lobbying can wreck the process and the standards might be burdened with patents (and standards users need to pay royalties to the patent holders).

    For one example where this has worked well, consider vehicle networking. Bosch invented/designed the Control Area Network (CAN). This was standardised by SAE as part of the in vehicle networking specification. ISO then just adopted the SAE stuff and extended it in some new areas. The stuff all works well and is based on proven technology (ie. the technology existed before the standards).

    --
    Engineering is the art of compromise.
    1. Re:You're doubly missing the point by Anonymous Coward · · Score: 2, Insightful

      "In theory, theory and practice are the same. In practice, they are not."

  16. down with mebibytes! by jollyreaper · · Score: 2, Insightful

    you get a 17 MiB file This whole mebibyte thing seems like an April Fool's prank that's been carried on for too many years. I can't believe people are actually using it now.
    --
    Kwisatz Haderach
    Sell the spice to CHOAM
    This Mahdi took Shaddam's Throne
  17. I Remember When... by Nom+du+Keyboard · · Score: 2, Interesting
    I remember when back in the good old days of the IBM EGA (640x350 6-bit color) adapter, when semi-clone cards were made they were all rounded up and tested against the IBM "standard". The IBM card had a couple flaws at the time, two of the bottom scan lines were interchanged, and it interfered with the computer's (IBM PC) ability to Warm Boot. Each card was given a percentage rating of how well it compared to the IBM Standard, and comments on whether or not the bugs in the original were fixed, or kept for compatibility reasons. Also, for less money, all of the clone cards came with the maximum 256KB of memory, while the IBM EGA only had 64KB standard, with the rest able to be added through a daughter card.

    What most made me smile was that the IBM EGA card was included in the matrix of results, showing a rating of 100% compatibility with itself.

    --
    "It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
  18. Validates better against the TRANSITIONAL spec by dominator · · Score: 4, Interesting
    Speaking as an OOX implementer, this is pretty bad. But it's not quite as bad as the headline makes it seem - the meat of the story is linked a few blogs deep:

    The expectation is therefore that an MS Office 2007 document should be pretty close to valid according to the TRANSITIONAL schema.

    Sure enough (again) the result is as expected: relatively few messages (84) are emitted and they are all of the same type.

    <m:degHide m:val="on"/> where "val's" values are supposed to be "true|false".

    [snip]

    Making them conform to the TRANSITIONAL will require less of the same sort of surgery (since they're quite close to conformant as-is)


    In other words, if you're validating against the TRANSITIONAL spec, the OOX documents aren't horribly far off. And it's wrong in such a way that's easy to compensate for in code (i.e. check for "true|on" for a truth value). That's a markedly different situation than described by the headline's "'somewhat less' with the transitional OOXML schema" claim.

    And in case anyone claims that ODF doesn't have the same sort of problem, I refer you to AbiWord bug 11359/OpenOffice bug 64237. This one is a show-stopper.
    1. Re:Validates better against the TRANSITIONAL spec by aug24 · · Score: 2, Insightful

      in case anyone claims that ODF doesn't have the same sort of problem

      FFS, ODF isn't a fast-track ('multiply implemented and widespread') standard. It's perfectly acceptable for a proposed standard to be ahead of current implementations - it's only proposed after all. Implementations should be expect to be playing catch-up.

      OOXML on the other hand is claimed to be already implemented and widespread and thus eligible for fast track. So it is a big deal if it turns out it isn't. Not to mention that you're selectively pointing out that the transitional version nearly works, blithely ignoring the fact (in the same blog) that strict is well fucked. So the strict version of the 'standard' should be thrown out even harder that then the transitional.

      I'm beginning to wonder if this concept is just too hard to grasp for many slashdotters or if there're just too many people drinking Norway brand Kool-aid.

      Justin.

      --
      You're only jealous cos the little penguins are talking to me.
  19. Re:You're missing the point of an ISO standard by dvice_null · · Score: 4, Insightful

    > Wha? Valid in what respects?

    Valid as in possible to implement. How could a standard not be possible to implement you ask? Well that is simple. E.g. write a program that follows this standard:
    1. It must print "1" on exit
    2. It must print "2" on exit

    As you can see, it would not be possible to implement a program according to that standard. That is why someone would need to write a reference application implementing the standard to notice errors like this. Before the standard is given to the whole world to be implemented.

    It is better that only one has to wonder the errors of the standards, rather than the whole world.

  20. Re:You're missing the point of an ISO standard by hardburn · · Score: 5, Interesting

    You need at least one coded reference implementation or else you'll end up with something in the standard which is difficult/impossible to implement. Especially in a 6,000+ page standard.

    ISO would be well advised to take the method the IETF uses, which is to have two independent teams implement the standard based on the documentation before an RFC can reach a Draft Standard status. I suspect ODF would have only benefited from this process by cutting down its rough edges, while OOXML would have been so cumbersome that it would be simply dropped.

    --
    Not a typewriter
  21. hmmm... 122k errors by SlshSuxs · · Score: 3, Interesting

    After the first error, are the remaining errors meaningful (i.e. false positives)? I believe most errors after the first are false positives relative to the first error.

  22. Re:You're missing the point of an ISO standard by Schraegstrichpunkt · · Score: 4, Insightful

    And why is that an issue? The job of ISO is to develop the standard in an implementable fashion. Top down.

    That explains why OSI is such a trainwreck compared to IP.

    Not a bottom up

    So why was ODF approved, then? Or ISO C?

    adopt the lowest common denominator of whats already out there

    "Lowest common denominator" is not equivalent to bottom-up design.

  23. Re:122,000 errors... by Dunbal · · Score: 3, Funny

    Obligatory: 122,000 errors should be enough for anybody.

    --
    Seven puppies were harmed during the making of this post.
  24. Up with mebibytes! by JustinOpinion · · Score: 4, Insightful

    Ha!

    Then there are those of us who think the prank is the people who refuse to use it (and who trot out the tired "hard drive manufacturers are stealing my disk space" myth/meme).

    Seriously, the one thing we can agree on is that there is often confusion regarding whether someone meant "1000" or "1024" when they used a prefix. The difference in approach between the two camps is:
    1. Stick with the status quo (where one tries to guess the convention being used based on context). That is, just accept with the confusion/inaccuracy.
    2. Use SI units in the original SI sense (powers of 10) and use new binary prefixes when you really mean it (power of 2). That is, create a convention and adhere to it.

    Interesting that in a discussion about standards (and failures thereof) you would argue that a standard meant to reduce confusion is a prank! I agree, by the way, that "mebibyte" sounds kinda silly... but who cares? It gets the job done. ("Quark" was a silly name, but it's now deeply ingrained in science and no one thinks twice about it.)

    For what it's worth, many software products now use the binary prefix notation (e.g. Konqueror).

    1. Re:Up with mebibytes! by menace3society · · Score: 4, Funny

      You're forgetting one thing: people have already adapted to the "old" usage. Dictionaries already exist saying that "mega-" can mean a factor of 1048576 units of computer data. If we change the system now, what will not happen is that everything disambiguates itself, and the hard disk companies stop lying to customers. What will happen is that

      1) Seagate et al. will continue to market their products in terms of GB and TB.
      2) Users will be outraged that their 232GiB hard disk only has 231 or so GiBs of usable space due to formatting, thus leaving the problem unsolved.
      3) People will lose good slang abbreviations like Meg and Gig to Kib, Mib, Gib (or Jib), Tib, and Pib, which not only sound stupid but will also be hard to distinguish in normal conversation.
      4) PHBs will misuse the binary-only versions as if they were base ten, especially if it catches on that "mebi-" is more than "mega-".
      Techie: Hey boss we've got new computers with 100 mebibytes of L1 cache.
      PHB: How much is a mebibytes?
      Techie: 1048576 bytes.
      PHB: Oh, so it's about a million then. Cool.
      Next Day
      PHB: Hey guys, we shipped nearly 2 mebi-units of dongles this quarter.
      Board: What's mebi-units?
      PHB: Well, it's.... Proceed into incorrect explanation that convinced Board of Directors that Boss is "with it"
      5) As a corollary to 4), people will start using those prefixes to refer to everything in a computer. The new chip is 3.2 GiHz, it draw 25 kiW of power, it weighs 21 Kig, etc.
      6) People will always think you are a douchebag.

      And that's not even getting into the confusion caused by having two different sets of prefixes for slightly different multipliers, maybe, during the transition.

      Ask any Brit: How much is a trillion?

  25. Re:You're missing the point of an ISO standard by Skrapion · · Score: 5, Insightful

    Not a bottom up, adopt the lowest common denominator of whats already out there Sure, the ISO does that a lot, and it's a fine approach. But that takes time, which is why the fast-track process was designed for standards which have already been implemented.
    --
    The details are trivial and useless; The reasons, as always, purely human ones.
  26. Re:You're missing the point of an ISO standard by Anonymous Coward · · Score: 5, Funny

    write a program that follows this standard:
    1. It must print "1" on exit
    2. It must print "2" on exit onExit() {
          print("1");
          print("2");
    }

    What's so hard about that?
  27. Re:You're missing the point of an ISO standard by davidkv · · Score: 5, Insightful

    There's a fundamental difference between the IETF and ISO. IETF makes standards of stuff that has been proven to work (or at least be implementable), whereas ISO wants to write specs to tell people what should work.

    A bit like comparing tcp/ip and whatsitsname (x400?). It doesn't really matter how nice something looks on paper if there's no good implementation of it.

  28. Re:Curiousity by Feyr · · Score: 2, Insightful

    microsoft was more than happy to play that game,
    until some governments stepped in and said any documents submitted to them in the coming years has to be an open standard.

    so they bought their way to one and voila. their documents still dont conform in practice, but in theory it's an open standard

  29. Re:Curiousity by Fast+Thick+Pants · · Score: 2, Insightful

    You've fallen victim to Microsoft's water-muddying strategy -- They gave their new file spec the ridiculous name of "Office Open XML" (abbreviated OOXML) just so it would be conflated with the OpenOffice.org's software and file formats.

    So this is not a case of a third-party compliance test like the Acid tests for web browsers; this is Microsoft failing to conform to their own standard.

  30. uhhhhh by niteice · · Score: 3, Funny

    Most of the problems reportedly relate to the serialization/deserialization code.
    um

    Isn't that what file formats do?
    --
    ROMANES EUNT DOMUS
  31. Referenced article promotes a bogosity. by Ungrounded+Lightning · · Score: 4, Informative

    The referenced article claims that "the English had imposed GMT on the rest of the world by force when Britain was a big colonial power", which is bogus.

    The English had a major sea trading infrastructure, at a time when improvements in clocks finally made accurate determination of longitude by celestial navigation practical for trans-Atlantic voyages.

    They established an observatory at a major port (Grenwich) to provide a time-hack for ships in port (both military and commercial) to set their clocks, and distributed navigational charts with that observatory's latitude as the basis for the coordinate system (thus simplifying navigational calculations).

    This quickly became the defacto standard on a voluntary basis among commercial shipping, along with the cities that grew up around major seaports (with multiples-of-an-hour offsets to approximate local noon - typically multiples of an hour, sometimes of a half- or quarter-hour), just as the coordinate system became the standard for shoreline mapping in other locations (to simplify navigation near shores by ships using the Grenwich meridian for their ocean charts). Then when railroads drove time standardization it spread from the seaport cities to inland locations.

    Of course the empire's military and government used it internally. But the rest of the world adopted it voluntarily.

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
    1. Re:Referenced article promotes a bogosity. by Trails · · Score: 2, Interesting
      It was also encouraged in some part by the fact that the first clock which worked reliably on a ship, which you refer to, was invented by the Englishman John Harrison. This book, which discusses the inventor and his invention, is quite interesting and worth a read.

      The original mean went through Paris, but shifted to Greenwich as a result of the aforementioned invention, and the naval political pull the British earned as a result.

  32. ODF wasn't fast-tracked by Xtifr · · Score: 5, Insightful

    Does anyone know if Open Office is compliant with the Open Document Format? Just curious. I don't know, but if none of the multiple (big difference already) vendors behind ODF haven't implemented it properly yet, then that just means that it shouldn't have been on the fast-track.

    Oh wait! It wasn't!

    The fast-track is for de-facto standards which are already so widespread (i.e. supported by multiple vendors) and consistent that there's little point in trying to push a divergent standard out, even though a divergent standard might be better. Something like TCP/IP would be a good example of the sort of thing where the fast track might be appropriate. ODF wasn't fast-tracked, so the standards committee came up with the best standard, irrespective of what might actually be out there in the wild. Now it's up to the vendors to catch up. That's the usual way this is done (i.e. the C++ standard, where most vendors took a few years to catch up, or the C standard where most vendors took a few months to catch up, and MS took a few years).

    Of course, if MSOOXML had gone through the regular track, it probably would have taken years to finish (since it's so large, complex, and poorly defined), and MS couldn't afford to wait. So instead they bought themselves a standards committee or twelve.
  33. Re:Curiousity by MightyMartian · · Score: 4, Insightful

    And that's what's been going on. However, a lot of governments and other organizations are now realizing that leveraging all that data they've been gathering for the better part of two decades on a closed, proprietary standard could lead to disaster. That's the whole point of trying to get an internationally recognized open standard that anyone can implement. ODF is supposed to fulfill the function of a published, implementable office document standard so that, theoritically, in 2100AD, when someone needs to open a document created in 2010, it's in a openly available format that, at the very worst, someone has to reimplement, but at least has clear, concise documentation that isn't thousands of pages long and doesn't include references to proprietary standards.

    The problem with that is that an open document format standard is a direct threat to Microsoft's near-monopoly in the office app department. If anyone can implement a document format that's cross-compatible, then they can easily implement a competitor to Office, and if they decide to undercut Office or (as with OO.org) give the damn thing away, then Microsoft's monopoly is one breath from collapse, and believe me, if Microsoft loses Office, they're in serious, serious trouble within five years. So, OOXML, a "standard" that not even Microsoft can implement, is pushed through the ISO using all sorts of peculiar and ultimately nefarious methods now means Microsoft and its partners can go around telling Small Town, USA that Office saves in an ISO standard, but in reality, the poor bastard in 2100AD who needs to open this file is going to be spending many months trying to figure out this monster, which is in direct violation of the whole notion of an open standard.

    That you have no problems is irrelevant. That's not what the point of an open standard is.

    --
    The world's burning. Moped Jesus spotted on I50. Details at 11.
  34. Re:You're missing the point of an ISO standard by MountainMan101 · · Score: 5, Funny

    The microsoft implementation would print "1" on Vista Home, "2" on Professional and "12" on Premium. It prints "4" on Linux just to prove it's linux that is broken. On Mac OS X it would print "1" and then "2" if you paid $50 more.

    Actually, what am I saying. A M$ program exiting cleanly.... ha ha

  35. What it the idea behind the "fast track" process? by walterbyrd · · Score: 3, Insightful

    I thought the idea behind the fast-track was a have less-fussy way of ratifying standards, when those standards were already widely used.

    If that is correct, then how does the MSOOXML standard qualify? This is a "standard" that is used by absolutely nobody, not even the creator of the standard uses this standard.

    Do I not understand the idea behind the fast-track process?

  36. Re:That is an improvement by Tony+Hoyle · · Score: 4, Informative

    Do ya think?

    Governments started demanding documents in open formats.. that threatened their monopoly, so they paid to get their XML schema called one.. now governments go back to buying exclusively Office again... MS Wins.

    End users don't give a shit about open. Governments do but only on paper.. once it comes down to the buying decision all they need is a checkmark on a list. It doesn't actually have to mean anything (cf. Posix compatibility in NT4.. damned near useless but it was a requirement at the time).

  37. Re:That is an improvement by inTheLoo · · Score: 4, Insightful

    The point of the article is that there are no conforming implementations. There never will be a conforming implementation and everyone knows it.

    --
    No calls now, I'm ...
  38. Yes, I think so. by inTheLoo · · Score: 3, Interesting

    ODF is the tip of a very big iceberg. It's an important and public facing tip but it is a small part of both government and business wasting money on the upgrade treadmill and all the intentional waste of M$ Office. It's all downhill from here.

    --
    No calls now, I'm ...
    1. Re:Yes, I think so. by Allador · · Score: 4, Informative

      What is this 'upgrade treadmill' you're referring to?

      Most .gov orgs at least here in the US that I've seen are using everything from Office 97 to Office 2003, but none are using Office 2007.

      That suggests to me that there is no 'forced upgrade' or 'upgrade treadmill'.

      What is it that you're seeing that indicates otherwise to you?

    2. Re:Yes, I think so. by Allador · · Score: 3, Informative

      Easy solution.

      1. Tell them to send you .doc or .pdf.

      or

      2. Install the free, simple, easy to install compat pack from MS.

      Nothing you've said here translates into MS forcing you to upgrade. In fact they've given you tools that make it easy and simple to NOT upgrade, and made them free to download.

      This is not to suggest that MS doesnt WANT you to upgrade, of course they do. But many, many businesses and orgs are still running quite successfully on older versions of MS Office.

    3. Re:Yes, I think so. by Anonymous Coward · · Score: 3, Insightful

      The compat. pack wont install on my BeOS desktop, what am I doing wrong?

    4. Re:Yes, I think so. by Knuckles · · Score: 4, Informative

      Of course the compat pack only covers features that are shared between the different Office versions. If someone sends you an *.xlsm file with 66,000 rows, you are out of luck even with the compat pack.

      --
      "When I first heard Daydream Nation it quite frankly scared the living shit out of me." -- Matthew Stearns
  39. Re:That is an improvement by willyhill · · Score: 3, Informative
    Anyone posting on this thread should be aware that "inTheLoo", "gnutoo" and "westbake" are sockpuppet accounts of the person who posted the original troll comment, twitter.

    twitter now has six known accounts on Slashdot, three of which have negative or near-zero karma.

    --
    The twitter monologues. Click on my homepage and be amazed.
  40. well... by sentientbrendan · · Score: 3, Interesting

    >How many other fast-tracked ISO standards have no conforming implementations?

    C++?

    Try out the "export" keyword next time you write any C++.

    1. Re:well... by david_thornley · · Score: 4, Informative

      C++ wasn't fast-tracked.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
  41. At least one other by ribuck · · Score: 3, Informative

    How many other fast-tracked ISO standards have no conforming implementations?

    ISO 25436 describes a version of the Eiffel programming language that has never been fully implemented. The standard contains lots of "blue-sky" "would-be-nice-to-have" sections which are planned to be implemented in the future.

    ECMA gives the document author a lot of control, so things can become ECMA standards that would not become ISO standards. But then the fast track ISO process (for existing ECMA standards) makes it easier for them to become ISO standards.

  42. Only 122,000 proprietary extensions by flyingfsck · · Score: 2, Funny

    I don't see any problem. Under the standard, proprietary extensions are allowed...

    --
    Excuse me, but please get off my Pennisetum Clandestinum, eh!
  43. Re:Does anyone know if Open Office is compliant wi by makomk · · Score: 4, Interesting

    As far as I know, Open Office produces valid ODF documents (with the odd extension for things like spelling and grammar checker options that are application-dependent), but it doesn't necessarily implement 100% of the latest version of the ODF spec. (In fact, IIRC sometimes other word processors add support for new ODF features before it does.) Since ODF is a committee-developed standard not based on what any one word processor does, this really shouldn't be surprising.

  44. Re:That is an improvement by Allador · · Score: 4, Interesting

    I wouldnt agree with your statement.

    The point of the article is that MS Office isnt conformant to the STRICT version. This shouldnt come as a surprise, as the change from the original OOXML to the strict version happened, but no new versions of MS Office have been released. The best thing anyone could reasonably expect of a company is that they would update it in the next Office 2007 service pack.

    Office comes in a 2-4 year release cycle, and the change in ISO from the transitional version to the strict version happened after Office 2007 SP1 was already done.

    How could MS have known in advance the changes that would happen to the standard? They cant see into the future.

    Dont forget here that the STRICT version is NOT representative of what any version of office produces. We already knew that.

    It was an ISO evolution of the submitted version (the transitional one). The vendor would need some time and a release cycle to adapt their products to it.

    What _will_ be interesting is how/when/if MS does conform to the strict format.

    On the other hand, the MS Word conformance to the transitional format seems reasonable. TFA only noted one problem, where an attribute value was using on/off rather than true/false. This is minor and easily fixed and/or recorded as a known issue.

  45. Facts? by argent · · Score: 3, Informative

    Facts? Try this fact: this is not an external standard that Microsoft is supposed to bring their software into line with, this standard was presented by Microsoft as accurately describing what their software actually did. That's the whole reason it was "fast tracked", because it was supposed to be a description of a conforming implementation.

    If it's not, then it shouldn't have been "fast tracked", it should have gone through the same process as current HTML standards... you know, the ones Acid3 are testing...

    That is, the issue is not whether Office conforms to the standard, but that Microsoft lied about its status.

  46. Re:Does anyone know if Open Office is compliant wi by Kalriath · · Score: 2, Insightful

    I've heard elsewhere in this Slashdot discussion that apparently there is a point where OO.o blatantly violates the specification - using the exact opposite value for hidden text as it's meant to. So it's almost valid.

    --
    For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
  47. Acid3? WTF? I thought this was about OOXML? by walterbyrd · · Score: 2, Insightful

    WTF does the acid3 test have to do with any of this?

    However firefox does with the acid3 has nothing to do with ISO corruption, does it?

  48. Re:You're missing the point of an ISO standard by naer_dinsul · · Score: 2, Funny

    Or even better... We could spawn two threads, one to handle each print. That way we could never be quite sure what order they'd appear in!

  49. Re:What kind of BS is that? "Strict Standard?" by Allador · · Score: 2, Insightful

    You either conform to a standard or you don't. Thats a nice theory but not really practical. ISO OOXML (strict version) was created between MS product releases.

    How long should it have taken for MS to release a version that matched ISO OOXML strict? One hour? One day? One year? More?

    Companies dont have the magical ability to instantly create a released product the day that the standards group settles on something. Thats just absurd.

    A standard that allows non conforming versions is no standard. Standards dont allow or disallow implementations. Thats now how it works. Standards exist. Implementations try to be compliant to them.

    According to TFA, Office 2007 OOXML is very conformant to ISO OOXML Transitional. But its not very comformant to ISO OOXML Strict.

    This should not be a surprise. For examle, the Strict version removes VML as a vector graphics markup. But MS has a decade or more of investment in VML, and their currently released products use VML. It will take a while for MS to change Office to not use VML (assuming they do choose to).

    If it would take 2 to 4 years for M$ to properly implement and document their crappy little standard, it should take 2 to 4 years for people to believe they had a standard worthy of ISO approval. I agree that it shouldnt have been fast tracked. That was a bit of an abomination. But lets be clear that MS didnt create a new standard, and then implement it. They just continued to develop their existing implementation, and documented what they already had. The OOXML is not a fresh creation ... its a documentation of something that has existed and been evolving for 10-15 years.

    Standards that come from mature, crufty old de-facto standards (ie, OOXML) are always going to be uglier than standards that were created to be a standard from day one (ie, ODF). Thats just reality. Expecting it to be clean and pretty is not reasonable.

    But the world where OOXML and the previous binary .doc .xls, etc formats are documented (ie, the world we're in now) is better than the one we were in before, where none of it was documented.

    PS, thank you Twitter for being reasonably coherent and making a post that, littered with the M$ nonsense that it is, at least was a reasonable discussion.
  50. Re:What kind of BS is that? "Strict Standard?" by Allador · · Score: 2, Informative
    Your quote was from the GrokLaw summary, which used some creative editing of the original blog posting to create drama and brouhaha. It's important to go to the actual article that GrokLaw was quoting and get the information from the source.

    Based on the actual root article, the results from the transitional version was nearly perfect, with 84 instances of the same (very minor) class of error.

    From TFA:

    The TRANSITIONAL conformance model is quite a bit closer to the original Ecma 376. Countries at the BRM (rather more than Ecma, as it happened) were very keen to keep compatibilty with Ecma 376 and to preserve XML structures at which legacy Office features could be targetted. The expectation is therefore that an MS Office 2007 document should be pretty close to valid according to the TRANSITIONAL schema.

    Sure enough (again) the result is as expected: relatively few messages (84) are emitted and they are all of the same type complaining e.g. of the element:

    <m:degHide m:val="on"/>
    since the allowed attribute values for val are now "true", "false", etc. this was one of the many tidying-up exercices performed at the BRM. This is a simple (and very common in this sort of thing) error, and not too surprising or worrisome. It's basically a very minor errata.

    This is actually quite impressive, given that the transitional version is not the same as what MS originally proposed, and so there was also little expectation that a document format created in the past would be conformant. It looks like the groups went to some effort to make sure that the transitional version was nearly 100% compatible with what MS Office 2007 actually emits.

    And it shouldnt be surprising to anyone that Office 2007 doesnt conform to the strict version. The strict version was semi-major surgery on what MS proposed. And it was developed long after Office 2007 was released.

    More from TFA:

    Validating against the STRICT model

    The STRICT conformance model is quite a bit different from Ecma 376, essentially because most of that format's most notorious features (non ISO dates, compatibility settings like autospacewotnot, VML, etc.) have been removed. Thus the expectation is that existing Office 2007 documents might be some distance away from being valid according to the strict schemas.

    Sure enough, jing emitted 17MB (around 122,000) of invalidity messages when validating in this scenario. Most of them seem to involve unrecognised attributes or attribute values: I would expect a document which exercised a wider range of features to generate a more diverse set of error message. Again, to restate. The strict version of ISO OOXML (what causes all the errors in validation) is NOT based on the current version of MS Office 2007. Therefore there is no reason to expect that Office 2007 docs would be fully compliant. The strict version did not exist when Office 2007 was created, therefore it was not possible for them to be conformant to it.

    To do so would have required them to predict into the future the path that ISO would take.

    Now the interesting question will be whether MS aligns with the strict ISO OOXML in a future Office 2007 Service Pack, or even if they clean up that one minor issue found here (on/off vs. true/false in attributes).

    The strict version breaks alot of backwards compatibility with legacy documents that were created in much older versions of office and forward converted. Given that, I'll be interested to see what MS does with this over the next year or two as their releases catch up to the ISO standards.

  51. Comment removed by account_deleted · · Score: 3, Insightful

    Comment removed based on user account deletion

  52. Re:Really? by willyhill · · Score: 2, Interesting
    For someone with a 1.2M+ UIN and a grand total of five posts, you sure are versed in Slashdot lore.

    You created this account as a clever variation on westlake, just like your Mactrope troll account was intended to be confused with Macthorpe.

    That makes it six sockpuppet accounts so far. To repeat what I've been asking you, how long do you figure this can go on?

    --
    The twitter monologues. Click on my homepage and be amazed.