Slashdot Mirror


Microsoft to Open up Office Formats

Been on TV writes to tell us that Microsoft is expected to announce on Tuesday the opening of their Office file formats, according to Financial Times. From the article: "Microsoft will submit its Office file formats to Ecma International, the standards body, which will develop the documentation and make it available to the industry. The move is being supported by a number of organizations including Apple Computer, Barclays Capital, BP, Intel and Toshiba."

7 of 451 comments (clear)

  1. Define "open up" by Psionicist · · Score: 4, Interesting

    So.. Will they really open everything, or just wrap their proprietary implementation inside XML and therefore claim their format is "open"?

    I hope they really open up the format. Otherwise it'd be as bad as RIAA promoting DRM "for freedom". Sigh.

  2. 18 months? by dada21 · · Score: 4, Interesting


    It seems odd that it will take 18 months to develop documentation for the file formats. Sure, the formats must be complex, but it seems like maybe this documentation organization might not be a truly independent standards body.

    Ecma's wiki and site seems to be pretty much confirm that they're composed of manufacturer members. I wouldn't consider them the equivalent of ANSI or UL. 18 months of work by a collusive industry is more throwing those governments a bone than actually getting the work done right.

    I guess there should be some applause for getting the ball rolling. Uphill?

    1. Re:18 months? by jc42 · · Score: 4, Interesting

      Ecma's wiki and site seems to be pretty much confirm that they're composed of manufacturer members. I wouldn't consider them the equivalent of ANSI or UL.

      A related point that I'm wondering about: When the standards specs are complete, how will I get them? Will they be online? Or will I have to pay and sign an NDA to get a copy?

      This isn't an idle distinction. I well remember, back in the 1980's, working on networking projects where we really wanted to get the OSI stuff up and running alongside IP, to compare them. A problem was that the OSI specs weren't online; they could only be ordered in print. By the time we got a purchase order approved, an order sent, and the docs delivered, we had long since downloaded the RFCs for the internet equivalent and implemented it all. And part of the problem was that we had to hand-type the stuff from the OSI specs, leading to lots of typos and extra time to spot the typing errors. The IP docs could be directly copied to the code without error. (And yes, I am one of those weirdos who writes perl scripts that read spec docs and spit out code. I've gotten all sorts of funny reactions from people when they first discover those entries in my makefiles. ;-)

      The end result was that our OSI code could never catch up with the IP code. It couldn't even come close, simply due to the delays in dealing with for-pay, on-paper specs when the competitor was instantly available online in machine-readable form.

      If we'd had to sign NDAs for the OSI stuff, we'd never have gotten anywhere. But then, I guess we really didn't anyway, because all that OSI code is now dead and forgotten.

      I can see ECMA using a similar approach to delay us "open source" geeks, so they can hold it semi-private while oh-so-innocently pretending to have opened it all. It'll likely be open in the same sense as the OSI specs, but maybe with NDAs. With MS's marketing clout, the effect won't be to eliminate those formats from the market. The main effect will be a big drag on developers' time, as they try to jump through all the hoops required to get something working.

      I do expect that 6 months from now, we'll be hearing a lot of "Hey, we opened the formats, but nobody else has implemented them. Our competitors must be intentionally ignoring them; or maybe they're just incompetent." No mention of the fact that the specs haven't been published yet. And, if computing history is any guide, that 18-month estimate means at least 3 years, probably more.

      This sort of thing isn't what you'd call a efficient. But I don't suppose anybody ever called software a rational market.

      --
      Those who do study history are doomed to stand helplessly by while everyone else repeats it.
  3. Seems To Only Count For Writing by EXTomar · · Score: 4, Interesting

    They are fully and openly specifying how to write all of the Office formats. While this is good, it does nothing for the other important half which is reading. They clearly don't want all applications to perfectly files generated by some software. This tatic seems to guarentees that at least one product will "clean" as well as special Office formats: Office itself.

    I suppose people can take the information on how to write a valid "clean" Office format to make better format translators but we are still hosed for various random files that will be generated and only readible in sanctioned applications.

  4. Re:Perhaps, but I suspect more of the same... by masklinn · · Score: 4, Interesting
    In this case, I suspect they'll end up releasing, but still maintaining control over the office formats.

    Adobe has been doing it for years with the PDF format, and most people are ok with it.

    Keeping control over the evolution of the format but having the specs fully open so as to allow completely compatible products is a good thing, I myself would appreciate it if MS did that.

    --
    "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
  5. Re:Write not read by dereference · · Score: 4, Interesting
    Do not a single one of you idiots understand a binary file format?

    Well, on the admittedly limited chance that you're not trolling, and that instead you'll actually consider a reasoned response with an open mind, I'll try one more time. First, yes, we "idiots" do know exactly what we're talking about, so I honestly hope you'll bother to read and possibly even learn something.

    The fine example you gave is a trivially simple and quite static format, similar to an image. It is far from complex and dynamic enough to describe any useful arbitrary document. If you'd actually re-read the post to which you replied, you'd find a much more relevant example, that of HTML. HTML can't be described as a basic C-style structure like your example, but a formal grammar such as BNF (or a DTD for XHTML) could be used. However, you can very easily omit many optional flags/features when describing how to write a valid document in any such format. As noted, I might only tell you only about the head, title, and body tags, and perhaps the h1-h7 tags as well.

    Is it possible for me to neglect to tell you about all the other formatting tags (like b and i and friends) and even "forget" to mention the whole hyperlink concept with the "a" tag? Sure it is. Can you write a valid document? Sure you can. Now, can you really read all possible documents, including those that use the tags I so conveniently neglected to describe? No.

    Let's even use your own example, with a modification:

    long version 0x0100
    long number of strings 0x0002
    long string length
    string
    long string length
    string
    long number of options 0x0001
    int option_num
    int option_length
    byte [] option_data
    EOF

    Here you see that I've told you how and where to add multiple options. However, I've not told you what options are valid. I might only tell you about some of the options and not others. You can always still write a document given that format, but you can't read all documents unless you've been told all the possible valid options.

    So, really I hope this hasn't been a waste of time, and that you can see that Microsoft can choose to give out any arbitrary amount of detail for how to write a proper and valid document, without giving sufficient tools with which to parse all possible valid documents.

  6. Re:Hold on... by SeventyBang · · Score: 4, Interesting



    They're opening their file formats because they still has a trump card. Or has anyone forgotten about this?

    A quick patch or two to Microsoft Office (now one of their biggest or the biggest ca$h cow - 1/3 of their profits?) and MS Office suddenly reads|writes XML format only. They aren't about castrating themselves voluntarily. They still have shareholders to keep happy, but more importantly, they want to be the trendsetters, no matter what.

    How does this impact Open Office? Open Office can then read the XML Format because it's declared in the patent. But what O^2 won't be able to do is write the MS Office XML Format [except to violate the patent]. This means: no interoperability and any business which wants to migrate away from a closed system (MS Office) to Open Office can do so only as a one-way trip, burning the bridge behind them. And the company can't communicate both directions, so that forces a move en masse.