Slashdot Mirror


Microsoft Releases Pre-2007 Binary File Format Specs

An anonymous reader writes "Microsoft has released the specifications for the binary file formats used by pre-2007 Microsoft Office applications. They're accurate this time! Honest! While the documents are enormous (Word alone requires 533 pages; Excel runs over 1000 plus another 850 pages for the Office 2007 binary format), they hopefully will be useful to developers trying to create or extract information from Microsoft Office files (which despite their flaws, have been the de facto standard in many fields for some time now)."

27 of 269 comments (clear)

  1. How freaking "open" of them... by clang_jangle · · Score: 5, Insightful

    ...to finally share proper doc of the old standards. This just means they feel confident that MS Office 2007 will take firm enough root to ensure that the old game of catch up for FOSS projects will stay the same.
    And wasn't it just yesterday some twits had an artice about how MS is changing/will change? I sure wouldn't hold my breath!

    --
    Caveat Utilitor
    1. Re:How freaking "open" of them... by _xeno_ · · Score: 2, Insightful

      It's useful for people who want to generate Word documents. A project I worked on wanted to generate Excel spreadsheets as a way to download reports from a web application. We got it to work using Apache POI's HSSF, which while it doesn't implement everything reverse-engineered enough for it to work.

      ...Wait a moment. Allowing people to generate documents using old formats that work with the current Office actually helps Microsoft's Office monopoly, doesn't it? And here I thought they were just being kind.

      --
      You are in a maze of twisty little relative jumps, all alike.
    2. Re:How freaking "open" of them... by Anonymous Coward · · Score: 2, Insightful

      A project I worked on wanted to generate Excel spreadsheets as a way to download reports from a web application.

      Or you could NOT be a fucking retard and just use CSV.

      But then it would be interoperable with every spreadsheet and you wouldn't be able to make the Microsoft bash. So I guess that wouldn't serve your purpose.

    3. Re:How freaking "open" of them... by Lord+Crc · · Score: 3, Insightful

      ...to finally share proper doc of the old standards. This just means they feel confident that MS Office 2007 will take firm enough root to ensure that the old game of catch up for FOSS projects will stay the same.

      I guess that whole ISO voting stuff on OOXML just passed you by?

    4. Re:How freaking "open" of them... by neokushan · · Score: 5, Insightful

      If they keep hold of the spec and don't release it, you'll bitch about them not being very friendly.

      If they release the spec to everyone and promise not to go after any Open Source projects that may take advantage of it, you'll bitch about them still trying to line their own pockets.

      Really, Microsoft has no chance of pleasing you, do they? Just accept that it's good for everyone to have open standards, regardless of the possible ulterior motives involved.

      --
      +1 IDisagreeSoHeMustBeATrollOrAnAstroturferOrAShill
    5. Re:How freaking "open" of them... by Anonymous Coward · · Score: 2, Insightful

      not playing devil's advocate here but csv is just that: "comma separated values." he might want to include formatting, simple formulae, etc. in the generated excel file.

    6. Re:How freaking "open" of them... by Z34107 · · Score: 2, Insightful

      Sigh. Microsoft can never do anything right, can they?

      A week or so ago people were whining that they wouldn't release the specs. Well, they've started external documentation for the 2003 binaries - and your link has documentation links for 2007 as well.

      At least they warn you that they might have patents - this isn't some kind of submarine patent trolling operation. For commercial products, they even give you a link to some Nice People who will help you wade through the minefield.

      Not perfect, amazing, miraculous, or complete, but surely we can agree that this is a Good Thing. It definitely doesn't hurt anything.

      --
      DATABASE WOW WOW
    7. Re:How freaking "open" of them... by Xtifr · · Score: 5, Insightful

      It is important to note that open source developers, whether commercial or non-commercial, will not need a patent license for the development of implementations of these protocols or for the non-commercial distribution of these implementations,

      So...commercial developers can develop as long as they don't distribute. Boy, that's helpful/useful. About as helpful and useful as a kick in the nuts. :)

      I still say the idea that a protocol can be patented is silly to the point of almost being an oxymoron. We can, perhaps, debate whether an implementation of a protocol can be patented, but the idea that the protocol itself can be patented seems like blatant abuse of the patent system, even if you're one of those who believes that software or business-method patents are a valid notion.

      Fortunately, it does seem to be getting easier to challenge patents. Now if only we could get MS to admit what patents they think various open source projects might be violating, so we can start the search for prior art.... :)

      (Alternatively, maybe we can keep them muttering vague threats about their patents without being specific long enough that we can ask for estoppel or laches if they ever do try to get specific. The rumblings help because that way they can't pretend that they didn't know about the supposed violations all along, a vital point in raising a defense of laches.)

    8. Re:How freaking "open" of them... by DickBreath · · Score: 5, Insightful

      > Sigh. Microsoft can never do anything right, can they?

      They *could* do something right, but they choose not to. It would work against their business model.

      They *could* release specs unencumbered by patents. They simply don't want to.

      True interoperability is the last thing that they truly want.

      This has happened before. It will happen again. See IBM decades ago. The entrenched monopolist is never in favor of true interoperability -- nevermind whatever they may say. Everybody else who lives on the scraps is in favor of interoperability. Who you think is right depends on whether you think the currently in power monopolist has the God given right to be the only one in the business.

      --

      I'll see your senator, and I'll raise you two judges.
    9. Re:How freaking "open" of them... by smittyoneeach · · Score: 2, Insightful

      The rest of the captives and I are keen on feeling the Rorshachian "Yes We Can" zietgiest so prevalent in modern politics and this Microsoft announcement, as we sit chained to the oar.

      --
      Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
    10. Re:How freaking "open" of them... by SirSlud · · Score: 2, Insightful

      Theoretically, you can't sell a product that operates natively on their data formats. But if free tools are available to translate from MS format to format X, that seems to free up commercial software to provide free translators to translate to native or other open formats.

      --
      "Old man yells at systemd"
    11. Re:How freaking "open" of them... by clang_jangle · · Score: 3, Insightful

      I believe he meant that this announcement from MS would give some "hostages" who wish to feel more optimistic about MS becoming less evil some false hope to brighten their dreary day.

      --
      Caveat Utilitor
    12. Re:How freaking "open" of them... by dreamchaser · · Score: 2, Insightful

      You are making a mountain out of a molehill. There is no reason a standalone utility couldn't be distributed for free that works with any of those platforms.

      MS also couldn't use GPL code in their OS without issues. It works both ways. MS didn't have to releases these specs at all. I'm glad they did; it's a step in the right direction. No need to be politically militant about it.

  2. Re:So that's only about 2400 pages! by peragrin · · Score: 3, Insightful

    actually that's inaddition to the 6,000 pages for the OOXML spec since the OOXMl spec references that data.

    --
    i thought once I was found, but it was only a dream.
  3. Honest Attempt by clampolo · · Score: 4, Insightful

    I honestly believe that they are trying to give out complete information. It's just that they have 20 years of spaghetti code to somehow shape into an API document. I doubt if anyone at Microsoft really knows how the code works.

    With a 1000 page document describing how to list off spreadsheet information, I shudder to think about how organized their kernel is.

    1. Re:Honest Attempt by syousef · · Score: 4, Insightful

      Joel on Software my arse. I do wish people would stop quoting that shill. He's a Microsoft apologist who in the past has managed to present Bill Gates' unprofessional attitude (swearing at staff etc) as some kind of misunderstood genius. No Joel, your boss was an unprofessional asshole.

      As for this article. No intern should have been working on Microsoft's flagship product even 15 years ago. That's 1992 we're talking about, not 1982. It's entirely possible to write efficient code that isn't unreadable spaghetti and it's not always a good solution to use Office automation to read office documents.

      --
      These posts express my own personal views, not those of my employer
  4. Kudos to them by Enderandrew · · Score: 4, Insightful

    I can't understand the negativity. Sure Microsoft has an unpleasant past, but this is a good move on their part and should be met with nothing less than praise.

    We want to encourage more behavior like this.

    --
    http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
  5. 2 things though... by hee+gozer · · Score: 4, Insightful

    a) Does this mean the standard GNU response is now invalid?

    b) If someone writes a FOSS implementation of a .doc/.xls viewer, does that mean MSFT could more easily throw their weight to declaring .doc a standard? (Since a standard ought to have multiple implementations, although maybe office 2003 and 2007 counts as two, or office and word/excel/powerpoint viewer :p )

    1. Re:2 things though... by Sloppy · · Score: 2, Insightful

      I don't think it has made GNU's response invalid, just a little weaker. It used to be somewhere in between legally impossible and nearly impossible, to implement Microsoft's format. Now it is "merely" pragmatically impossible. It's still a joke-of-a-format, with absurdly-unnecessary complexity.

      I don't think anyone will ever write a reliable and complete (*) viewer for these formats, but I guess I shouldn't misunderestimate the amount of money someone like Novell or Sun might throw into something like that. I ain't holding my breath, though.

      (*) I have to throw in those qualifiers, because OO does often do an amazing job. The key word is "often."

      --
      As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
  6. Re:unusually bloated ? by IamTheRealMike · · Score: 2, Insightful

    If you think Word is only dealing with "saving text" you need to spend some time learning what it can do. The format specs are big because their users needs are big.

  7. Yes, kudos for this ... but not for MS's past by KWTm · · Score: 4, Insightful

    I can't understand the negativity. Sure Microsoft has an unpleasant past, but this is a good move on their part and should be met with nothing less than praise.
    We want to encourage more behavior like this.

    You are right. This is a great step forward. However, I think the Slashdot community, with its cynical eye on Microsoft, is reminding us to take this in the proper context. It remains to be seen whether this is the beginning of a slow but steady change of course for the world's largest software company, or whether this is a fake-out to fool people into thinking that Microsoft is nice.

    Personally, I suspect that this reflects internal conflict within Microsoft, with some portions of the behemoth trying to do something good, while another faction still trying to squeeze money out of Microsoft's unique position in the software world.

    In any case, remember how some people would say, "You always complain about Microsoft! What would it take for you to admit that Microsoft is doing something good?"

    #2 on the list was: Stop hijacking the HTML standard and make a compliant browser! Then they put out IE7. (Not perfect, but a heckuva lot better than IE6!)

    #1 on the list was: Open up the Word document file format. Okay, so they've done that. (Again, not perfect, but a heckuva lot better than what went on before!)

    Congrats, Microsoft. You did it. A little late in coming, and you really didn't impress us with your OOXML fiasco waving that money around, but I'm willing to adopt a wait-and-see attitude to see whether it's still those same money-grubbing upper level managers that are in control, or whether this really is a new day at Microsoft.

    --
    404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
    [GPG key in journal]
  8. Visio by llzackll · · Score: 4, Insightful

    Where is Visio ?

  9. free software .. by rs232 · · Score: 2, Insightful

    "This is definitely useful for app developers of free software"

    You mean as in you work on the implementation for free and Microsoft benefits from any commercial developments.

    --
    davecb5620@gmail.com
  10. Still, don't expect a converted Scrooge from that. by Spy+der+Mann · · Score: 2, Insightful

    I'm sure this move was somewhat forced to please the European Union or something.

    In any case, I'm sure this would be just what Sun needs to make OpenOffice(.org) more compatible with MS Office than MS Office itself :)

  11. Meh.. /.-ers by comm2k · · Score: 4, Insightful
    for all those thinking that this has anything to do with Gates leaving - you're wrong, its neither right nor interesting AND CERTAINLY NOT 5+ INSIGHTFUL.
    Microsoft releases api/ protocol specs | Feb. 2008
    http://www.theregister.co.uk/2008/02/21/microsoft_goes_open/
    Microsoft releases further specs | April. 2008
    http://www.theregister.co.uk/2008/04/08/microsoft_posts_protocol_documents/

    And they state that more will come after gathering feedback between then and June.

    Between now and June it will garner feedback from the developer community. Then, at the end of June, Microsoft will publish the final versions of technical documentation - along with definitive patent licensing terms.

  12. Re:By following the links.... by temcat · · Score: 2, Insightful

    You can't say it promises nothing if you haven't actually ATTEMPTED an implementation.

    This does not make sense. Their promise or non-promise is in no way contingent on my actions. It's me who has to consider what they promise before acting. If I find ambiguities, I'd better not act until these are clarified. And there are plenty of ambiguities. If you really can't see what they are, there are links to the analysis on Groklaw.

    I personally saw the ambiguities immediately when I read the CNS. And remember, it's non-lawyers who are going to implement the spec, so the covenant must be as clear as possible.

  13. Re:So that's only about 2400 pages! by kestasjk · · Score: 3, Insightful

    Because "pages" are a great way to measure a specs size..

    What about line spacing, detail of information, number of examples? If the spec is clearest when fully expanded who cares if they can squeeze it onto a single page in microfilm by cutting out helpful documentation?

    Rather than looking at the number of pages why not look at the number of distinct node types/attributes? Surely that would give a better idea of spec size?

    --
    // MD_Update(&m,buf,j);