Slashdot Mirror


Microsoft Releases Pre-2007 Binary File Format Specs

An anonymous reader writes "Microsoft has released the specifications for the binary file formats used by pre-2007 Microsoft Office applications. They're accurate this time! Honest! While the documents are enormous (Word alone requires 533 pages; Excel runs over 1000 plus another 850 pages for the Office 2007 binary format), they hopefully will be useful to developers trying to create or extract information from Microsoft Office files (which despite their flaws, have been the de facto standard in many fields for some time now)."

18 of 269 comments (clear)

  1. ,,, or undo file corruption? by MickLinux · · Score: 5, Interesting

    I know it's old hat by now, but back in the Office 98 days, file corruption was a big deal.

    I wonder what was going on, but it occurs to me that now I could concievably actually back out
    the errors, and figure the thing out.

    --
    Correct Horse Battery Staple: 72 bits of entropy. Enter "Correct H" into google. When it generates the phrase, that's
    1. Re:,,, or undo file corruption? by Anonymous Coward · · Score: 2, Interesting

      It might have been software state corruption unrelated to the file format, and so this might not help (I'm not asserting it does help either way).

      If this is anything like their previous documentation it will be full of errors and omissions. Wait until this has been reviewed by engineers who reverse engineer their formats and then you'll know if this is more useful than (for example) the KOffice source code, or OpenOffice.org, Abiword, Gnumeric, etc.

    2. Re:,,, or undo file corruption? by Architect_sasyr · · Score: 2, Interesting

      Work in an office with other people using the same stuff. It happens all the time. I just got back into my own office from being upstairs repairing a designers OS X.5 permissions. It happens everywhere, but because we all detest Microsoft we make more of a note of it.

      Continuing off topic for a moment: I actually notice that there are a stack of bugs I come across all the time on my Debian or CentOS boxes that I just fix and move on without ever really registering that they occured - it's a technical skill thing I think.

      --
      Me failed English...
      FreeBSD over Linux. If my comments seem odd, this may explain...
    3. Re:,,, or undo file corruption? by Anonymous Coward · · Score: 1, Interesting

      I wonder why it's so much easier to fix on these? Oh wait! It's because of the documentation!

    4. Re:,,, or undo file corruption? by PitaBred · · Score: 3, Interesting

      Considering that the Office files are almost binary dumps of the software state, you're saying the same thing ;)

  2. So that's only about 2400 pages! by Anonymous Coward · · Score: 5, Interesting

    A far cry from the 6,000 pages for OOXML ..

    1. Re:So that's only about 2400 pages! by Anonymous Coward · · Score: 2, Interesting

      You're not counting the documents for Powerpoint and various other supporting components (VBA, Forms, etc.). When all of that is included, the total is around 5000 pages. And I don't think that that counts the OLE file format specification.

  3. interesting... by AmaDaden · · Score: 5, Interesting

    Did anyone else notice this is coming out on the first business day at MS that is Gates free...?

  4. wouldn't touch it with a ten foot pole... by advocate_one · · Score: 3, Interesting

    the "license" conditions no doubt will contain several pitfalls for anyone who actually wants to use it to implement a file input/output filter in conjunction with free software... and the other problem is once having seen the specification, you'll never be able to safely work on other free software projects again...

    --
    Donald 'Duck' Dunn: We had a band powerful enough to turn goat piss into gasoline.
  5. Holy Crap! by erroneus · · Score: 4, Interesting

    Or is it Wholly Crap?

    I guess we'll see. I'm rather shocked by this. This is a kind of "giving in" gesture that is MOST uncharacteristic of Microsoft. Is this was the "Post-Gates" Microsoft will be like? How much more cooperative spirit will the community enjoy?

  6. It's a trap by symbolset · · Score: 1, Interesting

    It's always a trap.

    --
    Help stamp out iliturcy.
  7. Re:Honest Attempt by kentrel · · Score: 2, Interesting

    It's just that they have 20 years of spaghetti code to somehow shape into an API document. I doubt if anyone at Microsoft really knows how the code works

    Really? Care to provide some evidence for that "20 years of spaghetti code" comment. If MS can make Office 07 faster and more efficient for me to use than OpenOffice with its painfully slow operation, then surely its a miracle that they can do that despite using 20 year old spaghetti code

  8. Why the documents are so long by SnappyCrunch · · Score: 2, Interesting

    Raymond Chen (well known Microsoft blogger) linked to Joel on Software today about Why the MS Office file formats are so complicated

  9. Re:Flaws by Enderandrew · · Score: 2, Interesting
    --
    http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
  10. Re:How freaking "open" of them... by Kjella · · Score: 1, Interesting

    It is important to note that open source developers, whether commercial or non-commercial, will not need a patent license for the development of implementations of these protocols or for the non-commercial distribution of these implementations,

    So...commercial developers can develop as long as they don't distribute. Boy, that's helpful/useful. About as helpful and useful as a kick in the nuts. :)

    Maybe someone with a law degree could sort it out but I thought it simply meant that a commercial company like Novell, Canonical or Red Hat could develop code as long as the distribution of the implementation itself is non-commercial. In short:

    1. Give this away for free
    2. Get more users and support for your distro
    3. Profit
    4. ??? (sorry)

    --
    Live today, because you never know what tomorrow brings
  11. Re:How freaking "open" of them... by K.+S.+Kyosuke · · Score: 2, Interesting

    F***ing bullshit, I say! Nice of them to give us precise royalty rates, but "patented" and "applied for patents" ticks instead of patent numbers? Is there *any* sane way to get to the list of USPTO patent numbers in question at all? For me, this is another FUD along the lines of "pay for something but do not ask for what you are paying (and why) otherwise we might sue you". I am so happy to live in Europe (and, at the same time, afraid that this might change really soon with all those US companies' attempts to export this crap).

    --
    Ezekiel 23:20
  12. Re:How freaking "open" of them... by smittyoneeach · · Score: 2, Interesting

    Hostyle's sig, "If Caesar were alive, you'd be chained to an oar" inspired me.
    We're all galley slaves in this modern economy, so, as with the kool-aid vendors in the presidential campaign with their smarmy little ads, we should accept this MS announcement and decide to feel good about it.
    And that, my friend, is the Straight Audacity of a Hope Talk Express for Change You Can Wonder About.

    --
    Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
  13. Re:CSV is crap by tjstork · · Score: 3, Interesting

    Wise man say building all corporate data on excel spreadhseets is building a house of cards.

    I couldn't agree with you more, but the more recent trend is to use Excel as the presentation layer, which is much, much safer. You build a web site that pumps the data out of the database, create Excel sheets dynamically, and you got a lot of happy Excel junkies.

    --
    This is my sig.