Slashdot Mirror


Microsoft Releases Office Binary Formats

Microsoft has released documentation on their Office binary formats. Before jumping up and down gleefully, those working on related open source efforts, such as OpenOffice, might want to take a very close look at Microsoft's Open Specification Promise to see if it seems to cover those working on GPL software; some believe it doesn't. stm2 points us to some good advice from Joel Spolsky to programmers tempted to dig into the spec and create an Excel competitor over a weekend that reads and writes these formats: find an easier way. Joel provides some workarounds that render it possible to make use of these binary files. "[A] normal programmer would conclude that Office's binary file formats: are deliberately obfuscated; are the product of a demented Borg mind; were created by insanely bad programmers; and are impossible to read or create correctly. You'd be wrong on all four counts."

12 of 259 comments (clear)

  1. Office Doc Generation on the Server by VosotrosForm · · Score: 5, Informative

    I would like to point out another good option Joel doesn't have on his list. It's a software called OfficeWriter, from a company named SoftArtisans in Boston. When I last checked/worked there, it was capable of generating Excel and Word docs on the server, and I believe Powerpoint was probably coming relatively soon. Creating a product that can write office documents isn't quite as impossible in terms of labor as Joel is saying.... but it's still way beyond any hobby project. Plus, he is suggesting that you use Excel automation or the like through scripts to create documents on the server, which is a decent suggestion, if you want Excel or Word to constantly crash and lock up your server, and you enjoy rebooting them every day. If you want to do large scale document generation on a server you are going to need something like Officewriter. -Vosotros/Matt

  2. Re:patent promise doesn't sound very good by ContractualObligatio · · Score: 5, Informative

    If there are any optional parts of the spec, those parts aren't covered.

    RTFA. That's in the FAQ. Yes they are.

    If the spec refers to another spec to define some part of the format, that part isn't covered.

    In other words - if you do something related to a spec that isn't covered, it isn't covered. How could it be any different?!

    I'm not saying that there aren't any flaws, but this kind of ill informed, badly thought out comment (a.k.a. "+5 Insightful", of course) has little value.

  3. Re:first post? by julesh · · Score: 3, Informative

    I'd assume it has something to do with the antitrust action the EU was taking. Didn't they order that Microsoft had to open all their protocols/formats?

    As far as I remember, they only insisted on protocols (it was on the basis of a complaint from server OS vendors that MS was tying their market-leading desktop OSs to their server OSs and gaining an unfair advantage).

  4. Re:Joel by zootm · · Score: 4, Informative

    I'm not going to say anything against the Microsoft doc; he's pretty much absolutely right and it's a great introduction to why older formats are how they are in general to boot.

    The Hungarian thing – no, I still don't see it. Hungarian should not be used in any language which has a reasonable typing system; it's essentially adding unverifiable documentation to variable names in a way that is unnecessary, in a language which can verify type assertions perfectly well. The examples in the article are just ones where good variable naming would have been more than sufficient. It's not good enough.

    Oh god I've started another hungarian argument.

  5. Re:One possible reason for releasing the specs now by Chief+Camel+Breeder · · Score: 5, Informative

    Actually, I think they're releasing it now because they were ordered to in a (European?) court settlement, not because they want to.

  6. Re:Joel by mhall119 · · Score: 4, Informative

    Programmers didn't understand why Hungarian originally used his famous notation It wasn't created by some guy named "Hungarian", it was created by Charles Simonyi.

    http://en.wikipedia.org/wiki/Hungarian_notation
    --
    http://www.mhall119.com
  7. Re:patent promise doesn't sound very good by jsight · · Score: 5, Informative

    Hurr hurr. The Microsoft implementation of Java wasn't buggy: far from it, it was actually superior to the Sun implementation. It was faster and integrated better with Windows.


    Among other issues, borderlayoutmanager did not behave properly in MS's implementation. It was buggy in incompatible ways, but your right, that in and of itself wasn't the big problem. The big problem was their insistence on both not fixing the bugs, and not going along with major initiatives (such as JFC/Swing).

    But back in the day, the Microsoft J++ development environment was far superior to anything Sun had to offer. We're talking a good 10 years ago. Sun has finally managed to catch up in the past two or three years, but still, Sun's problem wasn't that the Microsoft implementation was worse: their problem was that it was better.


    If by "2 or 3 years" you mean about 5 years, then I'd agree. Java development tools didn't really reach maturity until things like Eclipse came onto the scene about 5 years ago.
  8. Re:Joel by encoderer · · Score: 3, Informative

    "Programmers didn't understand why Hungarian originally used his famous notation"

    Uhh.. There was never a "Mr. Hungarian" ....

    It was invented by Charles Simonyi and the name was both a play on "Polish Notation" and a resemblance to Simonyi's father land (Hungary) where the family name precedes the given name.

  9. Re:Joel by encoderer · · Score: 3, Informative

    It's not the language that makes it obsolete, it's today's IDEs.

    First, understand that nearly every bit of "Hungarian Notation" you've ever seen is misused. The original set of prefixes suggested by Simonyi were designed to convey the PURPOSE of the variable, not simply the data type. It was adding semantic data to the variable name.

    This is still valuable today.

    However, in days of lesser IDEs, the more common use of Hungarian Notation is still helpful, as it was a lot more work to trace a variable back to it's declaration to identify the type.

  10. Re: "compound documents." oh no, run away! by Thundersnatch · · Score: 4, Informative

    Anyways, it's no surprise that it's all the OLE, spreadsheet-object-inside-a-document, stuff that would make it difficult to design a Word killer. (How often to people actually use that anyway?)

    At my company, our users do that every day. Excel spreadsheets embedded in Word or PowerPoint, Microsoft office Chart objects embedded in everything. It's what made the Word/Excel/PowerPoint "Office Suite" a killer app for businesses. MS Office integration beat the pants of the once best-of-breed and dominant Lotus 1-2-3 and WordPerfect. When you embed documents in Office, instead of a static image, the embedded doc is editable in the same UI, and can be linked to another document maintained by somebody else and updated automatically. It saves tremendous amounts of staff time.

  11. Re:patent promise doesn't sound very good by ozmanjusri · · Score: 4, Informative
    Microsoft implementation of Java wasn't buggy: far from it, it was actually superior to the Sun implementation. It was faster and integrated better with Windows.

    Ah, marketing. Where would we be without it?

    Microsoft developed J/Direct specifically to make Java non-portable to other OSs. The MS JVM wasn't better than Suns, it was just tied heavily into the OS, and code developed for it broke if run on any other VM.

    J++ was another lockin tool to ensure any "Java" developed in Microsoft's IDE would only run on Microsoft OSs. JBuilder was always a better package anyway.

    --
    "I've got more toys than Teruhisa Kitahara."
  12. Re:One possible reason for releasing the specs now by Jugalator · · Score: 3, Informative

    So while I'm not a conspiracy nut, I do believe one of Microsoft's goals here are to assist the process of those binary formats becoming obsolete, to drive Office 2007/2008 adoption. Not a chance. Microsoft is bound to release Office 2003 security updates until January 14, 2014.
    --
    Beware: In C++, your friends can see your privates!