Slashdot Mirror


Massachusetts Adopting 'Open Format' Software

XopherMV writes "A Massachusetts state senator who had complained about the state government's effort to promote open-source software at the expense of proprietary software has hailed the state's effort to reach a compromise over future software purchases by the state. The latest iteration of the state's policy emphasizes 'Open Formats' such as TXT, RTF, HTM, PDF, and XML." And if file formats for state use must be in truly open and free formats, then it matters much less what OS or application is used to create or open them. (On the other hand, XML and other TLAs don't always mean free or open formats.)

18 of 273 comments (clear)

  1. True, but... by professorhojo · · Score: 4, Insightful

    >On the other hand, XML and other TLAs don't
    >always mean free or open formats.

    This is true, but XML documents themselves are also considerably more open than their binary counterparts. Anyone can parse a well-formed XML document, and validate it if a DTD is provided. While companies may still create XML that behaves in a specific way bound to their application, the data in the XML document is available to any application. While developers could create obfuscated DTDs or encrypt their data in a proprietary manner, they would lose most of the benefits of using XML. XML doesn't bar the creation of proprietary formats, but its openness is one of its greatest advantages.

    1. Re:True, but... by fireboy1919 · · Score: 4, Informative

      While developers could create obfuscated DTDs or encrypt their data in a proprietary manner, they would lose most of the benefits of using XML.

      I think you're missing what Microsoft would consider the benefits of XML. Namely, that they could create obfuscated DTDs and encrypt their data in a proprietary manner while still using it, thus convincing the masses that they're using an open format while not actually using one. They're actually doing this with their html exporter now.

      Another thing they like to do is put bugs and workarounds into their code that no one else knows about (of course, they only do this in places they own the marketshare). Their RTF encoder is riddled with these.

      So...I think the only fair thing to do is to make an open format and make the government-approved reference implementation open source.

      --
      Mod me down and I will become more powerful than you can possibly imagine!
    2. Re:True, but... by SilentChris · · Score: 4, Insightful

      "Namely, that they could create obfuscated DTDs and encrypt their data in a proprietary manner while still using it, thus convincing the masses that they're using an open format while not actually using one."

      But they won't. They can't. Microsoft has a history of sticking with the original file format they created along with 1.0 of the application. Today's Word docs have a lot "tacked on", but they still have the basic structure openable by the original Word.

      WordML (Microsoft's XML structure for Word docs) is fairly clear-cut. They can "obfuscate", but they won't, because people'll will want those original files openable in 10-15 years. Backwards compatibility is a huge goal at MS.

    3. Re:True, but... by fireboy1919 · · Score: 4, Insightful

      Today's Word docs have a lot "tacked on", but they still have the basic structure openable by the original Word.

      Word documents are not backward compatible, except in a few lucky cases, despite the fact that most of the functionality is the same. Have you even tried this? Word XP documents don't work in Word 97; 97 don't work in 95, and I would assume it goes back even farther.

      I think a better claim would be "backwards compatibilty is a huge thing to avoid at MS" considering that almost no new functionality has been added to word in the past 10 years and yet the document format has changed.

      --
      Mod me down and I will become more powerful than you can possibly imagine!
    4. Re:True, but... by purplemonkeydan · · Score: 4, Informative
      Have you even tried this? Word XP documents don't work in Word 97

      Have you tried this? On a recent trip for work, my company laptop had Word XP (2002) installed, the machines at the client site used Word 97. There were no problems whatsoever with compatibility.

      Office is generally pretty good with forward and backwards compatibility.

    5. Re:True, but... by theguyfromsaturn · · Score: 5, Informative

      You must be using extremely simple documents... basically plain text. My supervisor and his other grad students use different versions of Word (I'm not sure which one), but all the the figure positions, get screwed up, equations get put everywhere, and it's a general mess. I manage to maintain compatibily with both of those guys by not using Word but OpenOffice instead. It's actually this lack of compatibility between Word versions that got one of the other grad students to switch to OpenOffice, which was better at handling different versions of Word documents than Word itself.

      --
      I like my dinosaurs feathery, and my pterosaurs hairy (or is it pycnofibery?)
  2. Ploy On Price Negotiation? by Manip · · Score: 4, Insightful

    This is not the first time we at /. have seen states and countries go this route but they almost always end up back with Microsoft but with a discount on their licence.

    I don't know about you guys but I won't believe it until I see office workers using it, before then it is just a negotiation ploy to save some money with Microsoft (Why else announce it early?)..

  3. Re:PDF by TheRaven64 · · Score: 5, Informative

    PDF is an open format. The specifications are available for free download and no license fee is required to implement it. It is controlled by a single entity (Adobe), rather than by a committee (e.g. the w3c), but it is no less open.

    --
    I am TheRaven on Soylent News
  4. Re:Open Formats by TheRaven64 · · Score: 4, Informative

    Google is your friend. The complete PDF specification is available for download from Adobe's website.

    --
    I am TheRaven on Soylent News
  5. HTM? HTM? by Anonymous Coward · · Score: 5, Informative

    HTM is the filename suffix that broken operating systems like Windows used to assign to HTML files. The document format is called HTML.

  6. if they are serious about it, that's enough by idlake · · Score: 4, Insightful

    If they are serious about enforcing open document formats, that's good: open source can compete and win if formats are open. The big concern is that companies like Microsoft will try to portray their proprietary formats as "open". For example, the DOC format has been documented by Microsoft, but it isn't truly open because it keeps changing and because it is under Microsoft's control. In particular, XML is not an open format--it isn't a format at all; XML is a standard in which people can define formats, both open and proprietary.

    A format isn't open until it has actually been standardized by an independent body that can guarantee that it is free from patent or other claims, and until it has been demonstrated that it can be implemented independtly by actually doing so.

  7. Re:HTM? HTM? by Karma+Sucks · · Score: 4, Funny

    Similarly, RTF is RTFM.

    Geez.

    --
    (Please browse at -1 to read this comment.)
  8. e-government and our Boston City Council by dsaklad · · Score: 4, Informative

    Boston City Council sends by email public hearings notices for council committees like the Human Rights Committee. But our Boston City Council is unwilling to send the email as plain ASCII text instead of the .doc formatted public notices that are not so compatible.

    Maybe they want to preserve enbolded text as if that enbolded text was some sort of legal document. Maybe they want to preserve the image of a seal of the city. At the expense of wider more compatible distribution of important information our city council is even unwilling to put the full text of public hearings notices on the web site at http://cityofboston.gov/citycouncil

    An online calendar at the website does list the meetings minimally with no details. The full explanation for the purpose for holding the public hearing needs to be posted every time with an archive for reviewing past hearings.

    So much for a mandate of so called e-government !

  9. Re:PDF by martinX · · Score: 4, Funny

    Odd. I was expecting it to be a link to a PDF file...

    --
    When they came for the communists, I said "He's next door. Take him away. Goddam commies."
  10. It'll open, but not look the same by Sylvius · · Score: 4, Interesting

    The last time I was sendout out resumes (a lot of places want a doc file), I opened it in multiple versions of word. The file always opened, but the formating got changed. Sometimes it all fit on one page as intended, other times it would spill over onto two pages, etc. So for times when formatting is critical, word is not truly backwards compatible. You are better off exporting to pdf...

  11. TXT is not a format by Free+Bird · · Score: 5, Informative

    A .TXT file is nothing more and nothing less than a plain text file. Ironically, it's only because of MS, champion of closed standards, that using the .TXT extension for these files has now become a de facto convention, but in the DOS age, other extensions such as .DOC or extensions that were basically part of the name (like README.1ST) or the total absence of an extension were also very common.

  12. What sort of "open" are they talking about??? by Bloody+Peasant · · Score: 4, Informative

    Obligatory disclaimer: I wrote this humble file formats FAQ and it represents my personal and professional opinion (not necessarily my employer's).

    That said, can someone in MA please ask the movers and shakers there to read that document? It's probably in the class of "common sense" to most of us here, but clearly we've done a less than stellar job so far of imparting this clarity to those in political circles.

    For the impatient: the conclusion I reached is that RTF and PDF are very questionable if you want to use them as truly interchangeable formats in a heterogeneous environment. This is an empirical finding, based on real life experience.

    --
    -- This .sig intentionally left meaningless.
  13. Re: This is bad news, not good news by tyen · · Score: 4, Insightful

    Policy, Not Mechanism.

    They are very close, but need some additions to nail this right. Everyone freaking out over XML being cited should read the article. Reading the original article, I note that they defined "open format" by policy and not mechanism:

    specifications for data file formats that are based on an underlying open standard, developed by an open community, and affirmed by a standards body; or, de facto format standards controlled by other entities that are fully documented and available for public use under perpetual, royalty-free, and nondiscriminatory terms.

    This means they really don't care about the actual format, they care about the terms of access to the format. Microsoft can't drive a DTD with encrypted blocks through a mechanism-based loophole simply by declaring, "Hey! Look! XML!".

    However.

    It is said that even the largest companies bear the imprint of their founders. Gates was raised by lawyers, and his company operates like one. Unless you adversarially test this legislation before it passes, I guarantee you Microsoft will find a perfectly legal way to protect their crown jewels if it passes. There are other big players who will fight tooth and nail against this legislation, too. Oracle. IBM's DB2 folks.

    It is unfortunate that I could not find on their web site a full explanation of what they meant by "open format". However, going by that small excerpted blurb, if I was thinking of legal and marketing workarounds, here are some things I can come up with off the cuff.

    1. Dilute or pervert one of the definitions of "open standard", "open community", or "standards body". No definition legislated, easy enough to do. Control the standards group, control the standard.
    2. Note that the clauses separated by a semicolon (;) stipulate access terms for the latter but not the former. Sure, place it with a standards group, but make it expensive to obtain the standard "to cover distribution costs". The EIA standard for racks for example, costs over $50 to obtain an electronic copy. Perfectly open, perfectly standard, but certainly not "royalty-free".
    3. Play the Internet Explorer bundle game again, on a different playing field. Make the default format of the application a proprietary format, and allow saving as the standards format as long as the user takes additional steps to configure it or specify the standard format. By default, the vast majority of users will deploy with the default setting, killing any standard format in the crib through sheer inertia.
    4. Sure, there is an open format. It just doesn't support all the features of the application.
    5. Twist the definition of "fully documented" because that term is not nailed down. Yup, it's fully documented. "The 'dynamic_index' field stores dynamic indexes". There. It's fully documented. What? You want to know what a dynamic index is? Oh, but that's a trade secret. Or here is the full format of the dynamic index data structure, "fully documented". Leave out enough adequate description of the semantics, and you can bamboozle nearly everyone, including yourself. Why do you think Microsoft themselves can't get their own Word format consistent across versions? You can take the Microsoft-is-Evil theory that they do this to "entice" their customers to upgrade, but I tend to think it is because the format is ambiguously documented enough that even their own smart programmers trip up on the specifications.
    6. Supply an open standard, but your implementation of the standard is different from the outside world's implementation(s). Hey, bugs happen. No reference implementation that everyone standardizes upon, not a problem to be just barely incompatible enough (without any need for evil conspiracies) to annoy users enough to make them stick with the original application. Coders who hack EDI systems can sympathize with me here; even when everyone agrees upon an implementation standard, a "data format dissonance" tend