Public Request For Microsoft To Release Deprecated File Formats
SgtChaireBourne writes "NLnet, a Dutch foundation for an open information society, has publicly called for Microsoft to release its deprecated formats into the public domain. The maker of Office has made large efforts during the last year to move against the OpenDocument Format (ISO/IEC 26300). These efforts have been producing a lot of commentary regarding the amount of data bound up in the Redmond-based company's proprietary specifications. It's a nasty situation to end up with files that cannot be read because the sole vendor with the documentation for the files has withdrawn permission. ODF is the way forward, or a step forward at the least, with new documents. But for the old documents in the legacy formats, they cannot be read without supporting software and that support requires full access to the specifications."
Last time I checked "many different versions" of doc, xls, and ppt are NOT old, obsolete file formats. They're essentially asking MS to not only open up their old file formats (such as Word 97 and older doc files), they're also asking them to hand over the full specifications on all their EXISTING modern formats--a move that would allow comptetitors to develop Office clones at will.
This is a thinly disquised shot at MS and closed source formats, not some noble attempt to help out archives. If it wasn't, they would have limited this to older files only and also called on other companies that make other older, proprietary formats (like Corel, Adobe, etc.) to release all their specs too.
SJW: Someone who has run out of real oppression, and has to fake it.
If you do that, people will never learn and continue to use closed formats. It's too easy to fall for a closed format for your crucial documents and then go whining when the company stops supporting them. Let people pay the price of their mistake, then, then open document formats will pick up steam.
\u262D = \u5350
i see only 2 solutions:
1. release a convertor. (it's available)
2. support legacy via providing the convertor instead of actually reading the deprecated formatted document.
we want to move forward, to adopt a standard -give some time to deprecated formats by supporting them till some time (a deadline), and provide conversion tools for free.
nobody wants a html fiasco when it comes to other file formats.
The reason - they don't have any documents describing the formats.
Code are descriptions of formats.
When Microsoft was forced to disclose information about the SMB format to EU anti-trust department they tried to give them the source code - complaining that it cost them too much to describe the format.
So they are sadly asking for something that dont exists.
Just saying it like it are.
Technically, it would be sufficient for the sake of old documents to provide a free tool that is able to read those documents, or a tool that would convert them to an open document format. This tool wouldn't need to have its source published.
The worst proprietary 'hooks' such as 'footnoteLayoutLikeWW8', 'lineWrapLikeWord6' and 'useWord97LineBreakRules', appear now to have been documented - see this link. This in effect means that some of the quirkier behaviour of old versions of MS Office may now have been made public (difficult to say for sure as the ECMA resolution is behind a passworded site).
Microsoft would make their, and everyone else's, lives a lot easier if they went the whole way and documented the entire depreciated office formats, allowing others to write filters to correctly interpret them. This would also give them a foothold in claiming that the tags above truly do point to an open format, since the behaviours they refer to would be openly documented.
But let's not hold our breath.
Is crushing a suspect's child's testicles illegal?
John Yoo: "No, [if] the President thinks he needs to do that."
Isn't it quite obvious that there are no specs? The OOXML specs are probably the best they can do when they have to reverse engineer the code into documentation. Don't expect any better than that and furthermore, don't expect them to even try (which they at least have when it comes to the OOXML documentation).
As long as the company will sell you a conversion tool, there is no such thing as an obsolete format.
This issue is a bit more complicated than you think.
Microsoft may not have the formats formally specified anywhere...Many, many years ago, shortly before my book was published, Microsoft actually wanted to hire me to write the official documentation for the Segmented Hyper-Graphic (SHG) file format because their own in-house documentation for the format was for an even older, unsupported version.
I mean, think about it, if you write code to store a document, do you sit down and write the byte-layout of that file? I suppose you could, but it's generally not necessary for the coders. My guess is that MS doesn't even have this stuff lying around. They'd probably have to have someone actually piece it together from the code.
I think that something people don't get is that there are not and never were comprehensive specifications for these formats. The specification is likely the code and nothing more. The document formats weren't conceived as a du jure standard, they are things that grew over time and evolved. Somewhere at the core you're going to find things like a C structs - from some old and forgotten compiler - being copied verbatim to disk.
Asking Microsoft for the spec will not mean simply taking an existing doc off the shelf and handing it over. It will mean either handing over the code for the old products that read and write those formats or spending person-years of effort combing through that code, constructing a specification, and then, somehow, testing the spec.
I wouldn't hold my breath for either.
As noted in another post about this article, it may be that there is no "format" other than "the code". If so, then the only free tool that is cheap to make is a wrapper around a complete application that just calls only part of that application. If so, making the wrapped tool free means giving away the entire program, not just the file part. In effect, then, this amounts to requesting that old versions be made free. Any difference between your proposal and asking old versions of their editor to just be made "free" (for whatever free you might be meaning) is just words, I suspect--nothing semantic.
Of course, this comes back to the question of whether there should be software patents at all, and whether software copyrights should have the immensely long durations that they do. Indeed, at some point, probably much shorter than happens now, having old tools be free so they can be recycled for other purposes may not be bad. It might even give vendors a kick in the pants to move faster to make newer tools be different enough that the old tools didn't threaten them. But bypassing a proper change in software copyright and patent law and instead just beating up on certain people who have things one wants does not seem the best approach to me.
Kent M Pitman
Philosopher, Technologist, Writer
Regardless of whether this particular initiative succeeds or fails, it would be wonderful if community pressure could lead corporations to adopt a "Community Standard" for their proprietary file formats:
Either support your format, or publish a full specification if you abandon it. (Do neither, and you suck, publicly.)
The world is currently headed towards a rather worrying future in which a staggering number of valued documents and other file resources of many types are destined for demise by corporate abandonment. Maybe it's time for communities to stand up and proclaim:
"We're not merely point-in-time consumers of your product. When we invest in your proprietary format by using your tools, we need its longevity safeguarded. Either support the old format, or publish full specs for it so that we can seek that support elsewhere."
I guess it's wishful thinking, but hey, we're paying them, not the other way around. Ultimately, if enough people and their wallets want something done, it will be done.
"The question of whether machines can think is no more interesting than [] whether submarines can swim" - Dijkstra
It's 'way past time for governments to make the commitment to open source software for information storage. It's the only way to ensure that public data gathered at taxpayers' expense is freely available to members of the public or their elected representatives.
I've calculated my velocity with such exquisite precision that I have no idea where I am.
In other news, the Dutch open information society requested knowledge of what you are thinking, noting "there is something you can't hide."
It is rare that I agree with Slashdot articles on such things, but on this, even the most pro-Microsoft zealot cannot really disagree... everybody wins if these specs are released... They're no more supported, don't compete with Microsoft's newer formats, and would -heavily- show all the entities investigating Microsoft's monopoly that they can "do the right thing".
It would also be a superb PR move (even though they don't deserve the publicity for something they should have done on their own long ago): it would reassure clueless CEOs. "See?? We can use closed source software, because once Microsoft doesn't support it, they'll just open it up!!!". It is far from true, but enough would think that way to make it worth it.
So come on MS, do it.
Why else would you have stuff like "break lines like W95" in the OOXML spec? Because you don't have an actual description, it means "call that legacy W95 code".
Live today, because you never know what tomorrow brings
It's not NEARLY as clean as Apple's Applescript solution, but since you can script OLE Components, you should be able to set up a computer to migrate the documents. If they are on a file server, you should be able to set up a machine with whatever is the last version of Office that can read the old files, and have it walk through your document tree, looking for each appropriate document. Then it should be able to load it in Office and save it in a newer format.
That would get all your documents in the latest (Office 2003 or something), then you adapt the script to run on a machine with Office 2007 and do the same thing. Presto-chango, your documents are up to date and safe.
Regarding formatting... if you're talking about documents not updated in 5-10 years, you probably don't care that much. You might want the content (I need to go through old hard drives and rescue any high school and college papers I care about, that are now hitting the 10 year old point), but if you haven't used it in years, and then want to use it, you can take the time to reformat. You're preserving because 1% of those documents might be needed in the future, which makes it worthwhile to bring them forward with an automated solution.
I'm beginning to think that a lot of the worry over old file formats becoming inaccessible in the future is overblown. With the continuing advances of emulation and virtualization technology, it seems highly unlikely that we'll lose all access to documents in old file formats. Emulation of the proper platform and installation of the appropriate software are all that's needed. The real trouble rests with obsolete physical storage media. I still have 5.25" floppies that I haven't been able to read for many years now, but that's hardly Microsoft's fault! And if there's a market for it, someone will be happy to copy all of your old media onto something more modern.
To the making of books there is no end, so let's get started
Cannot agree with you here. Obviously you feel you can continue running Windows 98SE with Office 97 in a virtual partition essentially forever - and in that case, you probably can.
However, the moment you get to Windows XP and recent versions of Office, you hit the dreaded Product Activation bugaboo. Now you're dependent on MS, Adobe, or whomever to continue supporting activation servers as you migrate old software and operating systems to newer virtual platforms. Also EULA's that prevent using software in virtual environments exist. You may well find that running Office 2003 on Windows XP can't be done, legally at least, on the machine that follows your next one. Then where are you?
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
It seems to be that the simplest solutions in this case are:
- Don't use MS formats in the first place for your master documents, or
- Keep the appropriate software around and accessible.
I see little incentive for Microsoft to release this information except as a show of good faith to its consumers.
Microsoft Office products do not scale well into the future at all, so don't be surprised when they don't.
Even if all we get out of this is .wri compatibility, I'll be tickled. (I have maybe 2MB of .wri files which are nigh on unreadable on my Mac; the reason they're .wri is because I wrote them fifteen years ago on my parents' computer that barely ran Doom, and they weren't about to pay for Word because it was $600 and I was 13 and had already produced some nice stuff in Write.)
"Why Subscribe?" Good question...
All that is necessary is that the formats either be published or that they be made available to interested parties on reasonable terms.
"On reasonable terms" is probably a cut of any commercial product and/or a flat fee for commercial users and a NDA agreement for commercial and non-commercial users.
This will foreclose true open-source implementations but it will allow free implementations and it will allow governments and others a chance to write their own implementations.
Now, personally, I think it would be a very wise business decision for MS to publish all file formats that are not faithfully supported either by its current products or by its current products with MS-provided plug-ins. If Word 2007 renders the file differently than Word 1.0 for DOS then it's not faithful support and the format must be made available so another person or company can try to do it right.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
Between the enterprise and government sectors, there could be an awful lot of pressure applied. When you look at the amount of resources that they are committing to a format, it would be in their best interests to contractually ensure that it will still be viable down the road, even after the information has sat on the server so long the format or even its replacement has been depreciated.
While the information could of course still be recovered in such a situation, there is no reason large companies or the government should except the potential expense of having to convert a potentially massive amount of information from legacy formats, potentially through several others using antiquated software. They have a major vested interest in ensuring the operability of the formats they use and the power to demand it. They just need to realize it and combine their influence to achieve it.
or they could just switch to using open standards, of course.
Based on my previous successes in getting Microsoft to release the source code to the deprecated MS-DOS 4.x (i.e. before the MS-DOS 5.0 complete re-write) under a free / open-source license, I'm confident that Microsoft will be happy to release deprecated file formats under a similar license.
Oh, wait ...
Ok, reasonably - think about the importance of compatibility flags like 'breakFootnotesLikeWW95'. The text of the footnotes is fully available, all there is is that they might be rendered slightly differently. Asking Microsoft to exactly specify how Word for Windows 95 broke footnote lines is specious. Nobody is ever going to implement it, and holding up the spec process for the first file format they tried to fully specify and slapping them with major penalties is just ridiculous. Especially when folks are asking them to document and free up their other file formats. Documenting OOXML is enough, because you can upconvert from any older format to OOXML.
I bet some eager mods thought i talk about ODF :)
Of course, i wrote about MS's older binary formats, heh.
Patents Drive Free Software as Hurricanes Drive Construction Industry
created with MY time and effort on the computer I bought, using the OS I bought and the application I bought. It's MY document.
So why the hell should MS get even more money so that I can read my document?
And what would MS lose by documenting their formats? Control over me? Well, that's not supposed to be a profit centre and I don't remember seeing it on the purchase order or in any sort of discount. So where do I get paid in a quid-pro-quo manner for ceding control to MS over me?
OLE2 filez:
OOO - http://www.openoffice.org/ - OSS Office
POI - http://poi.apache.org/ - Java API To Access Microsoft Format Files
MS Access:
Jackcess - http://jackcess.sourceforge.net/changes-report.html#a1.1.11- Jackcess is a pure Java library for reading from and writing to MS Access databases.
http://www.kexi-project.org/
would be to do what the hell you thought was appropriate.
How can MS prove it's wrong? It's not in the spec and WW95 produced different output depending on the printer driver attached. In fact, MSOOXML isn't being obeyed by MS either, so any test taken would show MS being wrong too. Or show it up as being an open definition that can be boiled down to "do it like us".
In general, it is safe and legal to kill your children. -- POSIX Programmer's Guide
Msft is for-profit, and that is fine with me.
.doc formats - and those formats are closed. Certainly you don't think that msft should lie about the openness of OOXML, do you?
As I understand it, msft sells software, not file formats. HTML is an open format, yet msft, and many others, sell HTML editors. Same with ASCII. RTF, and PDF.
It would not cost msft anything to open their formats, so what is the problem? People buy ms-office because it's better than any other office product, not because they are locked in to a proprietary format, correct?
Also, isn't msft saying that OOXML is wide-open? But the OOXML refer to old
So you're saying that Microsoft could spend time writing such docs (assuming they don't actually exist) but they're currently getting away with not being compelled to do that. By the way, besides Microsoft's say so do you have any evidence to justify this belief that these docs don't exist? I'm not one to believe an organization that spends so much time being duplicitous and illegally leveraging their monopoly.
Digital Citizen
Then this request comes right in time -- the problem is currently an inconvenience solvable by virtualization, and comes with its own handy cautionary tale. If they immediately stop using Office products that require activation or run on platforms that do, it's obvious that they're saving themselves from something worse than a simple inconvenience down the line.