Microsoft Partially Opens Proprietary XML Format
eschasi writes "Groklaw has an article up
reporting that Microsoft
is going to open up their XML representation of the DOC format in response to Massachusetts' demand for
open formats. According to Groklaw there are some interesting caveats involved in the move. From the license: 'We are acknowledging that end users who merely open and read government documents that are saved as Office XML files within software programs will not violate the license'. While opening up the format even partially is a good idea, it's still a far cry from folks being able to write programs that create DOC-compatible files."
Mind you, this is - as I understand it at Groklaw - merely an opening to make GPL-applications able to read (not write!) government made (nothing else) documents, without interfering with MS patents. 'Open' might not be the best word for this...
Proprietary XML? Leave it to Microsoft to completely miss the whole damn point.
Sugapablo
The right to own data was lost with closed format, since it did require a license to read something you might have produced yourself. For a private person, it might be sad. For a corporate needy of its archives of past correspondance, it can be catastrofal. That microsoft opens up their format for reading, and specifies parts of it, makes it possible to write software to convert this data to a open format, or index it and such. Therefor, we can still save in MS format, but have much-less tie in.
I'm only wondering how far it goes, if it goes as far as to say that I'm allowed to make a non-MS certified opensourced bot that crawls my disk, and indexes office XML files... And what if a corporate does so, will they be allowed?
Assembling etherkillers for fun an profit
"We are acknowledging that end users who merely open and read government documents that are saved as Office XML files within software programs will not violate the license."
It seems that the ability for a citizen to read and access government documents should surpass all other interests, regardless of licensing issues. In other words, even if a government employee was boneheaded enough to save a document in a proprietary format, my ability access to the information in that document should be guaranteed no matter what, licenses be damned.
Bill Clinton: Pimp we can believe in. - The Shirt!!!
Ah, and once again Microsoft do what they do best: create a solution to a demand which doesn't actually solve the problem but your average politician can point at and say "they've cooperated". Bit like their server licencing and the judgement against them in the EU, it's providing a solution which is useless yet looks good on paper.
The MS Game "Allegiance" was actually 100% open-sourced by MS a while ago, just for your info too. I know it's not a document format, but MS (especially the developers section) does open-source stuff on occasion.
-Jesse
Nothing says "unprofessional job" like wrinkles in your duct tape.
Yes, and no. There are some issues with formatting and the positioning of content, I hope that this partial release of information can help the OO.org team to improve OO write.
If they open up the format just enough so we can read it , it will be a nice enough start so we can officaly open the documents then save them as a fully open format. ,
As much as i would love them to be made to play fair and open the format fully
Opening it enough to make it easy to parse gives us all we need incase of the disapearence of word , or MS trying to force an upgrade by breaking compatability in some way.
The only things certain in war are Propaganda and Death. You can never be sure which is which though
I'm a little confused on the whole .DOC being a closed format issue. If OpenOffice can write documents in the proprietary .DOC format, why can't other programs? Am I missing the picture completely? Thanks for any explanation!
Equally this still presents a roblem for QUANGOS. Non government organisations that perform the delegated work of governments will not be able to produce doccuments without restriction on which programs can read them. This could present huge confusion for end users who can't be expected to know where that blurry line between organisations lies.
Does anybody really want to keep this format going? Let Microsoft do whatever the hell they want and focus on moving people to open source one person at a time.
Laws are for people with no friends.
Sort of but that's not the point. They are crappy work arounds for the proprietary format. If the XML isn't all fucked up like MSFT probably wants it to be then anyone can easily view the documents (and write them) in any current AND future program that can read standardized XML.
If MSFT can't close the document format and any program can correctly read/write documents in the way they were intended what advantage does MSFT have.
That's why MSFT doesn't want this and everyone else does.
Yes, but they did this through reverse-engineering. IANAL but this probably leaves them open to DMCA lawsuits.
Microsoft has simply left this alone because OO.o and the others aren't yet a threat. If they ever become one, you'll see the floodgates open.
Clippy!
now if i could only find that old copy of microsoft bob...
Simple. He simply uses Windows Calculator, and translates from binary 10001100001111110101011000010111101011111010010 to hex. Simple when you think about it.
See my journal for slashdot ID's by year. Mine created in 2005. http://slashdot.org/journal/289875/slashdot-ids-by-year
Toilets. I believe toilets are as ubiquitous as Microsoft Word.
Laws are for people with no friends.
using proper English grammer and spelling.
How about, you handle the grammar and I handle the, Spelling. "OK"
-mkb
What, the party invented the aeroplane?
Can someone explain (I'm not trolling here) how the heck did M$ manage to shove a patent in on a public format that's been around for ages?
Or, is it some other issue than patents this time? I mean, XML-based formats are easily hackable, so M$ doesn't really need to spec it for you to write a converter, even though for a state government it would be logical to ask for a spec.
The problem with the MS implementation as I have understood it is Microsoft has used xml as transport for their proprietary DOC format, not defined their DOC structure in xml. There's a difference here. The former being the case, yes, you can get to the xml and "see" the DOC, but it is just an ascii encoded binary... so, you really get nothing more than the old proprietary stuff, AND an extra layer of obfuscation! Hardly what xml was supposed to be about.
...but I'm a little confused. Suppose I get a copy of a document in a format with a closed license. In what way am I bound to that license? When did I agree to it? Why would I ever need permission to from the creator of the format to read it? Is there some mysterious EULA that I accepted by being born? Or does this license only apply to people who create the documents with a microsoft application who have presumably agreed to some byzantine concept of ownership?
I always got the impression most of the remaining work was being bug-for-bug compatible with the Word layout engine, eg agreeing on what margins are and so on rather than actually reading the file data itself.
I agree with your assessment of toilet seats being cheaper than Microsoft seat licenses but shouldn't we wait until Microsoft releases a study on the total cost of ownership between toilets and Microsoft Office?
Laws are for people with no friends.
Wouldn't it just be SO easy to ditch DOC and start using HTML? All you'd need to do is have major corporations remove other options from the menu, so HTML would be the only option. Welcome to the new format. It's really easy. It's 100% compatable with even the most basic text editors. Although, Office does seem to produce butchered HTML (but only with images). Until they resolve this issue, I can dream.
Silence is golden... and duct tape is silver.
Reverse-engineering for compatibility purposes is still legal under the DMCA. Reverse-engineering is OK as long as you don't do it to infringe upon copyright.
Source, The text of the DMCA, Chapter 12, Section 1201.f (find within page for "reverse engineering")
The previous sig has been removed due to
Also, any software that implements this is violating the spirit of the GPL. The license explicitly restricts use of the patents to reading and writing MS documents. Noone may take such an application and modify it to make their own XML document format.
- krafty
Here is my question, the MS patent on this XML format has not been fully accept right? The patent office is awaiting public comment. Has anyone gone to make a comment?
Also, I don't even see how you can patent using open standard. I mean, XML was designed as method of storing data,amoung other things. How could the patent office possibly accept a patent where XML is simply being used to do what it was designed to do?
I mean, to draw a parallel. The 110w outlet in the US is an industry standard right? I mean, everyone can make plugs and outlets royalty free and all the appliances and devices can plug into them for power. MS patenting XML to store a word processing file is like Sony patenting a TV that uses the 110w outlet, thereby blocking anyone else from doing it even though they didn't invent the outlet or the TV. The same holds true here. MS didn't invent XML, they didn't invent the word processor, nor did they invent storing a word processing file in XML. So, how in the hell can they apply for a patent on it? Just by paying money?
...is 100% closed.
if i'm a grammar nazi, you're an illiteracy nazi.
I think that the reason it became so popular was the close file format.
Whaaaa? Cart or horse, which comes first?
Dude, Word did not get popular because of proprietary file format. Users don't give a rats ass about file format until they need to export/import from one to the other. That the file format is commonly used is a result of the programs popularity. Word got popular for other reasons such as aggressive marketing, aggressive pricing, aggressive positioning, feature richness, useability, blah, blah.
I work for the municipal healthcare dep. at Rio de Janeiro City. Here at Brasil the federal gov. has stabilished a deadline to change most software to opensource or free equivalents by 2007.
;-)
So, we started by enforcing the use of OpenOffice in every desktop. The process is simple, if someone want that old 450MHz Duron replaced by a new 2GHz Athlon they must use OpenOffice instead of MSOffice. Its amazing how this argument work!
Mind you that we don't forbid the installation of MSOffice on this new machines. No sir, anyone can BUY and DONATE the licente to the city, so the software can be installed legaly on the computer. Heh, imagine how often it happens!
The next step was to replace Lotus Notes (argh!) with PostFix + Cyrus running on Debian, and installing ThunderBird on every desktop. Most users just loved the change, because the Lotus Notes Client realy suck.
To add an nice touch, every DOC file that pass trough the email system is converted into a PDF, for tha sake of virus-prevention... The only way to pass an editable document thought is to use OpenOffice native format!
One day, I dream of substitute all W2k desktops with Ubuntu Hoary... and tell its just a new version of WindowsXP. With most of the users already using OpenOffice, ThunderBird and Firefox I gess none of the users will notice the change!
---- You know how some doctors have the Messiah complex - they need to save the world? You've got the "Rubik's" complex
You forget one thing: it's not their document that people wants to read, it's the customers', just stored in their format. It's like the guy who built my house refusing to tell me what size bricks he used, so that I have to hire him to do all the repairs.
I am trolling
This at least gives us the right to our own data back, since we can then convert it to a more useable format. So it seems like we've won the first battle, but not the war!
= /library/en-us/dnrtfspec/html/rtfspec.asp. You have won nothing, you do know that Microsoft used to publish word and excel formats on their website? It did not impeded MS's dominance, it did not help the competition.
You never lost the right to your data, you could always output your data into something else. Text, RTF if you wanted to preserve formatting. RTF's specification and a sample reader are published by Microsoft, http://msdn.microsoft.com/library/default.asp?url
If you cared, and few really do, you could always have written an RTF file with word. RTF is documented and sample readers are available from Microsoft, http://msdn.microsoft.com/library/default.asp?url= /library/en-us/dnrtfspec/html/rtfspec.asp. Word and excel format used to be published, it hardly mattered with respect to Microsoft achieving dominance or helping the competition.
This at least gives us the right to our own data back, since we can then convert it to a more useable format...
That microsoft opens up their format for reading, and specifies parts of it, makes it possible to write software to convert this data to a open format, or index it and such. Therefor, we can still save in MS format, but have much-less tie in.
You seem to be under the impression that ".DOC" documents use something other than eight bit ASCII characters to store data. Try this: Open up WINWORD.EXE, type in "abcdefg", save the file as "abcdefg.DOC", then open up "abcdefg.DOC" with NOTEPAD.EXE.
Guess what? NOTEPAD.EXE will show you that your data, the string "abcdefg" is there in the file just as it ought to be.
There is no loss of data when using WINWORD.EXE; rather, there is a gain of typesetting [or markup, or "formatting"] structure that other typesetting [or markup, or "formatting"] programs might not be able to understand.
Microsoft owns the rights to their own proprietary typesetting [or markup, or "formatting"] algorithms, but they make absolutely no claims whatsoever on the underlying data that those algorithms act upon.
If you don't like Microsoft's typesetting algorithms, then use Corel's [WordPerfect], or IBM's [Lotus Word Pro], or Apple's [iWork], or hell, even Donald Knuth's.
And after you've tried those other proprietary algorithms, ask yourself whether Microsoft's proprietary typesetting algorithms failed to offer you any value for your money.
Besides, even if none of what I've said is true, you can still always take your ".DOC" documents, open them in WINWORD.EXE, and click on "File | Save As... | Save as type | Text Only (*.txt)" and never have to deal with Microsoft for the remainder of the life of your data.
It's also to force people to migrate from text files and commandlines to full blown GUIs by turning simple stuff like:
to(That's me trying to format it nicely, btw.)There appears to be nothing at http://schemas.microsoft.com/office/word/2003/word ml
wordml. Is that like Manimal? Never mind. :)
Offic outputs real xml with no base64 encoded or cdata blocks
Have you ever been to a turkish prison?
Could someone point me to a reference about the laws regarding proprietary standards in the first place? Can't I write my own program to manipulate files in any format, whether or not the file format was created by someone else? Then is it illegal to create a program where Ctrl+C means copy, since ____ (Apple?) invented that?
Patenting something is good, to protect inovation, if microsoft has created a invention which allows amazingly wierd complex data structures to be stored in a higherical structure easilly then they can patent that, but that wouln't be a patent on the XML file which stored the resulting structure.
This patent seems to be on the arragement of data, if that arangement was chosen so a specific process can work on the data then patent that process with the data arangement, if not then this patent is for one thing and one thing only, anti-competitive behaviour, and as such shouln't be granted.
> Why does Microsoft have to open up their file format anyway?
They don't have to. Let them keep it.
On the other hand, I want the right to participate in my country's politics without having to pay the Microsoft tax. Hence, government must use open standards.
I personally believe that government should avoid software that uses proprietary formats from the outset. Some people, however, believe otherwise, and they are lobbying for a compromise that will make it legit for government to use Microsoft software.
XML is a W3C recommendation (not an open standard; W3C makes that distinction for a reason). It is based on SGML (not UML). XML is a meta-markup language like SGML; it is a means of specifying markup languages such as HTML or WML (not a markup language like HTML). Being a W3C recommendation, XML is copyrighted... by the W3C (not it cannot be copyrighted). Patenting and licensing of XML schemas or DTDs (which is what Microsoft did) is not the same thing as copyrighting anything (tools, formats used by tools, whatever) As for You can write anything on paper but it still doesn't make it true? I couldn't agree more. In fact that statement is as true of Slashdot comments as it is of paper. Jeez, I hate Microsoft as much as any Slashdotter, but at least get your facts straight!
It's a penny for your thoughts, but you put in your two cents worth. Somebody, somewhere is making a penny. SteveWright
Office 2003 XML Reference Schemas:
http://www.microsoft.com/office/xml/default.mspx
-- "I never gave these stories much credence." - HAL 9000
It's hard to say, but I'd read this to say that I can write GPL'd software, but anybody who wants to create a derivative work. would have to go the Microsoft web site and agree to the license.
This is probably splitting hairs, but unless the format is released into the public domain or into an open licensed format, there is nothing that says Microsoft couldn't change their mind later and stop granting licneses. My license may be perpetual, but anyone who doesn't make it in the gate may be out of luck.
Furthermore, this might allow Microsft to halt distribution of GPL'd implementations of their formats to people using the program for non-government purposes. Note this clarification:
So, you can distribute your OpenOffice filter to people, but presumably only under the condition that they use it to read government documents.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
"This at least gives us the right to our own data back, since we can then convert it to a more useable format."
Not correct. "We" will have no right to read or write data in their format. Only "Government documents" may be read. That doesn't give most of us shit.
The race isn't always to the swift... but that's the way to bet!
specifically of GPLed software. They are putting loads of effort to get around that but GPL software is creeping up everywhere and they don't know how to stop it.
Microsoft can now say, "Office XML file format is available for anyone to read. This proves Microsoft is promoting open standards."
Decision makers who don't care about the nuances of open standards or this issue, will put a check mark next to Open Standards in their features matrix.
Meanwhile, MS develops MSXML solutions to extend their reach into lucrative corporate markets now populated by small companies.
Don't mod me down (again) for the following, because this is the harsh reality.
Alternative office suites may be able to read and write M$ XML all they want some day. Microsoft simply doesn't care because they aren't a real threat to their bottom line. *No* Office application competitor redefines the broad market or adds new overwhelming feature/value to the broad Office applications market. Period. You can imagine what MS would do if such a thing existed.
http://www.maxineudall.com/2010/02/should-economists-be-sued-for-malpractice.html
Microsoft can not prevent you from writing software that creates .doc files. What the can do, is prevent you from writing software that creates .doc files if you read the .doc file specification. They own the specification, and can put any conditions on it they like (up to those permitted by law). You then have to choose between reverse engineering the format (assuming you live somewhere where it is still legal), or getting a copy of the spec and only adding read support.
I am TheRaven on Soylent News
I think the courts should under no circumstances let this pass. This is a bunch of BS, and I think that unless Microsoft complies fully and delivers the complete format of the data files, they should be fined not $5,000,000 per day until they comply but $5,000,000 per day until at least 100 independant open source computer programs exist that can handle Microsoft document files in their entirety, with no major user complaints about the functionality of these programs.