MS Office XML Format Now In TextEdit
computerdude33 writes "Apparently, Apple heard of Microsoft Office changing to XML formats. If you have OS X 10.4.2, you can save documents in TextEdit in Word XML Format. They are saved with a *.xml extension, and are riddled with references to Word. Here is an example of one of these documents."
Now you just have to find a Microsoft product to read the future Microsoft Word XML file!
So a simple two word text file has the following 33 XML tags pasted here with the greater and less than signs removed...
/ 2003/2/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:SL="http://schemas.microsoft.com/schemaLibra ry/2003/2/core" xmlns:aml="http://schemas.microsoft.com/aml/2001/c ore" xmlns:wx="http://schemas.microsoft.com/office/word /2003/2/auxHint" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C1488 2" xmlns:st1="urn:schemas-microsoft-com:office:smartt ags" xml:space="preserve"o:DocumentProperties/o:Documen tPropertiesw:fontsw:defaultFonts w:ascii="Times New Roman" w:fareast="Times New Roman" w:h-ansi="Times New Roman" w:cs="Times New Roman"//w:fontsw:docPr/w:docPrw:bodywx:sectw:pw:pP r/w:pPrw:rw:rPrw:rFonts w:ascii="Helvetica" w:h-ansi="Helvetica" w:cs="Helvetica"/wx:font wx:val="Helvetica"/w:sz w:val="24"/w:sz-cs w:val="24"//w:rPrw:tHot time!/w:t/w:r/w:pw:sectPrw:pgSz w:w="12240" w:h="15840"/w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440"//w:sectPr/wx:sect/w:body/w:wordDocum ent
?xml version="1.0" encoding="UTF-8" standalone="yes"?
?mso-application progid="Word.Document"?
w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word
http://tinyurl.com/4ny52
OpenDocument from OASIS
I don't really see the problem with "bloated" xml, when the files are zipped by default. Instead of smushing your efficiency requirements in with your readability and standardization requirements (and screwing all three), you first handle readability and standardization and then rap it in a standard efficiency layer. The upshot is, not only are the files often *smaller* than the old Word equivalent, but I can also hack through them using a couple of standard perl packages that have come with linux, OS X and cygwin for years.
Where's the downside?
OpenOffice 2.0 Beta already supports WordML.3 3450
http://www.openoffice.org/issues/show_bug.cgi?id=
"Taligent is still pure vapor. Maybe they'll be the last who jumps up on Openstep... "
One thing to note is that the Microsoft XML formats and schemas, either those exported by TextEdit or by the .docx format, are not necessarily done by Microsoft by choice. They're not even in response to OpenOffice.org. In my opinion, they are the result of "government forced technology", similar to how the California clean air regulations back in the 70s started to force Detroit to pour more money into catalytic converters and environmentally friendly cars.
There have been numerous government proposals and mandates that require open document formats. Some of the Massachusetts proposals come to mind. I believe the EU also has proposals on the table that require the use of open document formats. The trick with the EU proposal is that it actually mentioned XML (I believe it's the ISIS proposal, but may have the wrong acronym). Governments are large Microsoft customers and Microsoft doesn't want to lose their business. Including the ability to save in publicly documented XML formats gives them a loophole to continue selling to governments, even if all of the open document format requirements are adopted.
The ability of OpenOffice.org (and NeoOffice/J) to support these formats really is dependent on two things. First, the schemas are licensed from Microsoft on non-OSS compatible terms. Each individual person or application has to enter into a licensing agreement with Microsoft individually. This is directly against the terms of either BSD style or GPL style licensing. Secondly, Microsoft may have software patents involved with their schemas according to their licensing terms. While the patentability of a schema itself is questionable, they seem to have several patents revolving around the interpretation of XML schemas that may apply to their Office schemas. This goes against the CDDL style licensing Sun is now fond of.
Because of these terms, the only ways that OOo/NeoOffice could legally support them would be if either the schemas are clean room reverse engineered from example documents or if Microsoft turns a blind eye to open source folk using their schemas. Since I wouldn't want to rely on Microsoft's generosity, the clean room solution is the only way I can see. Sun won't be the one to clean room them either; they don't have to. StarOffice (and Sun built OpenOffice.org for Linux/Solaris/Win) would be covered under Sun's cross-licensing arrangements with Microsoft as a result of their settlement. Those licenses don't extend to non-Sun OOo developers like me, however, so we're all up shit creek.
Just because you can read it and the format is "open" doesn't mean it's "free". You can be sure that Microsoft's lobbyists will make sure that all of those government directives still refer to "open" and no "free" gets snuck in there by mistake.
ed