XML Support In Office 2003 Isn't For Everyone
0x0d0a writes "Unfortunately, it seems that Microsoft's recent campaign to promote Office 2003 based on its XML support may be a bit misleading. Only the Enterprise and Professional releases will have this support -- not Standard. Microsoft will still be leveraging file format compatibility for at least another Office release."
This is not reliable source! This is US led propoganda campaign!
Seriously, though, who here could not have predicted this?
Compared to war, all other forms of human endeavor shrink to insignificance. God, how I love it. - Gen. George Patton
but XML support in OpenOffice is.
------------
This is guarenteed to not be the first post.
Right, like we couldn't have seen this coming from a looong way off.
.DOC files completely - thanks to .PDF, it's been mostly successful.
I've given up on Office completely. I even try to reject
"Compatability" is still a bitches game.
; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
The sun rose this morning; sunset predicted for later today!
This is one reason I use openoffice (openoffice.org at home as it supports most word versions flawlessly, without promting me to "insert office cd 2" to install the feature.
We're only gonna die from our own arrogance, that's why we might as well take our time...
The entire business world is still being held hostage or pushed around by a proprietary file format. How sad, annoying, and wasteful.
I always said during the DOJ trial all I wanted was to have the Office file formats opened. That would have really lead to some change.
Btw in case your new here, try OpenOffice you might like it.
www.openoffice.org
If you wanna get rich, you know that payback is a bitch
Microsoft announced that only the Enterprise and Professional versions of Office 2003 would support the feature of saving files to industry-standard media such as IDE and SCSI hard disks. The Standard version of Office 2003 will allow the user to save document files only to Microsoft Zippo (TM), a new proprietary USB-based external removable media device. "We believe this is an innovative way to provide extra value to our customers." said Microsoft spokesman Hugh Jass.
"And this is my boy, Sherman. Speak, Sherman." "Hello." "Good boy."
If they ever do make it general they'll encumber the components with so many patents and copyrights that it will be a proprietary format in spite of being XML based.
The people running Microsoft might not be "nice", but they certainly aren't stupid. Moving to an open file format would immediately saw one of the legs out from under their monopoly. Expect them instead to vaporize the file format issue and drag it out as long as possible, so that people and companies tempted to switch to a WP with an open format will think they can get the open formats without switching, if only they wait a little longer and pay for a few more upgrades.
Sheesh, evil *and* a jerk. -- Jade
We have to remember that this is Microsoft we are talking about here. Any time they say "we are going to switch to an open format", there's always a catch to it.
Is Microsoft ever going to switch to an open format? No, why would they? They will only lose money. As for the people complaining about competition, why should a company with 90 - 95% of the desktop Office suite market care?
People with little or no knowledge about what Microsoft has done in the past might think that Microsoft is taking a great step forward. But remember, this isn't going to be complete XML, it is "Microsoft XML"
All this about Microsoft doing a great thing by switching to an "Open XML base" is all hype, nothing more.
Read my journal here.
Swimming with a very big shark is always guaranteed to be interesting, not necessarily good or bad. This is just throwing a few drops of blood in the water to spice things up.
Text in Office 2003 files stored in XML format might be viewable in other desktop programs, but all document formatting would be lost
Actually, this is entirely the point of XML. XML is not Yet Another Word Processor Format. It's intended to store "content" as opposed to "presentation", leaving "presentation" up to the app, much as was the original intent of HTML. Rather than an evil Microsoft plot, they are in fact conforming to the spec when they produce such a file.
The semi-trailer truck sized hole in the notion is, of course, that "presentation" isn't really entirely separable from "content", especially in a modern document. All that graphic-artist stuff like layout and font choice and formatting actually affects the value and usefulness of the document. That's why we put it in in the first place. And that's why everyone always whines when Word strips out all the "presentation" they've spent all that effort putting into the document and just leaving them with the raw XML "content" -- a bunch of text.
The flaw here is in the attempt to erect too high of a wall between presentation and content, not in Word.
By the time you get fine-grained enough control over the presentation to create documents that actually look the way you want, the "content" usually becomes illegible. Alternatively, you have only coarse control over the presentation, in which case the content most often looks like crap. This problem is easily seen in any number of web pages that feel obliged to include some little rant at the top about bloated HTML and how they concentrate on "pure content", which usually means a sea of unreadable and undiffentiated Times Roman.
The flip side is if you actually do break up the content enough to get control over the presentation. The last time sometimes tried to create a human-readable ASCII-text format for documents, they wound up with Postscript. A typical document actually looks something like:
[556 0 24 -19 541 703 ]
AddEuroGlyph
} if
F
F4S53 Ji
688 1320 M ( )S
F2S53 Ji
800 1518 M (802.3z Gigabit Eth)[42 42 42 21 42 36 21 60 23 41 37 42 23 23 21 51 23 0]xS
1431 1518 M (ernet local)[37 28 41 37 23 21 23 42 37 37 0]xS
1781 1518 M (-)S
1809 1518 M (side interface)[32 23 42 37 21 23 41 23 37 29 27 37 37 0]xS
2255 1518 M ( )S
F3S53 Ji
650 1620 M S
F4S53 Ji
688 1620 M ( )S
F2S53 Ji
800 1620 M (Supports f)[46 41 42 42 42 28 23 32 21 0]xS
1145 1620 M (ull Gigabit line rate)[41 23 23 21 60 24 41 37 42 23 23 21 23 24 41 37 21 28 37 23 0]xS
1795 1620 M ( )S
F3S53 Ji
650 1722 M S
F4S53 Ji
688 1722 M ( )S
F2S53 Ji
800 1722 M (Operates in either media convert)[60 42 37 28 37 23 37 32 21 23 41 21 37 23 24 41 37 28 22 63 37 42 23 37 21 37 42 42 42 37 28 0]xS
1888 1722 M (er)[37 0]xS
1953 1722 M ( or line)[21 42 28 21 23 23 41 0]xS
2189 1722 M (-)S
2216 1722 M (card )[37 37 28 42 0]xS
800 1817 M (mode)[63 42 42 0]xS
Here's a hint. The "content" is clearly delimited by parentheses (instead of, oh, "") Easily readable by humans, right? A cinch to import into other applications, right? Guess what: a real XML word processing document that kept the presentation information isn't going to be any more readable. You're not just going to whip out vi and fix it up any more than you can do that to your Postscript documents now.
XML is not magic application pixie dust that makes all features transparently interoperable when you sprinkle it on.
Microsoft will still be leveraging file format compatibility for at least another Office release.
Here we go again. "If Microsoft would just use an open format like XML then anyone could read the documents with any program and the world would be a better place."
XML is a format for creating data formats. It is not a data format. The fact that a particular format is XML compliant says nothing for its readability, it simply means that it can be parsed into a document tree by an XML parser. That doesn't mean that anybody can determine what the tree represents, only that it can be created. My favorite analogy: "If Microsoft would just start using 8-bit bytes, then anybody could read their file formats."
Microsoft has made it clear that the dollar value of secret file formats isn't lost on them. They will continue to use secret file formats, even if they're XML-based, until someone makes them stop. At the same time, they'll be able to harvest the stupidity of PHB's who will claim that Microsoft file formats are open because they're XML. It's surprising how many people on Slashdot foolishly believe the same.
Michael
Do you have ESP?
I m not counting on MS Office Suite to provide me with a XML editor. Here are some alternatives:
DocSoft's W2XML Version 2
Authentic by Altova
i4i Tagless Editor
XMLWriter by Wattle Software
Opensource Extensible XML Modeling Application
If you know of any other GUI based XML modeling/editing apps, please feel free to add them to this list.
Consensus is good, but informed dictatorship is better
Develop once, sell many times...
IANAL, but imagine a beowulf cluster of in Soviet Russia all your belong are base to us welcoming the new SCO overlords.
It makes sense actually for usefullness.
If you xlink to another XML document or some binary data, then you need the "other document". If you need the dtd, or stylesheet information, you need the other document as well.
Zipping one XML document only has space saving as its only advantage. But for many, ensuring they are in the same place ensures you dont' get errors interpretting them and their required children/siblings/parents.
--
"I'm not bright. Big words confuse me. But Wanda loves me and that should be enough for you." - Cosmo
This is as good-a-time as any to migrate away from Microsoft Office. Open Office 1.1 is about to come out and it looks brilliant!! (the beta is currently available at http://www.openoffice.org/ ) It supports open standards (eg. XML), Microsoft Documents (word/excel/powerpoint) and exports to PDF (both text and graphics) at the press of a button! It also manages to count page numbers correctly when printing (* cough - word, cough *).
:)
On the other hand, my wife prefers Word and I prefer Open Office. The only time she likes open office is when she asks me to convert a document from one word format to another - because word won't do it at all, or word converts it very badly.
Also, I save several hundred dollars every few years
AC
lemme see ...
...
... see Bill, see Bill emulate.
... ;)
there's MS Java, then there's the other version
there's MS HTML, then there's the other version
there's MS VC++, then there's the other version
there's MS OS's then there's the other OS
same ol same ol
Nope, nuthin new here folks, move along
Words to men, as air to birds.
Even if XML was supported in all versions of Office, would that mean that Office would suddenly have an open file format? I don't think so. It's perfectly possible for me to write anything in XML in a way that you will not be able to read it.
Which is normal. XML is a way to describe data. If you have the DocType Definition (DTD) of an XML file, the only thing you know is whether that XML file is structured correctly, and how you would create another XML file that would look like the same thing for an XML parser. Nothing more.
In the long run, XML is nothing more than a standard you can use to base other standards on. XML can be put in the same row as ASCII, bytes, the file concept, or even SGML: it's a standard intended for the creation of other standards.
Nothing more, nothing less
Therefore, I think the argument that Microsoft Office will 'support XML' is just a marketing joke. It won't do anything out of the ordinary...
Microsoft will still be leveraging file format compatibility for at least another Office release.
They'll do this as long as they have a monopoly (or near-monopoly). The XML support isn't about making file formats compatible with competitors, or even about pretending to. It's just one more feature that MS has added to Office, in an attempt to persuade existing users to upgrade. It means that Office can be used to edit XML documents. It doesn't mean that Office's proprietary file formats are disappearing.
XML editing is a useful feature for some people, and from what I've heard it works better than the horrible HTML support in previous versions of Office, but it's still a niche. (True, it can be used to help with cross-platform compatability, but so can RTF and other existing "save as" options.) Most users just want to write a letter or design a presentation, and aren't concerned with markup languages.
It's great to see someone else gets it. Postscript is actually a language which describes layout -- really, nothing is stopping you from doing all your work in it. Same with TeX. Of course, both languages (and they are true languages) are extremely complex and generally benefit from a middle-ground tool to do the real work (LyX, TeXinfo, Acrobat, Dia (?) etc).
:)
Treat XML like a database. It has rules of operation, but what you contain and how you describe the data are completely arbitrary.
That said, if office is really aiming for interoperability, they would publish the XML schema and layout rules. However, as most of us already know, it's just yet another business with the desire to put "XML" on their "Corporate Resume" to make them look more "open".
Sorry for all the double-quoted words.
Microsoft's Leach emphasized that this change in positioning doesn't negate that "customer-defined XML schema support is a feature of Pro." On the other hand...
Cool, they've actually appointed a corporate leach. Perhaps that explains why MS Office came out with XML support after it was released in OpenOffice.
Stop-Prism.org: Opt Out of Surveillance
Treat XML like a database. It has rules of operation, but what you contain and how you describe the data are completely arbitrary.
...
Anyone who has used XML knows perfectly well that it's entirely possible to describe the complete dataset for content, layout, and presentation, within an XML document, in a form which can be easily parsed by humans and software alike. Completely. Using open standards, even.
Consequently, it's also possible to wrap it all up in 'parseable', yet 'unhandleable-unless-you're-on-the-inside' data blobs which mean nothing to no-one, yet still use 'XML' as a wrapper.
It's a liability of having such an open design, and Microsoft are exploiting this fact, in the context of *CLEAR* market-division tactics.
*They* created the artificial 'Professional/Enterprise/Standard' labels. Not the Users.
MS' use of XML here is perverted. It serves no purpose other than to give MS an opportunity to blag press release points about how their software uses 'the latest open standards' to people who have *NO CLUE* what they're talking about
; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
So wait a second - the original post stated that XML is ALL about the content and specifically NOT the presentation. Now you are saying that XML is apparently *self documenting* and the USER decides how the content should be displayed.
So, according to your post, Microsoft is correct when their XML file output includes the *content* and the *user* can display it however they want.
XML is a text-based system for data storage and retrieval, intended to be *self documenting*. In other words, the details on what fonts are used, what settings The User has set for individual parts of the documents, the parameters for those setting, etc. ARE ALL SUPPOSED TO BE STORED IN READABLE FORMAT WITHIN XML TAGS, CONFORMING TO A KNOWN, PUBLISHED DOCUMENT DESCRIBING THE CONTENT.
No it's not. XML is not supposed to store information such as 'font' and other presentational features. This is the job of the XSL stylesheets or CSS etc. XML is designed to store data in a structured way. So for instance you may have a <chapter> tag, but what font to use for chapter tags is only supposed to be specified in the XSLT. If I did an XML export of my word document, I would expect (hope for) an XML document, and either an XSLT stylesheet transforming the XML to HTML, or an XSL:FO stylesheet so that I can turn the XML into a pdf or postscript file. However, the stylesheets would be the 'icing on the cake'. The essential item is the XML formatted data, not the presentational information.
For comparison, here is the equivalent (empty) document in OpenOffice.
:editing-cycles>1</meta:editing-cycles><meta:editi ng-duration>PT0S</meta:editing-duration><meta:user -defined meta:name="Info 1"/><meta:user-defined meta:name="Info 2"/><meta:user-defined meta:name="Info 3"/><meta:user-defined meta:name="Info 4"/><meta:document-statistic meta:table-count="0" meta:image-count="0" meta:object-count="0" meta:page-count="1" meta:paragraph-count="1" meta:word-count="0" meta:character-count="0"/></office:meta></office:d ocument-meta>
content.xml:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE office:document-content PUBLIC "-//OpenOffice.org//DTD OfficeDocument 1.0//EN" "office.dtd">
<office:document-content xmlns:office="http://openoffice.org/2000/office" xmlns:style="http://openoffice.org/2000/style" xmlns:text="http://openoffice.org/2000/text" xmlns:table="http://openoffice.org/2000/table" xmlns:draw="http://openoffice.org/2000/drawing" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:number="http://openoffice.org/2000/datastyle " xmlns:svg="http://www.w3.org/2000/svg" xmlns:chart="http://openoffice.org/2000/chart" xmlns:dr3d="http://openoffice.org/2000/dr3d" xmlns:math="http://www.w3.org/1998/Math/MathML" xmlns:form="http://openoffice.org/2000/form" xmlns:script="http://openoffice.org/2000/script" office:class="text" office:version="1.0">
<office:script/>
<office:font-decls>
<style:font-decl style:name="Arial Unicode MS" fo:font-family="'Arial Unicode MS'" style:font-pitch="variable"/>
<style:font-decl style:name="HG Mincho Light J" fo:font-family="'HG Mincho Light J'" style:font-pitch="variable"/>
<style:font-decl style:name="Nimbus Roman No9 L" fo:font-family="'Nimbus Roman No9 L'" style:font-family-generic="roman" style:font-pitch="variable"/>
</office:font-decls>
<office:automatic-styles/>
<office:body>
<text:sequence-decls>
<text:sequence-decl text:display-outline-level="0" text:name="Illustration"/>
<text:sequence-decl text:display-outline-level="0" text:name="Table"/>
<text:sequence-decl text:display-outline-level="0" text:name="Text"/>
<text:sequence-decl text:display-outline-level="0" text:name="Drawing"/>
</text:sequence-decls>
<text:p text:style-name="Standard"/>
</office:body>
</office:document-content>
meta.xml:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE office:document-meta PUBLIC "-//OpenOffice.org//DTD OfficeDocument 1.0//EN" "office.dtd"><office:document-meta xmlns:office="http://openoffice.org/2000/office" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:meta="http://openoffice.org/2000/meta" office:version="1.0"><office:meta><meta:generator> OpenOffice.org 1.0.1 (Linux)</meta:generator><!--SRC641_[7663]_LINUX_IN TEL__stripples.devel.redhat.com_at_9/10/02_8:50:05 --><meta:creation-date>2003-04-14T09:09:00</meta:c reation-date><dc:language>en-GB</dc:language><meta
That is only 2 out of the 4 or 5 files openoffice saves. Oh, and for all those who made sucky Base64 jokes about MS WordML, take a look at this:
<config:config-item config:name="PrinterSetup" config:type="base64Binary">ugL+/0dlbmVyaWMgUHJpbnR lcgAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAU0 dFTlBSVAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAWAAMAAAIAAAAA