Microsoft Just Says No to .Doc Replacement Panel
Schlemphfer writes "OASIS is a nonprofit consortium backed by top technology companies, and the purpose of this organization is to set open standards for desktop and business software. They've just announced a working group that will create an XML-based document format standard for openoffice.org. And even though Microsoft is a member of Oasis, they aren't going to be taking part in this group. It's a logical move on Bill's part, considering that standardized XML docs are sure to weaken the hold that Microsoft's proprietary .doc format has on business software."
Yes, MS isn't going to open up one of its proprietary license. Especially one that is so widely used. If this comes as a surprise, you need to soak your head.
But, I guess everyone will have a great time bashing MS for doing the obvious...
Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
Why should Microsoft change formats? They are presently in a position (in regards to office software) where they can force their own "standards" on everyone else. They continue to dominate because there are not reliable, transparent converters. If they were to adopt a document format where other companies software could edit documents created by Word, there would be little reason to stay with Office. I personally always use plaintext wherever I can, I don't want to rely on any document format (no matter how common) to continue to exist for long periods of time.
Office 11 will have an XML format available, but the default will still be .doc.
Under capitalism man exploits man. Under communism it's the other way around.
Proven in court. Why would they turn away from monopolistic behavior when their punishment for it is negligable?
Office is the cash cow, and they have done their best to eliminate viable competition.
The only reason that Corel Wordperfect lives on is the legal community, and a few bullheaded supporters that will not change. (not that refusal to change is bad in this case.)
Why would anyone logically think that they would embrace a standard that will put their competitors on an equal playing field?
A standard that they cannot "extend" easily at this point without lots of bad publicity.
I think that they are going to "wait and see" if it flies, then embrace and extend it after it sticks. It is in their benefit to wait for it to fail, or for more time between their conviction and their extension of this standard. They don't want to get their hand slapped again so soon.
Cuchullain
"If sharing a thing in no way diminishes it, it is not rightly owned if it is not shared." -St. Augustine
This just in, Sun says No to Java standardization! My point being...BFD? Of course Microsoft isn't keen to join up. Just like any other for-profit company wouldn't join a committee whose purpose was to weaken their market position...
First, although XML seems more 'open', in reality it is simply a higher-level encoding that may or may not be easier to understand but is guaranteed to both take longer to parse and take up more space than the conventional .doc format because of the size of the tags, making this a downgrade 'optimization' of both speed and size -- where is the win here?
;-)
.doc and .pdf instead of HTML, and giving HTML a fancier name for the new millenium isn't going to change it. Anything tougher than bold, italics, and tables has been proven to be an O(n^2) representation in HTML and has been neglected because nobody wants to download a meg of webpage.
Funny. I just made a "hello world" document using Word 2000 and it was 19 KB.
Lack of features -- there's a reason people are still using
If you seriously think that XML is just a fancy HTML, then there's no hope of you understanding why this open standard is a good thing in the first place.
- Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.
IMHO, Microsoft's closed office formats are the basis for its monopoly in the Office market. I love LaTeX and use it when I author articles myself. But when I work with others, guess what? I have to use Word.
I've tried using LaTeX with several groups and each group has decided to move back to Word. It is just too familiar, too standard.
The sad part is that I absolutely hate Word as much as I dislike any other program. It has nothing to do with my feelings towards MS. Word is just a poorly done program.
In the real business world, Office will be king until MS opens its format. StarOffice (which I've used quite a bit) is nice, but at 99.5% compatability, it just isn't good enough. No one wants to lose a business deal because they don't use the standard.
I highly doubt MS will ever release its hold on the Office formats. Of course, they are going to XML, but that doesn't mean the format will be open and readable to competitors.
..People are going to have a much easier time smashing through the Microsoft document formats when they switch to XML.
.rtf (among other formats), and have it open fine in any other well-coded word processing program.
As it stands, should Microsoft be worried? Nope.
MS Office has no equal. Some things are based on their ownership of the most proliferous desktop OS - such as MS Office opening damned near instantly.
The rest, though.. Open Office? Please. Show me the equivalent of Excel, Access and PowerPoint on Linux, or on any other office suite. You won't be able to.
Word.. Word can easily be replaced with even AbiWord. The word processor is a battle that is meaningless. Even MS Office allows you to save in
Where some may insist that Microsoft has an operating system monopoly due to dirty tactics (OEM threats, anyone?), MS Office is king because it kicks ass. It has no equal.
Once a standards based XML document format is formalized, Microsoft will boldy announce that Office has full support for reading and writing the format. What they won't tell their customers is that when Office writes out such documents, they will most likely embed "custom" features that "extend" the standard that non MS applications will have difficulty understanding and without which things just won't quite look right, thus locking MS Office users in to the same dilemma they have now.
BUT, this can be avoided IF the standards committe carefully structures the standard in such a way to prevent custom incompatible extensions and that any application not adhering to the standard cannot advertise itself as compliant or able to read/write such documents. A good trademark owned by the standards body would assist in enforcing this. Then Microsoft would have to choose either to implement it openly, or not fully support it. This would at least force them to be honest.
Isn't Microsoft .doc format based on XML already?
Yes, but this doesn't really help a whole lot. XML is a standard for designing document formats, it is not a format in its own right. The fact that Microsoft's format is "based on XML" really only says that they will use HTML-like tags <foo>some text here</foo>, it doesn't say that how their word processor will interpret those tags, or even what the tags will be, etc.
What's wrong with RTF or straight-up ascii?
Try embedding a spreadsheet in RTF, and get back to us (is this question for real ?)
I was under the impression that Microsoft Office 11 was promoting their own??? version of XML. If that is the case, I am sure that BillG wouldn't want anything else as a standard
No, Microsoft are using their own document format. It's not a "version of XML", XML is a specification for writing document formats. It isn't a format in its own right. Bill couldn't care less if something else became standard, but the issue here is convenience. Microsoft may want to be able to add tags to their document format, as they add features to their software. It's really a case of the "not invented here" syndrome -- everyone likes to invent their own format. Even with standards like POSIX, C++, C, and HTML, any vendor of consequence adds their own vendor extensions.
Yes, MS isn't going to open up one of its proprietary license. Especially one that is so widely used. If this comes as a surprise, you need to soak your head.
"Proprietary licenses" are not the issue here. Microsoft are moving to an XML based format, and they already allow developers access to documentation for their formats. Moving to XML will make their formats more accesible -- it might not make much difference to a serious implementor, but it will make it much easier for the average perl hacker to do something with their documents.
The issue is that MS don't want someone else controlling the format that their software uses. It's simply more convenient if you have complete control over the specifications of your format. Compatibility requires some discipline, and possibly a certain amount of inconvenience. Whether or not that inconvenience is worthwhile depends on the merits of the format, which is why Microsoft are playing "wait and see".
In any case, I doubt Microsoft would use a standard format as their native format, at best they would base their native format on a standard and add a bunch of vendor extensions to it.
The article said they're taking a "wait and see" approach. They are free to join at a later date.
Why do people think XML is a panacea for proprietary document formats?
.doc file here]
[Insert binary blob of data that is currently a
Lookeee! Now it's XML. Isn't that so much better?
No, I don't think MS is going to do anything that awful, but realize that XML is not magic. It does nothing by itself to make a document more open. If you have lookup table values in the XML data then you're still screwed unless you know what the actual lookup table is. You can have utterly meaningless tags with random data in it. If you don't have agreements on what fields actually mean then all you have is content without value. Yay.
Frankly, all XML really does is explode a file's size by encapsulating data with tags. Whoop de doo. You have to have a rigorous and complete document specification, and while a DTD may fulfill that need it doesn't always. With a rigorous and complete spec though then XML is redundant - you can just as easily parse a binary file at that point. And look! You can do it with less memory and CPU. Funny that.
I fully expect MS version of "XML based .doc" to be simply a base-64 encode of the .doc we have today, enclosed in a pair of XML-tags.
Thinking "Oh, it's XML! Then we can all understand what it says!" is naive.
Belief is the currency of delusion.
Microsoft Office file formats are the lynchpin to their dominance of the computer software world. Because everyone has Office, no one can switch since the defacto exchange format is MS Office docs. Small companies/organizations can effect wholesale change to some degree but still have difficulty trying to interact with other businesses. Non-techs don't understand why you can't read their Word doc b/c what else could you be using? This causes pain for anybody who tries to switch and the quickest relief of pain is to fork out a few hundred smackers for a copy of Office.
Microsoft also enforces its planned obsolesence in the same way. Since new machines only come with the new version of Office, any existing organization is eventually infected with the 'upgraded' versions (complete with their 'smart' features that are either annoying or useless to 99% of the consumer base). Once these documents begin to float around and not open quite right in old versions of Office, everyone needs to upgrade. Otherwise, countless billable hours will be lost to futzing with file formats. $400 for an Office license quickly pays for itself when you're billed out at $50-$100 per hour. Its not the most desireable path, but for a struggling business, its the quickest pain relief available.
File formats also further entrence the Windows operating system. Clearly, linux and unix are out with no native MS Office suite. While I admire the open source projects and their ability to continually reverse engineer the moving target of MS file formats, it is impossible to keep up and they can never provide 100% compatibility which is imperative for a working daily interaction with MS Office users. Even on the Mac with Office X (touted by MS ads for its full compatibility), there are roadblocks to easy transion. My wife uses Office at work because she has to interact with others who do. She recently tried to move to Mac but couldn't because her files weren't quite right. The symbols didn't translate correctly, which might not bother business folk, but as a scientist, it meant that all her technical papers would require endless fixing just to do a little work at hoem. So she's back to a Microsoft Windows box. How fortunate for Redmond that the software they supplied wasn't capable enough for her to make the 'switch'.
All of this hinges on the ability of Office to maintain a closed file format. It keeps users trapped in Office due to compatibility with their coworkers and colleagues. It forces users to upgrade their perfectly good software and shell out more $$$ to MS just because someone else in the office has a new machine. It locks users into the blessed Windows OS again solely for the sake of compatibility and ease of document exchange. MS will never agree to a default open file format for its applications as it would break their stranglehold on both office productivity software and operating systems, the only two profitable portions of their business. Even the new XML formats that promise self describing data storage will only pay lip service to the critics as they wrap up their proprietary binary formats in easy to read, text tags.
And this time it would be very simple. Once the XML document standard has been settled, the US government needs to mandate that any wordprocessing software used by the government must use the XML open starndard, no exceptions. Give the industry one year from the adoption of the the standard to implement it in their software. After which, any document processing software which does not conform is automatically excluded from any consideration by the government. No one is forced to open up their proprietary systems. It's their choice. Choice is good, even for arrogant companies like Microsoft.
-- Will program for bandwidth
So if these guys are all for standards...why don't they adopt the XML format used by KOffice? They'd probably want to extend it somewhat, but KOffice seems to Work Here and Now.
Please correct me if I got my facts wrong.
"robust features of .doc"? such as? .doc is a simple dump of the memory state, with little to no internal integrity. Probably one of the *least* robust file formats I know of
People who think they know everything are a great annoyance to those of us who do.
Microsoft XML Architect and W3C XML Standard Co-creator Jeal Paoli announce XML integration with "Office 11" on November 14th...
Open Source community (in no doubt lead/prodded/cajoled/wrangled by Sun's Scott McNealy) tries to upstage W3C's work on XML by producing their own standard on November 20th.
Can you say "wanna-be"?
Also, I think the "editors" of /. should be lynched for turning an honest response from Microsoft into a "we-don't-play-that-no-mo" response. Microsost NEVER said that they weren't going to work within that working-group or not. CowboyNeal et. al. are just a bunch of freakin' gits who love to "sucker-punch" anyone they can.
I think /. should change their background color to "yellow" - because this STINKS of "Yellow Journalism"
ScottKin - looking for CowboyNeal so I can PUMMEL him into consciousness.
I don't give a rat's behind about "karma" here or anywhere else. Don't like what I have to say here? Deal with it!
How many times is this joke going to be posted and modd'ed +5 funny? Yes, we get it! HA!
Forget the whales - save the babies.
I have been using OpenOffice on a consistent basis for the past three months, and I have to say that I like it quite a bit. I know, there are some things to be worked out on version 1.0.1, but I just enjoy being able to create documents without having to worry whether I will be able to send them to others and them not be able to read them in MS Office. Plus, I have both Linux and Win machines, and I can move files between them without having to worry about trying to open them up on the receiving machine.
In the bigger scheme of things, this could be interpreted as another Sun vs. Microsoft battle. MS has been trying to stick it to Sun and Java over Web Services, and this could be Sun's way of responding. Boys, boys, can't you learn to play nice together? The truth is, OASIS has lent OpenOffice some credibility by talking about XML file formats and trying to create a standard using OOo as an example.
Always look on the briight side of life! (whistle, whistle)
XML buys you a few things:
* it can build on other XML standards (e.g. there's an XML spec for address books).
* you don't have to worry about byte ordering endianess or the represent of chars/ints
* XML can be converted to other document types (e.g. text, another XML format, RTF, PDF) via XSLT
* XML has a schema which allows validation
* there are several XML based tools out there.
* because it's human readable, it's easy to make quick and dirty manual patches or modifications if they are needed.
RTF and PDF are both good binary formats that have a lot of support from tools, but they lack the other advantages of XML. Of course, it should be possible to create a binary version of XML that had all these properties (except human readability), but it'd take years for it to reach the maturity and the XML library code base that XML has and there's no guarantee that it'll be accepted by industry.
XML is more than a glorified HTML. It's a standard format that we've all agreed upon, much like english. Languages like Loglan are a lot better than english, but it doesn't matter. No-one uses Loglan, so there's no use for Loglan outside the small circle of Loglan enthusiasts.
It's no secret that the /. crowd is microsoft-hating all day long (and me too), but flat out lying or manipulation is still not ok - I thought we left that for Bill to do.
.Doc Replacement Panel
/.ers aren't hackers (by the definiton of RMS, anyway)) and such have a critical eye, and won't blindly swallow stuff without actually questioning (unless RMS ways so of course :).
Microsoft Just Says No to
And from the article:
Microsoft, [snip], has decided to take a "wait and see" approach with the working group, said Simon Marks, product manager for Office. Microsoft is an OASIS member and can join the working group at a future date, he said.
"If this turns out to be something that we feel (is necessary) for customers, we can join, but currently we'll just wait and see," he said.
I also believe this more or less means no, but it doesn't say so! I like to think hackers (I guess most
You always need some sort of code to work with a given format of XML data.
Of course, but I think you're missing the point. Obviously, any XML interpreter needs to be programmed in order to make sense of a document's structure and content. My main concern is that MS will completely disregard any suggested standards, in the same manner as they treated Java. All this does is add extra work to something that could easily be standardized.
Imagine if I write an XML application that follows the suggested standardized specs. I can read and write XML documents as I please. I might produce something like this:
(xml)
(data)
Blah blah blah.
(/data)
(/xml)
Now someone else is using some XDocs application. As a trivial example, MS programers might give XDocs the functionality to allow for 'shortcuts' -- why not shorten the data tag, since it's used so frequently? The data tag would be kept, but a second tag is introduced to simplify typing. So the same document might turn into:
(xml)
$Blah blah blah.
(/xml)
Now, my XML app might choke and die trying to read this, although MS's app would (no doubt) be able to read mine.
Sun will end up reading and writing XDocs.
Now we're back to multiple file-formats again, which XML was trying to ease away from in the first place. How is this an improvement?
Frankly, all XML really does is explode a file's size by encapsulating data with tags. Whoop de doo. You have to have a rigorous and complete document specification, and while a DTD may fulfill that need it doesn't always. With a rigorous and complete spec though then XML is redundant - you can just as easily parse a binary file at that point.
That's just false. With a rigorous and complete specification for a language you still have to write a parser for that language. But with XML, you use one of the dozen off-the-shelf parsers, including the one that probably ships with your operating system or browser. Guess what, these office documents will probably work _out of the box_ with pre-existing XML browsers (Mozilla, IE 6) and CSS stylesheets. Or at worse, an XSLT could do the transformation on either the client or server side. The virtue of standards is that you can leverage standard tools.
A binary file format would typically need a binary plugin.
I was just trying to think what people would say if MS had participated on the panel, and I think it would be "sabatoge." If MS did indeed participate on the panel they would have a chance to undermind the standard that was produced as well as get earlier info about the developing standard to try and circumvent it sooner.
I don't think MS would do this, but I think that there are worse things MS could have done than simply not participate.
"Not knowing when the dawn will come, I open every door." - Emily Dickinson
If my name was William H Gates III, I wouldn't even SUPPORT the new format unless they twisted my arm (and then I'd implement the buggiest, shoddiest parser you'd ever seen for it). If MS Office doesn't support a format, it doesn't exist. Simple as that.
Joe Consumer won't ever know it existed, and my megacorp can continue plodding its way to world hegemony, or Wherever It Wants To Go Today(tm).
Word Perfect used to have a >90% hold on the marketplace. Now no one gives a damn about them (which is a shame as WP was once the best wordprocessor out there).
Word may have a stranglehold on the marketplace right now, but nothing lasts forever. Nothing.
Boobies never hurt anyone. - Sherry Glaser.
That's a cool idea, but unfortunately it will never happen. Have a look at AdBusters. They've got a number of great ads ready to air, but no network will show them because they run against the commercial grain of the rest of the sponsors. Rest assured, the media giants do *not* want to waste all their hard work kissing Microsoft's ass just to throw it away for a few million worth of ad revenue.
My deviantArt site
One of the problems with WYSIWYG markup is that it is visual, everyone likes Word (or whatever) because they can make things Look Right. But this is also its biggest problem, as it removes the structural/semantic information. We've now trained a whole pile of people to believe that what they think looks good must (obviously) look good to anyone. (To see the validity of that, just look at what those attitudes have done to the web. "I like blinkies, so everyone must." Ewww.)
But now the document is non-portable, and in some sense digitally unusable. Hard to index, hard to grab bits of for the next time you need almost that same thing. Indeed, something like the oft vaunted "mail merge" in Scribe, LaTeX, XML are relatively simple (a shell script and sed) but they tend to be hard in WYSIWYG documents.
Why? Because semantic markup is necessarily domain-centric. A business letter doesn't have the same kind of content as an invoice. Even when they're part of the same communication.
Thats a good thing for indexing, cataloging, analyzing and all that.
Its also a good thing for those who need to produce a lot of documents that look a lot alike. Hence document templates (available in any decent word processor).
Even better, using XML allows a nice separation of powers. The person writing the business letter does not need to know what it will look like, and the person defining its look does not need to know its content. Since the writer is not concerned with the look, editing actually gets easier. For example, I often use LaTeX (also HTML and XML increasingly) and emacs and know them both relatively well (and I use both under both Windows and Unix) and when I need to switch to something WYSIWYGish, I tend to get very cranky. "What do you mean, you cant put every sentence on a line by itself?"
Now everyone with a grain of sense knows all this (so I apologize for repeating it). Or do they?
Microsoft does. XML based documents are going to be the future, they say. Oasis does (but then they're SGML oriented anyway).
But not everyone does. That secretary down the hall doesn't. And he's going to fight like hell having to do things in a true XML oriented way (show him an xml editor and wait for him to threaten to quit). (Why do you think SGML never caught on?) He doesn't care about saving work - he wants to get paid for his 40 hours. And his boss is going to hear him loud and clear since he sits right outside her door. Even though putting him into that XML re-education camp is very likely to save a whole pile of money in the long run, the noise and screams and the short run cost is going to make it very hard to push in any kind of organization.
Which means we might end up with an XML representation of that WYSIWYG text. This would be a real mess. There is a thing called the "Rainbow DTD" (a quick web search turned up no live copies of this). This was an SGML (it predated XML) markup that essentially represented WYSIWYG markup. So there were elements like "". Yech.
As a proof of concept, a while back, I cobbled together a script that would read this and guess as to the users "meaning" (we were dealing with a relatively small target domain)- it worked, but quite badly, to get it to work well would have taken expert system or statistical inference kinds of code. The idea was not supported by my boss, because it would have required iterations and feedback from the original authors to tune the translations. He said, "They like WYSIWYG, lets not bother them." It was clear that it would have worked though, and with tools like XSLT, it would not have been all that hard.
So now I wonder, are the OASIS folks going to do a "rainbow dtd" type thing? Perhaps at a slightly higher level of abstraction? Or will it be a metalanguage for document definition (hey, I thought that was what XML was). And the MS folks, what does their XML look like?
Cuz, one way or another, with XSLT and a bit of hackery, someone will find a way to translate one to the other. And back. The only question left is how hard it will be and how much semantic information will carry across.
(And, yes, I know where and why it doesnt work, but it had to be done at least this far.)
version="1.0">
/> /> <xsl:apply-templates /> )
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
<xsl:output method="text"/>
<xsl:template match="*">
<xsl:choose>
<xsl:when test="count(node()) = 0">
<xsl:value-of select="name()"
</xsl:when>
<xsl:otherwise>
( <xsl:value-of select="name()"
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
The transformation in the other direction is left as the traditional exercise for the interested reader.
If that's the case then kindly explain to me why anybody uses MS? It's marketing, right? 90% of computer users in the world stick to MS because of marketing, or maybe you'll say the BSOD, or the monolopy status forces users against their will.
Kindly explain to me one thing that a lot of businesses do that you can't do with MS software. If there were a lot of things in this category then people would ditch MS like a dirty whore. Unfortunately for your argument, there aren't many.
Show me an environment that's as rich feature wise as MS. I have never encountered a problem that couldn't be fixed, and I can verky quickly solve most projects.
The era of adding genuinely useful features to productivity software is long past. I defy you to find any company (including Microsoft) where more than 5% of the people use more than 5% of the features in MS-Office. Feature creep in that product is addressing a diminimus minority. Sure, you can do all kinds of clever stuff with VBA - who actually needs to? Very few people.
The one and only time in recent memory I have tangled with VBA was to borrow from a colleague a script which implements a basic feature that MS-Access (2000) is simply missing - save a table as CSV. That's right, it can't do it. It can put it on the clipboard, but as any non-techie who wangs data around using Excel will tell you, the world stops at row 65,535. Lame.
Why do people upgrade from MS-Office 97 to 2000 to XP? Not for features, for one of two reasons - (a) they get a new computer and the old version won't run, or far more commonly, (b) they start receiving too many .DOC files by email that their software won't read. MS not only has the sense to stick with the impenetrable binary format, but to make an incompatible change to the default save format each release to force the upgrade path. Forget XML - the .DOC is the lingua franca of non-techie document exchange. There is a 3-way tie for second place between .PPT, .XLS and those little winmail.dat calendar thingies from Outlook.
I use StarOffice 5.2 for day to day munging of MS-Office files, for which it is fine, and it has come a long way from earlier versions, but it still needs work in the one word processor feature that really matters - handling .DOC - nowadays it supports even fancy stuff like change tracking, fonts are mostly their though it suffers from more "layout creep" than exchanging files from one setup of MS-Word to the next (what a bunch of lameness, making layout depend on the print driver, Word's worst bug IMHO).
ISTR that MS was originally proposing to use XML in Office 2000 when it was first on the drawing board. Some PM pulled that piece of business suicide away right quick.