Microsoft Claims OpenDocument is Too Slow
SirClicksalot writes "Microsoft claims that the OpenDocument Format (ODF) is too slow for easy use. They cite a study carried out by ZDNet.com that compared OpenOffice.org 2.0 with the XML formats in Microsoft Office 2003. This comes after the international standards body ISO approved ODF earlier this month." From the ZDNet article: "'The use of OpenDocument documents is slower to the point of not really being satisfactory,' Alan Yates, the general manager of Microsoft's information worker strategy, told ZDNet UK on Wednesday. 'The Open XML format is designed for performance. XML is fundamentally slower than binary formats so we have made sure that customers won't notice a big difference in performance.'"
But how fast a document opens is one of my last concerns here.
What I didn't see mentioned in this article was the fact that back in March, Microsoft joined a subdivision of INCITS (V1 Text Processing: Office and Publishing Systems Interface group within the International Committee for Information Technology Standards). Which is the group that kind of decides whether or not it should be widely adopted. Being ISO certified is one thing but it doesn't mean everyone's going to use it as a standard.
There was much speculation that Microsoft had joined INCITS with the intent to slowdown or stop the spreading use of ODF and insert their own standard. Sounded like another Microsoft power trip to me.
I predict that Microsoft will bitch and bitch about ODF and then release study after study suggesting some other patent laden format (probably Open XML) over ODF. This is just the first complaint against ODF--too slow. Perhaps next they'll complain that it's not documented well enough, some of their apps just can't support it, it gives their developers arthritis, it looks too ugly, etc.
My work here is dung.
If I was an MS shill (like so many in these forums seems to be), I would be deeply, deeply ashamed that the company I pimped myself out for was incapable of distinguishing between a document format and an application.
(read the 'study')
But I am sure the shills will pipe up with "easier to use", "people are used to it", "noone forces people to use MS" and other such irrelevance.
There are shills on slashdot. Apparently, I'm one of them.
It's not a game loading complex 3D worlds and sound effects, it's a load of text being displayed on screen. What difference does a few milliseconds here or there make? OpenDocument could be ten times slower and the benefits of an open document format would still vastly outweigh the effects of loading time.
"Any performance limitations now will be resolved as Moores Law continues"
Not that I like the argument.
Open Source Drum Kit, LPLC deve board - mjhdesigns.com
Anytime Microsoft complains about OpenDocument, I just remember back to when they were on the Technical Committee at OASIS forming the standard. They then left that committee. If they truly cared about OpenDocument, they would have stayed on the TC and made changes to it.
I see this as an attempt by Microsoft to slander this format and try to further their own semi-OpenXML format.
--
Jason Faulkner
Eastern US Press Contact
OpenDocument Fellowship
Jay | http://oldos.org
I can just see Microsoft's new slogan for Office 12:
"Microsoft, saving your life, one microsecond at a time..."
Since when is a format slow? I could write an interperter for the MS format that is 3x as slow as the ODF. What are they defining as unsatisfactory and on what kind of documents?
"You will do foolish things, but do them with enthusiasm." - S. G. Colette
You only need to write it to disk when you hit "save." When the document is open, and living in RAM, it doesn't even have to be kept in ODF!
A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
...is impossible, due to the therblig frammisating thingumbob.
Well, actually, now that you mention it, a professor and his student did remove it, but you can't call it successful, because um, performance, sure, that's right, in our labs our very own scientific technical unbiased tests showed that because of ferthbernder sprocket-flange snap-toggle linkage, when you removed IE using the professor's techniques, it reduced Windows performance by a lot of percent. No user would accept this, any more than they would accept the reduced performance of WIndows on a year-old PC.
We will now show you just how severe this performance problem is.
Right here. In this very courtroom.
With a faked demo^h^h^h^h^h^h^h^h^h^h a dramatic, animated illustration presented right on the screen of an actual PC.
"How to Do Nothing," kids activities, back in print!
If Microsoft are saying that they can't read XML documents efficiently then I guess we have to believe them, but if that's really true it says more about their lack of programming skill than the the difference between reading a binary vs text (or XML flavor #1 vs flavor #2) document on a modern processor.
If a Windows-capable PC has enough oomph to render clippy in 3-D translucent splendor for Vista, then it's certainly fast enough to load an XML document.
I don't read ACs: If a post isn't worth so much as a nom de plume to its author then I wont bother either.
He had a humongous spreadsheet (a couple hundred megabytes) and was tracking the load time.
He whined about the memory OO takes, and didn't mention that MSOffice pre-loads its stuff on startup, so you are loafing MSOffice stuff whether you need it or not.
You mean to tell me that parsing a file at an average of 200k of data is too slow on 1.0+GHz processors?
OPTIMIZE YOUR CODE!
I know that there are many variables here, but seriously... how slow can it be? I use OpenOffice 2.0 on an Athlon64 3200+ and I have no issues, in fact, I find it much quicker than M$ Office
ODT format is basically a set of XML files packed into a ZIP archive. One of them is "the biggie" (content.xml) and others are for supporting it. Images etc are saved/packed in subdirs. Now - to open it, OOo apps should unpack the whole package and parse XML keeping all its contents in memory (presumably, but highly likely). Maybe not a big deal if all you handle is two page memo but keep in mind that OOo's spreadsheet and database(!) programs work the same way. And for something like 20-30 page specs sheet on a sub-1GHz machine OOo works noticeably faster when handling DOC format documents than handling its "native" ODT documents. Saving/autosaving can be a pain too (as you should dump all you document to XML and pack it. Unlike MSOffice where storage formats work as database).
All in all - OOo's file formats are a nice and simple solution for exchanging reasonably sized documents (if you don't mind usual XML-namespace-hell structure) but for editing/working on larger documents/spreadsheets you may find yourself using MSOffice document formats (from within OOo). Pity they don't provide their own "scratch-pad/database-in-a-file" formats.
So - for once, Microsoft is kinda right here.
I guess the loyal crowd has already reeled in +5 Insightful mods by railing against MS, but it might not be a bad idea to actually read the article.
.doc definitely has a few speed advantages over a XML format, hence it'd be good to have the replacement XML schema designed for performance.
Mr.Yates says OpenXML has been designed with performance in mind, whereas ODF is not. A binary format such as
I wouldnt know if this was actually the case; however, it would be good to investigate if the claims were true. OpenOffice could very well do with a major performance boost. A lean,well-designed XML schema cannot hurt.
I mean, really. Is it such a shock that MS is trying to damage the reputation of a rival format? Actually, they're talking more about OpenOffice as an application rather than the ODF format, which is a very dishonest bit of FUD. I'm sure there will be more propaganda against ODF from the company we love to hate in the near future.
Perhaps next they'll claim that ODF is so slow that it's causing Vista to be late to market.
Transistors and Beer!!
This brings to mind something that Microsoft did in the mid 1990's. When MS Word was trying to wrest market share from Wordperfect, Microsoft apparently coded speed bumps into Windows that only their programmers knew how to avoid. Microsoft then claimed that MS applications were "better" becuase they were faster, though we didn't understand that it was because of intentional handicapping of their rivals' software until they'd pretty much crushed WordPerfect in the market.
It kind of makes me wonder if they'll try the same approach to make ODF look "slower," by optimizing MS apps to work with Open XML and fumble around with ODF files.
TLR
A man no more knows his destiny than a tea leaf knows the history of the East India Company
Oh noes! That document took 5.3 seconds to load and 10.2 seconds to save! Sure, I've been working on this document for 20 hours straight, but that's a LONG time to wait!!!
It's been a long time.
Actually the problem is not binary versus none binary, its fixed length versus variable length fields and records.
With old style formats, you knew that the header was 512bytes followed by 600 bytes of meta data, followed by the document sections which all indicate their size (or have some way of calculating it based upon the block type)
With XML, you get a tag opening and have to parse until the closure, this adds a lot to the complexity of reading.
Writing is slightly different, and should infact be simpler with XML even though it may be more verbose, you don't need to buffer the entire block or rewrite the section header to indicate the length, you just happily do a sequential write.
liqbase
In fact, until this very day I didn't even realize that performance was even in Microsoft's dictionary, and like so many other words Microsoft uses I don't think it means entirely what they think it means. Newsflash, Microsoft, "innovation" does not mean "steal other people's ideas." "Security" does not mean "It'll be taken over before you can download the first update for it." And "performance" doesn't mean "the entire fucking system stops for 30 seconds when some application decides to stop handling its windows controls." Now STFU and go back to pushing your poison kool-aid on unsuspecting consumers before Apple eats your lunch.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
The OpenOffice implementation might be a little slow. I my opinion this is probably due to the cross platform nature of OpenOffice itself, or it might be just slow.
The ZDNet article wasn't comparing formats, it was comparing OO.o to MS Office 2003. If they really wanted to do it right, they would add Abiword and K Office.
In my limited, subjective testing the new version K Office is much faster than either OO.o or MS Office in reading in documents.
-Matt
+5 Insightful? Oh PLEASE!
.doc files. So are Microsoft's new XML files. So it's pointless to claim that a "binary" file format is faster than an XML file format.
ODT XML files are binary files. So are old Word 2003
When people say "binary files" they mean this as opposed to "text files", a seperation that stems from the ability to open a file for in "binary" or "textfile" modus in several APIs. Has to do with, amongst others, interpretation of control codes such as ^Z.
The other big mistake: file formats aren't fast or slow. The algorithms for reading and writing them are (or aren't) slow.
*slaps cheek* NO WAI!
You fail to see the point of what they're saying. They're saying a binary file, with a header and fixed data structures, are alot easier to read & parse than an XML file, which consists of structures of variable length, needs to be interpreted, etc etc etc. This is a problem with XML.
I'm Rocco. I'm the +5 Funny man.
Everything that OpenOffice needs must be included with the application and loaded when the application loads. The opposite is true with Office. The majority of the application is already in the operating system. This is why OpenOffice is cross platform and Office is not.
OpenOffice has its own fonts and font engine, though it can utilize others. Office uses the OS's font engine but adds fonts to the OS during installation.
OpenOffice has its own engine to place, draw, clip... windows/forms. Office uses the OS's.
OpenOffice has its own database engine, though it can use several others. Office uses jet which is part of the OS.
The list goes on...
If the file format was supposed to be tested for perfomance then they should have used the two different formats with the same application.
Having to work for a living is the root of all evil.
http://www.groklaw.net/article.php?story=200511251 44611543
/>
/>
/>
using a text editor, would you rather try to fix a bug in an odf or ms xml file?
MS XML
<w:p>
<w:r>
<w:t>This is a </w:t>
</w:r>
<w:r>
<w:rPr>
<w:b
</w:rPr>
<w:t>very basic</w:t>
</w:r>
<w:r>
<w:t> document </w:t>
</w:r>
<w:r>
<w:rPr>
<w:i
</w:rPr>
<w:t>with some</w:t>
</w:r>
<w:r>
<w:t> formatting, and a </w:t>
</w:r>
<w:hyperlink w:rel="rId4" w:history="1">
<w:r>
<w:rPr>
<w:rStyle w:val="Hyperlink"
</w:rPr>
<w:t>hyperlink</w:t>
</w:r>
</w:hyperlink>
</w:p>
OpenDocument
<text:p text:style-name="Standard">
This is a <text:span text:style-name="T1">
very basic</text:span> document <text:span
text:style-name="T2"> with some </text:span>
formatting, and a <text:a xlink:type="simple"
xlink:href="http://example.com">hyperlink
</text:a>
</text:p>
---
There is something true in that study, indeed.
Personally I already have seen this kind of numbers, even though I've never minded to measure them.
Why? Simply put, because it matters very little.
Compared to Windows 3.11, Windows XP needs 100 times more disk space, 10 times more RAM and 10 times more time to boot.
Compared MS to Word 5.5, MS Word 2003 if slower and bigger.
Today I wouldn't revert back to Windows 3.11 and would not choose Word 5.5. What'd be the most important features expected in a document file format? In my opinion:
1. compactness
2. openness
3. flexibility
No "access performances", though.
Because the time needed to load a document, when you do real office work, weighs by far less than the time you spend on it while working.
And when someone sends you a file written with a different version of the software or even with a different software, how much time do you spend to make that file readable and printable?
Maybe Computers will never be as intelligent as Humans.
For sure they won't ever become so stupid. [VR-1988]
Here is a fast new algorithm to compress XML in such a way that browsing and searching the tree can be done without uncompressing it. This should make Word definitely faster when handling ODF. I really think Microsoft should start implementing some of this stuff instead of whining and complaining.
My first program:
Hell Segmentation fault
This reminded me of this paper, "The Psychology of Learning". In it the writer describes the act of people who don't want to learn new things: "As long as everybody around them use tools, techniques, and methods that they themselves know, they can count on outperforming these other people. But when the people around them start learning different, perhaps better, ways, they must defend themselves. Other people having other knowledge might require learning to keep up with performance, and learning, as we pointed out, increases the risk of failure. One possibility for these people is to discredit other people's knowledge. If done well, it would eliminate the need for the extra effort to learn, which would fit very well with their objectives."
This issue is about Microsoft defending their turf rather than not wanting to learn something new. But it's basically the same motive at work: find ways to undermine the new to benefit the old.
It goes on, "This model of learning also explains other surprising behavior that I frequently observe. I have seen novices in software development with knowledge of a single programming language explain to experienced expert developers why their choice of programming language was a particularly bad one. In one case, I talked to a student of computer science who told me why a particular programming language was bad. In fact he told me it was so bad that he had moved to a different university in order to avoid courses that used that particular language. When asked, he admitted he had never written a single program in that language. He simply did not know what he was talking about. And he was willing to fight for it. With respect to programming languages, negative opinions about a language that a person does not know, are usually based on very superficial aspects of it. To people obsessed with performance lack of such in a programming language is a favorite reason to advocate its eradication (even though performance is not a quality of a language, but of a particular implementation)."
The positive lesson to take away from this is the MS is undoing itself. It's turning to cheap, nasty, suit-driven mentalities to defend its turf rather than the old days when it would just go out and write something new and nasty. It's become an unwieldy beast. I read about the Vista delays yesterday and briefly thought "Will anyone notice - who uses Windows these days". To an extent it shows what a bubble I live in. But it's true - *all* of my regular contacts use linux, freebsd or mac os x. As they should. After all - friends don't let friends use Windows.
Believe with me, my saplings.
I read this article a couple of months ago that compare M$ XML and OpenDocument
1 44611543
http://www.groklaw.net/article.php?story=20051125
For those who are too lazy to read, here's a brief summary of the main differences between the two formats:
- m$ tags are 2-3 letters long and not readble
- m$ format looks more like a dump of the binary structure, and makes no attempt to separate content and style
The author was already feeling the size argument coming for m$ format, which is nonsense because both formats are compressed anyway and a XML should be readable.. but somehow, he was not expecting the "speed" issue.
Come on. If you wish something "efficient", use a binary format. If you start having a textual XML + compression, then obviously speed is not your concern. What's your concern then? Readability, processing by third party tools. In that case separation of content and style is more important. Who cares that "stuff" is written in Helvetica 12 black. I personally prefer to know it's a "title". And so on..
As for the speed, on today's computer which are virtually 1000x faster than required for typesetting document, this is laughafable. In addition, for large documents, I know many "word" addicts who separate documents in 100pages portions or so, because it become impossible to handle...
What I think about m$ XML, is that. well. it's not that bad. Even though not really "open", it's still better than before. But comon. This was done in a "rush", to fight back open document initiative. And in that case, dumping dummily the "internal binary structure" into a XML document was making more sense for them. There's nearly no development cost involved (no reasearch whatsoever) and it could be implemented very quckly.
Then Yates come and talk about "customer experience" (cf ZDNET article).. This is laughfable.
Regarding "customer experience", when will word support a real vector image format (no WMF crap please). like let's say EPS/PS/PDF... ? I personally hate having to make a raster of my images and make the word document explode in size (when i'm FORCED to use word).
2030?
Or rather - partly true. I don't know whether they have specs/docs, but I assume they do - incomplete ones. But yes, at least in Word 2, the file was essentially a memory dump, and later doc "formats" at least fill large parts of the file with binary dumps straight from memory.
The compatibility issues of course arise when you have a completely different memory layout in a later version, and basically need to replicate the one from the previous version (bug for bug) to load older files. It's insane.
"When I first heard Daydream Nation it quite frankly scared the living shit out of me." -- Matthew Stearns
Because "free" still means more to me than an additional 1.7 seconds.
------ The best brain training is now totally free : )
If it it were technically true, so what?
Why the hell does a text editor need to block the UI while writing to disk?
Were that I say, pancakes?
At tech-ed 2005 here in Brazil I saw one of MS evangelists showing a table comparing speeds for MS office (don't remember the version) and openoffices showing diferences od 20x or more...
I use both offices suites at work and at home and the speed difference is in the order of 2x at most for the first loading of the program and almost no difference after this (anything below 1 second is just "fast enougth" for me). And my computer is rather outdated.
I think ms Office a fair software, not worth the price, that's really expensive in Brasil, but they don't need to lie this way to sell it...
MS did this right again.
They deliberately confuse the application with the file format.
Psycologically reinforcing the perception that everything in a computer is vertically oriented and "incompatible" unless it comes from our application.
They understand the immense threat that a viable alterative (file format in this case) presents. PHB gets idea, "If this is iteroperable, gee I wonder what else is?"
Beautiful.
http://www.maxineudall.com/2010/02/should-economists-be-sued-for-malpractice.html
I seem to remember a rather depressing benchmark with respect to how fast OOo was able to save and re-open a large spreadsheet- and how much memory was required to do so. The results were not pretty, and would have definitely qualified as something that goes into the "must improve asap" category. I use primarily open source apps, but I have to admit that this performance benchmark was a little disappointing. Here's a to a related ZDNet article: http://blogs.zdnet.com/Ou/?p=119
This stuff doesn't even make sense.
OpenOffice uses ODF. Office uses binary formats. The performance analysis quoted doesn't compare ODF and OpenXML. It states right in the article:
Here is a comparison with the standard 16-sheet SXC and XML sample file I've been using. The sample is in compressed XML format because it is smaller and easier for you to download. You'll have to convert the XML file to XLS and the SXC file to ODS to run the following test yourself.
XLS is a binary format. This study is irrelevant to the statements made. And it's the only data given to substantiate the claims made. So there is no data given at all.
All you can conclude from this is that OpenOffice 2.0, retrofitted recently for ODF, is much slower in a windows environment than Office 2003 using binary file formats. A far cry from any statements made either by Yates or by the summary.
What a pile of crap journalism.
-1 Uncomfortable Truth
"We'd support it but it's too slow"
:(
This means they'll cut off Vista support?