OpenOffice Bloated?
cygnusx writes "ZDNet's George Ou has been writing a series of posts about Open Office bloat. Includes some interesting system usage comparisons" From the article: "Even when dealing with what is essentially the same data, OpenOffice Calc uses up 211 MBs of private unsharable memory while Excel uses up 34 MBs of private unsharable memory. The fact that OpenOffice.org Calc takes about 100 times the CPU time explains the kind of drastic results we were getting where Excel could open a file in 2 seconds while Calc would take almost 3 minutes. Most of that massive speed difference is due to XML being very processor intensive, but Microsoft still handles its own XML files about 7 times faster than OpenOffice.org handles OpenDocument ODS format and uses far less memory than OpenOffice.org."
Consider that Intel owns a big chunk of CNET and then you see a possible conflict of interest brewing over an article possibly designed to sink Open Office. Now consider the author, George Ou, who has also posted such titles as, Is the Honeymoon with Firefox Over?
Seeing a bit of a pattern forming.
The dangers of knowledge trigger emotional distress in human beings.
Could it be the GUI? Excel uses native widgets and I'm sure is heavily optimized towards MFC (after all, its their API!). I don't think OO has that luxury. I doubt thats the entire issue but it could partially explain it.
It's interesting today to see the bloat and memory hog complaints leveled against the non-Microsoft product while showing MS' version as lean and mean.
I can't defend the numbers, they do look huge, but we're seeing about one or two articles a week in the trade rags about the latest memory, cpu, cache, etc. advances. Technological advances render all but the most dramatic processing demands almost moot.
In the numbers and benchmarks from this article, unfortunately, this is one of the more dramatic instances. I'm always willing to wait a little more for opening an application, or a file if other factors offset. In this case, free vs. whatever Office goes for now, typically is enough of an offset, but maybe not so for a large company where that extra "time" and computer resources add up big, and the pricing is likely to be more disounted for volume licensing.
Interesting numbers on the two different speeds on processing XML. Does anyone know or conjecture the difference in the true internal XML data for the comparison? I thought OpenOffice was the more pure in the sense that it is true human readable data in the XML while Microsoft's format is more of an envelope architecture for binary proprietary Office payloads. And, I wonder what the specifics in this test were around that.
Bottom line for me: I'm still going with OpenOffice, I've been a fan for years.
I don't use Windows and haven't since '98. At one point, I ran Linux, but kept a dual boot system with Windows, just for opening complex Word documents. Then, I started using Crossover and that saved me a lot of time and I eventually wiped Windows off my box for good.
Now I got into OS X, and I run MS Office on it. I must say though, without bias, that MS Office has to be their greatest product. It just works and I haven't ever had any issues with it at all. It is fast, user friendly, stable and usable. Let's face it: when coders code a word processor they will always look at MS Office for implementation ideas. On the Powerbook, MS Office just flies.
A few weeks ago, I tried to run Openoffice on my Debian box, and there was a huge performance decrease, when compared to running MS Office. It was certainly noticeable. It took a while for a document to open up.
Though, Office has been around for a long time and Openoffice hasn't, so I'm sure there will be lots of features and performance gains in the coming years for the latter. I'm definitely going to keep an eye on Openoffice.
I'd chance my arm and say a fair bit.
I made the mistake of opting for x86-64 Gentoo for one of my desktop boxes ("upgrading" it to 32bit this weekend), meaning I have to use the 32bit precompiled OpenOffice binaries. But these need hooking into a 32bit JRE which x86-64 Gentoo doesn't have, since making 32bit apps available through Portage is seemingly something that Gentoo Won't Do Because You Should Be Happy With 64bit. So whenever you start OOo it spends about a minute looking for a JVM (and failing) before you can do anything. I could have manually installed Sun's 32bit JRE, but I can't be bothered.
Disable Java in the options and it starts in 1-2 seconds on the same machine.
By way of comparison, I tried the same trick on my 32bit box (similar spec but with slower HDD's) and OOo was as snappy as hell and opened like the proverbial soil off a shovel.
If there's any functionality I miss through disabling Java, I haven't encountered any yet. And please note I'm not saying that Java is slow to execute (it isn't), it's just appallingly slow to load.
Moderation Total: -1 Troll, +3 Goat
With the exception of Outlook, Office 2003 has never crashed on me, even when handling huge files. On the other hand, when we evaluated Open Office, we couldn't get it to stay up for more than 1/2 hour, and when it did work it was unacceptably slow.
Well, according to the Misco catalogue I received this morning MS Office standard costs £300.
At my local computer shop, RAM costs £75/GB, so I could have 4GB of RAM for my machine.
On a price performance comparison MS Office uses 7MB and OO.org uses -3960MB.
Only two things are infinite, the universe and human stupidity, and I'm not sure about the former. (Einstein)
I've tried to use Open Office on my machine at home (dual-P3 800 MHz, 1 Gb RAM) and have always gone back to KOffice. OO has always felt "bloated" to me. It takes much too long to start up, and everything seems to slow down a little on my machine.
On the opposite end of the spectrum, Abiword and Gnumeric load very fast and seem to fly during use. KOffice is a touch slower than Abiword/Gnumeric but still light years ahead of Open Office. It also has a very snappy feel to it. Abiword works on Windows, Mac and Linux. Yes, I know, this doesn't address databases or presentation software.
IMHO, there should be no question mark, but more of an exclamation point.
-Charles
Learning HOW to think is more important than learning WHAT to think.
if i remember correctly, after compiling oo2, it ran very well and fast. the precompiled bins were def slower. my guess is that these tests were run on a windows machine. so just switch to nix and compile, its that easy.
1. It is not fair to compare based on file size. Not only are OOo files compressed, but different data that is the same size uncompressed can have drastically different processing times. Think of the difference of one page full of vector graphics, tables and a little text compared with 3 or 4 pages of text.
.csv, .xls, .doc, and of course .odt and .ods files.
2. It is a known problem that OOo takes a while to start. Staroffice (at the point when Sun bought it) was made by a German company. Most of the internal functions are named in german, and use abbreviations that are not obvious. The fact is that each version of OOo has been getting smaller and faster. OOo 2.0 is the same. If you run OOo 1.1.4 and OOo 2.0 side by side on windows, the 2.0 version uses about 10MB less memory when both have nothing open.
3. Since it uses more memory, it has a higher chance of being swapped out when you switch to another program for a while. A good way to see this in a short period of time is to run a torrent in the background (seeding or just downloading). Leave an OOo window open and use another program for 20 or more minutes. When you switch back to OOo it can take 10-40 seconds (depending mostly on the speed of your hard drive and amount of memory available) for the window to redraw.
If you are using OOo often enough to keep it in memory it is very snappy. But if it gets swapped out, then you will notice a speed degredation.
4. In my experience with small files (less than 200 records in a spreadsheet and 1 - 4 page documents) OOo takes longer to open and save files. I usually work with
There: Something at a specific location.
Their: Owned by someone.
Please make sure your english compiles.
One common cause for this discrepency is that Windows does pre-caching and pre-binding for commonly used applications. When you first install Firefox or OO, it will be slower, but if you don't use IE or Office for 6 months, while you use the alternatives regularly, the Microsoft apps will be slower after a while. IE takes *forever* to load on my laptop on the rare (once or twice a year) occasions I fire it up.
A few years back, there were some articles showing that Linux was significantly slower than Windows on 4-CPU systems. At first, there was some questioning of the results from the Open Source community. Once the results had been verified, the Linux kernel developers set about to remedy the situation. They were quite successful, and Linux has beaten Windows in every such test since.
Software sucks. Open Source sucks less.
I don't like Microsoft. I don't like Windows. I do, however, like Office. It's been a good office suite for a very long time. It's been very easy to use since I first started playing with Office 4.2. If Microsoft would actually release a version of MS-Office for Linux then I would probably purchase it.
Before everyone starts ranting about how this isn't good for GPL, or how I'm being bad by saying this, remember, the point of the GNU OS is for application developers to have a level playing field. Microsoft, like any other consumer software maker would be just as correct to participte in that kind of market as anyone else.
I use Open Office, but I don't agree that it's the best productivity suite. It is the best free productivity suite for Linux at the moment. Since Microsoft's product will always cost money, Open Office undoubtedly will remain the best free productivity suite; it will serve as a baseline. If vendors wish to make a commercial product that is better than Open Office and charge for that product it's their right to do so.
Do not look into laser with remaining eye.
I'm not sure how feasible it is to profile such a large program, but I'm sure Microsoft profiles the daylights out of their stuff. Do OOo developers profile things like the start-up time? After all, you can't start optimizing things unless you figure out exactly what is slowing it down. Is it the Java run-time engine? Is it because it needs to load a lot of libraries that MS Office does not need to (because of dynamic linking to Microsoft DLLs). Maybe when loading certain data sets, the program goes into a pathalogical state, creating hundreds of thousands of small objects? I don't know.
But things like analyzing profiling data and then optimizing are not fun to most people. Even more so if it means that an algorithm needs to be re-written. After all, if the "open file" operation needs a complete re-think + re-write, who's going to do it? It's not "fun". After all, the "open file" operation already exists. Generally, I think programmers like to build *new* things as opposed to fixing old things. And in this case, it's not even a matter of "fixing". It's a matter of rewriting. I presume that at Microsoft, if Word's "open file" operation (run with me on this for a minute) is uber-slow, then somebody is going to *have* to fix it, or not get a good performance review/etc. However, in the case of OOo if no one makes it faster, well, it does not negatively affect the person who wrote the slow version in the first place (not to discredit OOo authors or anything. They've done a phenomenal job given that they do this for fun and not profit).
Of course, there are an equal number of programmers who like to fix security holes and so forth, but patching a security hole is one thing, while re-writing major algorithms in a large program is another. There are of course some programmers who love optimizing code (Michael Abrash?). But I think they are far and few between. Very often, once something works, an attitude sets in that "It's working. Now don't break it". And optimization in it's early stages will often break things.
I was wondering how much of the RAM footprint difference was due to Office relying on Windows code. So just for the fun of it I fired up Excel on my Mac. 22.94 MB of real memory being used for Excel, 34.14 for Word. Compare that with 7.10 and 9.81 for Excel and Word on Windows and 37.54 and 37.66 for Calc and Write on Windows. Anyone running OpenOffice on a Mac want to add another data point where MS doesn't have code "hidden" in the OS?
Here's why more people don't use OpenOffice: 1. THEY'VE NEVER HEARD OF IT. Most people don't know jack squat about computers or programs. They use what everyone else does or what they've seen elsewhere. That would be MS Office because that's what they have at school or work. They don't know that there are any other office suites even out there. 2. If they do know about OpenOffice, they don't like it because if they had ever used it before, the commands are in slightly different places on the menus than in the version of Office they use at work. This made it "too hard to use" because they have to re-learn a few locations of functions. (Interestingly enough, most people I know HATED Office 2003 when it first came out because the commands and menus were a little different than in Office 2000. They said it was impossible to use! Same thing for Windows 98 users than went to XP.) 3. They opened up the most heavily-formatted Office 2003 document they could find- lots of macros and such. It didn't open up quite right in OpenOffice, so they concluded that it was junk, never minding that Office 2000 or XP would have barfed on it worse. I used my Linux-running computer to display a read-only PPT 2003 presentation off of a USB stick after the presenter's computer crashed. He was SO pissed that one hyperlink didn't work right (linked to a non-existant file on the "E:/" drive, but the rest of the presentation was *perfect.*). So he used somebody else's computer with Office 2000. The text boxes were all over the place and his background was gone when they displayed it... 4. You can get MS Office for free from peer-to-peer or by sharing an original disc.
Just "gittin-r-done," day after day.
Just go ahead and admit it, they both suck for different reasons. We need a third player.
Patience young padawan. So far the biggest problems with OO have been the lacking features compared to the M$ Office aswell as interoperability with the M$ Office. We're obviously getting somewhere now that people start benchmarking and complaining about memory usage. Seriously, five years ago no one would've even bothered to check memory usage when comparing those products, there wasn't much to compare.
For the record, I'm not saying OO ain't bloated, so it seems, and perhaps there's been too much pressure to reach interoperability and feature richness, but it's too early to condem it. Time will tell wether their internal design is good or not. Can it be made faster/leaner/meaner without too much sweat and tears...
1 Earth is warming, 2 It's us, 3 it's royally bad, 4 we need to take action NOW
It would be interesting to know if someone has investigated using the symbol hiding capabilities in the newer versions of gcc to eliminate some of the shared object related bloat that most probably afflicts OpenOffice. When you use shared objects for everything every function name gets put in the dynamic symbol table by default. The only ones that actually need to be there are the ones called from the main program and other shared objects. All of the functions and global data that are only referenced by other code in the same shared object don't need to be in the dynamic symbol table or linked at run-time. Windows has used explicit exporting of symbols from the dawn of time, you can explicitly hide or export symbols in newer version of gcc, 3.4 in particular. I think KDE takes advantage of it on gcc 3.4 compiles.
/opt/OpenOffice.org/program/*.so
You can look at the dynamic symbols that ARE loaded when the shared object loads with something like:
objdump -T
The bloat is especially accute in C++ code because the mangled function names can be quite long.
All those symbol names are loaded and scanned to do run-time link the shared objects, it causes slowness at startup which OpenOffice certainly has and you take a big memory hit for stuff that is not useful code.
Manually keeping track of which symbols need to be exported and which are not is a pain, and is a pain in Windows DLL's. You would almost be better off on something as big as OpenOffice to write scripts to process objdump output and figure out which symbols are actually be called outside the shared object and need to be in the dynamic symbol table.
On the other hand its kind of good discipline to create an a clean and disciplined API for each shared object which defines the public interface to the shared object. It helps improve modularity, reusability, testability and discipline in general and eliminate bloat when you realize that in fact nothing is actually calling dead code.
@de_machina
I think he's referring more to Windows' trait of moving the data for the most commonly used programs to defragmented sectors on the outer edge of the hard disk platter. The quickstarter may pre-load parts into memory, but it doesn't improve disk performance.
If you don't know where you are going, you will wind up somewhere else.