Why OpenOffice.org? Open Document Formats
Jem Berkes writes "In this current article about OpenOffice.org (also covered at Linux Today), I try to make a point about OpenOffice's commitment to open document formats and interchange as the strongest selling point - never mind cost. The OOo developers are putting a lot of effort into their XML format; will this pay off, and will users notice the significance of OpenDocument/OASIS document formats?" This can't be said enough: file formats are what determine whether and how easily data is portable, or whether the user is just stuck.
Speaking of superior file formats, has anyone else noticed just how much smaller OOo files are than the comparable MS Office documents? I routinely have to export files to MSO formats for peer review, and I have always marvelled at the amount of space a .doc takes by comparison.
+++++++
"Look, dear, it's a crazy hairy scary man!"
This is great news. I use OpenOffice in my small town law practice, and I'm so happy to be liberarted from the tyranny of proprietary licensing fees. Lack of compatibility between software packages (office, accounting, case mgmt., etc.) is an even bigger problem for law offices in rural areas, like mine, who want to explore open source but lack support services.
I'm learning --- ever so slowly --- more about Linux and Samba so I can complete the office transformation some day. Its hard to find patient teachers, and tech understanding comes slowly to some of us. Its worth the effort though.
I'm laughing at clouds.
If they don't need to edit the file, why not save it as PDF?
The main one that most people overlook is the ability to edit a section of a document and only have that section change. With binary files, like MS Word, if someone opens it up and makes one small change, then the whole file gets changed. This difference comes into play when you start considering the ability to diff files, and to use these diffs for applications such as LBFS (low bandwidth file system), or log based file systems. There is a lot of technology out there that could lead to great improvements on network/disk usage if non-binary filetypes are adopted more regularly. Currently you can only use text based files in these systems. Imagine if you could use CVS with binary files (and actually harvest the benefits of using such a system). Just my 2 cents though.
Why I love software that saves as XML? You can edit their saved files with a simple text-editor (vim!), and that saved my ass once: I had to do a rather complex layout with the great DTP program Scribus, and (being still in development) some bug made it crash. Luckily Scribus saved the file before/while crashing, so I hadn't lost everything, but everytime I'd open it, Scribus would crash.
Using a proprietary data-format, I'd be lost now. Using an XML-Format, I just open the file in a text-editor, check what happenend since my last (regular) save, copy&pasted the changes step by step to the old file, until it crashed.
Then one step back, analyze the problem, send bug-report to Scribus-developers and be a happy man.
In another procect, I use a similar technique to visualize raw data given by CSV (e.g. Adsense data). It saves me a bunch of work I'd had to do manually in Excel.
Magic like this would not be able utilizing proprietary file formats. OOo's XML file format has made my life easier. And I love OOo for it :)
Screw the FSM - Real geeks believe in the Invisible Pink Unicorn
I wonder how feasible it would be for other word processors, such as AbiWord, to use this format natively. Or, at least appear to use the format natively.
That is, after all, what happens in other areas: MS owns the market leading, proprietary, format/protocol, and then the others rally around an open alternative.
BTW, I don't think that the XML encoding is important. What matters is that the format is legally open, that it is published with good documentation, and that there is nothing hidden in it to tie people to OOo.
It's by design. When MS Word was being pushed by Microsoft as "industry standard" (back in the late '80ies, early '90ies), it came with dozens of import filters for about any word processor format known to Man. So the MS sales person could always point out that no one would loose any old data, because Word was pretty capable of reading the format in question.
With the later versions, the number of file formats MS Word was supporting, shrank. And today it is reduced to old MS Word formats (and none of them as perfect as other office suites) and to a number of good documented formats (RTF, HTML, plain text). I remember when the company I was working for was converting from OS/2 to Windows NT4.0 and the old Ami Pro documents were no longer readable. It was quite an effort to finally find an old copy of Winword 6.0a to import the Ami Pro files, because the later incarnations of MS Word weren't able to read them directly.
PDF? Proprietary? Only if you mean Adobe's implementation. There are thousands of tools out there for generating and viewing PDF content in the open source world. Calling PDF proprietary simply because Adobe doesn't provide a viewer for all platforms would be like calling multicast DNS proprietary because at least initially, stock versions of Rendezvous wouldn't compile under Linux.
Based on that same definition, Postscript is proprietary. Oddly enough, Ghostscript is sometimes known to open encapsulated postscript files generated by Adobe Illustrator that Adobe's own Photoshop can't. When the open source software exceeds the quality and reliability of the reference implementation, it can no longer reasonably be described as proprietary, even if the reference implementation happens to be, IMHO.
That said, I would no more recommend people posting PDF or OOo docs than Word docs, for exactly the same reason. You have to download special software to view it. Even if Firefox had a plug-in in the shipping version, most people wouldn't have that version. For that matter, most people don't use Firefox.
The web is a powerful platform for deployment of information precisely because there are a very limited number of standard formats for contents, and a single standard environment for viewing them. It pisses me off to no end when I see a PDF file without an HTML version alongside it. The last thing I want to do is deal with a whole different environment to view content---whether it's Acrobat or a viewer plug-in makes no difference. Ditto for Word, OOo, etc. (As I always say, "Repeat after me: 'HTML is for Viewing, PDF is for Printing'.")
And I hope I -never- have to read something that some clueless peson uploaded in Postscript again. Yes, there's software for every platform, but no, most people don't have it installed, and it's a pain in the ass to distill to PDF just to view something that's usually mostly plain text anyway. And before you ask, yes, sometimes I have been known to just read the Postscript file in vi.
Bottom line, if in doubt, HTML. If HTML won't work because the person posting it is too anal about formatting... HTML anyway, and post a nice, neat, formatted PDF for the three other people in the world who are as anal as they are. ;-)
</rant>
We now return you to your regularly scheduled discussion of open formats.
120 character sigs suck. Make it 250.
I recently worked as a consultant for a biotech company. They where developing health care drugs for the American market, and one of all FDA regulations they had to follow, was that all documents regarding some substance or drug must be available for at least 10 years time, or more.
This was a big reason they did NOT adopt open office, because in their corporate world (that is the opposite of real life) Microsoft Office was the guarantee that their documents would be accessible in 10 years, or more. I disagreed and did some arguing with them for the importance of open formats, but in the end they choosed Microsoft Office. Because; In the corporate world, Microsoft is king.
I believe they made the wrong choice and (IMO) the correct way of following FDA regulations, etc, is to use open formats for data/documents/etc. However this has not yet been realized by the industry (or FDA, I believe).
However, when the industry DO realize, all open formats will be at a very nice spot compared to Microsoft Office/closed document formats.
Nope Zip files can be recovered either entirely or in part...depending on the dammage. A minor amount of corruption may not lead to any data loss -- something that isn't true if the original uncompressed data is dammaged by the same amount.
Since the contents of the zip are text files, at worst they could be edited by hand to correct them. I can't think of a more stable document format that doesn't involve having multiple copies of the document.
A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
"WtF?!" you might ask :) A collegue tried switching to OpenOffice. We got into swapping a PowerPoint document back and forth, and at some point I started getting .ppt files that PowerPoint97 could not open, claiming that the file had been created by a future version of PowerPoint. So something is broken in OpenOffice's "export to PowerPoint" that is emitting files that PowerPoint97 cannot read.
Oh, the irony. Forced to upgrade to Office 2003 because someone in my organization tried OpenOffice :(
Crispin
Many people are telling me that OpenOffice could be faster and less demanding on memory, and these are areas where our own products shine. Have you never wanted OpenOffice to start a little quicker?
My personal feeling is that even open source products are not beyond the realm of criticism in areas where they fall down. Mind you, I am seeing that our little PlanMaker/OpenOffice comparison page is causing the OOo developers to improve their product. So, even if you never use TextMaker or PlanMaker, you profit from our little row.
Apart from that, I am still convinced that open document formats are the way to go if we all (united and apart) want to break Microsoft's monopoly.
SoftMaker Office for Windows|Linux|Android
Boss wanted me to create a PostScript version of our corporate logo, so it could be scaled as needed.
Source: a poorly rendered GIF.
Equipment: one Linux machine, with OpenOffice.org installed.
I found the matching font, got the dots lined up, converted it to a traced object, found the right "burnt sienna" color... but that pukey-green was nowhere in any color selector I could find.
After hunting for nearly a half hour, for an edit box that would let me enter an arbitrary hex triplet, I just saved the file and quit OOo. Then I unzipped the document, opened the style sheet in NEdit, and changed the hex triplets by hand. Save, exit, re-zip, and open it in OOo to see if the changes were correct. Voila!
I never, never ever would have been able to do that in a Microsoft product. I will grant that Microsoft may have made the hex triplet entry somewhat more obvious, but that doesn't mean I would have been able to find it any more easily. They absolutely control how the user accesses the document. OOo lets you access it any way you want.