Slashdot Mirror


Why OpenOffice.org? Open Document Formats

Jem Berkes writes "In this current article about OpenOffice.org (also covered at Linux Today), I try to make a point about OpenOffice's commitment to open document formats and interchange as the strongest selling point - never mind cost. The OOo developers are putting a lot of effort into their XML format; will this pay off, and will users notice the significance of OpenDocument/OASIS document formats?" This can't be said enough: file formats are what determine whether and how easily data is portable, or whether the user is just stuck.

25 of 478 comments (clear)

  1. Sam Hiser, OpenOffice.org - interviewed at LW by Anonymous Coward · · Score: 4, Informative

    There's a cool interview with Sam Hiser of OpenOffice.org here

  2. file size by Morthaur · · Score: 5, Interesting

    Speaking of superior file formats, has anyone else noticed just how much smaller OOo files are than the comparable MS Office documents? I routinely have to export files to MSO formats for peer review, and I have always marvelled at the amount of space a .doc takes by comparison.

    --

    +++++++
    "Look, dear, it's a crazy hairy scary man!"
    1. Re:file size by figleaf · · Score: 5, Informative

      It is a compressed zip file.
      Rename it to zip and extract the files.
      The extracted files are usually larger or about the size of Word documents.

  3. Re:Righto Mate by PincheGab · · Score: 4, Insightful
  4. Stability by scrote-ma-hote · · Score: 4, Insightful

    I wish people would stop touting stability as a superiority of software products. I use OO and MS Office regularly, and both have crashed on me, or done very flaky things, such as refusing to save a file for some unknown reason. I'm a more than average user, but not some elitist who has configured my machine perfectly, and if I can't get things not to crash, then your average user isn't going to be able to either. They'll try the program, excited by it's superior crash record, it'll crash once, and then they'll get burned, blame the software and never try again. There's plenty of good reasons to use OSS software, but stability wise, it's no better, and note no worse, in my books than an MS product.

    1. Re:Stability by DeTHZiT · · Score: 4, Informative

      Usually when you experience many random crashes, or seemingly random results from a program, there's usually a problem with your system memory (RAM).

      Try using Memtest86 to diagnose your system. It may be nothing, bad luck, or some other component of your system misbehaving, but it's usually bad memory.

  5. Why do I like OO.o formats? by Realistic_Dragon · · Score: 4, Funny

    So for once the unwashed are comming to _me_ saying 'I can't read this'.

    If it ever goes away I shall have to switch back to mailing them raw TeX files again.

    --
    Beep beep.
  6. Too Bad OO Sucks So Bad by Crispin+Cowan · · Score: 5, Insightful
    I love the open document format concept. I think it is vitally important. I can't believe that enterprises and governments are willing to store critical archival documents in Microsoft Office format, and put them selves at risk of being unable to open these documents as little as 10 years hence.

    However I have tried hard to switch to OpenOffice. Even our business people have tried to use it. And the sad truth is that it just sucks. There is no way in hell that OpenOffice competes with Microsoft Office for usability. The PowerPoint clone is especially weak: in PP, common buttons like "make the font bigger" are prominently displayed, while in OO you have to hunt hard for the button in the customization menus, and even then it doesn't work right.

    This is not to say that OO is not a valuable asset. Clearly a lot of people have worked hard on it. But don't kid ourselves, this beast has a long way to go yet just to compete with MS Office 97, never mind 2003.

    Crispin

  7. OO in law offices by ir0b0t · · Score: 5, Interesting

    This is great news. I use OpenOffice in my small town law practice, and I'm so happy to be liberarted from the tyranny of proprietary licensing fees. Lack of compatibility between software packages (office, accounting, case mgmt., etc.) is an even bigger problem for law offices in rural areas, like mine, who want to explore open source but lack support services.

    I'm learning --- ever so slowly --- more about Linux and Samba so I can complete the office transformation some day. Its hard to find patient teachers, and tech understanding comes slowly to some of us. Its worth the effort though.

    --
    I'm laughing at clouds.
  8. The sad thing is... by beeglebug · · Score: 4, Insightful

    ... almost every file I save in Open Office gets saved as a .doc/.xls rather than an OOo format (I can't even think of the file extensions of the top of my head, thats how infrequently I use them). If the file I am saving has to be sent to anyone, or opened on a machine other than my own, I have to go with Microsoft compatability, even though it annoys me intensly.

    1. Re:The sad thing is... by scrote-ma-hote · · Score: 4, Interesting

      If they don't need to edit the file, why not save it as PDF?

    2. Re:The sad thing is... by bladesjester · · Score: 4, Insightful

      Because, for whatever reason, most people specifically ask for doc and xls files. They tend to get snippy when you send them pdfs.

      When dealing with buisnesses that you wish to continue dealing with in a positive manner (be it for commerce, looking for a job, or any other reason), you try not to do things to annoy them overmuch. Just shrug, show them what they want to see while you do what needs to be done in the background. Most of them will be happy as long as they get the results that they wanted and what *they* see is what they expected to (there are exceptions to this, but as a general rule it's not a bad guideline).

      --
      Everything I need to know I learned by killing smart people and eating their brains.
  9. How to speed OpenOffice file-format adoption by CdBee · · Score: 4, Insightful

    Write a Firefox Extension that enables OpenOffice documents to be viewed in the browser, or edited if OOo is present on the system? (yes, this would be a lot of work)

    Suddenly you have an alternative to the traditional recipe of using .Doc files and the free MS Word Viewer to distribute written documents.

    --
    I have been a user for about 10 years. This ends Feb 2014. The site's been ruined. I'm off. Dice, FU
    1. Re:How to speed OpenOffice file-format adoption by mkldev · · Score: 4, Interesting

      PDF? Proprietary? Only if you mean Adobe's implementation. There are thousands of tools out there for generating and viewing PDF content in the open source world. Calling PDF proprietary simply because Adobe doesn't provide a viewer for all platforms would be like calling multicast DNS proprietary because at least initially, stock versions of Rendezvous wouldn't compile under Linux.

      Based on that same definition, Postscript is proprietary. Oddly enough, Ghostscript is sometimes known to open encapsulated postscript files generated by Adobe Illustrator that Adobe's own Photoshop can't. When the open source software exceeds the quality and reliability of the reference implementation, it can no longer reasonably be described as proprietary, even if the reference implementation happens to be, IMHO.

      That said, I would no more recommend people posting PDF or OOo docs than Word docs, for exactly the same reason. You have to download special software to view it. Even if Firefox had a plug-in in the shipping version, most people wouldn't have that version. For that matter, most people don't use Firefox.

      The web is a powerful platform for deployment of information precisely because there are a very limited number of standard formats for contents, and a single standard environment for viewing them. It pisses me off to no end when I see a PDF file without an HTML version alongside it. The last thing I want to do is deal with a whole different environment to view content---whether it's Acrobat or a viewer plug-in makes no difference. Ditto for Word, OOo, etc. (As I always say, "Repeat after me: 'HTML is for Viewing, PDF is for Printing'.")

      And I hope I -never- have to read something that some clueless peson uploaded in Postscript again. Yes, there's software for every platform, but no, most people don't have it installed, and it's a pain in the ass to distill to PDF just to view something that's usually mostly plain text anyway. And before you ask, yes, sometimes I have been known to just read the Postscript file in vi.

      Bottom line, if in doubt, HTML. If HTML won't work because the person posting it is too anal about formatting... HTML anyway, and post a nice, neat, formatted PDF for the three other people in the world who are as anal as they are. ;-)

      </rant>

      We now return you to your regularly scheduled discussion of open formats.

      --
      120 character sigs suck. Make it 250.
  10. A non binary filetype has many more perks as well by licamell · · Score: 4, Interesting

    The main one that most people overlook is the ability to edit a section of a document and only have that section change. With binary files, like MS Word, if someone opens it up and makes one small change, then the whole file gets changed. This difference comes into play when you start considering the ability to diff files, and to use these diffs for applications such as LBFS (low bandwidth file system), or log based file systems. There is a lot of technology out there that could lead to great improvements on network/disk usage if non-binary filetypes are adopted more regularly. Currently you can only use text based files in these systems. Imagine if you could use CVS with binary files (and actually harvest the benefits of using such a system). Just my 2 cents though.

  11. Re:Who cares if its XML? by Cecil · · Score: 4, Insightful

    Not necessarily true. Reverse-engineering XML (at least, XML that is not purposely obfuscated) is orders of magnitude easier than reverse engineering binary formats, because it is a self-descriptive format. Each piece of data has a name associated with it automatically -- the name of the tag -- as well as a rough structure (clearly this 'size' is for font size, not page size, since it's within a font tag). And just as importantly, XML tells you exactly where an 'array' of items ends because it has a /tag. With a binary format, the count for the array will typically precede the array, but does not have to... in a particularly complex format the length of the array can be implied by other parameters, and you have to use multiple samples to find out how exactly it is implied where it ends, and even when you think it's figured out it probably isn't, and the files that don't fit your assumptions will crash or produce garbage when read in.

    A proprietary XML file is not at all proprietary compared to a binary file. They're easy for even a novice programmer to figure out how to read.

  12. XML Formats rock! by Anonymous Coward · · Score: 5, Interesting

    Why I love software that saves as XML? You can edit their saved files with a simple text-editor (vim!), and that saved my ass once: I had to do a rather complex layout with the great DTP program Scribus, and (being still in development) some bug made it crash. Luckily Scribus saved the file before/while crashing, so I hadn't lost everything, but everytime I'd open it, Scribus would crash.
    Using a proprietary data-format, I'd be lost now. Using an XML-Format, I just open the file in a text-editor, check what happenend since my last (regular) save, copy&pasted the changes step by step to the old file, until it crashed.
    Then one step back, analyze the problem, send bug-report to Scribus-developers and be a happy man.

  13. 50 years from now by mslinux · · Score: 4, Insightful

    Open, well-documented formats will allow governments and businesses to access documents/info many years from now. It's unfortunate that most IT managers don't realize how closed formats will hinder them in the future.

  14. Re:Who cares if its XML? by MrBandersnatch · · Score: 4, Informative

    OMG the parent was modified up as insightful!!!

    The point of XML isnt that its human readable. Its that its machine PARSIBLE and that one can use a rather large number of tools in order to process the CONTENT without having to deal with all the proprietary ***** that is normally in there.

    Being able to apply XSL alone on a document means it incredibly simplifys the process of converting from one format to another WITHOUT having to learn YA proprietary format/tools.

    And to give you an idea of the value of this - Ive just spent 3 weeks converting a LARGE word document to XHTML (properly, i.e. its accessible, well formed etc etc). If this document had been written in OO (or if it had been possible to import it into OO without OO having convulsions on many of the tables), Id easily have shaved a week off that work.

  15. Data Interchange with Open File Formats by DoktorTomoe · · Score: 5, Interesting
    Funnily, I'm currently working on a bunch of projects to incorperate external Data Sources using Perl and OOo "template" files. E.g. it should be possible to write invoices from a database, copy a template, opening it, entering the data (address and billing information) to the right fields within the OOo file and saving it to disk. The user then should be able to review/print/PDF it and send the results to the customer. Modern accounting software already does this automagically, but my approach allows using the powerful OOo WYSIWYG for formular design - for example, any secretary would be able to write a seasons greetings on the template of december in no time.

    In another procect, I use a similar technique to visualize raw data given by CSV (e.g. Adsense data). It saves me a bunch of work I'd had to do manually in Excel.

    Magic like this would not be able utilizing proprietary file formats. OOo's XML file format has made my life easier. And I love OOo for it :)

  16. Re:Who cares if its XML? by arendjr · · Score: 5, Insightful

    I'm sorry, but here you are a bit mistaken. Most importantly there are 2 things which make XML special in this area:

    • Namespaces. XML allows you to use different XML schema's within one document. This makes it possible to embed for instance SVG data within an OpenOffice.org document (which it actually does if you're adding images). So, no need to reinvent the wheel here.
    • XSL. A technique making it possible to transform a document from one XML schema to another with very little programming effort. This makes XHTML export and import/export filters for Office 2003 XML files much less of a hassle. Again, this is actively being taken advantage of by OpenOffice.org. No need to reinvent all the parsing and generation code again.

    To say the fact they're documenting the format it is more important than the fact it's in XML is true, but that doesn't make it unimportant they're using XML.

  17. Not to be negative but...Looke here. by Anonymous Coward · · Score: 5, Insightful

    There's SVG support. It's just not particularly good.

    http://graphics.openoffice.org/svg/svg.htm

    However someone is working on it, and there's enough documentation out there, you can too.

  18. Important for government work as well. by jbn-o · · Score: 4, Informative

    For Peruvian Congressman Villanueva, use of free software and free formats was critical--his letter to Microsoft on why he was rejecting their arguments explains how important not being locked in is to doing transparent government work in addition to treating citizens well. I'm sure he's not the only one, but his letter to Microsoft is well worth reading.

  19. Re:Formatting Woes by Sique · · Score: 4, Interesting
    ...but the point is Office just can't handle anything that wasn't originally created by MS.

    So, is that because of incompetence, or by design?


    It's by design. When MS Word was being pushed by Microsoft as "industry standard" (back in the late '80ies, early '90ies), it came with dozens of import filters for about any word processor format known to Man. So the MS sales person could always point out that no one would loose any old data, because Word was pretty capable of reading the format in question.

    With the later versions, the number of file formats MS Word was supporting, shrank. And today it is reduced to old MS Word formats (and none of them as perfect as other office suites) and to a number of good documented formats (RTF, HTML, plain text). I remember when the company I was working for was converting from OS/2 to Windows NT4.0 and the old Ami Pro documents were no longer readable. It was quite an effort to finally find an old copy of Winword 6.0a to import the Ami Pro files, because the later incarnations of MS Word weren't able to read them directly.
    --
    .sig: Sique *sigh*
  20. Re:[OT] devolution of MS Office by Crispin+Cowan · · Score: 4, Interesting
    I'm curious why people have bothered to upgrade MS Office past 97 or 2000 at all.
    Good question. I am still running Office 97 (on VMware on my Linux laptop) and until very recently I had no motive at all to upgrade. The new motive: OpenOffice.

    "WtF?!" you might ask :) A collegue tried switching to OpenOffice. We got into swapping a PowerPoint document back and forth, and at some point I started getting .ppt files that PowerPoint97 could not open, claiming that the file had been created by a future version of PowerPoint. So something is broken in OpenOffice's "export to PowerPoint" that is emitting files that PowerPoint97 cannot read.

    Oh, the irony. Forced to upgrade to Office 2003 because someone in my organization tried OpenOffice :(

    Crispin