Slashdot Mirror


KOffice To Use Open Office File Format

InodoroPereyra writes "This article at The Dot indicates that the KOffice developers decided to switch to the Open Office file format (OASIS) for their next major release. Excellent news both for KOffice, which will benefit from OpenOffice's excellent filters, and for the GNU/Linux Desktop users in general, who will benefit from a unified file format standard between these office suites."

48 comments

  1. That other office suite by __past__ · · Score: 5, Interesting

    Let's wait how long it takes that other office suite vendor to see the light. After all, they are an OASIS member themselves...

    1. Re:That other office suite by tsa · · Score: 1

      They will never see the light. Seeing the light will cost them a LOT of money.

      --

      -- Cheers!

    2. Re:That other office suite by norweigiantroll · · Score: 1

      Nope.. no AbiWord in that list.

  2. Let's hope this will be the new trend by tsa · · Score: 5, Insightful

    This is very good news. Finally we have a choice between different word processors that use the same format. I think this can certainly help organizations in their decision to migrate or not to migrate to Linux. Let's hope this will be the new trend for the future.

    --

    -- Cheers!

  3. One format... by shfted! · · Score: 5, Funny

    One format to rule them all,
    One format to find them,
    One format to bring them all,
    And in the saving lose all formatting.

    --
    He who laughs last is stuck in a time dilation bubble.
  4. That all? by Anonymous Coward · · Score: 0

    In other news, KOffice to use OpenOffice code base.

  5. Abiword by aderuwe · · Score: 5, Insightful

    I guess we should be poking the Abiword developers now to do the same.

    1. Re:Abiword by BrokenHalo · · Score: 3, Insightful

      True. Abiword has lots of things going for it (not least the fact that it's *much* quicker to load than OpenOffice) but being based around yet another file format can be a real show-stopper.

    2. Re:Abiword by dominator · · Score: 4, Informative

      Actually, we do have fairly decent support for the OpenOffice file format. It's just not the default file format, nor is it likely to be. If you're interested, please read:

      http://abisource.com/mailinglists/abiword-dev/20 03 /Apr/0167.html
      http://abisource.com/mailinglists/ abiword-dev/2003 /Apr/0183.html

      Basically, while it makes sense for us to support the OOo file formats as best as possible, it is not desirable for us to make them the default file formats. If anything, RTF is a much better choice for this particular job, as I don't believe MS will be supporting the OOo file format anytime this century. However, both support RTF, and RTF is capable of preserving ~100% of the content and data that DOC is, albeit oftentimes more verbosely.

      That said, it might make sense for upstream packagers (RedHat, Ximian, ...) or individual users to change Abi's default file format to RTF or OOo to meet their needs. It's a matter of changing 1 line of code, or altering 1 line in a configuration file. It's intentionally easy.

      This all boils down to different worldviews - Abi and OO won't ever have a 1:1 mapping of features, nor will we agree on how to represent those features in any single file format. The best you can hope for is "really close" conversions. Loss of content or presentation markup is unacceptable in a "native" file format.

      IMO, the best solution is to all have a "common tongue", which may well be the OOo format, or say RTF. We should all use the common language when we want to speak to each other, and hope nothing gets lost or misinterpreted on either side during the translation (remember a translation from Abi -> KOffice using the OOo format as an "intermediary" has at least 2 points of failure instead of just 1). Unfortunately, that's all unavoidable. But when we're speaking "at home," we really want to speak our mother tongue. There's less ambiguity and a higher level of precision.

      For those reasons, I don't think that the KOffice folks are necessarily making the best decision here, though I continue to wish them the best of luck and success.

      Best regards,
      Dom Lachowicz, AbiWord maintainer

    3. Re:Abiword by akvalentine · · Score: 2, Interesting
      But when we're speaking "at home," we really want to speak our mother tongue. There's less ambiguity and a higher level of precision.

      But now the OASIS format will BE KOffice's native tounge...

    4. Re:Abiword by dominator · · Score: 2, Insightful

      That's only completely valid if there can be a perfect translation from KWord's file format to OOo, or if Kword redesigns how it does certain things so that it matches OOo's expectations of the world. This is what I highly doubt, and I speak from some considerable experience when saying so.

      Dom

    5. Re:Abiword by Anonymous Coward · · Score: 0

      Ack. It was my understanding that KOffice had a frame-based document layout model, I really don't see how they're going to reconcile that with OpenOffice's word-like layout, without sacrificing what makes KOffice good (it was once stated that its aim was to be comparable with Adobe, not crappy-ass MS Word).

    6. Re:Abiword by Anonymous Coward · · Score: 0

      Well David Faure is on the OASIS workgroup and actually tweaks the format instead of the apps...

      // Peter "psn" Simonsson (KOffice developer)

    7. Re:Abiword by Anonymous Coward · · Score: 0

      OpenOffice has some really good frame based functionality. While I haven't looked at the source code, if feels very much like a frame based word processor that simply defaults to a single frame per page.

      Also, OpenOffice supports frame based HTML editing by mapping its internal frames to DIVs and CSS.

    8. Re:Abiword by Anonymous Coward · · Score: 0

      I guess it is time to FORK abiword in order to make the OpenOffice.org file format the default...

  6. Additional XML benefits by neglige · · Score: 4, Interesting

    Using an XML based (and documented!) file format has additional advantages. First and foremost, the documents can be easily used by other applications, e.g. full text indexer. Generating meta data has never been easier ;)

    Or use a stylesheet on the document and adopt it for, say, mobile devices (my favourite topic, I must admit). XML->HTML, XML->WML, XML->cHTML ... no problem. It's even possible to extract an abstract, collect hyperlinks from the document and present them seperately, leave out the graphics (or convert them)...

    Is this possible with .doc? I'd guess so. As easy as with XML? Don't think so.

    --
    My cats ate my karma. They also wrote this comment.
    1. Re:Additional XML benefits by Anonymous Coward · · Score: 2, Interesting

      problem is that the xml files are saved in a .zip archive, containing all contained contents, where a content can be the actual document, an image included in the document, a spreadsheet document or whatmore.. The xml files would be a lot bigger than a binary format, but the zipping process manages to get it down to about the same size again most of the time..

    2. Re:Additional XML benefits by raffe · · Score: 3, Informative

      I have been playing around with the new xml format in word 2003 beta. It works very nice. We make reports from out system to word xml. We can open it in word, we can transform it further to pdf, crystal and so on. The format is ok and not f*cked up cdat stuff.....

    3. Re:Additional XML benefits by swillden · · Score: 3, Interesting

      The xml files would be a lot bigger than a binary format, but the zipping process manages to get it down to about the same size again most of the time..

      Incorrect. Go try it on a few documents. In practice, I see that OOo Writer documents (without images) are less that half the size of their Word counterparts, and OOo is not (yet) very careful about the XML it spits out, tending to save lots of style and other information that isn't even used in the document.

      The zipping process makes the files a lot *smaller* than you normally get out of a binary file format. Why? Rather simple, really. In most binary file formats (e.g. Word), the formatting information is fairly compact, but the content isn't compressed in any way. Given that English text has about one bit per character of entropy and given that (hopefully!) there's much more content than formatting, there's a lot of room for compression to do its work. In the case of embedded images, it really doesn't matter what format you use, they don't compress, but the XML doesn't add a significant amount of overhead to them, either.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    4. Re:Additional XML benefits by daniel_yokomiso · · Score: 3, Informative

      There's one benefit you ignored. If, in future versions, they want to put additional information (i.e. new tags or attributes) the older software versions will be able to read the new documents, keeping backwards compatibility. This is the most important feature for organizations that work with standardized software, they're able to keep using old verified versions of the applications .

      --
      Disclaimer: If I disagree with you I'm probably trolling...
    5. Re:Additional XML benefits by Anonymous Coward · · Score: 1, Interesting

      SBC gives us our monthly phone bill [its a sizable business] in .DOC format. It typically weighs in at about 25 meg (lots of tables and overly verbose "descriptions" of the surcharges and fees). When I save the file in OOo, it saves down to just over 700k.

  7. So this one format it: by SHEENmaster · · Score: 0, Flamebait

    ASCII!

    Does anyone know of a good ANSI editor for X?

    --
    You can't judge a book by the way it wears its hair.
    1. Re:So this one format it: by BrokenHalo · · Score: 2, Insightful
      Does anyone know of a good ANSI editor for X?

      Yes. it's called emacs. And you don't even have to be running X, or even Linux for that matter.

      Yes, I know it takes a couple of days to get productive with it, but once you've got past the initial learning curve it's very easy, and quick to use with Tex if you're into real typesetting.

    2. Re:So this one format it: by Haeleth · · Score: 2, Funny

      Notepad runs quite well under Wine. 64 kb should be enough for any file!

    3. Re:So this one format it: by Anonymous Coward · · Score: 0

      Actually, Notepad starting at least in NT5 did not have the 64kb limit. It could have been something higher, I suppse, but I never encountered it.

  8. And in other news... by Anonymous Coward · · Score: 2, Funny

    New app announded: KOffin.

    1. Re:And in other news... by paultt · · Score: 1

      no, it's called "Mozilla Koffin"!!!

  9. Old files by SgtChaireBourne · · Score: 4, Insightful
    I interpret the actions of that individual OASIS member that is not participating has too much of an interest in not participating. Its word processor part of the two things not yet making a loss and has historically relied on lack of forward compatibility to drive a rack of new purchases - HW, OS, misc. apps.

    Basically, that "other vendor" is facing irrelevancy. Especially looking at the proposed changes with DRM, server lock-in, a proprietary XML schema and the software as subscription model.

    The OASIS format supported by Koffice, StarOffice, and OpenOffice.org is not only cheaper and more flexible, but safer in the long run because it's open. That means you're not locked into one platform, one vendor, or even on package. Though the differences are not so dramatic in a word processor, package independence means that individuals can choose the tool that works best for their needs or work methods and still collaborate.

    Being an open format means you don't have to depend on the goodwill of a monopoly to keep your format alive. Nor is there a risk of breaking the DMCA, EEA, commit a computer related crime and violate several patents when you try to read that 5 year old file.

    --
    Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
    1. Re:Old files by BigBir3d · · Score: 3, Insightful
      Basically, that "other vendor" is facing irrelevancy. Especially looking at the proposed changes with DRM, server lock-in, a proprietary XML schema and the software as subscription model.
      No, actually they are facing a uninformed public, and higher profits through new revenue streams. As long as Dell et al bundle software for decent rates, MS can't help but to make money hand over fist.
  10. Great News?! by metalmaniac1759 · · Score: 1

    Is this great new? How is this going to affect the end users. I guess it's because whenever someone mails me a .DOC file - I have to first open it in Openoffice. If it's a simple file - then I try opening it in KWord and save it to a PS file - so that I dont have to wait forever for it to open in Openoffice.

    Gtg for my class - will continue this post later.

    Nandz.

    1. Re:Great News?! by Anonymous Coward · · Score: 0

      Oh, I see you are still using StarOffice 5.2.

  11. Electronic Publishing by SgtChaireBourne · · Score: 5, Interesting
    Yes, let's hope this will be a new trend. The last round of open standards (e.g. TCP/IP, HTTP + HTML) brought a lot of good, especially HTML. I'm curious to see where this step will lead.

    I suspect that it is also a big step closer to electronic documents with a long shelf life. This may lead towards electronic publishing where well-formed and, possibly, valid documents become the norm. Even if the structures are rudimentary, this still will help portability and retrieval.

    Right now, [X]HTML and PDF are only part way there. PDF is useful for rapid dissemination, but can more or less be thought of as a compact form of paper. Most HTML document are neither well-formed nor valid and often too dependent on transient constellations of technologies. So, a format like this will let organizations choose tools suited for their specific needs and tasks.

    --
    Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
  12. Clarification... by Danious · · Score: 4, Informative

    They will be using the OASIS file format, this doesn't mean they will be using the OOo MS import/export codebase. There MAY be a common library in the future, but that is not clear yet. Also, this is not for the coming release, but for the one after that (v2.0?) that is slated for say middle of next year.

  13. OASIS by Eberlin · · Score: 1

    I guess it's a great reason to celebrate...break out the champagne supernova! Call everyone using the unix server...use (Wonder) $wall

    I'm glad we can all finally agree on one brow...um, file format.

    Sorry, that's all the gallagher references I've got without smashing a watermelon.

  14. Which openoffice format? by Anonymous Coward · · Score: 2, Interesting

    Since OO decided to screw everyone and change formats between 1.0 and 1.1, does that mean now Koffice is also just like microsoft in abandoning people who've supported them in the past?

    1. Re:Which openoffice format? by Rick+BigNail · · Score: 1

      As long as Koffice could open and save to the old format, it should not be a big problem.

    2. Re:Which openoffice format? by Anonymous Coward · · Score: 0

      Funny... that would make KOffice more OO compatible than OO itself, then.

  15. "Excellent Filters"???!!! by fm6 · · Score: 2, Interesting
    People keep prattling about how great the filters are in Open Office. Come on, people, let's be a little more objective. Parroting the OO party line is not good for the open source movement.

    From my experience, OO's filters are decent, perhaps a little better than Microsoft's, but hardly anything to get excited about. The last time I read a Word file in OO, it screwed up a very simple bulleted list. Face it, it's very, very hard to write a really good word processor filter, especially for a file format as messy as Microsoft's.

    The OO native file format is pretty good, or at least the current version is. I have some issues with it, like throwing in every obscure XML namespace that has some silly feature that somebody likes. And there's still too much device-specific information. But I guess you can always just ignore the noise, especially since it's more neatly separated out than in previous formats.

    OK, I'm cynical about attempts to challenge Word's workplace dominance. But here's a scenario/fantasy that's worth thinking about: Bush II loses the '04 election, despite his carrier landing skills. An "anti-business" Attorney-General revives the anti-trust actions against Microsoft. This time, they ignore silly outdated rememdies like splitting off the application divisions (multiple monopolies, great) and come up with something that's ahead of the curve. Like forcing Redmond to work harder at standards compliance. Hey, you say Word dominates because it's better? Prove it: have it read and write OO format! Then you can compete on features, rather than locking out the competition with format crap.

    1. Re:"Excellent Filters"???!!! by burns210 · · Score: 1

      Well ya, them bundling an OOo filter would be handy, but it would just be YET ANOTHER FEATURE the general masses wouldn't need. By the way, i believe it is free to create filters for Office(they wouldn't be bundled with every Office Sale, ofcourse) to read new formats, only no one has done it yet...

      In my humble opinion, one (or all!) of these would be a better punishment:

      *publish ALL their APIs... MS tried to avoid this siting some security reasons, but in reality is there even a downside to this?

      *Document all aspects of the MS office file formats(and future formats) to an extent that a third party can write a filter equal to that of the native MS Office filter.

      *Make it illegal(more illegal!) for MS to bully OEMs by saying 'if you ship that product line with Linux(or other OS) then we can't keep giving you the current deal on the Windows OEM version, it will now cost 10x of what you are currently paying per machine'... that is anticompetitive and wrong.

      *require .net to have a port(equal to the native port) on 3 platforms other Windows(inlcuding windows alpha), or just make them port it to Mac, Linux, and Solaris. ... any others that i am missing?

  16. sorry by SHEENmaster · · Score: 1

    I'm a hardcore VI(M) zealot. Any other ideas?

    --
    You can't judge a book by the way it wears its hair.
  17. Obligatory Nitpicks by fm6 · · Score: 1
    It's funny how everybody insists that ASCII is a universal character set, when very few people actually use it any more. What most people use is Microsoft Latin1, with ISO Latin1 a distant second. Yeah, both these character sets are supersets of ASCII, but when you a Pound Sterling symbol can be entered with the right keystroke, you've broken any pretence at backward compatibility.

    Not that it really matters, except that the A is ASCII is "American", so Western Europeans will accuse you of being U.S.-centric.

    Despite the name, the "ANSI Character set" was never any kind of standard. Microsoft claims they call Microsoft Latin1 "ANSI" because it's based on an ANSI draft that eventually became ISO Latin1. But I think it has to do with the "ANSI" software that used to be in MS-DOS. This emulated an "ANSI Terminal" (better known as a DEC VT-100) and allowed the ANSI graphics BBS people used to be so fond of. Not the same character set as Latin1, of course, but it's not suprising that the tech writers would confuse the two.

    Further confusion: when I was documenting Delphi, explaining the exact meaning of the AnsiString character type took some skill. Its characters were never any kind of "ANSI" character set. In fact, it's not even a single-byte character set. It is, in fact, UTF-8, which is also a superset of ASCII, but which uses multi-byte characters to represent the more exotic stuff.

    Yet further confusion: Slashdot seems to use ISO Latin1, but sends an HTTP header claiming to use UTF-8! Doesn't matter most of the time...

  18. Not that simple by fm6 · · Score: 1
    Or use a stylesheet on the document and adopt it for, say, mobile devices
    That only works if you're transforming XML that imposes some kind of structure on the document. OO XML doesn't. Here's some pretty typical OO XML:
    <text:p text:style-name="Standard" >All
    <text:span text:style-name="T1" >work</text:span>
    and no
    <text:span text:style-name="T2" >play</text:span>
    makes
    <text:span text:style-name="T3" >Jack</text:span>
    a dull
    <text:span text:style-name="T4" >boy</text:span>.
    </text:p>
    As you can see, they only use a couple of tags (P and SPAN) for most content. Formatting is kept elsewhere (in CSS sheets), which does make the file easier to process. But it doesn't impose any kind of structure, since nothing identifies the particular pieces except the style-name attribute -- and most of these are chosen arbitrarily. So there isn't enough information to transform the content into other markups. You can always just strip out the formatting information -- but you can do that with any word processor format!
    1. Re:Not that simple by neglige · · Score: 1

      You are right. It would be possible to assume the structure of the document from the style names, but this is a pretty dirty hack unless all your documents follow strict formatting rules (assign the correct paragraph 'style definition' to each paragraph).

      --
      My cats ate my karma. They also wrote this comment.
    2. Re:Not that simple by fm6 · · Score: 1
      Which is exactly what people already do to create structured documentation in Word and "unstructured" Framemaker. It works as long as you keep a close eye on how people are using styles. The minute they get careless, the whole thing breaks down.

      The OO people are experimenting with support for XML export by associating styles with XML tags. The latest version has a beta implementation of "simplified DocBook". But again, this is pretty much what people already do with Word and Framemaker. The difference is that OO is less of a mess than Word, and the OO people will certainly do a better job of defining a mapping mechanism than the one that comes with "structured" Framemaker. That last thing is very badly designed!

      When you're creating a structured document, you really want an authoring tool that writes directly to whatever XML application you're using. The problem with that is that you have to define this application before you can write a single word -- which makes this approach impractical for an ordinary productivity suite like Open Office.

      The one tool I really like for this is XMetal. You can feed it a DTD and a set of stylesheets, and it's instantly ready to do WYSIWYG editing of XML. Unfortunately, XMetal now belongs to Corel, which seems determined to destroy it in the name of .NET support.

      The leading XML/SGML editor is the Arbortext product (I foget what they changed the name to after the last rebranding). Unfortunately this one is expensive. Also, defining an XML or SGML application is non-trival.

      I got all excited when I heard about Lyx, an open-source "Document Processor" that support structured documents. Unfortunately its native format is not structured. Once again, you have to define some kind of mapping between stylistic elements and your document structure.

      Interesting thread here.

  19. Uniformed public by SgtChaireBourne · · Score: 1
    No, actually they are facing a uninformed public...
    Yes, for right now. But people and businesses eventually start to ask themselves why they are running low on money. Or they ask themselves why they are spending so much time just trying to get/keep the MS machines running when all the others brands do just fine.

    I supposed the prohibition on product reviews and general criticism are contributing to the problem by preventing informed decisions. Likewise when the media refer to the Microsoft Worm / Virus of the Week as an "E-mail Worm" or "Internet Virus", they are effectively providing spin/damage control for a single company's design and production defects by neglecting the scope of the problem.

    --
    Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
  20. Schema, DRM, server dependency - unresolved issues by SgtChaireBourne · · Score: 1
    There are still some serious unresolved issues with the way MS-Word 2003 handles XML and even with MS-Word 2003 itself.

    First, it seems that only two of the 6 flavors of MS-Word 2003 get the XML as touted. Second, the schema is still proprietary. Third, the application uses DRM so earlier versions are not compatible and must buy upgrades. Fourth, the DRM is dependent on MS-Server 2003 with expensive per-seat client licenses.

    So, at first glance to use MS-Word 2003's XML format it looks like you need at least one installation of MS-Server 2003 plus client licenses. Then you will need new copies of Office 2003 accross the board. Then to be allowed the privilege of accessing your data, you must keep paying the licensing fees. That allows single point of failure to take down your whole workplace at either of two points: the server or the licensing fees.

    Making tools that go around all that would be in violation of copyright/DMCA, the Economic Espionage Act of 1996, a computer crime, at the very least.

    So at first glance, it looks safer, more flexible, and less work to go with the Oasis format.

    --
    Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.