Slashdot Mirror


Alternatives To .DOC As Standard WP Format?

D. C. Sessions asks: "I'm on the Software Task Group of a standards body (JEDEC) which is, among other things, responsible for the DDR memory standard. You may have heard of it. Currently standards drafts must be submitted in an editable word processing format, which right now is interpreted as FrameMaker or MS Word. I find not only offensive, but dangerous that these standards -should- outlive the current MS software that can manipulate them. I've gotten some sympathy on 'bit rot' from the rest of the committee based on showing what current flavors of Word do to documents saved with older versions, but the problem is this: What do I propose as a replacement?" Two that come to mind right off of the top of my head are LaTeX and, of course, HTML. Any other formats that can work just as well as .DOC in most situations and are cross-platform to boot?

"It should (obviously) be an open file format, preferably with an open source tool to access it. It absolutely must be usable on LoseBlows, should be usable on Mac, and (for my own sake) on Linux and Solaris. It must be capable of structured documentation, numbering, tables, and embedded vector graphics. I just don't know of such a beast at present."

6 of 205 comments (clear)

  1. No. by Eloquence · · Score: 4
    • HTML print results are unpredictable, formulas are hard to layout, and page design is impossible.
    • LaTeX is bad at handling images, and there are no easy editors for the Windows platform.
    • RTF has been killed by Microsoft with dozens of different implementations. (Some of them omit important things like footnotes.)
    • SDW (Star Office) is just as proprietary as Microsoft's DOC, but supported by fewer platforms.
    • PDF is a print format, text extraction is more difficult, and it's bad for PDAs.
    • TXT is insufficient for most tasks.

    XML may be a way out, but there's no XML-based document format on the horizon. (I don't know about this Open E-Book stuff, though.) All in all, the OSS community has failed to provide an open, flexible document format that could compete with MS Word. I'm as unhappy with that as you are, but if you want to change it, all word processor developers must get together and formulate a standard. Is this ever going to happen? Note that most closed-source word processors want to bind their users to their product by using a proprietary, closed format.

    --

  2. SGML/XML/DocBook by tobyjaffey · · Score: 5

    Use a nice SGML/XML application like DocBook. Tools for manipulation are free, anyone can write DocBook, with or without specialist tools (it looks a lot like HTML to the layman).

    Don't use HTML, at least use XHTML making sure that you segregate style from content. If you must use HTML, use stylesheets so that formatting is consistent.

    But, my recommendation would be to use DocBook (SGML) and use stylesheets and nice free parsers to output TeX, ASCII, RTF, HTML and whatever else people want.

  3. Some Suggestions by bhurt · · Score: 4

    Consider using TeX/LaTeX, postscript, or an XML/SGML variant, like DocBook or HTML.

    Basically, what you want is a format the fits the following criteria:
    1) The original text can be easily gotten out of the format. This way even if the programs that read the file go the way of the dodo, future programs could still recover the data.
    2) The specification is fully open and documented, and preferrably stable and mature.
    3) At least one open-source program handles displaying/converting the format. I would recommend storing a copy of this program in the same place as the standards themselves- including shipping source with standards CDs.

    You've gotten over the hardest part already- you've realized you have a problem.

    Brian

  4. Re:It may seem incredibly redundant... by dbarclay10 · · Score: 5

    I'm sorry, but I have to disagree with you.

    XML is nothing more than a concept - you store data and text within "tags". The tags can be of pretty much any name. The data can be anything. This isn't a standard, it's not even a format.

    Basically, XML boils down to: store it in a text file, delimit data, fields, and content by tags. Sorry, that doesn't cut it. You have to do more.

    No, if you want to think about using XML for this, you need to talk about the DTD, not XML itself.

    So, the question becomes, which DTD? In order to compete with the competition(LaTeX, HTML, PostScript), it has to be: device-independant, easily rendered, easily edited, and extremelycomprehensive.

    Don't shout "XML!!". XML, without a DTD, is almost useless, especially for this application. The DTD has to be all those things I mentioned, plus(for this application), it needs to be standard.

    Dave

    Barclay family motto:
    Aut agere aut mori.
    (Either action or death.)

    --

    Barclay family motto:
    Aut agere aut mori.
    (Either action or death.)
  5. LoseBlows by aphr0 · · Score: 5

    Thanks for showing the maturity everyone has come to expect from the linux community.

    Hey linsux users - grow up.

  6. It may seem incredibly redundant... by Gendou · · Score: 4

    ...but I think XML is the clear answer here. XML is already very mature, can be used in a number of situations, and can incorporate more than just text.

    You can even embed binary data in an XML document (with a tiny bit of creativity) for all those people who like to populate their files with custom fonts, clipart, graphs, etc. (This is accomplished through something, say... <BINARY CLIPART><DATA>[image data]</DATA></BINARY CLIPART>. You get the idea.)

    How about special configuration parameters? You could incorporate tags that would handle the way a document is viewed by different people ("are you a techie, marketing drone, webbie, etc" -> certain data becomes visible).

    The biggest advantages here are obviously the standards provided by XML (thank you W3C). It's uses are broad. It's got high quality interpreters on ALL platforms (especially JAXP for Java - it's a joy to work with *g*).

    The only standards we'd really have to focus on would be which tags would be considered "key" tags.

    What else do you need? Doesn't OpenOffice already use XML as it's standard document type?

    Sure I could be wrong on this, so don't berate me too much. I've just had a lot of positive experience working with XML for sooo many different applications.