Slashdot Mirror


Opera CTO Hits Back at Microsoft's Standards Push

Michael writes "Opera CTO Håkon Wium Lie hit back today at Microsoft's push to fast track Office Open XML into an ISO standard, in a blistering article on CNET. He also took a swipe at Open Document Format: 'I'm no fan of either specification. Both are basically memory dumps with angle brackets around them. If forced to choose one, I'd pick the 700-page specification (ODF) over the 6,000-page specification (OOXML). But I think there is a better way.' The better way being the existing universally understood standards of HTML and CSS. Putting this to the test, Håkon has published a book using HTML and CSS."

19 of 246 comments (clear)

  1. fsck'n ugly by Anonymous Coward · · Score: 5, Insightful

    Yeah, but that "book" is fsck'n ugly. It doesn't even compare to a professionally typeset book, or something produced in LaTeX. I hope that isn't the "solution" to this standards "problem". Let's face it, the average Joe is going to use whatever Microsoft pushes at them. Case closed.

    1. Re:fsck'n ugly by AKAImBatman · · Score: 5, Insightful

      Yeah, but that "book" is fsck'n ugly. It doesn't even compare to a professionally typeset book, or something produced in LaTeX.

      You don't typeset with Microsoft Word, either. Which makes the entire argument specious. Word processors like MS Word and OOo Writer are for creating common documents like letters, memos, and maybe the occasional flyer. Neither one is particularly good at anything even close to professional publishing work. Even the book authors just use Word (or surprisingly, OOo Writer!) to do the text content. That text is then exported to a more sophisticated program, where the actual typesetting and page layouts are done.

      I think this fellow's point is that HTML/CSS formats can store any information that a Word Processor might need to store, with no need to invoke new technologies. To a certain extent, he may be correct. Unfortunately, HTML/CSS may make a good intermediary format, but it is not particularly good from a performance or usability perspective. Then again, XML formats in general are fairly poor choices for the same reason.

      I think if we want to break this conundrum, the industry is going to have to learn how to keep local data stores that are of high performance, while exporting intermediary formats when emailing or uploading to external computers. The only problem is finding a way of doing this so that it's completely transparent to users. The mythical "mom" doesn't want to worry about emailing a document in the right format, or having the right program to read the attachment she received. She just wants it to do what she tells it, with no bloody prompting with questions she has no answers for.
    2. Re:fsck'n ugly by EvanED · · Score: 5, Informative

      html/css sucks at MANY things - how about a self-updating TOC? (don't even try to say some javascript parsing the DOM for header tags with certain IDs to generate it dynamically!)

      This would have to be done by the tool displaying it, same as a self-updating TOC in a Word or OpenOffice Writer document. The information is present in a correctly-structured HTML document in the form of Hx tags.

      Hell, how can you even tell the page numbers in a html "document" anyways?

      The same way you would in a Word document. It doesn't make sense if you're looking at it as a web page in your browser, but if your editor used HTML it would work the same way. (This also partially alleviates the rendering issues.)

    3. Re:fsck'n ugly by Anonymous Coward · · Score: 5, Insightful

      I don't see (x)html + css as being the answer either:
      Only because you can't tell the difference between "XHTML + CSS" and "web pages".

      -too many versions of html (4, and perhaps 5 soon) and xhtml (1.0, 1.1, strict, transitional, etc)
      So? Pick one as your word-processor standard, and rule all the others out. The existence of too many versions of MS Word doesn't seem to have hurt the .doc format.

      -different versions of CSS, browser support for it varies quite a bit (and is pretty much non-existent for CSS3)
      What does browser support have to do with word processing? We're talking about word processors, not web sites.

      -too many rendering engines, css hacks required so the content displays the same in most of them, etc
      And this is different from word processors how? Microsoft's XML format is absolutely crammed full of hacks to duplicate obscure rendering features of obsolete versions of Word, WordPerfect, etc. And it would surprise me very much if the rendering of ODF was pixel-identical between all the products that support it.

      -html/css sucks at MANY things - how about a self-updating TOC? (don't even try to say some javascript parsing the DOM for header tags with certain IDs to generate it dynamically!)
      You're thinking of web pages, not HTML. HTML used for a document could easily have an auto-generated table of contents. Remember that we're talking about using HTML as the file format for a word processor. A word processor can trivially parse the DOM for header tags and update a table of contents without requiring any JavaScript at all. It's kind of what word processors are for.

      Hell, how can you even tell the page numbers in a html "document" anyways?
      By looking at the little "Page N of N" display in your word processor, I would assume.

      -while word/OOo formats aren't real typesetting (like InDesign CS2 would do), at least they have half-way decent typography. Yeah, no fancy glyphs or super precise kerning, but it's still usable. On the web there's only a handful of "just OK" fonts one can use (unless everything is rendered server-side as images).
      What does "on the web" have to do with word processors? We're not talking about the web here. We're talking about word processors, which will have access to all the fonts the user owns, just like any other application.

      -if people use html/css, there would basically be no standards *at all* or anything even resembling it (much like anything we see on the web).
      Why not? We're talking about word processors, not the web. We're talking about computer-generated HTML, not something some 13-year-old hacked together by copying-and-pasting examples into Notepad. It would be trivial to enforce valid XHTML 1.1 + CSS2.1, for example.
    4. Re:fsck'n ugly by Lost+my+low+ID+nick · · Score: 5, Insightful

      So, McSmarty, how do I
        - position an image on page 4 of my document?
        - add footnotes?
        - embed fields (date, last editor...)?
        - mark the embedded TOC as TOC so that it gets regenerated on reload?
      etc.

      And on the CSS side, there are quite a lot of shortcomings, too.

      Of course, all of this would work with custom XML tags or special id/class conventions, BUT then you'd have to specify those. And getting this below 700 pages won't be easy.

      So repeat after me:

      HTML is *not* a description language suitable for word processing in its current state, and it is unclear it can be made so without sacrificing device indepence.

    5. Re:fsck'n ugly by TheRaven64 · · Score: 5, Interesting
      I had a little go at using HTML for this kind of thing a few years ago. One thing that you might not be aware of is that CSS has a few things related to pagination. While you can't say 'put this image on page 4,' you can say 'if you need to put a page break in, put it before or after this div, so that this text and this image are on the same page.' For the table of contents, I wrote some ECMAScript that scanned the DOM tree for h1-4s and built a set of nested lists to display it, with links to the real headings. It didn't print the page number because, although this is possible with CSS it wasn't implemented in any browsers when I tried it. The embedded fields are already supported by meta tags in the document head. Footnotes, however, are a tremendous pain to get right with HTML.

      I just dug out the template I wrote, and the pagination and ToC worked fine in Safari. The auto-numbering of headers, however, didn't. This is due to a lack of support for counters in generated content, and the same problem with Mozilla was a significant reason for abandoning the whole idea in the first place; the only browser everything worked in was Opera.

      Another significant reason for abandoning this idea (not entirely relevant when talking about document formats being generated by tools) was that HTML is a huge pain to type, and XHTML is even worse. Something semantically equivalent to XHTML but using S-expressions would have been fine, but typing XHTML just involves spending far too much time hitting > and < keys (not to mention the redundancy of close tags having the full tag name). I turned to LaTeX, which is easier to type and also (being a Turing-complete programming language) much easier to extend than HTML.

      --
      I am TheRaven on Soylent News
    6. Re:fsck'n ugly by EsbenMoseHansen · · Score: 4, Informative

      So, McSmarty, how do I
      - position an image on page 4 of my document?

      You don't, nor do you want to. But you can anchor, float or bind the images to the text easily enough. This would be handled by css... for the HTML side, it would just be div and object tags --- not that you would ever see them, since this is an word app.

      - add footnotes?

      <p class="footnote">My footnote</p> with the appropriate CSS rule (presumably something like float: page or whatever.)

      - embed fields (date, last editor...)?

      Using XML entities, presumably

      - mark the embedded TOC as TOC so that it gets regenerated on reload?

      Regenerated on reload? Come on, have some ambition.. it should be in sync at all times. Anyway, by keeping tracks of the header tags, presumably.

      HTML is *not* a description language suitable for word processing in its current state, and it is unclear it can be made so without sacrificing device indepence.

      XHTML+CSS would need some expansions... but probably not much. A good layout program propably doesn't care about the device, but if it did, there are already @media tags to handle this situations. There are also a couple of other truly dedicated layout namespaces on w3 to consider.

      But all this matters not. This is politics. Sadly.

      --
      Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
  2. Classic quote for the books, gotta love XML play by Tablizer · · Score: 5, Insightful

    "Both are basically memory dumps with angle brackets around them."

  3. huh? by User+956 · · Score: 4, Funny

    Putting this to the test, Håkon has published a book using HTML and CSS.

    Uhm. I'm no expert, but isn't a book that uses HTML and CSS called a website?

    --
    The theory of relativity doesn't work right in Arkansas.
    1. Re:huh? by 8-bitDesigner · · Score: 5, Informative

      Actually one of the highlights of the CSS spec is support for non-standard display types, such as screen readers, projectors, PDA, and yes, print. CSS is a rather brilliant standard, but since W3C hasn't really seen fit to publish a reference platform for it, there's no real compliance checking in the major browers.

  4. CSS for Documents? by zaydana · · Score: 5, Insightful

    Having a word processor act more like a web browser would be awesome. Ever since I started using word processors (which for me was a long time after I started using web browsers), i've always thought, why doesn't updating this style make all text with that style update? Why do I always have to change the same thing over and over again?

    While turning word processors into web browsers would be stupid, things like CSS would be awesome to have in word processors.

  5. I don't know that I agree completely by Evardsson · · Score: 5, Insightful

    While I do agree that the ISO doesn't need more than one standard for printable documents, I don't think that Håkon Wium Lie is on the right track with HTML/CSS for print.

    Sure, it works, with enough tweaking, and CSS3, and a $350 download of a product to turn HTML/CSS3 into a PDF. This is better how? What about LyX, LaTeX, or even OpenOffice if you are just going to convert to PDF?

    The whole HTML/CSS-to-print thing shoots the real argument in the foot.

    --
    Death looks every man in the face. All any man can do is look back and smile. - Marcus Aurelius
  6. Re:Is it mature enough? by willy_me · · Score: 5, Informative

    I'm a latex junkie. Latex though is a PITA to create templates and styles for. Someone willing to take up the task to modernize latex or completely replace it?
    Done. It's called ConTeXt.
  7. How come? by ShaunC · · Score: 5, Funny

    If forced to choose one, I'd pick the 700-page specification (ODF) over the 6,000-page specification (OOXML).
    So I'd ask Håkon, "how come?" :)
    --
    Thanks to the War on Drugs, it's easier to buy meth than it is to buy cold medicine!
    1. Re:How come? by PCM2 · · Score: 4, Informative

      So I'd ask Håkon, "how come?" :)

      Since nobody gets it, I'll spoil it: That's how Håkon advises people to pronounce his name. It's even on his business card.

      --
      Breakfast served all day!
  8. Re:Is it mature enough? by indiechild · · Score: 4, Informative

    Tables are not obsolete. Tables are still used for tabular data, which is what they were originally intended to be used for, and that has not changed.

    Tables shouldn't be used for page layout -- that's what CSS is for. It's as simple as that.

  9. Re:Is it mature enough? by MrNaz · · Score: 4, Funny

    You mean you display tabular data *without* tables? Dude, you missed the point in a big way. Like say for example Andre Agassi was serving a tennis ball at you, by "missed" I mean he was serving the ball on a court in California while you were standing waiting to receive on a court in Florida.

    --
    I hate printers.
  10. Open Office Herecy (sold here) by IBitOBear · · Score: 5, Interesting

    I use OpenOffice. I support Open Document Format over MS/XML and .doc.

    That said, ODF it kind of blows. Really.

    I write novel-length "books" and it is FREAKING IMPOSSIBLE to do some very basic things in any/every ODF based word processor I have tried to date.

    Exercise for the Interested:

    Make a "Book" with an automatic table of contents, said table to contain an "Authors Note", "Prologue", auto-numbered chapters 1 to N with their associated chapter titles (where the actual chapter number is the chapter number internal variable), and finally "Epilogue" all at the same level of the index.

    This simple task is essentially impossible. The flaw is caused by the fact that everything goes through the "styles" and the styles don't inherit their list membership properties. You should be able to make a style "TOC Entry" that is assigned to a particular table of contents level (e.g. level 1) then make a sub-style "Chapter Heading" based on "TOC Entry" but with the chapter numbering magic attached, and in so doing, create "different styles" that go to the same level/point in the list.

    Exercise for the Interested:

    Make a "Book" with each chapter, and the prolog, and the epilog in separate sub documents. The linkage thing is a mess, it is hard to move "the pile of files" around especially if you want to use subdirectories (etc). If you have a custom style in the master document style list you have to _USE_ it in the master document if you want it to be pushed into the created sub-documents. Once the sub-documents are created it is a royal pain (read effectively impossible, or "supremely hidden feature required") to update those styles in those sub documents if you change that style.

    Exercise for the Interested:

    Put three separate "outlines" into one ODF Document. In ODF the outline is a function of the style headers, they only exist as implications of structure instead of first class abstractions. This is largely the fault of Microsoft Word, since the Word folks totally messed this up when they supplanted WordPerfect (which did this inset outline/object sort of thing right).

    ODF was, IMHO, poisoned by the slavish attempt by someone trying to make a Word killer instead of a "good word processor."

    And there are stacks more of these issues.

    And all that said, I *STILL* use ODF (Open Office etc) because I CATEGORICALLY REFUSE to _RENT_ the right to access my own work from a third party. Microsoft has plainly stated that such rental model is their intended business plan, which makes them a non-starter.

    In my opinion, having used both Word and OpenOffice for years; and having used Word Perfect and wordstar before them, ODF is a "workman like effort" to create a document format suitable for "normal business purposes". There is a reason that the legal profession never moved over to Word, and they likewise will not move to ODF, when you need to get to a tightly proscribed document format, both Word and ODF have a "you can't get there from here" fundamental limitation. Both formats simply refuse to represent some things because the designers "know" that a different format is better. Neither ODF nor Word has any allowances for _art_, professional or poetical.

    So, governments should use ODF because it is "no worse" than Word in terms of the ability to represent the documents it can represent, and given that congruence, the shorter, 100% open standard is, or should be, a hard minimum requirements.

    In terms of ODF being the be-all and end-all of document representation, I'd have to say "hardly!" I looked into the OpenOffice code base a while back to see if adding/changing the format to allow for "a book" would be reasonable. It didn't appear to be. Too many of the original StarOffice assumptions about document structure seemed pathologically uninspired. It was like looking at a big pile of Visual Basic. Everything in the standard is way too global, nothing "nests organically" it all nests pedagogically. (Every

    --
    Innocent people shouldn't be forced to pay for inferior software development.
    --"Code Complete" Microsoft Press
  11. Um... NO by salesgeek · · Score: 4, Informative

    ODF is not about web pages or word processing. It's a standard for office documents including spreadsheets, presentation and word processing. That's a big difference from what Opera's CTO is talking about. CSS/HTML might make a good format for one part of the suite (word processing) with a lot of work on the standard. The issue: that's not what is needed for a standard. It's about doing for office documents what HTML did for websites. ODF is actually an opportunity for opera - extend the browser to support ODF so people can post ODF documents, make dynamic applications render to ODF and so on. It takes the web to the next level and further erodes the big monopoly.

    --
    -- $G