Opera CTO Hits Back at Microsoft's Standards Push
Michael writes "Opera CTO Håkon Wium Lie hit back today at Microsoft's push to fast track Office Open XML into an ISO standard, in a
blistering article on CNET. He also took a swipe at Open Document Format: 'I'm no fan of either specification. Both are basically memory dumps with angle brackets around them. If forced to choose one, I'd pick the 700-page specification (ODF) over the 6,000-page specification (OOXML). But I think there is a better way.' The better way being the existing universally understood standards of HTML and CSS. Putting this to the test, Håkon has published a book using HTML and CSS."
Actually one of the highlights of the CSS spec is support for non-standard display types, such as screen readers, projectors, PDA, and yes, print. CSS is a rather brilliant standard, but since W3C hasn't really seen fit to publish a reference platform for it, there's no real compliance checking in the major browers.
Such things exist. TeX provides a decent the base for such things, so it's a matter of finding a TeX centric editor. LyX would be a good example, and indeed it has the sort of functionality and general approach to document creation that you seem to be after. Of course it doesn't necessarily have all the other features that other word processors might have (like mail merge or what have you).
Craft Beer Programming T-shirts
Tables are not obsolete. Tables are still used for tabular data, which is what they were originally intended to be used for, and that has not changed.
Tables shouldn't be used for page layout -- that's what CSS is for. It's as simple as that.
You're entirely right. Word/OOo aren't used for pro typesetting and page layout. But if we exclude that, then we still have many, many other formats, like RTF too (or why not even BBCode while we're at it?). Yes it's quite ugly, but I don't see (x)html + css as being the answer either:
;)
-too many versions of html (4, and perhaps 5 soon) and xhtml (1.0, 1.1, strict, transitional, etc)
-different versions of CSS, browser support for it varies quite a bit (and is pretty much non-existent for CSS3)
-too many rendering engines, css hacks required so the content displays the same in most of them, etc
-html/css sucks at MANY things - how about a self-updating TOC? (don't even try to say some javascript parsing the DOM for header tags with certain IDs to generate it dynamically!) Hell, how can you even tell the page numbers in a html "document" anyways?
-while word/OOo formats aren't real typesetting (like InDesign CS2 would do), at least they have half-way decent typography. Yeah, no fancy glyphs or super precise kerning, but it's still usable. On the web there's only a handful of "just OK" fonts one can use (unless everything is rendered server-side as images).
-if people use html/css, there would basically be no standards *at all* or anything even resembling it (much like anything we see on the web). And I'm not sure the W3C is really going to help much here... Not that their recommendations are implemented very quickly (so many nice standards, but with basically no support e.g. xforms). And I'm not sure they're really being too helpful anymore either - more like slow and misguided IMO.
At least with the new formats you're starting fresh, with the chance to have most features (like a Table of Content), and have them implemented properly. Mind you I'm not saying the new word/OOo XML formats are perfect - nor even the answer to the problem in the first place...
And yeah, it's not like (x)html has angle brackets either
Looks to me like Opera has only one tool: a hammer (or is that a web browser?) and everything is strangely starting to look an awful lot like a nail?
html/css sucks at MANY things - how about a self-updating TOC? (don't even try to say some javascript parsing the DOM for header tags with certain IDs to generate it dynamically!)
This would have to be done by the tool displaying it, same as a self-updating TOC in a Word or OpenOffice Writer document. The information is present in a correctly-structured HTML document in the form of Hx tags.
Hell, how can you even tell the page numbers in a html "document" anyways?
The same way you would in a Word document. It doesn't make sense if you're looking at it as a web page in your browser, but if your editor used HTML it would work the same way. (This also partially alleviates the rendering issues.)
I don't know if there's an automated way, especially because you run into the problem of differences in rendering. But, if you are on Linux, just install CUPS-pdf or on Windows, use PDFCreator (http://sourceforge.net/projects/pdfcreator/). Both are print drivers so you can use the HTML/CSS rendering engine of your choice (pick a browser), then print.
File | Export to Web? Am I missing something here?
/. summary.) But "To prove how powerful it can be, the authors decided to use CSS in the production process" is following only one link.
Yes, the fact that he used a program called Prince to generate a reasonably professional-looking "book". Not "printed web page". Book.
Funny, he didn't mention that he "wrote" the book in HTML, just that he "published" it in HTML.
"It is now possible, even feasible, to use HTML as the document format for books." (Granted, that's two links off the
That PDF posted above was generated entirely from an HTML + CSS document.
The problem with using HTML for publishing is that to this day there is no viable downloadable font system. So you are limited to a lowest-common-denominator list of 2-3 fonts like verdana and new times roman. With Flash and PDF you can do a lot more, but obviously authoring becomes a problem.
It can be done, some of the time, but it's very, very easy to mess up. I have tried numerous times to get Japanese support, using one of the several special Japanese versions that exist (it seems it simply can't be done with standard TeX), and only once did I manage to generate a DVI - which I was unable to convert to a usable format, because doing so always stripped out all Japanese text, for some reason I never managed to fathom.
And this is all fair enough, because TeX was written to scratch Knuth's itch, and therefore it does what Knuth needed very well: it's brilliant for typesetting English and mathematics. Unfortunately that doesn't make it the solution to all the world's typesetting problems.
I hate to say it, but "inferior" products like MS Word, OpenOffice.org, etc. have supported Arabic, Hindi, Chinese, and Japanese perfectly for as long as I can remember. Largely because they use Unicode internally, rather than one of the numerous inadequate and non-standard encodings that TeX and its derivatives rely on.
To be fair, there's a Unicode version of TeX called Omega or some such. I'd doubtless have found it very useful if I'd ever managed to get it to work at all.
And it worked out great.
http://software-libre.rudd-o.com/
Used MediaWiki to write the chapters, wrote a small python proggie (available there) to consolidate the wiki into a single HTML file (mostly conforming to the Boom! microformat), then used Prince and Hakom's book CSS to generate the PDF.
Great typesetting, collaborative book editing, screw LaTeX!
Hakom was right.
Rudd-O - http://rudd-o.com/
If you want to stay in Latex use the memoir document class.
Since nobody gets it, I'll spoil it: That's how Håkon advises people to pronounce his name. It's even on his business card.
Breakfast served all day!
An example of the HTMLDOC specific code used in the conversion
ODF is not about web pages or word processing. It's a standard for office documents including spreadsheets, presentation and word processing. That's a big difference from what Opera's CTO is talking about. CSS/HTML might make a good format for one part of the suite (word processing) with a lot of work on the standard. The issue: that's not what is needed for a standard. It's about doing for office documents what HTML did for websites. ODF is actually an opportunity for opera - extend the browser to support ODF so people can post ODF documents, make dynamic applications render to ODF and so on. It takes the web to the next level and further erodes the big monopoly.
-- $G
I can also define short commands like \code{} for inline code snippets (e.g. variable or structure names) and then decide how I want them typeset later. I have a \note{} command defined, that puts the note in square brackets, blue underlined when I compile a draft, and doesn't display it at all when I compile a final version.
The other nice thing about LaTeX is that it works with all of my standard development tools. I can keep my document in subversion, and have human-readable output to svn diff. I have a Makefile that makes generates all of my inclusions (e.g. graphs from gnuplot, images from OmniGraffle) and then typesets my document.
The only problem with LaTeX is that it's not really a well-defined format. A LaTeX document is basically a program that generates a document. My source code if pretty easy to read, because I use English words for typesetting commands and then define them in my document class. Without the document class file, someone would be able to extract most of the semantic information from my source, but they would find it hard to generate my output.
I am TheRaven on Soylent News
- position an image on page 4 of my document?
You don't, nor do you want to. But you can anchor, float or bind the images to the text easily enough. This would be handled by css... for the HTML side, it would just be div and object tags --- not that you would ever see them, since this is an word app.
- add footnotes?<p class="footnote">My footnote</p> with the appropriate CSS rule (presumably something like float: page or whatever.)
- embed fields (date, last editor...)?Using XML entities, presumably
- mark the embedded TOC as TOC so that it gets regenerated on reload?Regenerated on reload? Come on, have some ambition.. it should be in sync at all times. Anyway, by keeping tracks of the header tags, presumably.
HTML is *not* a description language suitable for word processing in its current state, and it is unclear it can be made so without sacrificing device indepence.XHTML+CSS would need some expansions... but probably not much. A good layout program propably doesn't care about the device, but if it did, there are already @media tags to handle this situations. There are also a couple of other truly dedicated layout namespaces on w3 to consider.
But all this matters not. This is politics. Sadly.
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
I am TheRaven on Soylent News
I am TheRaven on Soylent News
I suggest that you read a bit about both formats and how they were developed. And actually look at the XML samples of both. Google it, it's not so hard.