Could LaTeX Replace HTML?
Acheon asks: "I recently learned to use LaTeX recently and I wondered why it couldn't be turned into the next standard for online documents. After all, most features of LaTeX make it either easier or more powerful than HTML, such as pagination (pure HTML 4.0 is a nightmare to code by hand) and scientific notation. It is much more suited for scripting, much more standard and readable, as well as more versatile. Also, HTML to LaTeX transcription is already feasible, so the only big feature missing for LaTeX to be supported in browsers would be linking, perhaps object embedding. On the other hand I don't know of any project going into that direction, what is most of a surprise to me given the huge interest for LaTeX and the omnipresence of such documents in many areas."
This is a rather naive question- have you used LaTeX at all? I say this as a dedicated LaTeX user: LaTeX just isn't suited for web applications for a huge number of reasons.
First of all, you say HTML is a nightmare to code in. Perhaps if you are trying to go all the way with CSS, sophisticated visual layout, and so fourth, but I can knock out a simple, standards-compliant web page in 15-20 minutes. Not a pretty one, but a functional one. I can do that with LaTeX, but only with a library of templates which I have built up over the years. You just can't do LaTeX quick-and-dirty. It's not designed for it.
Second, there is the issue of visual formatting. LaTeX and HTML both, in theory, are based on the principle of content-based markup- you specify the data in content terms, and the browser/LaTeX engine determines how best to format it for display. Anyone who has ever used either of these languages knows that this is a total lie, especially for HTML. All professional HTML work centers on various hacks to achieve direct visual formatting of the page, something which HTML is fortunately quite amentable to. LaTeX, on the other hand, is a huge pain in the ass if you're trying to control the look and layout of a document- the LaTeX engine knows what's best , and it's sure as hell not going to take advice from you! You can do visual formatting the proper way, by redefining commands and LaTeX variables to get LaTeX to understand the visual format you are looking for. However, this is an enormous time outlay, and is completely impractical for anything less than, say, a book.
More fundamentally, LaTeX and HTML, although they were originally concieved for similar purposes (content markup for visual display of academic papers), have evolved in radically different directions. While LaTeX has stuck pretty close to that original intent, HTML has become almost a GUI specification language, with all kinds of capabilities which LaTeX simply doesn't have. The proof is in the pudding: Show me a LaTeX version of the Amazon page. Or the Slashdot main page. Even ignoring the issues like linking that you mention, it is for all practical purposes impossible. It would require literally weeks of dedicated LaTeX hacking, and the result would be a horrific kludge. LaTeX is, and is likely to remain, a language for typesetting documents for the purpose of conventional, dead-tree publication. Any other application of it would be a gross violation of a fundamental principle of hacking: the right tools for the right job.
In short, LaTeX and HTML have only their theoretical conception in common. For all practical intents and purposes they are so vastly different that using LaTeX as a general web language is inconceiveable. There is, however, a new language emerging which promises to clean up the blurred boundaries of content and visual formatting, and get rid of the most flagrant horrors of HTML. If you want to see an HTML alternative, go look into XML.
"Never let your sense of morals prevent you from doing what is right" -Salvor Hardin
For example, I have a LaTeX macro which will quote and cite from a source in the margin of my document. The Web has no concept of a margin. Sure, I could make Netscape 4.76 lay out a web page as if it were a technical paper, but why should I have to "flip pages" on the Web? And what if I want to read this super LaTeX-enabled web page in lynx? on my Visor? on my cell phone? with a screen reader?
Sure, you can simulate a lot of physical markup items with style sheets, but that's not the point. The point is, HTML is designed to embellish text with simple, logical markup; one of HTML's greatest strengths is that it can be rendered faithfully by a variety of different tools with myriad differences in capability. LaTeX, OTOH, is designed to target one medium: a DVI file which is tied to a particular page size. So you have some logical markup, but in general a lot of the "logic" is tied to physical realities of the page. (how many times have you typed \vspace{1.0cm}, for -- albeit a trivial -- example?)
In addition, LaTeX doesn't lend itself to interpreting -- the more powerful features, like indexing, citations, and TOCs all require multiple passes. Add to this that it's a LOT harder to parse and (to be honest) to write than semi-valid HTML, and it's just not a viable standard. The final nail is inertia. The web is based on HTML, and it has for a long time. People are OK with extending HTML in bizarre ways to give them an approximation of TeX-like control over their document's appearance, so there's no room for a better, cleaner language. :-)
~wog
LaTeX (based on TeX) is a fine typographic markup language. That is, it is specifically designed for describing pages of text in a elegant fashion.
SGML is a markup language designed to describe a document's contents, not layout. The layout of an SGML document is determined by a stylesheet.
HTML, was based upon SGML because the idea behind HTML was not to design a page description language, but a document description language. A language that describes the elements of a document and not how they are to be displayed on the screen or be printed. Unfortunately thanks to the commercial interests of Netscape and Microsoft, it failed to seperate layout and content.
XML is an attempt to simplify SGML, eliminating the more esoteric features. XML documents do not describe layout, but rely upon Stylesheets to determine how a page is layed out. This proves superior to LaTeX because a seperation between content and layout can be made.
The idea is, you can mark up data with XML, and then using a stylesheet, change how it is presented to the user. Even more impressive, the content's presentation (or stylesheet) can be modified dynamically through scripting.
XHTML is HTML represented in terms of HTML, it is the future, and as time progresses (we can hope) that XML and Stylesheets will eventually replace HTML.
LaTeX is not the answer for HTML. The goals of LaTeX is for the final presentation to be printed pages. LaTeX does a splendid job of that. The goal of XML is data-description. Add stylesheets and you have the means to present content in many ways.
XML is the replacement for HTML. XHTML is the gateway from HTML to XML.
Adding the new feature should take "only a few more weeks" according to them team, although there were suggestions that LaTeX support would also be added to the mail client, futher delaying the browsers release. Another programmer noted that "we might also want to make this LaTeX thing skinable".
Users waiting for Mozilla to release seemed suprisingly unsurprised by the announcement, although one slashdot reader was heard to say "it's a pity - i might have even used mozilla if IE crashed."
Drag n' Drop DVD Recommendations