HTML to be 'Incrementally Evolved'
MrDrBob writes "It has been decided that HTML is going to be incrementally updated, as the W3C believe that their efforts with XHTML are going unnoticed or unused by many websites out there. HTML is going to be worked on in parallel with XHTML (but with no dependencies), with the W3C trying to evolve HTML to a point where it's easier and logical for everybody to transition to XHTML. However, their work is still going to attempt to improve HTML in itself, with work on forms moving towards transitioning into XForms, but bearing in mind the work done by Webforms. In addition, the W3C's HTML validator is going to get improved, with Tim Berners-Lee wanting it to 'check (even) more stuff, be (even) more helpful, and prioritize carefully its errors, warning and mild chidings'. This looks like a nice step forward for the W3C, and will hopefully leave all the squabbling and procrastination behind."
HTML should go in a direction where content and form are truly separated. Have a document (or part of a document) mark the content in a purely logical fashion (like XML) and another document (or another part of the document) describe a presentation and which parts of the content to use where in that presentation.
HTML relies too much on the order of the content for presentation. It should be more like the workflow in a DTP program: Add a text box to the layout, then fill it with text.
What practical effect will this have? As long as browsers will render junk (X)HTML most people won't bother with an updated standard any more than they do the present one. Learning any proper coding system is work. What's the incentive other than pride in the craft? Firefox, IE, etc. make learning standards optional, which is just another word for more work.
I don't really see how this will improve the chances of their standards being adopted. It's not exactly like the leap from html to xhtml is all that confusing as is. This will just be even more confusing. Good luck getting all of the major browsers to support all of these incremental changes when they can't even keep up with the standards suggested years ago.
I think the way the W3C are going to try and go about it means that they'll gradually upgrade HTML so that there will eventually be a clear and simple transition path to XHTML, and therefore more websites will make the jump into the land of order.
Then make sure that the content added by the user is well-formed before adding it to the site.
HTML doesn't serve its purpose, because it doesn't mandate a lack of separation between content and style. For one, that means that it's difficult to process HTML pages with semantic tools. One of my favourite recent reads has been Visualising the Semantic Web ed. Geroimenko and Chen (Springer Verlag, 2005), which shows the rich possibilities of extracting information and transforming it, such as into a graphical display, or reorganizing it. This is all a cinch with any valid XHTML Strict page, but as long as we're stuck in HTML 4.01, these abilities will never be widely available to us.
Furthermore, creators of accessibility software are constantly marching uphill. Just yesterday the BBC had a report on how hard it is for blind users to use most plain HTML websites.
I've done parsers that "scrub" HTML for constructs that might cause security risks or mess up the site layout too, that had to accept almost all "sane" html, and even that isn't particularly hard, though quite a bit more work.
HTML at the moment is solid, robust, and gets the job done. As it has evolved it has gained additional features and power at each step, including CSS integration, better javascripting, DHTML, etc, thus leading at every step to a better end-user experience.
XHTML for all practical purposes, is HTML but with more errors. With XHTML, you get the power of being told that you have to put an end tag on all
tags. And, umm, not a lot else. The benefits of switching to XHTML are mostly theoretical.
The W3C needs to break the focus on validation, and get back to trying to work with developers and users to get what THEY want into specifications. It sounds like they realized that XHTML will not overtake HTML any time soon, and that they need to provide some sort of reason or reasons to make that change.
The ______ Agenda
HTML has been and continues to be "Good Enough". If there were some truly compelling reason to upgrade to something else most already would have. When image tags were introduced, people abandoned lynx rather quickly, the same goes for transparent gif support, CSS, etc. Its nice to try to bring order or whatever the goal of xhtml is but frankly if its got the ability to slap some text on page, embed and image and throw in a pretty background its good enough for most people, they know it, they are comfortable with it and they arent going to change without a really compelling reason.
xhtml 1.0 doesn't need the xml mimetype, only 1.1 does. The IE7 team's rationale is that they don't want to support the mime type until the xml renderer is capable of processing xhtml properly. Just "accepting" the xml mimetype and then using the html engine is cludgy, and I agree with him. But yeah: IE7 doesn't support xhtml 1.1. This is still planned for later (IE8?)
Jeremy
Develop a few *actual* applications where the XML-compliance of XHTML is actually useful in an observable way, and everybody will start producing XHTML compliant code for new websites, lest they be left out from a new revolution on the web.
As long as the benefits are just hypothetical (with XHTML somebody could develop useful parsing applications based on commodity XML parsers), try actually developing some such apps that generate real, observable value today, and you'll start convincing people who don't care about standards for their own sake.
I do generally try to stick to XHTML 1.0, since I care about standards and ease of parsing, but the majority of people don't, and they are the target audience the W3C needs to work on convincing.
If you don't fail at least 90 percent of the time, you're not aiming high enough. (Alan Kay)
HTML is dead. It's been superceded by XHTML for years now.
HTML was a good idea with some rough edges. It took XHTML to smooth some of them out. Specs that are less vague, more complete, and leave less to interpretation will fix more problems in the future.
XHTML is simpler than HTML (contrary to popular belief) because the syntax and structure is more consistent than HTML. You don't have to wonder whether you need a closing a tag: all tags get closed. All attributes get quoted. All tag names and attributes are lower case. It's really not that hard; if you don't want to do it because you can't read it anymore (you capitalization whore), that's what syntax highlighting is for. You just have to put forth a tiny bit of effort to make turn these rules into instinct.
There are two reasons why the transition to XHTML hasn't happened:
As long as browsers try to interpret messy markup, few people are going to care. It's the "good enough" attitude. "Quirks mode" is the big bad here. Browsers and visual authoring tools need to tell users that the page they are looking at is non-conformant and warn that it may not behave correctly. No other softare on the planet is as forgiving of the data it handles as web browesers.
If GCC still compiled C code when curly braces, paretheses, and quote marks are omitted at random, how much shittier would all the C code in the world be?
At least the W3C is doing something about the quagmire, but working in parallel is just a waste of time. Let HTML be, it's old and busted. XHTML is the new hotness. The W3C can spew out all the Recommendations (the flimsient of terms) it wants, but no one is going to care unless there's some enforcement at the other end of the line.
One thing the W3C needs to do is get off the semantic web high horse; it's putting the cart before the horse. They need to evangelize correctness, and the semantic web (plus other aspects) will follow naturally.
So, all you so called "developers" and "designers", keep on churning out your HTML 4.01 Transitional pages (or let Dreamweaver do it for you) with bloated table layouts. You'll keep contributing to the problem.
And who's fault is that? Certainly not the W3C's. They've been advocating the usage of CSS for style for years. Many of the HTML tags that only provide styling have been marked as deprecated in favor of CSS.
You could argue that they should have outright removed all of the HTML tags that only provide styling, but we all know that won't stop browsers from rendering them for compatibility.
If you really think XHTML is better, think again. While it doesn't support , it still supports , , , , , and more presentational tags. Also, since that strange browser from Redmond doesn't support the proper MIME type, XHTML is rendered as HTML anyway, making the effort useless.
No, the real issue here is bad web designers.
The advantages of consistency *ALWAYS* prevail when you're dealing with computers. Anything else is just begging for errors. And I'm quite certain that if I ran any sites you wrote in plain HTML through three or four browsers, I would get very different results.
Separating content from presentation on the client side is just a bad idea. It pushes too much complexity to the client, which users don't care about anyways. Browsers should only have to support a simple presentation format, which a little simple customization of basic things like linewrapping. Let the server side worry about chugging through various data sources and formatting templates to create a good-looking presentation, but don't try to standardize all that on the client side. It just hasn't worked.
That is the LaTeX attitude in a Word world.
Presentation is everything. Humans are emotional, not logical.
PDF and Flash are damn close to what people want. The main thing holding them back is that they aren't as integrated into the browser as HTML.
The tag.