HTML to be 'Incrementally Evolved'
MrDrBob writes "It has been decided that HTML is going to be incrementally updated, as the W3C believe that their efforts with XHTML are going unnoticed or unused by many websites out there. HTML is going to be worked on in parallel with XHTML (but with no dependencies), with the W3C trying to evolve HTML to a point where it's easier and logical for everybody to transition to XHTML. However, their work is still going to attempt to improve HTML in itself, with work on forms moving towards transitioning into XForms, but bearing in mind the work done by Webforms. In addition, the W3C's HTML validator is going to get improved, with Tim Berners-Lee wanting it to 'check (even) more stuff, be (even) more helpful, and prioritize carefully its errors, warning and mild chidings'. This looks like a nice step forward for the W3C, and will hopefully leave all the squabbling and procrastination behind."
We cannot have new HTML without upgrading the best part of the web.
Example of server side blink
Wonderful!?
liqbase
What practical effect will this have? As long as browsers will render junk (X)HTML most people won't bother with an updated standard any more than they do the present one. Learning any proper coding system is work. What's the incentive other than pride in the craft? Firefox, IE, etc. make learning standards optional, which is just another word for more work.
I don't really see how this will improve the chances of their standards being adopted. It's not exactly like the leap from html to xhtml is all that confusing as is. This will just be even more confusing. Good luck getting all of the major browsers to support all of these incremental changes when they can't even keep up with the standards suggested years ago.
I think the way the W3C are going to try and go about it means that they'll gradually upgrade HTML so that there will eventually be a clear and simple transition path to XHTML, and therefore more websites will make the jump into the land of order.
You know, that's a good idea
To laymen like me, this sounds rather cryptic. Could any of you web gurus please elaborate, and/or list other advantages of XHTML?
With it being XML, it's easier to read with other tools - using an XML library makes it trivially easy to write code to turn an XHTML web-page into a highly structured, tree-like associative array which contains everything the original page contains.
In layman-speak - instead of mashing through the 'view source' equivalent (one big string), it becomes a mightily detailed tree, with every section of the page as another branch, twig or leaf. And to keep with the arboreal metaphor - when one has finished with one's web-page topiary, pruning or grafting, it's really easy to convert it back into XHTML - without losing anything in the process.
Tedious Bloggy Stuff - hooray?
XHTML is VERY strict. That makes it very easy to parse. But that same facet makes it very tough to write by hand. What I mean is that with HTML you've got all your tags, but many people don't write them correctly. How often do you write a closing P tag? Do you close your IMG tags like you should (<IMG SRC=... />)? Most people don't. If you did that in XHTML, you're page would be wrong and if the browser is in strict mode, things die with an error. Improper nesting can also cause this (<P>Some <B>stuff</P> things </B>).
This adds serious complexity for some people. While Dreamweaver can easily handle that, can you imagine what it would take to make /. XHTML? You would have to write little bits to parse out every comment and story submission that's in HTML and then output it into valid XHTML. That's a TON of work. Otherwise, one single error and /. could stop rendering at all (if the browser does what, IIRC, it should).
However, the fact that tags are always opened/closed correctly, always nested correctly, etc makes XHTML very easy to parse for a computer. This would make things like screen reading, data scraping, automatic transformations (like with XSLT), much easier.
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
Then make sure that the content added by the user is well-formed before adding it to the site.
The advantage of XHTML is that it's easier for machines (and developers) to process and work with after it's written. Additionally it supports emerging useful new standards such as XForms, and other XML processing tools.
The disadvantage of XHTML is that it's harder to write initially and has stricter rules. Most people write broken HTML 4 transitional pages that, quite honestly, work fine for their audience (web only).
Parsing HTML is a bitch, working with it is, quite simply difficult. Additionally XHTML supports embedding other XML formats than XHTML within it, like MathML (an way to display formatted math equations) SVG (XML Vector Graphics) and any other arbitrary XML based format in an easy to understand way via namespacing.
There's a whole suite of tools built around XML (XPath and XSLT for example) that enable one to deal with MathML and SVG as easily as XHTML. It makes things simpler.
XHTML is, however, a lot harder to write. HTML tolerates a lot of errors, XHTML technically tolerates none, though browsers usually overlook this.
For my job, where I have to create sometimes copious amounts of HTML that will be seen only by IE or Firefox on windows or a mac, and often times be deleted within a few months, I just write HTML4 transitional and don't really worry about validating. I test both browsers and leave it at that.
For my personal site, or shit that I have extra time to do I write XHTML because I like to make neat, clean things, but honestly there's never a tangible payoff from it for most applications.
I will say this however, people who know XHTML are the people who know how to really write a web page. The people who've never heard of it are the ones that are a bitch to work with and slow you down with their ugly ass tag soup pages with no CSS.
Photos.
HTML doesn't serve its purpose, because it doesn't mandate a lack of separation between content and style. For one, that means that it's difficult to process HTML pages with semantic tools. One of my favourite recent reads has been Visualising the Semantic Web ed. Geroimenko and Chen (Springer Verlag, 2005), which shows the rich possibilities of extracting information and transforming it, such as into a graphical display, or reorganizing it. This is all a cinch with any valid XHTML Strict page, but as long as we're stuck in HTML 4.01, these abilities will never be widely available to us.
Furthermore, creators of accessibility software are constantly marching uphill. Just yesterday the BBC had a report on how hard it is for blind users to use most plain HTML websites.
I've done parsers that "scrub" HTML for constructs that might cause security risks or mess up the site layout too, that had to accept almost all "sane" html, and even that isn't particularly hard, though quite a bit more work.
HTML at the moment is solid, robust, and gets the job done. As it has evolved it has gained additional features and power at each step, including CSS integration, better javascripting, DHTML, etc, thus leading at every step to a better end-user experience.
XHTML for all practical purposes, is HTML but with more errors. With XHTML, you get the power of being told that you have to put an end tag on all
tags. And, umm, not a lot else. The benefits of switching to XHTML are mostly theoretical.
The W3C needs to break the focus on validation, and get back to trying to work with developers and users to get what THEY want into specifications. It sounds like they realized that XHTML will not overtake HTML any time soon, and that they need to provide some sort of reason or reasons to make that change.
The ______ Agenda
I know CSS, but that's a far cry from what I want. I want something like "box1.top=page.top+header.height*1.1; box1.height=box2.height=auto; flow=box1,box2; flow.src=http://domain/document.xml#article.text.m ain", if you understand what I mean. CSS relies too much on the position of the element in the document. This leads to the static layouts that you see everywhere today. Once you've positioned something "out of flow", only its children can be positioned in relation to that element.
HTML has been and continues to be "Good Enough". If there were some truly compelling reason to upgrade to something else most already would have. When image tags were introduced, people abandoned lynx rather quickly, the same goes for transparent gif support, CSS, etc. Its nice to try to bring order or whatever the goal of xhtml is but frankly if its got the ability to slap some text on page, embed and image and throw in a pretty background its good enough for most people, they know it, they are comfortable with it and they arent going to change without a really compelling reason.
xhtml 1.0 doesn't need the xml mimetype, only 1.1 does. The IE7 team's rationale is that they don't want to support the mime type until the xml renderer is capable of processing xhtml properly. Just "accepting" the xml mimetype and then using the html engine is cludgy, and I agree with him. But yeah: IE7 doesn't support xhtml 1.1. This is still planned for later (IE8?)
Jeremy
I believe that this is a response to the actions of the WHATWG (Web Hypertext Application Technology Working Group) (X)HTML 5 and to Bjoern Hoehrmann leaving the W3C QA.
So it's not a new pie-in-the-sky idea like XForms or XHTML2, but something much more likely to be useful to web developers that need to work in a world where IE is (still) the biggest fish.
There's a hidden treasure in Python 3.x: __prepare__()
If your web site is part of a federal contract, it has to be compliant.
http://en.wikipedia.org/wiki/Section_508
The masses are the crack whores of religion.
For one, that means that it's difficult to process HTML pages with semantic tools. One of my favourite recent reads has been Visualising the Semantic Web [amazon.com] ed. Geroimenko and Chen (Springer Verlag, 2005), which shows the rich possibilities of extracting information and transforming it, such as into a graphical display, or reorganizing it. This is all a cinch with any valid XHTML Strict page, but as long as we're stuck in HTML 4.01, these abilities will never be widely available to us.
Well said. This is exactly why we need XHTML.
// TODO: Insert Cool Sig
tag and a single line feed into a
p le.tld/image.gif[/url][/img]
u e-7][b][color=blue]μ-(b)-7[/b][/url][/color]
tag. For obvious reasons, this behavior needs to be addressed in the parser.
As for language, I don't really care, but it'd have to be able to sort out things like:
Missing close tags:
[url=http://www.slashdot.org/][b]Cool slashdot article!!![/url]
Out of order properties (the img open and close tags should not be converted to HTML because its input is invalid, but the url's input is valid and should be parsed)
[img][url=http://www.example.tld/]http://www.exam
Lists (I'll let you figure out how this one should look):
[list=a]
[*][url]http://www.example.tld[/url]
[*][img]http://www.example.tld/image.gif[/img]
[list]
[*]Coffee
Out of order close tags (close tags should be reordered properly):
[url=http://www.example.tld/faq/momal.shtml#Mu-Bl
I had more examples, but the power just flashed out here a moment ago, and I lost them.
GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
One of the best texts I've read on this subject can be found here... http://www.hixie.ch/advocacy/xhtml
HTML doesn't serve its purpose, because it doesn't mandate a lack of separation between content and style.
Maybe HTML doesn't serve your purposes, but it certainly serves my purposes.
Personally, I couldn't care less about fuzzy concepts like the separation of content and style; I just want to be able to write webpages in nano which look decent to most visitors.
Tarsnap: Online backups for the truly paranoid
Develop a few *actual* applications where the XML-compliance of XHTML is actually useful in an observable way, and everybody will start producing XHTML compliant code for new websites, lest they be left out from a new revolution on the web.
As long as the benefits are just hypothetical (with XHTML somebody could develop useful parsing applications based on commodity XML parsers), try actually developing some such apps that generate real, observable value today, and you'll start convincing people who don't care about standards for their own sake.
I do generally try to stick to XHTML 1.0, since I care about standards and ease of parsing, but the majority of people don't, and they are the target audience the W3C needs to work on convincing.
If you don't fail at least 90 percent of the time, you're not aiming high enough. (Alan Kay)
Without iframes (currently supported by IE, Firefox and Opera, at least) 4.01 Strict isn't workable for most sites that rely on third-party content for advertising -- eg, ads from Amazon. And that's a large chunk of the web.
Ph-nglui mglw'nafh Gates M'dna wgah'nagl fhtagn.
> Isn't XHTML suppose to be a transition path to XML?
No, no, and still no. It is a specific application of XML.
HTML is dead. It's been superceded by XHTML for years now.
HTML was a good idea with some rough edges. It took XHTML to smooth some of them out. Specs that are less vague, more complete, and leave less to interpretation will fix more problems in the future.
XHTML is simpler than HTML (contrary to popular belief) because the syntax and structure is more consistent than HTML. You don't have to wonder whether you need a closing a tag: all tags get closed. All attributes get quoted. All tag names and attributes are lower case. It's really not that hard; if you don't want to do it because you can't read it anymore (you capitalization whore), that's what syntax highlighting is for. You just have to put forth a tiny bit of effort to make turn these rules into instinct.
There are two reasons why the transition to XHTML hasn't happened:
As long as browsers try to interpret messy markup, few people are going to care. It's the "good enough" attitude. "Quirks mode" is the big bad here. Browsers and visual authoring tools need to tell users that the page they are looking at is non-conformant and warn that it may not behave correctly. No other softare on the planet is as forgiving of the data it handles as web browesers.
If GCC still compiled C code when curly braces, paretheses, and quote marks are omitted at random, how much shittier would all the C code in the world be?
At least the W3C is doing something about the quagmire, but working in parallel is just a waste of time. Let HTML be, it's old and busted. XHTML is the new hotness. The W3C can spew out all the Recommendations (the flimsient of terms) it wants, but no one is going to care unless there's some enforcement at the other end of the line.
One thing the W3C needs to do is get off the semantic web high horse; it's putting the cart before the horse. They need to evangelize correctness, and the semantic web (plus other aspects) will follow naturally.
So, all you so called "developers" and "designers", keep on churning out your HTML 4.01 Transitional pages (or let Dreamweaver do it for you) with bloated table layouts. You'll keep contributing to the problem.
This requirement isn't just bureaucratic mumbo jumbo. Ensuring that all (valid) XML documents follow rules like this is what makes them so easy to parse quickly and unambiguously.
There are automated tools (e.g., Tidy) that will do most of the work for static pages. But there really aren't "thousands of pages" to deal with; the HTML to XHTML conversion process is pretty simple.The real problems with XHTML are:
- It makes some common idioms, notably including embedded Javascript code, much more awkward to write correctly.
- There's no payoff for most sites.
Item 2 is the real killer. If everyone is happily parsing "tag soup" HTML, which is often not compliant to any standard, why jump through the hoops (however easy those jumps might be) to comply with a standard that brings no immediate benefit?When all you have is a hammer, everything looks like a skull.
You didn't read the document, did you? You've got the W3C's blessing to serve XHTML as text/html, but there are differences in the way Javascript and CSS are processed when it's served on a page as application/xhtml+xml.
And who's fault is that? Certainly not the W3C's. They've been advocating the usage of CSS for style for years. Many of the HTML tags that only provide styling have been marked as deprecated in favor of CSS.
You could argue that they should have outright removed all of the HTML tags that only provide styling, but we all know that won't stop browsers from rendering them for compatibility.
If you really think XHTML is better, think again. While it doesn't support , it still supports , , , , , and more presentational tags. Also, since that strange browser from Redmond doesn't support the proper MIME type, XHTML is rendered as HTML anyway, making the effort useless.
No, the real issue here is bad web designers.
The W3C position is that Google, for example, should not be required to enjoy or research the web. Their view is that the web pages themselves should provide the context and relevency information that Google is doing. They want discrete, well formated information that's reletively unchanging. Another example is Wikipedia. The current version is a data base app with a webpage front end. The W3C would perfer to see the site as discrete pages so every page is a complete indexable document just like a book. Tim especially is much like RMS in his views that information should be "free", and freely accessbble.. the user should figure out how THEY want it, not be told.
The current trend of web apps as database front ends is what corperate customers and server vendors want because it provides more control.. more than that the actual inforation is locked up so you have to have "permission" to view it. Many of the W3C specs are kind of designed to sabotage that approach which only complicates matters. They need to get closer to web app designers versus acedemic content providers.
Separating content from presentation on the client side is just a bad idea. It pushes too much complexity to the client, which users don't care about anyways. Browsers should only have to support a simple presentation format, which a little simple customization of basic things like linewrapping. Let the server side worry about chugging through various data sources and formatting templates to create a good-looking presentation, but don't try to standardize all that on the client side. It just hasn't worked.
Nonesense, on two counts:
You might also argue that the fact that Microsoft didn't ever embrace Pascal (opting for BASIC and C instead) played some part in Pascal's demise. I'm not sure I buy this theory, however, since even environments which had enthusiastically embraced Pascal (the Apple Macintosh, for instance, where most of the OS APIs were designed to be called from Pascal) changed course as soon as standard C and C++ compilers were available.
The simple fact of the matter is that Pascal, even with a host of proprietary extensions, was not all that great a general purpose language. It was verbose and restrictive and it didn't give you much in return for those flaws. C just felt more expressive, partly because it used a more generalized expression syntax (where assignment operators could be freely mixed with arithmetic operators) and partly because it didn't require you to write too much excess verbiage (in other words, it was cryptic). Then, when you compiled your C code it just flew in comparisson to your Pascal code.
just a ghost in the machine.
That is the LaTeX attitude in a Word world.
Presentation is everything. Humans are emotional, not logical.
PDF and Flash are damn close to what people want. The main thing holding them back is that they aren't as integrated into the browser as HTML.
The tag.
It's actually better to use HTML, even after all these years of xHTML in the wild.