HTML V5 and XHTML V2
An anonymous reader writes "While the intention of both HTML V5 and XHTML V2 is to improve on the existing versions, the approaches chosen by the developers to make those improvements are very different. With differing philosophies come distinct results. For the first time in many years, the direction of upcoming browser versions is uncertain. This article uncovers the bigger picture behind the details of these two standards."
Both standards are being worked on the by the W3C standards group. Microsoft, along with all other major browser developers, is a member.
My blog
You have to hand it to the W3C, they keep supplying web designers with rope.
/> /> />
I've been trying to get them (and browser people) to include a security oriented tag to disable unwanted features.
Why such tags are needed:
Say you run a site (webmail, myspace (remember the worm?), bbs etc) that is displaying content from 3rd parties (adverts, spammers, attackers) to unknown browsers (with different parsing bugs/behaviour).
With such tags you can give hints to the browsers to disable unwanted stuff between the tags, so that even if your site's filtering is insufficient (doesn't account for a problem in a new tag, or the browser interprets things differently/incorrectly), a browser that supports the tag will know that stuff is disabled, and thus the exploit fails.
I'm suggesting something like:
<restricton lock="Random_hard_to_guess_string" except="java,safe-html"
browser ignores features except for java and safe-html.
unsafe content here, but rendered safely by browser
<restrictoff lock="wrong_string"
more unsafe content here but still rendered safely by browser
<restrictoff lock="Random_hard_to_guess_string"
all features re-enabled
safe-html = a subset of html that we can be confident that popular browsers can render without being exploited e.g. <em>, <p>).
It doesn't have to be exactly as I suggest - my main point is HTML needs more "stop/brake" tags, and not just "turn/go faster" tags.
Before anyone brings it up, YES we must still attempt to filter stuff out (use libraries etc), the proposed tags are to be a safety net. Defense in depth.
With this sort of tag a site can allow javascript etc for content directly produced by the site, whilst being more certain of disabling undesirable stuff on 3rd party content that's displayed together (webmail, comments, malware from exploited advert/partner sites).
All the browser vendors have already said they will support HTML 5 (yes, that includes MS) and all but MS have said they won't support XHTML 2 (MS hasn't made much of an effort to suggest they will support it either).
As it stands, with both XHTML 5 and XHTML 2 using the same namespace, it is only possible to support one of the two.
This also seems to be the case when ever somebody bitches about web designers changing fonts, using javascript, or doing something to make their page look nice. You visit the websites created by the "changing the font at all, even in the stylesheet, is evil" or the classic "why are you trying to use two columns? two columns are evil" religious zealots and all their pages look really dull and boring. Long streams of times new roman. I guess this is our future, eh?
HTML 5 is aiming to support various things needed for web applications (in fact, the current draft is formed of two documents: Web Applications 1.0 and Web Forms 2.0). Also, see http://www.w3.org/2006/appformats/admin/charter.html.
Why not just go with XHTML all the way? I always though that the best way of "fixing" all the broken and horribly written HTML out there on the web would be to build a proxy that could translate from broken HTML to nicely formed XHTML and then send that to the browser, cleaning up this whole double rendering paths in the browsers (unless I missunderstood something) etc. XHTML really could be enough for everyone, and having two standards instead of one certainly isn't working in anyones favor.
am I the only developer thats sick of this html / css / javascript mess??
people/companies are trying to develop rich applications using decade old markup language thats improperly supported by different browsers (even firefox doesn't fully support css yet) and is a very ugly mix right now, its like squeezing a rectangular plasticine object thru a round,triangular and starshaped holes at the same time
the web needs a reboot
we need a programming language that:
*works on the server and the client
*something that makes making UIs as easy as drag and drop
*something that does not forgive idiot html "programmers" who write bad code
*something that doesnt suffer from XSS
*something that can be extended easily
*something that can be "compiled" for faster execution
*something thats implemented same way in all browsers (or even better doesnt require a browsers and works on range of platforms)
http://en.wikipedia.org/wiki/Comparison_of_layout_engines_(HTML5) covers HTML 5, Nobody has started to try to implement XHTML 2 AFAIK (though definitely nobody major).
Most of the web is non well-formed, so it's variations of HTML 4 with non-standard components. An HTML 5, that remains a non-XML language, presents a reasonable way forward for "web sites." Without the need to be well-formed, the tools to create are easier and can be sloppy, particularly for moderately admined sites. Creating a new HTML 5 might succeed in migrating those sites. If you avoid most breaks with HTML 4, beyond the worst offenders, Browsers could target an HTML 5, and webmasters would only need to change 5%-10% of the content to keep up. That would mean a less degrading "legacy" mode than the HTML 4 renderers we have now.
So while the HTML 4 renderers floating around wouldn't be trashed, they could be ignored, left as is, and focus on an HTML 5 one. Migrating to XHTML is non-trivial for people with out-dated tools and lack of knowledge. You can't ignore those sites as a browser maker, but HTML 5 might give a reasonable path to modernizing the "non-professional" WWW.
XHTML has some great features, by being well-formed XML, you can use XML libraries for parsing the pages. This makes it much easier to "scrape" data off pages and handle inter-system communication, which HTML is not equipped for.
It's interesting in that HTML and XHTML look almost identical (for good reasons, XHTML was a port of HTML to XML) but are technically very different, HTML being an SGML language, and XHTML an XML language. Both programs have their uses, HTML is "easier" for people to hack together because if you do it wrong, the HTML renderer makes a best guess. XHTML is easier to use professionally, because if there is a problem, you can catch it as being an invalid XML document. Professionals worry about cross-browser issues, amateurs worry about getting it out there.
XHTML "failed" to replace HTML because it satisfies the needs of professionals to have a standardized approach to minimize cross-browser issues, but lacks the simplicity needed for amateurs and lousy professionals.
Rev'ing both specs would be a forward move that might simplify browser writing in the long term while giving a migration path. XHTML needs a less confusing and forward looking path, and HTML needs to be Rev'd after being left for dead to drop the really problematic entries and give people a path forward.
Both standards are being worked on the by the W3C standards group.
According to the IBM paper html 5 is being done independently of the W3C. "In April 2007, the W3C voted on a proposal to adopt HTML V5 for review" is about as much as W3C has with html 5.
FalconShould there be a Law?
you ever use anything with ajax? i.e. u like google maps? u can thank MS for bringing that out of javascript...
ms ain't the devil for development, sometimes they drive new features and functionality that would take forever to incorporate otherwise. do they always do it in the best of ways, no, but they do bring out good things from time to time...
Anyone thinking of clicking on the parent's link (to vumit.com) should realize that it's a goatsex-style shocker page.
Find free books.
The worst thing about W3C standards is the lack of a reference implementation. If you can't produce a computer program that implements 100% of the specification you are writing in a reasonable timeframe, your standard is too complex.
Is doesnt matter if the reference standard is slow-as-molasses or requires vast quantities of memory, at least you have proven the standard is actually realistically implementable. On the other hand if your reference implementation was easy to build and is really good, then that will foster code re-use and massively jump-start the availability of standardised implementations from multiple vendors. It might also show that you have a really good standard there.
If you don't do this, you get stuff like SVG - I don't think there is even one single 100% compliant SVG implementation anywhere, and there may never be.
There aren't any fully compliant CSS, or HTML implementations either, to my knowledge.
The same goes for XHTML and HTML5. If you, as a standards organisation, are not in a position to directly provide, or sponsor the development of an open reference implementation, then personally, I think you should be restricting your standard to a smaller chunk of functionality that you are actually able to do this with.
There is no reason a composite standard, with a bunch of smaller, well defined components, each with reference implementations, can't be used to specify 'umbrella' standards.
Now, i am also aware that building a reference application tends to make the standard as written overly influenced by shortcomings in the reference implementation, but i really can't believe this would be worse that the debacle surrounding WWW standards we've had for the last 10+ years. Without a conformant reference implementation, HTML support in browsers is dictated by the way Internet Explorer and Netscape did things anyway.
I'm also aware that smaller standards tends to promote a rather piecemeal evolution of those standards, when what is often desired is an 'across the board' update of technology.
But this 'lets define monster standards that will not be fully implemented for years, if at all, and hope for the best' approach seems to be obviously bad, allowing larger vendors to first play a large role in authoring a 'standard' that is practically impossible to fully implement, and then to push their own hopelessly deficient versions of these 'standards' on the world and sit back and laugh because there is no way to 'do better' by producing a 100% compliant version.
I gots ta ding a ding dang my dang a long ling long
Ajax-like techniques are possible without XMLHttpRequest and I don't believe Google Maps uses XMLHttpRequest anyway. If any organisation is responsible for the popularity of Ajax, it's Google, as it was when they started using it extensively that it really took off.
Bogtha Bogtha Bogtha
Please re-read the original comment. It was saying that you can use JavaScript without being backwards-incompatible. You seem to have confused this with avoiding JavaScript altogether. Every single point you make is good against an argument that JavaScript should be avoided, but completely irrelevant to somebody asking for it to degrade gracefully, which is the distinction BlueParrot was trying to explain to you.
Bogtha Bogtha Bogtha
The author apparently has no experience with rendering XHTML on mobile devices. First of all, since the screen is smaller, it's not just about restyling things in a minimalist theme. It's about prioritizing information and remove the unnecessary one so more important information becomes more accessible in limited display real-estate.
For example, anyone who accessed Slashdot homepage on their mobile phone knows the pain of having the scroll down past the left and right columns before reaching the stories. You can simulate this experience by turning off page style and narrowing your browser window to 480 pixels wide. The story summaries are less accessible because they're further down a very long narrow page.
Another problem is the memory. Even if you style the unnecessary page elements to "no display", they're still downloaded and parsed by the mobile browser as part of the page. Mobile devices have limited memory, and I get "out of memory" error on some sites. For reading long articles on mobile devices, it is better to break content into more pages than you would on a desktop display, both for presentation and memory footprint reasons.
For these two reasons, a site designer generally has to design a new layout for each type of device. The dream of "one page (and several style sheets) to rule them all" is a fairytale.
I once had a signature.
The current situation is awful.
Sweet.. So we agree and I owe you some kind of beer. Slashdot makes everybody a flamer :-)
There is a very strong business case for good degradation too... Last I checked, Google doesn't interpret your javascript. You want good SEO, you better make sure the content flows right in lynx (which is the best way to think about how google sees the page).
Sadly, screen readers are pretty much like google too, but I really think we aren't feeding screen readers enough information for them to properly read a page. I really dont know the answer to screen readers. I've never played much with it, but in the windows world, if you were doing a winforms app you can sprinkle your form with metadata to help screen readers. But again, even the winforms solution is a bit like an alt tag.
When I took a usability class, we watched some video I wish I could find of somebody using a screen reader. Talk about intense. Imagine reading a web page, or any document for that matter, while looking through a straw that is only one word wide. That is about what it is like. Now read it with the voice cranked to "hyper fast talk mode" and that is how the blind experience the web. Very interesting and eye opening.
Whatever the future holds (silverlight/flex), we need to make sure the standard has some good, juicy metadata to help out screen readers (and google, really).
Where was I now?
I believe what you are referring to is the "Hidden iframe" technique. Google lists plenty of resources on using this technique.
That's one of them, yes. It really depends on what you want to do; for example you don't need anything other than typical mousedown event handlers for things like Google Maps, and you can use things like dynamically generated image URIs to send data back to the server asynchronously, which is compatible all the way back to Netscape 2. There are lots of options, the value in XMLHttpRequest is more convenience than functionality.
Bogtha Bogtha Bogtha
<restricton lock="Random_hard_to_guess_string" except="java,safe-html" />
Doesn't really matter how "hard to guess" your string is if you're going to transmit it cleartext in the body of your HTML document, does it?
"But wait!" you say, "We can randomize the string every time the document is served, thus defeating anything but an embedded Javascript with access to the DOM." Perhaps so, but now you're talking about server-side behavior — something clearly beyond the purview of the HTML specification.
If you think about it clearly, there's only one place that it makes any sense to address hostile embedded content, and it is server-side, with the growing battery of techniques already in service. Insisting that the HTML spec and browsers should be addressing this issue is assinine.
SIERRA TANGO FOXTROT UNIFORM
I remember when rusty and friends rolled out Dynamic Comments on Kuro5hin/Scoop. They did it with an iframe that chucked out a bunch of onload() crap that wrote into the parent document. Pretty slick for the time.
Way ahead of it's time though... most javascript was either for homework assignments or popup ads. All of it was copy/paste hackjobs that the web author found on super-mega-awesome-javascript.com or something. The result was "most people" hated javascript. You could browse 99% of the interweb with it disabled and all you'd miss were popups. Kuro5hin was one of the first reasons to actually turn on javascript because dynamic threaded comments were 100% better than the non-dynamic ones.
Now that javascript is starting to come of age and real programmers are writing cool things on it (and really javascript is kinda cool programming language once you get past super-mega-awesome-javascript.com and the differing implementations), almost anything that is useful on the internet uses javascript in some way. In a way, javascript has crossed the chasm from early adopters like kuro5hin to mainstream adoption and that nice beefy 80% of the market.
What I find funny is only the tech people are the laggards of this bell curve. And all 10% of them seem to hang out on slashdot pining for the days of yore. What a world we live in when the supposed alpha geeks are the laggards of a technology bell curve!!
Err, yes it does. From the Google Maps API reference:
And that's just a recent refinement. Google Maps has used the XMLHttpRequest object for ages. Yes, it's possible to get a similar effect using hidden iframes and such, but doing it that way is really awkward. They'd have to be crazy to pass that amount of data back and forth that way when they've got XMLHttpRequest.
I did. I still do: most images add nothing to the content, they merely add dancing bears to the web content, pull more bandwidth, provide client tracking through the tracking of third-party 1-pixel GIF's, and generally slow down my web performance. They also interfere extensively with text->speech synthesis for the visually impaired.
The Web is not for the developers. It's for the people who want and need the data, the clients who in the end actually pay the bills and view the pages. If it's a games site for people to play Flash games, great: othewise, get out of the dancing bears business and let me look up what I need.
I thank the HTML 5 guys for their attempts, but I prefer XHTML v2
From TFA:
XHTML V2 isn't aimed at average HTML authorsXHTML is for intelligent human beings, you know, people who can actually understand what separation of concerns is.
[HTML v5] propose features that might simplify the lives of average Web developersSo HTML v5 is for people who don't understand separation of concerns.
Unfortunstely that's the 99% of web kiddies out there.
The standards will appeal to different audiences.One standard for smart people who know programming and actually work with an engineering mindset, another for those who see the web as a big graffiti and work with an "anything goes" mindset. No thanks, I prefer ONE standard for smart people, XHTML v2, and just to kick out everyone who isn't qualified.
What the web is crying out for is a standard that supports a rich data hierarchy, a rich presentation hierarchy, and a databinding mechanism to connect these two (preferably without using CSS, but that's another debate).
That's exactly where the next-gen UI frameworks have gone (Flex from Adobe, XAML from Microsoft). These frameworks represent the wave of the future and that's where the web needs to go too.
Meanwhile, the web standards community spouts all this rhetoric of "separating presentation and semantics" in HTML/CSS, which is nonsense. Both HTML and CSS are precisely concerned with presentation. And they are not at all separate. You need to know and love both to coax good looking pages out of a browser. All this huffing and puffing, yet the best they can offer for application-specific data models is microformats!
As far as I can tell, both HTML 5 and XHTML 2 are icing on the cake, and missing the main course altogether.
The beauty of the web was that anyone could put up a web page.
All you "standards nazis" out there, please don't forget that. The web is for everyone, yes, even those who can't write HTML "properly".
Hopefully browsers will always render badly formed HTML, otherwise the web will be a poorer place for it.
Recently, I've had the privilege to work with people that were preceding both ISO committee's and W3C committee's. What struck me was their tendance to create standards that were on a high academic level. At the same time any pragmatic argument failed to be of any influence on the standard.
Although this leads to standards that are a pleasure to those who like the pilosophical aspect of representation of and interaction with information - and I'm certainly one of them - it also leads to standards that will never be used.
In the real world outside ISO and W3C, mundane arguments, like cost of implementation, degree of skill needed to work with those standards, ease of transition, etc, etc. *are* of importance and will influence the standard that will prevail in the end.
Although I can enjoy the academic approach to a new standard, I have to say that as owner of a IT company my hopes are on the pragmatic approach of HTML V5.
BTW: The job i did for those ISO guys (They didnt't work fulltime for ISO) was to map the ISO standard they had developed,to a practical implementation in the organisation they worked for after they had failed to do so themselves, so go figure.