HTML V5 and XHTML V2
An anonymous reader writes "While the intention of both HTML V5 and XHTML V2 is to improve on the existing versions, the approaches chosen by the developers to make those improvements are very different. With differing philosophies come distinct results. For the first time in many years, the direction of upcoming browser versions is uncertain. This article uncovers the bigger picture behind the details of these two standards."
Is Microsoft involved in this at all? If it is, then I am worried. Other than that, I can only say..."bring it on." It will be a matter of some kind of plug-in to make all viable browsers feel at home in whatever environment they find themselves in.
nobody cares about new web standards anymore.
You have to hand it to the W3C, they keep supplying web designers with rope.
/> /> />
I've been trying to get them (and browser people) to include a security oriented tag to disable unwanted features.
Why such tags are needed:
Say you run a site (webmail, myspace (remember the worm?), bbs etc) that is displaying content from 3rd parties (adverts, spammers, attackers) to unknown browsers (with different parsing bugs/behaviour).
With such tags you can give hints to the browsers to disable unwanted stuff between the tags, so that even if your site's filtering is insufficient (doesn't account for a problem in a new tag, or the browser interprets things differently/incorrectly), a browser that supports the tag will know that stuff is disabled, and thus the exploit fails.
I'm suggesting something like:
<restricton lock="Random_hard_to_guess_string" except="java,safe-html"
browser ignores features except for java and safe-html.
unsafe content here, but rendered safely by browser
<restrictoff lock="wrong_string"
more unsafe content here but still rendered safely by browser
<restrictoff lock="Random_hard_to_guess_string"
all features re-enabled
safe-html = a subset of html that we can be confident that popular browsers can render without being exploited e.g. <em>, <p>).
It doesn't have to be exactly as I suggest - my main point is HTML needs more "stop/brake" tags, and not just "turn/go faster" tags.
Before anyone brings it up, YES we must still attempt to filter stuff out (use libraries etc), the proposed tags are to be a safety net. Defense in depth.
With this sort of tag a site can allow javascript etc for content directly produced by the site, whilst being more certain of disabling undesirable stuff on 3rd party content that's displayed together (webmail, comments, malware from exploited advert/partner sites).
That's a very good article - as always IBM give a well-written introduction to the subject. But exactly what is the state of implementation of these? As far as I can gather, no browser maker has started to implement support for either. Is that correct? It would be useful to have some idea of the time scales we can expect on these both. Anyone know more about the state of play?
Aide-toi, le Ciel t'aidera - Jeanne D'Arc.
internet is so much more than HTML5/XHTML2
that Microsoft will not support it.. afterall they never fully supported HTML in the first place :P
on a serious note, though.. i urge every web developer to stop treating MSIE as a special case, since it does not follow standards. Instead, code it according to the w3 standards, and if IE doesn't display it properly, add something that redirects users to an info-site about why they can't view the site, and recommend that they download a decent browser.
Pure awesomenes
All the talk about web applications these days, but a W3C-endorsed user interface markup language (like XUL/XAML) is nowhere to be seen.
A next-gen "HTML" should support common application widgets like panels, toolbars, menu's, tabs etc etc. Without it, it's not worth the effort IMO.
I have to agree with Linus on this one.
His position on the matter is right on as always.
I'm sticking with XHTML1.0 strict. Perhaps I'll use XHTML1.1 with appropriate DTD if I ever need to support the canvas element, other than that... none of this stuff is what I want from a markup language.
All the browser vendors have already said they will support HTML 5 (yes, that includes MS) and all but MS have said they won't support XHTML 2 (MS hasn't made much of an effort to suggest they will support it either).
As it stands, with both XHTML 5 and XHTML 2 using the same namespace, it is only possible to support one of the two.
Why not just go with XHTML all the way? I always though that the best way of "fixing" all the broken and horribly written HTML out there on the web would be to build a proxy that could translate from broken HTML to nicely formed XHTML and then send that to the browser, cleaning up this whole double rendering paths in the browsers (unless I missunderstood something) etc. XHTML really could be enough for everyone, and having two standards instead of one certainly isn't working in anyones favor.
am I the only developer thats sick of this html / css / javascript mess??
people/companies are trying to develop rich applications using decade old markup language thats improperly supported by different browsers (even firefox doesn't fully support css yet) and is a very ugly mix right now, its like squeezing a rectangular plasticine object thru a round,triangular and starshaped holes at the same time
the web needs a reboot
we need a programming language that:
*works on the server and the client
*something that makes making UIs as easy as drag and drop
*something that does not forgive idiot html "programmers" who write bad code
*something that doesnt suffer from XSS
*something that can be extended easily
*something that can be "compiled" for faster execution
*something thats implemented same way in all browsers (or even better doesnt require a browsers and works on range of platforms)
Most of the web is non well-formed, so it's variations of HTML 4 with non-standard components. An HTML 5, that remains a non-XML language, presents a reasonable way forward for "web sites." Without the need to be well-formed, the tools to create are easier and can be sloppy, particularly for moderately admined sites. Creating a new HTML 5 might succeed in migrating those sites. If you avoid most breaks with HTML 4, beyond the worst offenders, Browsers could target an HTML 5, and webmasters would only need to change 5%-10% of the content to keep up. That would mean a less degrading "legacy" mode than the HTML 4 renderers we have now.
So while the HTML 4 renderers floating around wouldn't be trashed, they could be ignored, left as is, and focus on an HTML 5 one. Migrating to XHTML is non-trivial for people with out-dated tools and lack of knowledge. You can't ignore those sites as a browser maker, but HTML 5 might give a reasonable path to modernizing the "non-professional" WWW.
XHTML has some great features, by being well-formed XML, you can use XML libraries for parsing the pages. This makes it much easier to "scrape" data off pages and handle inter-system communication, which HTML is not equipped for.
It's interesting in that HTML and XHTML look almost identical (for good reasons, XHTML was a port of HTML to XML) but are technically very different, HTML being an SGML language, and XHTML an XML language. Both programs have their uses, HTML is "easier" for people to hack together because if you do it wrong, the HTML renderer makes a best guess. XHTML is easier to use professionally, because if there is a problem, you can catch it as being an invalid XML document. Professionals worry about cross-browser issues, amateurs worry about getting it out there.
XHTML "failed" to replace HTML because it satisfies the needs of professionals to have a standardized approach to minimize cross-browser issues, but lacks the simplicity needed for amateurs and lousy professionals.
Rev'ing both specs would be a forward move that might simplify browser writing in the long term while giving a migration path. XHTML needs a less confusing and forward looking path, and HTML needs to be Rev'd after being left for dead to drop the really problematic entries and give people a path forward.
It sure sounds like you're suggesting Flash/Flex/Apollo with something like ColdFusion/Java on the backend.
Seriously, at this point, having a single standard for web pages is going to be passe. All it will take is a good open source implementation for the browser, critical mass, and eventually, the big players will follow.
This is my sig.
If different browsers decide to adopt different standards, people (ie. web designers and developers) will fall back on whatever currently works for all browsers (ie. whatever we're using right now).
Browser vendors have publicly stated they will not implement XHTML 2. That 'standard' is stillborn. (X)HTML 5 is the way forward.
So I guess the question is which one Microsoft will ignore the most?
Both standards are being worked on the by the W3C standards group.
According to the IBM paper html 5 is being done independently of the W3C. "In April 2007, the W3C voted on a proposal to adopt HTML V5 for review" is about as much as W3C has with html 5.
FalconShould there be a Law?
Here is what I would suggest: 1 multi-column drop down, with sort capabilities. This is something that is available in desktop applications; 2) built-in browser menu; 3) better scripting modal window, I should have OK(alert), OK/Cancel(confirm), Yes/No, and Yes/No/cancel message boxes at least or a better way of specifying these.
Maybe some of this is improvements in JS/ECMA Script, but making more things 'built-in' to the browser, would make a more standard experience, assuming you could get everyone to upgrade to the browsers and people to develop to these standards.
After reading this article, it would be nice to have both these standards merged into one so I get xforms with HTML5 menu and toolbars.
Only 'flamers' flame!
Does slashdot hate my posts?
And what if I'm interested in creating elegant, accessible documents incorporating the new features? I guess I'm screwed by the idiots at W3C, then?
That's not a matter of allowing those poor, repressed amateur web designers to express themselves. It's their problem that they cannot comprehend something as easy as XML and it's a pity that Web authoring tools don't work like a good, old hammer - if you don't know how to use it, you hit your fingers and know better next time to be careful because it hurts. 1997 is over, tag soup is becoming a horror of the past - what's up with those people trying to keep it?
This is Slashdot. Common sense is futile. You will be modded down.
The worst thing about W3C standards is the lack of a reference implementation. If you can't produce a computer program that implements 100% of the specification you are writing in a reasonable timeframe, your standard is too complex.
Is doesnt matter if the reference standard is slow-as-molasses or requires vast quantities of memory, at least you have proven the standard is actually realistically implementable. On the other hand if your reference implementation was easy to build and is really good, then that will foster code re-use and massively jump-start the availability of standardised implementations from multiple vendors. It might also show that you have a really good standard there.
If you don't do this, you get stuff like SVG - I don't think there is even one single 100% compliant SVG implementation anywhere, and there may never be.
There aren't any fully compliant CSS, or HTML implementations either, to my knowledge.
The same goes for XHTML and HTML5. If you, as a standards organisation, are not in a position to directly provide, or sponsor the development of an open reference implementation, then personally, I think you should be restricting your standard to a smaller chunk of functionality that you are actually able to do this with.
There is no reason a composite standard, with a bunch of smaller, well defined components, each with reference implementations, can't be used to specify 'umbrella' standards.
Now, i am also aware that building a reference application tends to make the standard as written overly influenced by shortcomings in the reference implementation, but i really can't believe this would be worse that the debacle surrounding WWW standards we've had for the last 10+ years. Without a conformant reference implementation, HTML support in browsers is dictated by the way Internet Explorer and Netscape did things anyway.
I'm also aware that smaller standards tends to promote a rather piecemeal evolution of those standards, when what is often desired is an 'across the board' update of technology.
But this 'lets define monster standards that will not be fully implemented for years, if at all, and hope for the best' approach seems to be obviously bad, allowing larger vendors to first play a large role in authoring a 'standard' that is practically impossible to fully implement, and then to push their own hopelessly deficient versions of these 'standards' on the world and sit back and laugh because there is no way to 'do better' by producing a 100% compliant version.
I gots ta ding a ding dang my dang a long ling long
Why bother with any kind of standard? IE will just display it fscked up anyway...
The author apparently has no experience with rendering XHTML on mobile devices. First of all, since the screen is smaller, it's not just about restyling things in a minimalist theme. It's about prioritizing information and remove the unnecessary one so more important information becomes more accessible in limited display real-estate.
For example, anyone who accessed Slashdot homepage on their mobile phone knows the pain of having the scroll down past the left and right columns before reaching the stories. You can simulate this experience by turning off page style and narrowing your browser window to 480 pixels wide. The story summaries are less accessible because they're further down a very long narrow page.
Another problem is the memory. Even if you style the unnecessary page elements to "no display", they're still downloaded and parsed by the mobile browser as part of the page. Mobile devices have limited memory, and I get "out of memory" error on some sites. For reading long articles on mobile devices, it is better to break content into more pages than you would on a desktop display, both for presentation and memory footprint reasons.
For these two reasons, a site designer generally has to design a new layout for each type of device. The dream of "one page (and several style sheets) to rule them all" is a fairytale.
I once had a signature.
completely before to foster a gay and posts. Therefore exemplified by CRISCO OR LUBE. to survive at aal fear the reaper
The current situation is awful.
There is no contest, the browser vendors have made it very repeatedly clear on the WHATWG and HTML5 mailing lists that they do not intend to further support XHTML. They are going down the HTML5 dead-end, and s0d the rest of us.
<restricton lock="Random_hard_to_guess_string" except="java,safe-html" />
Doesn't really matter how "hard to guess" your string is if you're going to transmit it cleartext in the body of your HTML document, does it?
"But wait!" you say, "We can randomize the string every time the document is served, thus defeating anything but an embedded Javascript with access to the DOM." Perhaps so, but now you're talking about server-side behavior — something clearly beyond the purview of the HTML specification.
If you think about it clearly, there's only one place that it makes any sense to address hostile embedded content, and it is server-side, with the growing battery of techniques already in service. Insisting that the HTML spec and browsers should be addressing this issue is assinine.
SIERRA TANGO FOXTROT UNIFORM
*doesn't and never will exist.
HTML, Javascript and CSS have become the lay of the land, like it or not, because they've been in use for so many years now. Replacing them all in one fell swoop would require everyone to throw out all of their old code and start over from scratch, for little or no reason. What we have now actually works, but the edge cases and never deprecating old components is killing overall compatibility.
Instead, what needs to happen is, all browsers that cannot render a page simply need to not try and bail out, telling the developer that at this point, their webdesign sucks and is invalid. Of course, as long as Microsoft Browser of the Idiots exists, this will never happen either, as they pride themselves on the fact their browser is perpetually broken.
You may love the latest stuff shipping with Vista, but it's not on my computer and I'm not going to swipe a copy.
I don't even have Comic Sans, Ariel, Verdana, Times New Roman, etc.
I do have fonts. Some of them look kind of nice. You probably don't have them.
if awkward to install in some systems.
I liked it, and didn't find it awkward to install on either of the two PCs I installed it on. Back then though I had Windows now I have a Mac. I also liked XMLSpy.
FalconShould there be a Law?
You don't need to pass a validating parser, much less walk the DOM on every document. Comments should be stored in the database with the correct encoding. What kind of moron stores user submitted content in a db without first converting it to their apps default encoding? RSS feeds are irrelevant, even with ATOM you should check the encoding in the validation step as you would for any other third party content.
Ads are a problem, market forces should lead the brokers to standards compliance; unfortunately one vendor managed to grab a monopoly on the browser market. So I give you ads but consider the other points bogus.
The W3C was well on the way toward being fully useless, pointless, and ignored. They'd build themselves a lovely ivory tower, locked themselved inside it, covered their eyes and ears, and started to enjoy LSD. It was heaven for people who liked politics and design-by-committee more than engineering and practicality.
We love our tag soup. It mostly works, unlike xhtml which only works in Gecko. (nope, not IE, unless you use the text/html MIME type and your "xhtml" just happens to be tolerated when parsed as html) Tag soup gets stuff done.
XML is nothing to be proud of. Though I'm no fan of LISP, even LISP-style notation would be better than XML. XML is gross inefficiency while not even being particularly readable. In any case it's not a significant improvement over the bastardized SGML that is the foundation of HTML.
I thank the HTML 5 guys for their attempts, but I prefer XHTML v2
From TFA:
XHTML V2 isn't aimed at average HTML authorsXHTML is for intelligent human beings, you know, people who can actually understand what separation of concerns is.
[HTML v5] propose features that might simplify the lives of average Web developersSo HTML v5 is for people who don't understand separation of concerns.
Unfortunstely that's the 99% of web kiddies out there.
The standards will appeal to different audiences.One standard for smart people who know programming and actually work with an engineering mindset, another for those who see the web as a big graffiti and work with an "anything goes" mindset. No thanks, I prefer ONE standard for smart people, XHTML v2, and just to kick out everyone who isn't qualified.
I often wish for an Open Source browser brave enough to say "screw the W3C, we're going to be IE compatible". I suppose it's OK to leave out the exploitable buffer overflows. I want the rest though.
...and so on, etc., ...
Recognize the popular ActiveX controls, providing Open Source substitutes when possible. Feed any remaining ActiveX crap into Wine, with appropriate sandboxing.
Do the VBscript stuff.
Do the DirectAnimation stuff.
Ignore MIME types; they get lost anyway when you save the files.
Being "right" just isn't worth the trouble. This isn't a fight worth fighting.
What the web is crying out for is a standard that supports a rich data hierarchy, a rich presentation hierarchy, and a databinding mechanism to connect these two (preferably without using CSS, but that's another debate).
That's exactly where the next-gen UI frameworks have gone (Flex from Adobe, XAML from Microsoft). These frameworks represent the wave of the future and that's where the web needs to go too.
Meanwhile, the web standards community spouts all this rhetoric of "separating presentation and semantics" in HTML/CSS, which is nonsense. Both HTML and CSS are precisely concerned with presentation. And they are not at all separate. You need to know and love both to coax good looking pages out of a browser. All this huffing and puffing, yet the best they can offer for application-specific data models is microformats!
As far as I can tell, both HTML 5 and XHTML 2 are icing on the cake, and missing the main course altogether.
Amen!
Web Browsers and DHTML/DOM/JS are meant for "e-brochures", yet people are trying to bend them into everything, and it gets uuuugly.
Java's probably the closest to what we really need; however, it needs to simplify its GUI API's (most its API's are buerocratic, in fact), de-link the GUI engine from specific languages, and/or allow some kind of scripting/dynamic-typed option, and go OSS.
Table-ized A.I.
The word is "clique", not "click".
The beauty of the web was that anyone could put up a web page.
All you "standards nazis" out there, please don't forget that. The web is for everyone, yes, even those who can't write HTML "properly".
Hopefully browsers will always render badly formed HTML, otherwise the web will be a poorer place for it.
Uhh, the direction of browsers is to XSLT transform whatever format they receive into one that is compatible with the format they render. Its not so much a quandary, but a waiting game to see what formats they'll need to transform.
To imagine it any other way is silly.
So, like C++?
Recently, I've had the privilege to work with people that were preceding both ISO committee's and W3C committee's. What struck me was their tendance to create standards that were on a high academic level. At the same time any pragmatic argument failed to be of any influence on the standard.
Although this leads to standards that are a pleasure to those who like the pilosophical aspect of representation of and interaction with information - and I'm certainly one of them - it also leads to standards that will never be used.
In the real world outside ISO and W3C, mundane arguments, like cost of implementation, degree of skill needed to work with those standards, ease of transition, etc, etc. *are* of importance and will influence the standard that will prevail in the end.
Although I can enjoy the academic approach to a new standard, I have to say that as owner of a IT company my hopes are on the pragmatic approach of HTML V5.
BTW: The job i did for those ISO guys (They didnt't work fulltime for ISO) was to map the ISO standard they had developed,to a practical implementation in the organisation they worked for after they had failed to do so themselves, so go figure.
HTML V5 rewritten as an XML dialect will not probably be a XHTML V5, as XHTML uses is own naming scheme and branching for development. Unfortunatelly, HTML has diverged in two opposite directions, and HTML V5 one is simply better for everyday users/developers.
So when article author says:
...they provide us with a way to produce something that works vaguely like a modern desktop GUI that doesn't take 6 months to get working and even then have more bugs than Scotland in midge season!
The web applications produced in the last few years are a real testament to dogged determination in the face of insurmountable odds but come on we need a rethink regarding (interactive) web applications if there are going to really take off.
I used to have a better sig but it broke.
HTML5 and "XHTML5" are two serializations of the same "language". The HTML5 spect explains how to transform an HTML5 text into an XML DOM, while XHTML5 just uses a normal XML parser. Both result in an DOM, and the way that DOM is interpreted is consistent between the two serializations.
As long as browsers support slop, developers will write slop. A browser that doesn't support slop, won't be used.
If these standards are not going to be completed until 2012, or later, that just means that the web will be even more entrenched in old standards, and people will be even more reluctant to change.
Very few people pay attention to the current standards. Why is anybody going to pay attention the new "standards" ?
...with Black Jack! And hookers! In fact, forget the standard!
All I want to know about upcoming XHTML versions is, can we finally put block-level elements inside paragraphs? We've been whining about the inability to do this, and basically not _using_ paragraphs as such because we can't, for over a decade. It's, as far as I'm concerned, the *one* problem with XHTML as it stands.
As far as HTML5, I thought XHTML was supposed to *be* HTML5, conceptually if not in name. Having received the benefits of well-formed markup, why would anyone ever want to go back to the old "Maybe this element is inside of that paragraph, or maybe it's after it, depending on where we decide the elided close tag belongs" way of doing things? I for one don't *EVER* want to deal with non-wellformed markup again. Make it go away.
Cut that out, or I will ship you to Norilsk in a box.
I haven't looked at it for quite a while, but at least for many years, the "CSE HTML Validator" wasn't actually a validator
Thanks for that. I wonder why I didn't hear this earlier. See, several years ago for classes in college we had to write, and validate, xhtml and the professors had us use CSE's HTML Validator. They arranged a deal with CSE for students to buy an unlimited version, there was a free 50 use version but students could easily use it more than 50 tymes, at a reduced cost.
FalconShould there be a Law?
Mmmm, difficult to take any side at the moment but this is yet another format war. I guess that it is urgent to make the two proposal converge otherwise we might get a waste of energy in dulicate standard maintenance and implementation.
Just a simple question: this post on ars technica describes a pretty cool example of web page uing HTML5. Can XHTML gurus tell us how this would be done using XHTML 1.1?
...and what happens when a user with an older or non-compliant browser views your site that doesn't properly handle this tag that you'd be relying on? You can NOT implement your security client-side and expect it to be anything more than a speed bump for those that want to circumvent it. HTML is not meant to handle security in any way, and it shouldn't be expected to, ever, for the obvious reasons. What's stopping you from doing the same with server-side code, and why on earth wouldn't you prefer that to client-side? Your tag is a horrible idea, but you might see it in some upcoming version of IE anyway.
Regexes are completely the wrong tool for handling HTML. Even what he said about entity encoding them can be dangerous, because removing 'bad' things doesn't always make it safer. And regexes can only handle tags that are nested no deeper than some level.
Which is, of course, why one uses the proper tool to handle the data.
Mind you, I got this information from the Perl regex book, so it's not like I hate regexes or Perl or anything. I mean, I did my first multi-threaded code by modifying a JAPH (yeah, *that* JAPH...). For that matter, Perl has a bunch of very nice parsers that can handle tag soup without mangling it. There's a nice recursive descent parser, not to mention one or three specifically for HTML (and XML, etc.).
Then again, if I look at who I'm replying to, I recognize that nickname as being the same as one of the Perl dev's nickname, so you probably already knew that. Looking at your homepage, you probably are *that* chromatic, so I'll just shut up now.
You lost any respect when you mentioned using Dreamweaver... Using Dreamweaver is like using those auto game creation tools to make games, (MMORPG Creater 10.5.21, Click a button, get a game!), what you end up with is a steaming pile of crap. If people want to be weekend web site developers, let them learn the proper way. Using a product to butcher the code for you (that you would then end up trying to learn from) is the worst way to go.
If you really want to learn, get a book. Study the basic precepts of the subject you want to learn. Then practice, practice, and practice some more.
You mean the roles attribute... right? My brain had a parse error trying to figure out what you meant :-(
...
Maybe my brain only works with XHTML
Let the HTML Wars begin!
Minti: What's that huge shuriken in your back?! Kin: It's the instrument of my victory.
Whats the difference of HTML V5 and XHTML V2??? web development
I'm not sure how it is that you're misunderstanding me, but I damn well do know how this stuff works. I've even written a web server (like everybody and their dog, right?) and plenty of code to parse HTML.
Now, to an extent, there is something dynamic: pages are being automatically generated. It doesn't matter when. The pages can be cached, or not. They can be generated and kept forever, served out identically to every visitor. It just doesn't matter, except for web server performance.
To attack, one supplies data that will wind up inside the page. (a forum post, an email, etc.) It is at this moment that the attacker has his one and only chance to guess the random secret. The page is generated either right then, or repeatedly in the future. The attacker can now see it, but so what? He lost, and can not fix his error. His next attempt will be on a fresh new page with a fresh new secret. Knowledge of previously generated pages is completely useless to him.
lynxcache mirror: http://lynxcache.com/HTML_V5_and_XHTML_V2.html