HTML V5 and XHTML V2

Re:Where is Microsoft? by morgan_greywolf · 2007-12-16 05:55 · Score: 3, Informative

Both standards are being worked on the by the W3C standards group. Microsoft, along with all other major browser developers, is a member.

--
My blog

Re:Where is Microsoft? by Anonymous Coward · 2007-12-16 06:01 · Score: 0

Why would microsoft be involved? They don't give a shit about web standards! Maybe it would be a good thing if they got involved - not to be confused with "in charge".

Bet there still isn't a decent "Stop!" button by TheLink · 2007-12-16 06:02 · Score: 5, Interesting

You have to hand it to the W3C, they keep supplying web designers with rope.

I've been trying to get them (and browser people) to include a security oriented tag to disable unwanted features.

Why such tags are needed:

Say you run a site (webmail, myspace (remember the worm?), bbs etc) that is displaying content from 3rd parties (adverts, spammers, attackers) to unknown browsers (with different parsing bugs/behaviour).

With such tags you can give hints to the browsers to disable unwanted stuff between the tags, so that even if your site's filtering is insufficient (doesn't account for a problem in a new tag, or the browser interprets things differently/incorrectly), a browser that supports the tag will know that stuff is disabled, and thus the exploit fails.

I'm suggesting something like:

<restricton lock="Random_hard_to_guess_string" except="java,safe-html" />
browser ignores features except for java and safe-html.
unsafe content here, but rendered safely by browser
<restrictoff lock="wrong_string" />
more unsafe content here but still rendered safely by browser
<restrictoff lock="Random_hard_to_guess_string" />
all features re-enabled

safe-html = a subset of html that we can be confident that popular browsers can render without being exploited e.g. <em>, <p>).

It doesn't have to be exactly as I suggest - my main point is HTML needs more "stop/brake" tags, and not just "turn/go faster" tags.

Before anyone brings it up, YES we must still attempt to filter stuff out (use libraries etc), the proposed tags are to be a safety net. Defense in depth.

With this sort of tag a site can allow javascript etc for content directly produced by the site, whilst being more certain of disabling undesirable stuff on 3rd party content that's displayed together (webmail, comments, malware from exploited advert/partner sites).

--

Too many replies beneath your current threshold

Re:Bet there still isn't a decent "Stop!" button by LiquidCoooled · 2007-12-16 06:09 · Score: 3, Insightful

Why not just simplify your entire comment:

Content from a 3rd party runs in a more restrictive context than the primary site (this includes frames etc).
You are then not held at the whim of a web admin to ensure these tags are included.

Or you could just use the noscript addin right now and choose which sites you trust at your discretion.

--
liqbase :: faster than paper
Re:Bet there still isn't a decent "Stop!" button by coryking · 2007-12-16 06:14 · Score: 0

I like your idea for security tags, I could see your method still trashing the page with broken HTML. A quick idea might be to shove something in a stylesheet, a script, or the . Whatever it would be, it would refer to the id of the block you want to set and out be out of the guts of your HTML. Dunno how my idea would work on a site like slashdot where you have 300 things you want to secure... yours would be more straightforward in that case. In the perfect world, you'd want to isolate out the untrusted data to another document that would get included.

$100 says we both are on the right track, but there is a good reason it wouldn't work.
Re:Bet there still isn't a decent "Stop!" button by wizardforce · 2007-12-16 06:22 · Score: 2, Interesting

good idea, although in the case of myspace, it wasn't a technical problem that prevented them from keeping pages "safe" [eg. preventing the execution of malicious code] it had to do with the fact that myspace, by default allows everything *on purpose* they could have built the system such that certain tags would/could be disabled [slashdot is an example] and as big as myspace is, resources are not a problem- apathy and the need to incorporate everything user generated into pages [to hell with security! we want to build our pages any which way we like!] is.

--
Sigs are too short to say anything truly profound so read the above post instead.
Re:Bet there still isn't a decent "Stop!" button by throup · 2007-12-16 06:29 · Score: 3, Insightful

Could you not get around that by injecting code like:

</restriction> 
Something evil!!!
<restriction lock="I don't really care here" except="everything"> 

Obviously I need to work on something more destructive than "Something evil!!!" before I attempt to conquer the planet...
Re:Bet there still isn't a decent "Stop!" button by TheLink · 2007-12-16 06:31 · Score: 2, Insightful

Can't do that. That's because often the website you visit is the one sending the 3rd party data.

Think webmail (yahoo, gmail etc), when you receive spam, your webmail provider is the one sending you the data.

Usually they will try to filter the content to make it safe. BUT as history shows it's not always 100%.

The W3C or browser maker might also make a new tag/feature that your filtering libraries aren't aware of (e.g. old sites with guestbooks that might not filter out the "latest and greatest stuff").

With my proposal, users can enable javascript+flash for stuff like youtube, and youtube can be more certain that the comments about the video will be treated as plain html by browsers that support the security tag. Stuff that slips through the filters would likely still be rendered inactive by those browsers.
--
- Too many replies beneath your current threshold
Re:Bet there still isn't a decent "Stop!" button by Bogtha · 2007-12-16 06:32 · Score: 4, Insightful

even if your site's filtering is insufficient (doesn't account for a problem in a new tag

Why would your site let through new tags that it doesn't recognise? Use a whitelist.

the browser interprets things differently/incorrectly

This only usually occurs if you let through malformed HTML. Use tidy or similar to ensure you only emit valid HTML. Not to mention the fact that the whole problem is caused by lax parsing — something the W3C has been trying to get people to give up on with the parsing requirements for XML.

safe-html = a subset of html that we can be confident that popular browsers can render without being exploited e.g. <em> , <p> ).

You could define such a subset using the modularised XHTML 1.1 or your own DTD.

Before anyone brings it up, YES we must still attempt to filter stuff out (use libraries etc), the proposed tags are to be a safety net. Defense in depth.

Yes, but it won't be actually used that way. If browsers went to the trouble of actually implementing this extra layer of redundancy, all the people with lax security measures would simply use that as an alternative and all the people who take security seriously will use it, despite it not being necessary. I think the cumulative effect would be to make the web less secure.

--
Bogtha Bogtha Bogtha
Re:Bet there still isn't a decent "Stop!" button by BlueParrot · 2007-12-16 06:33 · Score: 2, Interesting

There is the object tag. It can be used as a client-side include. All it really needs is a "permissions" attribute or something like that: <object permissions="untrusted" codetype="text/html" codebase="foo.html"> </object>
Re:Bet there still isn't a decent "Stop!" button by coryking · 2007-12-16 06:43 · Score: 1

You could define such a subset using the modularised XHTML 1.1 or your own DTD. Or monkeys could fly out of our asses :-)

The idea of modular XHTML is a nice one, but unless I'm missing something, this new XHTML modular thingy we are talking about would still need to be supported by the browser, right? In other words, it will not be supported and is a waste of time.

Modular XHTML is a nice idea in theory, but honestly... nobody will use a module unless it is implemented by Firefox and IE. Can you name any existing XHTML modules implemented by both browsers?

Er.. atom or rss?
Re:Bet there still isn't a decent "Stop!" button by TheLink · 2007-12-16 06:44 · Score: 2, Insightful

No because the closing tag has to have a lock string that matches the lock on the opening tag.

My attempts to change the world (albeit by a little bit) aren't going very well either - it's been more than 5 years since I first proposed the tags, but so far the W3C and Mozilla bunch have preferred to make other "more fun" stuff instead...

Maybe Microsoft has subverted the W3C too :).
--
- Too many replies beneath your current threshold
Re:Bet there still isn't a decent "Stop!" button by Ambush+Commander · 2007-12-16 06:44 · Score: 2, Insightful

This is a novel technique (the unique, hard to guess string, which easily could be a hash of the document and a secret salt the website has) I have not seen before, but this merely punts the issue to the browsers. It cannot be solved there (as you mention); in fact, it does not even begin to solve it: think about the legacy browsers floating around the web. I don't even trust browser vendors to lock down all of this code: they also have their own security bugs.

There is also the minor point that your method is almost completely incompatible with DOM, but I'll overlook that for now. :-)
Re:Bet there still isn't a decent "Stop!" button by throup · 2007-12-16 06:49 · Score: 1

I see the error in my own logic: you are treating restriction as a empty element, so I can't inject a closing tag for it.

Now I have realised that, another (less critical) concern occurs to me: any user agent would have to treat your document as tag-soup instead of parsing a DOM-tree because that would be the only way to recognise the on and off states. Whether you see that as a problem or not depends on your attitude to the difference between HTML 4 and XHTML 1; an argument which is surely taking place elsewhere on this page so I won't go into it here :-).
Re:Bet there still isn't a decent "Stop!" button by nonpareility · 2007-12-16 06:49 · Score: 1

This sounds like what you're talking about, albeit only for script.
Re:Bet there still isn't a decent "Stop!" button by Bogtha · 2007-12-16 06:57 · Score: 1

The idea of modular XHTML is a nice one

It's not an idea, it's been a published Recommendation for over six years.

this new XHTML modular thingy we are talking about would still need to be supported by the browser, right?

No. If the server validates the untrusted data, what's the point in the browser doing it too? Validation is deterministic, you don't get double the security by doing it twice.

Can you name any existing XHTML modules implemented by both browsers?

All of them. XHTML 1.1 is XHTML 1.0 Strict broken up into modules.

Er.. atom or rss?

Those are not XHTML.

--
Bogtha Bogtha Bogtha
Re:Bet there still isn't a decent "Stop!" button by Anonymous Coward · 2007-12-16 07:02 · Score: 0

Yes that could work but it's a bit harder to do for people building websites - need a working link back to foo.html and something needs to handle the subsequent request for foo.html.
Re:Bet there still isn't a decent "Stop!" button by coryking · 2007-12-16 07:04 · Score: 1

While it is true that the server should validate the inbound (and really outbound) HTML, you have to admit it isn't easy. In perl land, there are some CPAN modules to help, but none of them ever feel right. I'm not sure what the answer is really. But I think the OP's idea was for a couple extra "hey Mr. Browser, yeah, we suck about validation... please dont trust this crap here.. we tried our best on the server, but really, you shouldn't trust it either".
Those are not XHTML. That was what I thought.. just guessing. MathML or whatever? I'm not sure, but does IE do that?
Re:Bet there still isn't a decent "Stop!" button by Bogtha · 2007-12-16 07:18 · Score: 1

While it is true that the server should validate the inbound (and really outbound) HTML, you have to admit it isn't easy.

On the contrary, it's very easy. There's plenty of tools out there to do this for you.

In perl land, there are some CPAN modules to help, but none of them ever feel right.

What do you mean by "feel right"?

I think the OP's idea was for a couple extra "hey Mr. Browser, yeah, we suck about validation... please dont trust this crap here.. we tried our best on the server, but really, you shouldn't trust it either".

What do you think the browser is going to do that you can't?

Those are not XHTML.

That was what I thought.. just guessing. MathML or whatever? I'm not sure, but does IE do that?

That's not XHTML either. Once more, XHTML 1.1 is XHTML 1.0 broken up into modules. The OP wanted to be able to specify permitted features of XHTML in a fine-grained manner. XHTML 1.1 defines modules you can use, or you can define your own DTD. Stop thinking about non-XHTML features like Atom and MathML! This is about normal XHTML like <script> , <table> , etc.

--
Bogtha Bogtha Bogtha
Re:Bet there still isn't a decent "Stop!" button by mattwarden · 2007-12-16 07:44 · Score: 0

But, in order to support "web 2.0" apps, browsers would need to allow this entity to be scriptable in the DOM (otherwise, I don't see how you could support restricting parts of dynamic content). And if that is the case, then a script could simply un-restrict any part of the page at will.

Do you have a solution to handle this without causing issues with dynamic content?
Re:Bet there still isn't a decent "Stop!" button by coryking · 2007-12-16 07:55 · Score: 4, Insightful

On the contrary, it's very easy. There's plenty of tools out there to do this for you. Cow Crap!

You want easy? SQL injections are easy to handle. Just use a parameterized query so you don't have to mix tainted data with your trusted SQL.

Back in the stone age before php thought parameterized queries were more then enterprise fluffery, you were forced to mix your user data with your SQL. And oh were the results hilarious! It look three tries (and three fucking functions) for PHP/mysql to get their escape code right and I'm sure you can still inject SQL with "mysql_real_escape_string()" in some new unthought of way.

There is no "parameterized query" with HTML. You are *forced* to mix hostile user data with your trusted HTML. If it was that hard to sanitize an "easy" language like SQL, how hard is it to sanitize a very expressive language like HTML?

You are telling me all those CPAN modules handle the hundreds of ways you can inject HTML into the dozens of different browsers? How many ways can you make an angle bracket and have it interpreted as a legit browser tag? How many ways can you inject something to the end of a URL to close the double quote and inject your javascript? How many ways, including unicode, can you make a double quote? Dont forget, your implementation cannot strip out the Unicode like I've seen some filters do - I need the thing to handle every language! I would guess there are thousands of known ways to inject junk into your trusted HTML.

I promise you that even the best CPAN module is still exploitable in some way not considered by the author. And I'd be insane to roll my own, as I'm not as smart as she is.

Don't kid yourself and thinking filtering user generated content is easy. It is very, *very* hard.
Re:Bet there still isn't a decent "Stop!" button by BlueParrot · 2007-12-16 07:59 · Score: 1

What do you think the browser is going to do that you can't?

Your implication is basically that web-developers are more competent in terms of security than those who design the clients, and thus the client should just swallow the stuff without even bothering. In reality there are MANY people who make web pages who would probably trust the browser developers a lot more than they trust themselves not to make a mistake.

Also, you're not looking at this from the point of view of the user. I might want to tell my browser to trust John Smith not to put malicious stuff into his webpage, but that doesn't mean I trust him to be capable of finding all known and unknown exploits some of the third parties may have sent him. In this situation I would very much appreciate it if John Smith had marked the data on his webpage saying "this is from me" , "this is from somebody else".
Re:Bet there still isn't a decent "Stop!" button by coryking · 2007-12-16 08:19 · Score: 1

I think it is still to easy to exploit is the problem. I'm sure if you thought hard, you could write some evil HTML to route around it and run your javascript. You'd just have to somehow get the big_key thing in your proposal.

The only real secure way is to isolate the untrusted bits into their own block.... like how you do multipart mime documents in email or something. You'd need a tag to reference the "external" untrusted bits and have the browser render them in a sandbox. Even in this case, you can exploit it by injecting your own stuff to trash where the boundary between the two documents are (like how you can exploit poorly implemented webforms that send email by injecting your own email headers :-).

I'm sure there are a wide range of technical reasons this would be hard to implement though. I'm already shooting holes in what I typed... even if you had the untrusted bits pulled down in a separate HTTP request there are problems like "it would be very slow" :-)
Re:Bet there still isn't a decent "Stop!" button by Bogtha · 2007-12-16 08:52 · Score: 1

It look three tries (and three fucking functions) for PHP/mysql to get their escape code right and I'm sure you can still inject SQL with "mysql_real_escape_string()" in some new unthought of way.

Escaping SQL isn't even close to the same problem. In that case, you virtually always want the user-submitted data to be treated as opaque data. The analogous situation with HTML would be escaping all the HTML and displaying it as raw code to the end user. The problem being talked about here is when you do actually want the HTML to be partially rendered. And really, PHP's approach to SQL has always been pathological.

How many ways can you make an angle bracket and have it interpreted as a legit browser tag? How many ways can you inject something to the end of a URL to close the double quote and inject your javascript? How many ways, including unicode, can you make a double quote? Dont forget, your implementation cannot strip out the Unicode like I've seen some filters do - I need the thing to handle every language! I would guess there are thousands of known ways to inject junk into your trusted HTML.

Remember, you are normalising this to a valid subset of HTML first. I think virtually all the attacks you describe only work when you are treating it as tag soup, with the exception of Netscape 4's well-known, long-fixed Unicode bugs. Can you give any examples of valid HTML that is misparsed by a maintained browser in an insecure way?

Don't kid yourself and thinking filtering user generated content is easy. It is very, *very* hard.

It's a hell of a lot simpler if you normalise to a valid subset of HTML.

--
Bogtha Bogtha Bogtha
Re:Bet there still isn't a decent "Stop!" button by Bogtha · 2007-12-16 08:59 · Score: 1

Your implication is basically that web-developers are more competent in terms of security than those who design the clients

Not at all. I expect the web developers in both cases to hand off the problem to third-party code. I just think that server-side code that has been maturing for a decade fills the role better than non-existent client-side code.

Also, you're not looking at this from the point of view of the user. I might want to tell my browser to trust John Smith not to put malicious stuff into his webpage, but that doesn't mean I trust him to be capable of finding all known and unknown exploits some of the third parties may have sent him. In this situation I would very much appreciate it if John Smith had marked the data on his webpage saying "this is from me" , "this is from somebody else".

If you don't trust John Smith's judgement, then that's not going to save you. For instance, JavaScript provided by him could be tricked by the untrusted content into doing what they want, and the attack would succeed because the JavaScript would be marked as coming from John Smith and therefore trusted.

--
Bogtha Bogtha Bogtha
Re:Bet there still isn't a decent "Stop!" button by Jeffrey+Baker · 2007-12-16 09:11 · Score: 1

It's not hard at all. Slurp it up into a DOM. At this point, it becomes an object and ceases to be a stupid string. You can then walk the tree removing nodes that are not allowed (example: in a forum post you can remove the script tag while ignoring bold and italics.)

I don't know why people are stupid about this. It's true that you probably can't do it with a regex. That's why $GOD gave us the DOM.
Re:Bet there still isn't a decent "Stop!" button by coryking · 2007-12-16 09:17 · Score: 1

It's a hell of a lot simpler if you normalize to a valid subset of HTML. True.dat. But you gotta know how to normalize it down first. Not saying you are wrong, but why are there so many XSS issues if it is easy? Poor education? How do we educate good programmers to do the right thing? I mean that seriously... like is there a "here is how to let your users make their comment pretty and link to other websites and not get hosed" FAQ? I think I see your take though... it helps if you have give the user a wysiwyg editor that spits you a known set of HTML. Anything outside that known set of HTML is evil. But maybe I'm still wrong. I'm a pretty smart guy, I think... at least open minded or something. I mean, at least I seem to know enough to worry about XSS issues but yet I dont find it easy at all. What am I missing here? I really don't want to get my users hosed :-) PS: I'm also a slightly special case because the I extended HTML with a couple tags that are only useful to photographica users (<popup> and <slideshow>)... PPS: Slashdot doesn't even do HTML filtering "elegantly". How can I type in those two fake tags as a comment AND quote you without escaping the brackets myself? I dont think this is as easy of a problem to solve as you think it is :-)
Re:Bet there still isn't a decent "Stop!" button by coryking · 2007-12-16 09:19 · Score: 2, Insightful

bah! see? slashdot's filter system just fucked me over too and I swear I previewed to see if it kept all my paragraphs.

It ain't easy as you say bro... :-)
Re:Bet there still isn't a decent "Stop!" button by uzytkownik · 2007-12-16 09:26 · Score: 1

I would rather see to have different, hierarchical namespaces/'boxes'/jails.
IE. None of the code from given box can read/write from anything except the same namespace - and having nested namespaces can be also forbidden(something like jail chroots).
Sample code:
<html xmlns="www.w3.org/1999/xhtml" xmlns:xr="..." xmlns:xi="http://www.w3.org/2001/XInclude">
<head>
<script type="text/javascript" xr:name="ad" src=".../ad.js" />
</head>
<body xml:lang="en">
<xi:include href=".../ad.xhtml" xr:name="ad" xr:forbidden="nested" />
</body>
</html>
Or:
<html xmlns="www.w3.org/1999/xhtml" xmlns:xr="..." xmlns:xi="http://www.w3.org/2001/XInclude">
<head>
<script type="text/javascript" xr:name="ad" src=".../ad.js" />
</head>
<body xml:lang="en">
<xi:include href=".../ad.xhtml" xr:name="ad" xr:nested="false" />
</body>
</html>

--
I've probably left my head... somewhere. Please wait untill I find it.
Homepage: http://blog.piechotka.com.pl/
Re:Bet there still isn't a decent "Stop!" button by Bogtha · 2007-12-16 09:48 · Score: 2, Interesting

Not saying you are wrong, but why are there so many XSS issues if it is easy?

A combination of ignorance, apathy, and poor quality learning materials.

is there a "here is how to let your users make their comment pretty and link to other websites and not get hosed" FAQ?

Well the real answer to this is to point them to the sanitising features available for their particular platform/language/framework/etc. Generic advice is low-level by its very nature, for example XSS (Cross Site Scripting) Cheat Sheet or perhaps OWASP.

I'm a pretty smart guy, I think... at least open minded or something. I mean, at least I seem to know enough to worry about XSS issues but yet I dont find it easy at all. What am I missing here?

You're trying to do it yourself. Don't. Hand it off to a library.

Slashdot doesn't even do HTML filtering "elegantly". How can I type in those two fake tags as a comment AND quote you without escaping the brackets myself? I dont think this is as easy of a problem to solve as you think it is :-)

Slashdot is a mess all around, a lot of their problems are because their design strategy seems to be "accumulate features over time, never refactor, offer options instead of taking away obsolete features or being non-backwards-compatible". I mean, they have three different commenting systems, three different display systems and three different comment formats. That's hardly something to emulate. Having said that, they are probably one of the highest targets around for crapflooders, and they won that battle conclusively, which is clear evidence that it's not impossible to sanitise input. Slashcode is open-source, if there were a gap in its sanitation procedure, then Slashdot would quickly be overrun by trolls screwing up every page.

If you want to handle situations like this, then normalise the code, and escape every tag not on the whitelist. But the feature itself isn't really ideal because the user expectation of their comments being markup-but-not-in-some-cases is confusing.

--
Bogtha Bogtha Bogtha
Re:Bet there still isn't a decent "Stop!" button by naasking · 2007-12-16 10:40 · Score: 1

I don't get it. Safe html? All html is safe. Certain extensions to embed content are not safe, such as embed and script, but html itself is safe. So basically, you just want to be able to include third-party html, but only html, no other content. How is that hard to filter?

--
Higher Logics: where programming meets science.
Re:Bet there still isn't a decent "Stop!" button by Splab · 2007-12-16 11:05 · Score: 1

The worst part is PHP still don't have prepare/execute statements for binding values.

It is so much easier and you don't have to worry about php screwing up again.
Re:Bet there still isn't a decent "Stop!" button by curunir · 2007-12-16 11:49 · Score: 3, Insightful

Why go through all the hassle of "random hard to guess string" which, if implemented improperly could be guessed? Plus, as others have pointed out, HTML is not a dynamic language. Your random, hard to guess string could be observed and used by an attacker.

Wouldn't something like:

<sandbox src="restrictedContent.html" allow="html,css" deny="javascript,cookies"/>

...be a whole lot simpler? Just instruct the browser to make an additional request, but one in which it's expected to fully sandbox the content according to rules that you give it. This makes it much harder for application developers to screw up and a lot harder for malicious code to bypass the sandboxing mechanism.

--
"Don't blame me, I voted for Kodos!"
Re:Bet there still isn't a decent "Stop!" button by Ant+P. · 2007-12-16 13:08 · Score: 1

It'd be saner to just have a in the header for that sort of thing.

Even if you have something like that for HTML it doesn't fix tagsoup markup, which is a problem if you want to generate RSS feeds or something from the user input.
Re:Bet there still isn't a decent "Stop!" button by Ant+P. · 2007-12-16 13:14 · Score: 1

Whoops, Slashdot broke my post. First line should have had:
<meta name="noscript" value="space-separated dom tree IDs">
Re:Bet there still isn't a decent "Stop!" button by Chandon+Seldon · 2007-12-16 13:30 · Score: 1

How can you not do it with a regex (or two)? You just entity encode any tag or entity not on the whitelist.

--
-- The act of censorship is always worse than whatever is being censored. Always.
Re:Bet there still isn't a decent "Stop!" button by chromatic · 2007-12-16 18:32 · Score: 2, Interesting

How can you not do it with a regex (or two)?

In the past eight years or so, I haven't seen a single regex which can parse HTML correctly and completely. The closest variant failed when it encountered CDATA sections.

--
how to invest, a novice's guide
Re:Bet there still isn't a decent "Stop!" button by Daniel+K.+Attling · 2007-12-16 19:36 · Score: 2, Interesting

Myspace is basically a hack job of static tables. The inclusion of css classes are a later graft upon the code to make it themeable (aka. blinding horror of red text on pink background). From what I've understood the reason why myspace has been so unwilling/unable to move forwards has been that doing so would break almost all themed user pages that depend upon said table structure.
Re:Bet there still isn't a decent "Stop!" button by vegiVamp · 2007-12-16 20:36 · Score: 1

I'm not sure about SGML, but XML in any case doesn't provide for attributes in closing tags.

--
What a depressingly stupid machine.
Re:Bet there still isn't a decent "Stop!" button by samjam · 2007-12-17 00:51 · Score: 1

It's only "hard" because many problems are defined by the same symptoms.

Some systems are engineered to render text in html, the text is badly quoted and so if the text is really html, the html shows.

Some systems are engineered to display a subset of html (to prevent "hostility") but they badly define their subset of html, so hostile html escapes.

The problem exists because somewhere between the customer and installer there was a distinct lack of definition and somebody "didn't care" and just combined "untrusted html" with "trusted html".

There are no tools that can solve these problems, the problems are defined by "untrusted output mixed with trusted output" but those who complain are the same ones who can't be bothered to define "trusted" "untrusted" and "hostile" in terms that any tool can ever process, and only define it in terms of user perceived harmful effects.

Sam

--
blog.sam.liddicott.com
Re:Bet there still isn't a decent "Stop!" button by Phleg · 2007-12-17 02:25 · Score: 1

XML doesn't allow for attributes on closing tags.

--
No comment.
Re:Bet there still isn't a decent "Stop!" button by fbjon · 2007-12-17 02:31 · Score: 1

In general, one should be wary of making regexes do things they're ill-suited for. Parsing complex languages seems quite obviously one of those things.

--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:Bet there still isn't a decent "Stop!" button by TheLink · 2007-12-17 03:02 · Score: 1

Sorry don't understand the tag-soup vs DOM tree bit.

People can say XYZ won't be a tag-soup, valid XYZ is well formed and all that, but in practice attackers will send you malformed stuff if that's what it takes to exploit things.

In the real world, browsers WILL encounter tag soup, believing otherwise is naive. My proposal is to help browsers skip the nasty bits in the soup, rather than completely relying on the servers to dish out soup that's safe to all supported browsers.

A suggested implementation is for the disabling bit to be at the parser level. For example after the parser hits a restrict tag, it stops recognizing javascript (and other nonallowed stuff), and so stops adding stuff as javascript to the DOM-tree. When it sees a valid restrictoff tag it starts recognizing stuff again that aren't disallowed by other prior (and thus overriding) restrict tags.

Note: if they break the parser and get control over your browser they don't even need to care about running javascript etc anymore :).

I really don't care what form the tag eventually takes, I would prefer to leave it to the HTML people. That's why more than 5 years ago I tried to bring it up to them and the browser people. Some said it should be <tag />, others said it should be something else, some have said I shouldn't be calling it a tag - technically it's not a tag, etc. I have resubmitted things to try to suit them. But almost all seemed to think that the filtering should only be done server side.

5 years have passed, and I still think I'm right that filtering shouldn't only be done server side :).

The HTML/browser people don't have to listen to me. Just means more money and jobs in the IT security line. Making money due to backwardness when thing could be so much better. Oh well, it's still money eh?
--
- Too many replies beneath your current threshold
Re:Bet there still isn't a decent "Stop!" button by Anonymous Coward · 2007-12-17 03:18 · Score: 0

Your filtering lib's slurping/parsing might be different from the browser's. In which case the browser could still get exploited.

If you have a way of telling the browser "enclosed stuff is suspect", the browser can be a bit more careful when slurping it - doesn't chuck the stuff "live and kicking" into its DOM.

Then whatever new stuff comes out, even if your lib isn't updated, the browser still knows "enclosed stuff is suspect" and is likely to still behave safely.
Re:Bet there still isn't a decent "Stop!" button by mcvos · 2007-12-17 03:22 · Score: 1

You are telling me all those CPAN modules handle the hundreds of ways you can inject HTML into the dozens of different browsers?

I know nothing about CPAN or what could possibly be the problem about HTML injection. (I mean, HTML comes from the server, right? So the server controls what it sends. At what point can anyone else inject anything you don't like?) However:

How many ways can you make an angle bracket and have it interpreted as a legit browser tag?

None. That is, sloppy html can contain all sorts of silly crap, but Tidied XHTML can not contain angle brackets except as part of a tag. > becomes >, which is harmless.

How many ways can you inject something to the end of a URL to close the double quote and inject your javascript?

None, as long as you ensure that it is a real URL. Use URLencode on anything that's supposed to be a URL. Besides, does javascript outside the double quote do anything? In XHTML you can't even have any kind of content outside the double quotes but between the < and >.

How many ways, including unicode, can you make a double quote?

Who cares? Tidy, encode, and it's harmless.

Dont forget, your implementation cannot strip out the Unicode like I've seen some filters do - I need the thing to handle every language! I would guess there are thousands of known ways to inject junk into your trusted HTML.

Not if you handle it properly.

Don't kid yourself and thinking filtering user generated content is easy. It is very, *very* hard.

Then don't filter it, but simply turn it into proper, correct XHTML. That will destroy anyhing that tries to abuse stupid loopholes in crappy browsers.
Re:Bet there still isn't a decent "Stop!" button by nuzak · 2007-12-17 03:39 · Score: 1

Any user agent would have to treat your document as tag-soup instead of parsing a DOM-tree because that would be the only way to recognise the on and off states.

DOM trees are still ordered (otherwise flow layout wouldn't work), so this would work:

<secure state="on"/> blah blah blah <secure state="off"/>

Not that I'd use tags like that, or even think it's a good idea, and it would be a O(n) thing to search through sibling tags, but the DOM API is still quite well up to the task.

Personally I think the proper solution for script injection is to simply not do it, and control your output.

--
Done with slashdot, done with nerds, getting a life.
Re:Bet there still isn't a decent "Stop!" button by nuzak · 2007-12-17 03:45 · Score: 2, Interesting

> A suggested implementation is for the disabling bit to be at the parser level.

The natural tag for controlling the parsing would be a processing instruction.

<?secure on key:hkwh45kdfhgkjwh45?>
blah blah blah blah
<?secure off key:hkwh45kdfhgkjwh45?>

Good luck getting that into a standard, but heck, you don't really even need the cooperation of the W3C to do this.

--
Done with slashdot, done with nerds, getting a life.
Re:Bet there still isn't a decent "Stop!" button by togofspookware · 2007-12-17 04:03 · Score: 1

If you look closely at the code he proposed, you'll see that he wasn't suggesting that.

--
Duct tape, XML, democracy: Not doing the job? Use more.
Re:Bet there still isn't a decent "Stop!" button by vegiVamp · 2007-12-17 04:20 · Score: 1

Ah, I hadn't seen that, thx.

In that case, I still have a more, well... ideological problem with it: the system is using non-enclosing tags to enclose content. It effectively renders the entire idea of markup by tagged enclosure useless.

This will be a bugger to implement in any standard tree parser, because the opening and closing tags don't match and as such don't actually *mark* the content.

--
What a depressingly stupid machine.
Re:Bet there still isn't a decent "Stop!" button by coryking · 2007-12-17 06:24 · Score: 1

You dont understand the problem. HTML injections are from users like me posting busted HTML as a comment to slashdot. The comment injects evil bits of javascript into the output when the page gets displayed. Using XHTML and having the browser choke and die on the output is just another security loophole as far as i'm concerned. Being able to get the end browser to choke on XHTML errors is a DOS. Imagine how much trolls would like it if they could get firefox to not even display this page because their evil XHTML caused this page to no longer validate?
Re:Bet there still isn't a decent "Stop!" button by mcvos · 2007-12-17 06:43 · Score: 1

You dont understand the problem. HTML injections are from users like me posting busted HTML as a comment to slashdot.

You're right. I don't understand why this should be a problem. Tidy the tainted HTML, so you end up with clean, reliable XHTML.

The comment injects evil bits of javascript into the output when the page gets displayed.

So remove the bits of javascript. Script tags and event handlers are easy to find, and links can be cleaned up.

Using XHTML and having the browser choke and die on the output is just another security loophole as far as i'm concerned. Being able to get the end browser to choke on XHTML errors is a DOS. Imagine how much trolls would like it if they could get firefox to not even display this page because their evil XHTML caused this page to no longer validate?

Exactly. So clean it up! Do not trust user input. Use Tidy or JTidy. It's really not hard to find. Hell, there's even a web-based version! This is all extremely standard stuff. Use it!
Re:Bet there still isn't a decent "Stop!" button by coryking · 2007-12-17 07:39 · Score: 1

I'm not convinced.

I'm not an idiot and I do clean up user input - both on the way into the database and on the way out.

The problem is even these libraries will have exploits. It isn't as easy to parse html as some people make it out to seem. There are a lot of details to nail down right (angle brackets "http://www.google.com/ to become a clickable URL automatically when they type it in. You have to sanitize that URL somehow and make sure your URL code doesnt let evil crap like " slip by and into a real quote, thus prematurely closing your final href attribute and "executing" the user's javascript inside your page.

This stuff is not as easy as some of you think. Try it. Write a secure way to automatically parse user input for URLs, handle line breaks, auto bold text, handle unicode AND generate safe, uninjectable HTML/XHTML on the way out to the browser.
Re:Bet there still isn't a decent "Stop!" button by mcvos · 2007-12-17 08:00 · Score: 1

The problem is even these libraries will have exploits.

Can you name a single exploit for Tidy?

It isn't as easy to parse html as some people make it out to seem.

Oh yes it is. It's rendering that's hard. For parsing it, there are tons of standard libraries. And using XHTML makes it especially easy.

There are a lot of details to nail down right (angle brackets "http://www.google.com/ to become a clickable URL automatically when they type it in. You have to sanitize that URL somehow and make sure your URL code doesnt let evil crap like " slip by and into a real quote, thus prematurely closing your final href attribute and "executing" the user's javascript inside your page.

That's a simple matter of using something akin to URLencode. Any halfway decent webframework should know how to proprly encode a URL. Something like
http://google.com/" onmouseover="execute_evil.js
should become:
http://google.com/%22+onmouseover%3D%22execute_evil.js
which is harmless. If this is not trivial in your webframework, you're using the wrong webframework.
Re:Bet there still isn't a decent "Stop!" button by TheLink · 2007-12-17 08:42 · Score: 1

From your link you'll see that Ben Bucksch refers to my original proposal years earlier, which history shows is still relevant today.

The W3C and browser people still aren't interested.

I believe one of the browser people told me to implement it in a browser and then come back. Which is the equivalent of Ford Motor Corp telling me to make a car with brakes when I point out that their cars don't come with brakes built in - full of accelerator pedals, but no brakes. The W3C says stuff like: "When bad things happen all cars should throw a security exception", and that's about all the help you get from them - no requirement for brakes to be installed :).

Then when stuff happens, lots of people blame users for not driving safely or not maintaining their cars properly. They don't blame the car manufacturers and regulators for cars where to stop you need to make sure every single accelerator pedal is up, instead of just stepping on a brake pedal.
--
- Too many replies beneath your current threshold
Re:Bet there still isn't a decent "Stop!" button by Ed+Avis · 2007-12-17 09:53 · Score: 1

The fact that the PHP developers are idiots and the MySQL quoting rules are complex does not mean it is impossible to safely quote HTML. There are three things you must do (in this order):

- change every & character to &
- change every < character to <
- change every > character to >

That is guaranteed to give you safe text you can put inside a <p> or other element. If you want to paste user-generated input in other places, such as <div style="$user_text">, then you would have to worry about closing quotation marks and other crap. But this too is easy if you know what characters are legitimate:

$user_text =~ /^[A-Za-z0-9]{1,100}$/ or die 'bad characters in user text';

Straightforward and watertight - this also checks the length to avoid any possible buffer overrun in dodgy browsers. You could use one of Perl's Unicode character classes instead. There is no need to worry about 101 different ways to make a double quote - there are probably that many glyphs which _look_ like a " character in popular fonts, but only one of them is the " character. Write code with a whitelist of safe characters, rather than trying to catch 'bad' ones, and you'll be safe.

--
-- Ed Avis ed@membled.com
Re:Bet there still isn't a decent "Stop!" button by Chandon+Seldon · 2007-12-17 10:34 · Score: 1

Does the problem in this case actually require parsing HTML?
It seems to me that we don't care about the parse tree at all - we just want to be sure that the output of the filter doesn't contain any potentially dangerous tags.

--
-- The act of censorship is always worse than whatever is being censored. Always.
Re:Bet there still isn't a decent "Stop!" button by chromatic · 2007-12-17 12:04 · Score: 1

It seems to me that we don't care about the parse tree at all - we just want to be sure that the output of the filter doesn't contain any potentially dangerous tags.

Recognizing potentially dangerous tags is more difficult than it seems, though. Our commenting system at work can't handle valid HTML in some cases, as it believes that newlines are not appropriate whitespace within tags. (The specification is clear that newlines are allowable whitespace within tags.) Furthermore, adding a CDATA or PCDATA section -- or in some cases, simple HTML comments -- can break almost every HTML-parsing regex I've ever seen.

There's no substitute for tokenizing and parsing with a stateful state machine.

--
how to invest, a novice's guide
Re:Bet there still isn't a decent "Stop!" button by Chandon+Seldon · 2007-12-17 12:21 · Score: 1

CDATA sections and comments seem blatantly unnecessary. Why does a website comment system want to be allowing HTML any more complicated than the list of allowed tags here on Slashdot?

--
-- The act of censorship is always worse than whatever is being censored. Always.
Re:Bet there still isn't a decent "Stop!" button by chromatic · 2007-12-17 13:27 · Score: 1

Why does a website comment system want to be allowing HTML any more complicated than the list of allowed tags here on Slashdot?

I'm not suggesting that a comment system should allow them. A malicious user could submit invalid input containing comments or CDATA sections which would confuse every regex-based system I've seen. If the filter doesn't handle them properly, it could let malicious data through unmunged.

--
how to invest, a novice's guide
Re:Bet there still isn't a decent "Stop!" button by Chandon+Seldon · 2007-12-17 14:30 · Score: 1

A malicious user could submit invalid input containing comments or CDATA sections which would confuse every regex-based system I've seen.

Can you give an example? The filter that I'm visualizing simply changes every instance of "<", ">" or "&" to an entity reference unless the character is part of a whitelisted tag. Comments and CDATA sections both start with "<", so I'm not seeing how they would cause problems.

--
-- The act of censorship is always worse than whatever is being censored. Always.
Re:Bet there still isn't a decent "Stop!" button by chromatic · 2007-12-17 14:47 · Score: 1

It's easy to write something in a CDATA section (and I think a comment, but I don't have time to test this now) that looks like a tag but isn't a tag. If you blindly encode the angle braces, you'll end up displaying them incorrectly -- which may be fine, but you may also encode away the ending tag of the CDATA section.

--
how to invest, a novice's guide
Re:Bet there still isn't a decent "Stop!" button by throup · 2007-12-17 19:32 · Score: 1

I know that browsers will encounter tag soup in reality, but I don't believe the standards should encourage it.

An alternative implementation would be to rely on a combination of server and client filtering, making use of a well-formed document. Imagine you start with a page template which is entirely well-formed (go with me here). Wherever we are going to insert content into this template, we can wrap the new content with:

<restricton lock="Random_hard_to_guess_string" except="java,safe-html">
Content goes here.
</restricton>

Now for the server-side filtering. Do what you like to filter the content, but make sure that it ends up as well-formed markup. This may involve using Tidy for instance. When you have done filtering, insert the content into the template as a child of the restriction element. As long as the content was well-formed when it was inserted, even if no other filtering had taken place, it will be safely wrapped in the restriction blanket.

Of course (as other people have said) this is only effective if the user-agent honours the restriction element. Until that can be relied upon (bacon wing, anyone?) then server-side filtering would still be required. As a just-in-case-safety-net however, I think there is potential in your idea but it may be difficult to agree on an implementation which would please everyone.

In any case, my comment was a concern with the proposed implementation and not a criticism of the overall idea. Good luck with your campaign.
Re:Bet there still isn't a decent "Stop!" button by 1110110001 · 2007-12-18 16:22 · Score: 1

PHP and mysql are two different things. Other DB extensions had parameterized queries for years.
Re:Bet there still isn't a decent "Stop!" button by 1110110001 · 2007-12-18 16:24 · Score: 1

The worst part is PHP still don't have prepare/execute statements for binding values.

Really? http://php.net/PDO-prepare
Re:Bet there still isn't a decent "Stop!" button by Splab · 2007-12-19 02:58 · Score: 1

Thanks for pointing that out, never come across that function before, about time PHP started doing things the right way. (Last time I programmed PHP was back in v. 5, so that is properly why I didn't see the addition, but I stand corrected none the less).
Re:Bet there still isn't a decent "Stop!" button by uhlume · 2007-12-19 16:46 · Score: 1

Oh yes it is. It's rendering that's hard.

That's an idiotic statement. Rendering is easy — there are easily dozens, if not hundreds or thousands, of graphic and typographic rendering libraries available to simplify the task of putting text and images on the screen, and they're largely interchangeable in their effect. The hard part is figuring out what should be rendered — which is only possible by parsing and interpreting the HTML. How this is done is the only significant area in which browser rendering engines (admittedly something of a misnomer) actually differ.

--
SIERRA TANGO FOXTROT UNIFORM

Re:i choose html v4.01 by Cyko_01 · 2007-12-16 06:04 · Score: 1

personally I like to use html 4.01 Transitional, professionally I use xhtml 1.0 strict whenever possible

Where are we now? by h4rm0ny · 2007-12-16 06:04 · Score: 1

That's a very good article - as always IBM give a well-written introduction to the subject. But exactly what is the state of implementation of these? As far as I can gather, no browser maker has started to implement support for either. Is that correct? It would be useful to have some idea of the time scales we can expect on these both. Anyone know more about the state of play?

--

Aide-toi, le Ciel t'aidera - Jeanne D'Arc.

Re:Where are we now? by larry+bagina · 2007-12-16 06:17 · Score: 1

safari has partial support for input type="range". They use it in their RSS reader.

--
Do you even lift?
These aren't the 'roids you're looking for.
Re:Where are we now? by gsnedders · 2007-12-16 06:23 · Score: 2, Informative

http://en.wikipedia.org/wiki/Comparison_of_layout_engines_(HTML5) covers HTML 5, Nobody has started to try to implement XHTML 2 AFAIK (though definitely nobody major).

Web Applications? by Anonymous Coward · 2007-12-16 06:05 · Score: 0

All the talk about web applications these days, but a W3C-endorsed user interface markup language (like XUL/XAML) is nowhere to be seen.

A next-gen "HTML" should support common application widgets like panels, toolbars, menu's, tabs etc etc. Without it, it's not worth the effort IMO.

Re:Web Applications? by gsnedders · 2007-12-16 06:21 · Score: 5, Informative

HTML 5 is aiming to support various things needed for web applications (in fact, the current draft is formed of two documents: Web Applications 1.0 and Web Forms 2.0). Also, see http://www.w3.org/2006/appformats/admin/charter.html.

Both garbage by Anonymous Coward · 2007-12-16 06:10 · Score: 1, Informative

I'm sticking with XHTML1.0 strict. Perhaps I'll use XHTML1.1 with appropriate DTD if I ever need to support the canvas element, other than that... none of this stuff is what I want from a markup language.

Re:Both garbage by BenoitRen · 2007-12-16 11:34 · Score: 1

Too bad IE doesn't support your prized XHTML 1.0 Strict, though. You're probably sending it as text/html too, which causes all browser to interpret it as HTML, which means you're feeding them invalid markup.
Re:Both garbage by Anonymous Coward · 2007-12-16 15:46 · Score: 0

...Oh noes!!!

Browser vendors choice by gsnedders · 2007-12-16 06:18 · Score: 4, Insightful

All the browser vendors have already said they will support HTML 5 (yes, that includes MS) and all but MS have said they won't support XHTML 2 (MS hasn't made much of an effort to suggest they will support it either).

As it stands, with both XHTML 5 and XHTML 2 using the same namespace, it is only possible to support one of the two.

Re:Browser vendors choice by CastrTroy · 2007-12-16 07:26 · Score: 1

MS was part of the W3C and at one time said they would support CSS. We all know where that has gotten us.

--

Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
Re:Browser vendors choice by mattwarden · 2007-12-16 07:46 · Score: 1

Can you explain why identifying the markup version with a dtd would not allow them to support both?
Re:Browser vendors choice by DrYak · 2007-12-16 08:57 · Score: 2

and all but MS have said they won't support XHTML 2

Given their past effort in "supporting" previous standarts, it's not hard for them to claim "XHTML2" support.
Just enable an additional DOCTYPE to be recognised, and throw the exact same broken "quirks-mode" parser as before.
Most of the new XHTMLv2 tags which differs from XHTMLv1's one will fail to be recognized and displayed properly, but that won't be a big change to their traditionnal support of standart....
{/sarcasm}

More seriously :
As it stands, with both XHTML 5 and XHTML 2 using the same namespace, it is only possible to support one of the two.

Sorry ? WTF ?
Just read the DOCTYPE and depending on whether is specifies XHTML2 or HTML5, just switch to the corresponding parser.
The fact that both XHTML5 and XHTML 2 use similar but incompatible namespaces means that you can't mix BOTH in the SAME document in a meaningful way.
That doesn't stop you from using several different parser depending on what dialect a document is using.

The only stuff that will be impossible to do is a "guess what dialect we're using as we go" type of auto-adaptive quircks mode.
But that shouldn't be necessary because since XHTML 1, all document HAVE to be valid before being displayed.

--
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Re:Browser vendors choice by uzytkownik · 2007-12-16 09:11 · Score: 1

I guess that the problems starts when you want to embed the xhtml in other xml document (as you do with svg or mathml in xhtml). You have no doctype to check if it is xhtml2 or xhtml5

--
I've probably left my head... somewhere. Please wait untill I find it.
Homepage: http://blog.piechotka.com.pl/
Re:Browser vendors choice by mysticgoat · 2007-12-16 09:12 · Score: 1

As it stands, with both XHTML 5 and XHTML 2 using the same namespace, it is only possible to support one of the two.
Please clarify, because I don't understand this.
Since XHTML will continue to require a specific declaration and doctype, similar to  <?xml version="1.0" encoding="UTF-8"?>  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd">
will this not be enough so that client (browser) will be able to distinguish any version of XHTML from anything else? Isn't that sufficient??
Re:Browser vendors choice by gsnedders · 2007-12-16 09:48 · Score: 1

Given their past effort in "supporting" previous standarts, it's not hard for them to claim "XHTML2" support.
Just enable an additional DOCTYPE to be recognised, and throw the exact same broken "quirks-mode" parser as before.
Most of the new XHTMLv2 tags which differs from XHTMLv1's one will fail to be recognized and displayed properly, but that won't be a big change to their traditionnal support of standart....
{/sarcasm}
I didn't mean to imply that MS had say they will support XHTML 2 -- they haven't made any comment regarding it whatsoever. Also, before you make further comments about their support of standards, bare in mind that before they stopped further development of IE that there was a time when IE was known for its standards support, and anyone who knows anyone on the IE team knows they're doing their damnedest to get it back that way (though, inevitably, touching code that hasn't been touched in years is no easy feat).
Sorry ? WTF ?
Just read the DOCTYPE and depending on whether is specifies XHTML2 or HTML5, just switch to the corresponding parser.
The fact that both XHTML5 and XHTML 2 use similar but incompatible namespaces means that you can't mix BOTH in the SAME document in a meaningful way.
That doesn't stop you from using several different parser depending on what dialect a document is using.
DTDs are dead and DOCTYPEs are not relevant except for some obscure things that are not related to this discussion. XHTML 5 doesn't even have one. They were also never intended for any kind of versioning. Also, a DOCTYPE is only required in a strictly-conforming XHTML 2 documents, and the UA requirements for XHTML 2 documents cover those that aren't strictly-conforming (and therefore ones that don't have a DOCTYPE -- if you don't have a DOCTYPE (which is also the case in fragments such as within Atom documents) you have nothing to branch on and nothing to distinguish the two).
But that shouldn't be necessary because since XHTML 1, all document HAVE to be valid before being displayed.
Please look up what validity actually means: the only difference between validity and well-formness is that a document meets requirements set out in a DTD as well as being well-formed (which means there is no difference if no DOCTYPE is specified, and a DOCTYPE is purely optional in XML). Also, there are both validating and non-validating XML parsers (section 5.1, XML 1.0) -- only the former actually checks a document is valid, and all major XHTML processors use non-validating XML parsers. An XML document (therefore including XHTML documents) only needs to be well-formed to be displayed in most cases.
Re:Browser vendors choice by porneL · 2007-12-16 10:18 · Score: 1

Browsers are non-validating parsers and don't read DTD. XHTML must also work inside other XML languages that may require their own DTD.

In XML, by design, this distinction must be made by the namespace, and this is reflected in design of DOM, CSS, XPath, etc. If you change design fundamentals, inevitably you'll end up with lots of nasty surprises and ugly workarounds.
Re:Browser vendors choice by GrassIsRed · 2007-12-16 11:16 · Score: 1

Microsoft actually made the first webbrowser with decent CSS support (IE3). Back then Netscape with their huge market share had no intent to support CSS. Microsoft delivers when they have to. Too bad they can afford to lean back now with their desktop monopoly.
Re:Browser vendors choice by Bogtha · 2007-12-16 11:44 · Score: 1

Microsoft actually made the first webbrowser with decent CSS support (IE3).

No, the first browser with decent CSS support was Internet Explorer 5 on the Mac, which used a completely different rendering engine to Internet Explorer for Windows. Internet Explorer 3 was the first major browser with any CSS support, but it was terrible; for instance, 1em was treated as 1px. Even Netscape 4 had better support for CSS than Internet Explorer 3.

Back then Netscape with their huge market share had no intent to support CSS.

To be fair, that was because they were betting on JSSS instead, which they submitted to the W3C, who chose CSS instead. So it's not like they eschewed a standard stylesheet language, it's just the W3C preferred CSS instead so they had to scramble to catch up with Netscape 4 by transcoding CSS into JSSS.

--
Bogtha Bogtha Bogtha
Re:Browser vendors choice by Bogtha · 2007-12-16 11:46 · Score: 1

 <?xml version="1.0" encoding="UTF-8"?>

Nitpick: You can omit the XML prolog when you are using XML 1.0 and either the UTF-8 or UTF-16 character encoding.

--
Bogtha Bogtha Bogtha
Re:Browser vendors choice by mattwarden · 2007-12-16 15:21 · Score: 1

Sorry, I meant doctype declaration.
Re:Browser vendors choice by zsau · 2007-12-16 15:25 · Score: 1

It's HTML 5, not XHTML 5. HTML is SGML and can't be mixed with XML. So there's no problem there...

--
Look out!
Re:Browser vendors choice by uzytkownik · 2007-12-16 20:00 · Score: 1

To be honest I heard that they want to introduce XHTML 5 along with HTML 5 with exactly the same rules but xml formatting. I may be wrong.

--
I've probably left my head... somewhere. Please wait untill I find it.
Homepage: http://blog.piechotka.com.pl/
Re:Browser vendors choice by Anonymous Coward · 2007-12-16 21:44 · Score: 1, Informative

As it stands, with both XHTML 5 and XHTML 2 using the same namespace, it is only possible to support one of the two.

Not true. They have different namespaces, so a processor can know exactly which language it is dealing with.

From the HTMLv5 working draft (section 1.1.2 "Relationship to XHTML2"):

"XHTML2 and this specification use different namespaces and therefore can both be implemented in the same XML processor."

When using HTMLv5-as-XML, the namespace is "http://www.w3.org/1999/xhtml".
The XHTMLv2 namespace is "http://www.w3.org/2002/06/xhtml2/".

Heck, you could even mix (X)HTMLv5 and XHTMLv2 in the same document, though it might be a bit perverse. I can't think of a good use-case off the top of my head.
Re:Browser vendors choice by gsnedders · 2007-12-17 02:58 · Score: 1

The latest public WD of XHTML2 predates the WG being re-chartered: the current decision is to use the same namespace, as is reflected by the ED: .
Re:Browser vendors choice by gsnedders · 2007-12-17 03:00 · Score: 1

http://www.w3.org/MarkUp/2007/ED-xhtml2-20071024/conformance.html#strict, even.
Re:Browser vendors choice by mcvos · 2007-12-17 03:34 · Score: 1

DTDs are dead and DOCTYPEs are not relevant except for some obscure things that are not related to this discussion. XHTML 5 doesn't even have one.

Probably because XHTML 5 doesn't even exist. XHTML is working on version 2 at the moment. Who knows, they might one day get to version 5, but at the current speed, that's going to be a while. HTML 5, on the other hand, most definitely does have a DOCTYPE. Or should have, if it expects to be recognised by browsers. DOCTYPE is vital for validation and proper parsing.
Re:Browser vendors choice by Anonymous Coward · 2007-12-17 08:13 · Score: 0

I stand corrected. Thank you.

Hmmm. So, on the basis of namespaces, there's no way to tell (X)HTMLv5 apart from XHTMLv2. The XHTMLv2 doctype declaration is still there though, meaning that we can identify a strictly conforming XHTMLv2 document.
Re:Browser vendors choice by gsnedders · 2007-12-17 08:42 · Score: 1

XHTML 5 most certainly does exist: it's the XML serialisation of HTML 5 (which is defined in terms of a DOM, and provides two serialisations: one that uses an XML parser and one that uses an HTML parser).

HTML 5 (in both serialisations) most certainly doesn't have a DOCTYPE. A quick look at the spec would confirm that. The closest it gets is which exists only in the text/html serialisation exists purely to switch browsers into standards mode.

As for validation, all you need is a formal definition of a content model: it is absolutely irrelevant how that is specified, be it any schema (including DTDs) or Klingon (or in HTML 5's case, English).

DOCTYPEs have never been used for parsing whatsoever: in XML the content-models are all specified regardless of the presence of a DOCTYPE, and in text/html a DOCTYPE has only ever been used to switch between standards/quirks mode (it's not even read whatsoever).

Re:I bet my ass.. by coryking · 2007-12-16 06:20 · Score: 3, Insightful

I urge every web developer to stop treating MSIE as a special case, since it does not follow standards No offense to you, but I love how every single person who smugly suggests this usually has a link to a website that looks like shit when viewed on any browser.

This also seems to be the case when ever somebody bitches about web designers changing fonts, using javascript, or doing something to make their page look nice. You visit the websites created by the "changing the font at all, even in the stylesheet, is evil" or the classic "why are you trying to use two columns? two columns are evil" religious zealots and all their pages look really dull and boring. Long streams of times new roman. I guess this is our future, eh?

Why not ditch HTML? by forgoil · 2007-12-16 06:21 · Score: 3, Interesting

Why not just go with XHTML all the way? I always though that the best way of "fixing" all the broken and horribly written HTML out there on the web would be to build a proxy that could translate from broken HTML to nicely formed XHTML and then send that to the browser, cleaning up this whole double rendering paths in the browsers (unless I missunderstood something) etc. XHTML really could be enough for everyone, and having two standards instead of one certainly isn't working in anyones favor.

Re:Why not ditch HTML? by Anonymous Coward · 2007-12-16 06:46 · Score: 1, Informative

The position taken is that XML is too hard and parse errors are scary or something. With XML, you need to make sure that your document contents are correctly encoded, use the DOM instead of document.write() and should be serving it with the correct MIME type (when appropriate). For some reason, it's deemed more important that cowboy web designers are permitted to continue ripping off their clients with inaccessible and invalid markup rather than advancing the web for the benefit of competent developers.

XHTML 1.0 strict is where it's at and there'll be no change with the release of HTML5 or XHTML 2.0 recommendations.
Re:Why not ditch HTML? by GrouchoMarx · 2007-12-16 06:57 · Score: 5, Interesting

As a professional web developer and standards nazi, I'd agree with you if it weren't for one thing: User-supplied content.

For content generated by the site author or a CMS, I would agree. Sending out code that is not XHTML compliant is unprofessional. Even if you don't want to make the additional coding changes to your site to make it true XHTML rather than XHTML-as-HTML, All of the XHTML strictness rules make your code better, where "better" means easier to maintain, faster, less prone to browser "interpretation", etc. Even just for your own sake you should be writing XHTML-as-HTML at the very least. (True XHTML requires changes to the mime type and to the way you reference stylesheets, and breaks some Javascript code like document.write(), which are properly left in the dust bin along with the font tag.)

But then along comes Web 2.0 and user-supplied content and all that jazz. If you allow someone to post a comment on a forum, like, say, Slashdot, and allow any HTML code whatsoever, you are guaranteed to have parse errors. Someone, somewhere, is going to (maliciously or not) forget a closing tag, make at typo, forget a quotation mark, overlap a b and an i tag, nest something improperly, forgets a / in a self-closing tag like hr or br, etc. According to strict XHTML parsing rules, that is, XML parsing rules, the browser is then supposed to gag and refuse to show the page at all. I don't think Slashdot breaking every time an AC forgets to close his i tag is a good thing. :-)

While one could write a tidy program (and people have) that tries to clean up badly formatted code, they are no more perfect than the "guess what you mean" algorithms in the browser itself. It just moves the "guess what the user means" algorithm to the server instead of the browser. That's not much of an improvement.

Until we can get away with checking user-submitted content on submission and rejecting it then, and telling the user "No, you can't post on Slashdot or on the Dell forum unless you validate your code", browsers will still have to have logic to handle user-supplied vomit. (And user, in this case, includes a non-programmer site admin.)

The only alternative I see is nesting "don't expect this to be valid" tags in a page, so the browser knows that the page should validate except for the contents of some specific div. I cannot imagine that making the browser engine any cleaner, though, and would probably make it even nastier. Unless you just used iframes for that, but that has a whole host of other problems such as uneven browser support, inability to size dynamically, a second round-trip to the server, forcing the server/CMS to generate two partial pages according to god knows what logic...

As long as non-programmers are able to write markup, some level of malformed-markup acceptance is necessary. Nowhere near the vomit that IE encourages, to be sure, but "validate or die" just won't cut it for most sites.

--
--GrouchoMarx
Card-carrying member of the EFF, FSF, and ACLU. Are you?
Re:Why not ditch HTML? by falconwolf · 2007-12-16 07:07 · Score: 1

I don't think Slashdot breaking every time an AC forgets to close his i tag is a good thing. :-)

That's one reason I always try to preview before I post, no actually I preview so I can edit before posting. However I still let some mistake slip by.

While one could write a tidy program (and people have) that tries to clean up badly formatted code, they are no more perfect than the "guess what you mean" algorithms in the browser itself. It just moves the "guess what the user means" algorithm to the server instead of the browser. That's not much of an improvement.

Though using one adds work, a validater helps here.
Falcon

--
Should there be a Law?
Re:Why not ditch HTML? by hey! · 2007-12-16 08:03 · Score: 3, Interesting

Well, according to TFA, because XHTML, while terrific for certain kinds of applications, doesn't solve the most pressing problems of most of the people working in HTML today. It can do, of course, in the same way any Turing equivalent language is "enough" for any programmer, but that's not the same thing has being handy.

At first blush, the aims of XHTML 2.0 and HTML 5 ought to be orthogonal. Judging from the article, I'd suspect it is not the aims that are incompatible, but the kinds of people who are behind each effort. You either think that engineering things in the most elegant way will get things off your plate more quickly (sooner or later), or you think that concentrating on the things that are on your plate will lead you to the best engineered solution (eventually).

I'm guessing that the XHTML people might look at the things the HTML 5 folks want to do and figure that they don't really belong in HTML, but possibly in a new, different standard that could be bolted into XHTML using XML mechanics like name spaces and attributes. Maybe the result would look a lot like CSS, which has for the most part proven to be a success. Since this is obviously the most modular, generic and extensible way of getting the stuff the HTML 5 people worry about done, this looks like the perfect solution to somebody who likes XHTML.

However, it would be clear to the HTML 5 people that saying this is the best way to do it doesn't mean anything will ever get done. It takes these things out of an established standard that is universally recognized as critical to support (HTML) and puts them in a newer weaker standard that nobody would feel any pressure to adopt anytime soon. A single vendor with sufficient clout (we name no names) could kill the whole thing by dragging its feet. Everybody would be obliged to continue doing things the old, non-standard way and optionally provide the new, standardized way for no benefit at all. Even if this stuff ideally belongs in a different standard, it might not ever get standardized unless it's in HTML first.

Personally, I think it'd be nice to have both sets of viewpoints on a single road map, instead of in two competing standards. But I'm not holding my breath.

--
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
Re:Why not ditch HTML? by jonbryce · 2007-12-16 08:15 · Score: 2, Insightful

I think the problem is that (x)html is trying to be two very different things. It is trying to be a universal document format for presenting information. It is also trying to be a universal presentation manager for thin client applications. The technical requirements for these are very different, and it may well be that two different standards are appropriate.
Re:Why not ditch HTML? by irc.goatse.cx+troll · 2007-12-16 08:17 · Score: 1

You can only go so far to verify the content you serve is valid XML. The complexity of doing full DOM tree walking on every comment, embeded ad, RSS news feed, etc that will all be served into your page is just not worth it.

--
Pain lasts, kid. Its how you know you're alive. Sometimes I think this growing up thing is just pain management-TheMaxx
Re:Why not ditch HTML? by Mystra_x64 · 2007-12-16 09:32 · Score: 1

You'll need just check validity server-side based on some schema (client-side would be good too, if possible). Anything not valid gets cut with htmlspecialchars() (if php) or something similar. Preview helps here, i guess.

--
Quick way to get 30% Funny 70% Troll: defend Opera browser on /.
Re:Why not ditch HTML? by Antique+Geekmeister · 2007-12-16 10:16 · Score: 1

Why don't you make a parser that will translate all the Word document formats into something sane while you're at it? Quite seriously, there's a lot of web materials that violate every sane standard, setting up their own standards that they also violate at whim. It's more work than anyone sane, or any reasonable sane group, can hope to successfully tackle, especially since so much of the debris is aimed at features that no user actually wants but the web publisher seeks to enforce (such as enforced ad content and clickthrough tracking).

I appreciate your thought, but I'm afraid it won't work in the field.
Re:Why not ditch HTML? by Antique+Geekmeister · 2007-12-16 10:20 · Score: 1

You know, it might be useful for Slashdot to simply turn off the HTML capability and use Wiki style entries. The amount of bad HTML here is scary, perhaps a "run weblint on this" sanitizer would be better?
Re:Why not ditch HTML? by falconwolf · 2007-12-16 10:42 · Score: 1

You know, it might be useful for Slashdot to simply turn off the HTML capability and use Wiki style entries. The amount of bad HTML here is scary, perhaps a "run weblint on this" sanitizer would be better?

Actually I'd prefer to see more html. For instance being able to use a table is sometimes helpful. Or maybe add graphics. I could see too many people abusing graphics though and overloading storage. Maybe require a validator, say either the W3C's online validator or software validators like CSE's HTML Validator.
Falcon

--
Should there be a Law?
Re:Why not ditch HTML? by Antique+Geekmeister · 2007-12-16 10:48 · Score: 1

The W3C's validator is good, if awkward to install in some systems. I'd be happy with that. I'd be happier if the default for posting an article was flat text, though.
Re:Why not ditch HTML? by BotnetZombie · 2007-12-16 11:28 · Score: 1

While I agree with much of what you're saying, I still don't see any reason why a site like Slashdot couldn't validate xml/dtd correctness of the user input, and respond with a meaningful error description when that's the case. If the response says that the tag opened before some text is never closed, it wouldn't be that hard for your regular submitter to correct the error and resubmit.
Re:Why not ditch HTML? by Jesus_666 · 2007-12-16 11:30 · Score: 1

There's still the little issue where MSIE's support for XHTML goes just far enough to make it equivalent to HTML 4.01. Imagine if people wrote XHTML 2 like they write HTML 4 today. MSIE wouldn't choke, it would just run everything through the tag soup parser and be happy. Many users wouldn't make sure their code validates or even passes as well-formed, because IE doesn't complain. All other browsers would also use the tag soup parser on XHTML in order to cope with the new, broken web.

Let's face it, unless all major players agree to play by its rules, XHTML just isn't a viable HTML replacement. And Microsoft doesn't give a shit about XHTML. So either IE's market share is marginalized or Microsoft suddenly develops respect for the XHTML standard or XHTML just won't take off.

--
USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
Re:Why not ditch HTML? by BenoitRen · 2007-12-16 11:42 · Score: 1

No, XHTML will not fix the web. I wish people would get it through their head for once and for all. I'm sick of hearing this nonsense.

Well-formed XHTML is not necessarily valid! I can use, say, a soup element and get away with it. Being well-formed means that it follows the XML syntax rules.

Valid XHTML is not necessarily semantically rich! I can use the font element under the transitional DTD, use a host of elements in the wrong way, use the i element, and so on.

Not to mention that XHTML2 is a steaming pile of crap that was thought up by people at the W3C who are so out of touch with what web developers want. A href attribute so all elements can be links? Huh?
Re:Why not ditch HTML? by Bogtha · 2007-12-16 12:04 · Score: 1

I haven't looked at it for quite a while, but at least for many years, the "CSE HTML Validator" wasn't actually a validator, but rather a linter that told you whether your document conformed to what the creator's personal opinions about good HTML were. Plenty of people pointed this out to him, so he stuck a little disclaimer in the FAQ and then complained when people kept pointing out that it wasn't a validator. Consider this Usenet thread for example, in which the creator attempts to defend the name, or this one where he defends himself against accusations of conning people.

Maybe he's finally incorporated a real validator these days, but given his attitude, I doubt it.

--
Bogtha Bogtha Bogtha
Re:Why not ditch HTML? by GrouchoMarx · 2007-12-16 12:29 · Score: 1

If the response says that the tag opened before some text is never closed, it wouldn't be that hard for your regular submitter to correct the error and resubmit.

Regular Slashdot submitter? Maybe. Regular LiveJournal or MySpace submitter? Not a chance.

--
--GrouchoMarx
Card-carrying member of the EFF, FSF, and ACLU. Are you?
Re:Why not ditch HTML? by Anonymous Coward · 2007-12-16 12:32 · Score: 1, Interesting

There are at least two really good options here:

a. Don't make us write HTML. If you want to let me write rich text, give me a rich text editor. I can buy stuff on the web (which consists of database queries/updates) without typing in SQL, so why can't I write stuff on the web without typing in HTML? (Cue the army of geeks to say "But that's completely different! It's OK to make one hard on the user, but not the other...")

a'. Use a different (non-HTML) text-based language, like Markdown or Restructured Text. I don't know if these can generate XHTML, but if not it wouldn't be that hard to add. A ton of blogs use processors like this already.

b. Have slashdot validate it. Obviously there are plenty of HTML parsers out there that can take tag soup, and make it into a DOM, and from there it's trivial to filter tags you don't want users to use, and output valid XHTML. Slashdot (and every other website) must do something like this already, so what's so hard about fixing it to generate XHTML? This is really no different from a CMS, except instead of getting data from an RDBMS record, you're getting it from a DOM tree.

I really don't see how your scare scenario could be a problem, or rather, any more of a problem in XHTML than HTML. No website in the world "allows someone to post a comment on a forum, like, say, Slashdot, and allow any HTML code whatsoever". This problem was solved already. All we need to do is tighten up the output a bit.
Re:Why not ditch HTML? by gbjbaanb · 2007-12-16 12:39 · Score: 1

If you allow someone to post a comment on a forum, like, say, Slashdot, and allow any HTML code whatsoever, you are guaranteed to have parse errors

like that never happens in the current scheme of things today? :-)
Re:Why not ditch HTML? by Anonymous Coward · 2007-12-16 14:39 · Score: 0

The only alternative I see is nesting "don't expect this to be valid" tags in a page, so the browser knows that the page should validate except for the contents of some specific div.

and what happens when the user-supplied content closes your div for you?
Re:Why not ditch HTML? by l0b0 · 2007-12-16 19:44 · Score: 2, Interesting

You can include HTML inside XHTML, by changing the namespace for that content in the container element or using includes. The browser should then parse the contents as HTML, and you can get the best of both standards.

Another option is to make sure comments cannot be submitted until they contain valid XHTML. You could use a WYSIWYG editor, fall back to /. mode when JavaScript is disabled, and help the user along by auto-correcting (when using WYSIWYG editor) or hinting (e.g., in "You need to end the strong tag by adding '</strong>') when validation fails.
Re:Why not ditch HTML? by random0xff · 2007-12-16 21:16 · Score: 1

And then the browser must sanitize before sending.
Re:Why not ditch HTML? by zarlino · 2007-12-16 21:26 · Score: 1

User supplied content can be easily fixed. First, the user-entered markup is NOT the website markup. It can be BBcode or it can be a subset of HTML. In any case it has to be validated, fixed and converted before being output.

People not doing this cannot be called web developers.

--
Check out my cross-platform apps
Re:Why not ditch HTML? by Anonymous Coward · 2007-12-17 00:14 · Score: 0

There's an easy solution to that problem.

1) Check user-generated content for well-formedness. (You're going to parse it to ensure nothing illicit is going on in there, anyway.)
2) If the content is well-formed, output it (after the regular additional checks).
3) If it is not well-formed, output it *as text* (i.e., with angle brackets converted to entities so they'll show up verbatim and all that) (again, after the regular additional checks).

Livejournal does that, for example - it'll happily allow you to use HTML in your posts, but if it's not well-formed, it'll just barf it out verbatim (that is, escaped).

The author will then notice (hopefully) and edit what he wrote. No big deal.
Re:Why not ditch HTML? by Anonymous Coward · 2007-12-17 00:21 · Score: 0

I prepend/append and nest missing elements so that it's valid, the user can then fix their mistake during preview. The closing elements are in the textarea, if the user can't figure out how to move them -- they're too stupid to be using anything but plain text in web web forms.

While my code handles the simple cases, there's a heuristic to throw an error on garbage like this:
<p><em>Hello <strong>world</p></strong><span><div>!</span></em>
For quick validation before calling into more complex code, wrap the submission in a containing element and see if XML parser throws an error.
Re:Why not ditch HTML? by mcvos · 2007-12-17 03:47 · Score: 2, Insightful

But then along comes Web 2.0 and user-supplied content and all that jazz. If you allow someone to post a comment on a forum, like, say, Slashdot, and allow any HTML code whatsoever, you are guaranteed to have parse errors. Someone, somewhere, is going to (maliciously or not) forget a closing tag, make at typo, forget a quotation mark, overlap a b and an i tag, nest something improperly, forgets a / in a self-closing tag like hr or br, etc. According to strict XHTML parsing rules, that is, XML parsing rules, the browser is then supposed to gag and refuse to show the page at all. I don't think Slashdot breaking every time an AC forgets to close his i tag is a good thing. :-)

Use Tidy, and suddenly you've got perfectly fine XHTML again.

While one could write a tidy program (and people have) that tries to clean up badly formatted code, they are no more perfect than the "guess what you mean" algorithms in the browser itself. It just moves the "guess what the user means" algorithm to the server instead of the browser. That's not much of an improvement.

You don't have to write one yourself, because W3C provides a perfectly good one, and there already is a large number of open source clones of Tidy. Writing it yourself would be stupid and prone to error. The existing ones are as good as "guess what you mean" can getm and that is an improvement, because you're not trusting wacky, unreliable browsers to turn the crap on your site into something valid, you're doing it yourself. You are in control! That is always an improvement.

PS: My Slashdot comments are perfectly valid XHTML snippets. (Not valid XML, because they don't have a root element. I'm trusting the Slashdot server to handle that for me.)

reboot the web! by wwmedia · 2007-12-16 06:22 · Score: 4, Insightful

am I the only developer thats sick of this html / css / javascript mess??

people/companies are trying to develop rich applications using decade old markup language thats improperly supported by different browsers (even firefox doesn't fully support css yet) and is a very ugly mix right now, its like squeezing a rectangular plasticine object thru a round,triangular and starshaped holes at the same time

the web needs a reboot

we need a programming language that:
*works on the server and the client
*something that makes making UIs as easy as drag and drop
*something that does not forgive idiot html "programmers" who write bad code
*something that doesnt suffer from XSS
*something that can be extended easily
*something that can be "compiled" for faster execution
*something thats implemented same way in all browsers (or even better doesnt require a browsers and works on range of platforms)

Re:reboot the web! by Anonymous Coward · 2007-12-16 06:47 · Score: 0

"we need a programming language that: ..."

No doubt. But which proprietary solution should be adopted in an open forum that is the web? Because you can't even get Microsoft to support XHTML or ODF let alone some new web engine running on their products that isn't theirs. And if they do it will have to proprietary require a license causing a restriction who can play in the game before Microsoft will consider it.
Re:reboot the web! by Anonymous Coward · 2007-12-16 06:59 · Score: 2, Informative

There are a lot of people who think that web, Ajax and Flash applications are a very bad thing. Not just users, but also noted developers and usability experts.

More thoughts on why Ajax is bad for web applications: this is about how Ajax apps are often very fragile and usually don't work as expected.

Ephemeral Web-Based Applications: usability guru Jakob Nielsen writes this great article that goes into depth about how most web apps are complete failures when it comes to usability. Even something as basic as navigation quickly becomes unintuitive and difficult.

Why the .NET framework makes for bad web applications: this explains why .NET apps using some of the latest technology around is often a bad idea.

You're not on a fucking plane (and if you are, it doesn't matter)!: Ruby on Rails creator David Heinemeier Hansson talks about how we don't need web apps everywhere.

There are a lot of anti-web app articles here. Having done a lot of web apps for years now i think a lot of them are spot on although they are really against web apps when web apps probably are the best tool for the job:
Web apps: taking five years to get to where desktop apps were a decade earlier?

A JavaScript tip built on years of experience: try to avoid JavaScript.

Why is Web page layout still such a problem?

Web 2.0: A serious case of diarRIA.

AJAX: the "ricer" of the software development world?

Keep the Web in the browser, please.

The wasteful nature of pointless JavaScript effects.

An example of the sorry state of JavaScript today.

The Web is inherently an inadequate application development platform.

Where is the developer productivity increase with JavaScript-based Web applications?

A great Web developer is a waste of a really great application developer.
Re:reboot the web! by cyfer2000 · 2007-12-16 07:04 · Score: 1

I've just got a message from God this morning, he is working on reinventing the whole universe to plug those black holes now, and the request of a better web will be saved in a bugzilla database and be fulfilled several billion years later.

--
There is a spark in every single flame bait point.
Re:reboot the web! by fulldecent · 2007-12-16 07:13 · Score: 1

You can't WYSIWYG-author semantic content.

--
-- I was raised on the command line, bitch
Re:reboot the web! by corychristison · 2007-12-16 07:34 · Score: 1

I think I found what you want.
Re:reboot the web! by coryking · 2007-12-16 07:34 · Score: 1

Semantic is awesome for "make the navigation column have a the colour of Mt Everestt on a cool summer day". Semantic is awesome for "make all the links in my header have an icon in front of them". Semantic is great for "Make my pull quotes use comic sans and set them in a box with a drop shadow and a reflection under them" But you still need to address basic presentation!!! I still need to make the three column grid in a straightforward way!! Where is my grid tag? Where is my "flow content between these columns"? You know how sweet it would be if done right?

I'm a tech writer by training. I know all about semantic markup and it's niceness. But I'm also a usability person and I know how getting the *presentation* of the content right is just as important as the content itself. In a very real way, the presentation *is* the content just as much as the content is the presentation! Either one, poorly done, ruins the message.

You could have the worlds best linux documentation in the world, written so well grandma can recompile her kernel, but if it is presented poorly (like as in "info" document), you might as well just write "Fuck you asshole, docks are for pussies, read the source code" because nobody will understand your docs. If it is all done as a single column and it has some grainy goat face on the top left of the page, grandma will not even bother. She'll use BitchX and ask the IRC nerds instead. If the designer didn't use a readable screen font, or left it to the browser default of times new roman, she will not be able to read it. She doesn't know how to change the browser font yet either--how can she without reading the documentation?

Telling us we are evil for hacking around a shitty semantic markup language is a surefire way to get ignored. Fight presentational markup like the catholic church fights sex and you'll loose your following. Right now it is hard to give grandma a good looking website that makes a "I can trust this page" impression without resorting to hacked HTML. We need good layout so grandma trusts our linux documentation!

Ever hear of XAML? XAML is the sound of the future if the W3C doesn't deliver...
Re:reboot the web! by pinkstuff · 2007-12-16 08:26 · Score: 1

No, you are not the only one. Check out Java FX if you haven't already.
Re:reboot the web! by MyDixieWrecked · 2007-12-16 08:30 · Score: 3, Interesting

I agree with you about some things you're saying...

You need to realize that the markup language shouldn't be used for layout. Your comment about "making UIs as easy as drag and drop" can be done with a website development environment like Dreamweaver. You need a base language for that.

Personally, I think that XHTML/CSS is going the right way. It can be extended easily, it's simple enough that that basic sites can be created by new users relatively quickly, however complex layouts still require some experience (yeah, it's got a learning curve, but that's what Dreamweaver is for).

The whole point of XHTML/CSS is that it's not designed to be implemented the same way in all browsers. It's designed so that you can take the same "content" and render it for different devices/media (ie: home PC, cellphone, paper, ebook) simply by either supporting a different subset of the styling or different stylesheets altogether.

Have you ever tried to look at a table-based layout on a mobile device? have you ever tried to look at a table-based layout on a laptop with a tiny screen or a tiny window (think one monitor, webbrowser, terminal, and code editor on the same 15" laptop screen)? table-based layouts are hell in those scenarios. Properly coded XHTML/CSS pages are a godsend, especially when you can disable styles and still get a general feel for what the content on the page is.

I'm not sure if I 100% agree with this XHTMLv2 thing, but I think XHTMLv1 is doing great. I just really wish someone would make something that was pretty much exactly what CSS is, but make it a little more robust. Not with more types of styles, but with ways of positioning or sizing an element based on its parent element, better support for multiple classes, variables (for globally changing colors), and ways of adjusting colors relative to other colors. I'd love to be able to say "on hover, make the background 20% darker or 20% more red". I'd love to be able to change my color in one place instead of having to change the link color, the background color of my header and the underline of my h elements each time I want to tweak a color.

I'd also love if you could separate form validation from the page. doing validation with JS works, but it's not optimal. Having a validation language would be pretty awesome. Especially if you could implement it server-side. If the client could grab the validation code and validate the form before sending and handle errors (by displaying errors and highlighting fields) and then the server could also run that same code and handle errors (security... it would be easy to modify or disable anything on the clientside...) that would be great. All you'd really need is just a handful of cookiecutter directives (validate the length, format/regex, and also have some built-in types like phonenumbers and emails), that would be great, too.

I also think that it's about time for JS to get an upgrade. Merge Prototype.js into javascript. Add better support for AJAX and make it easier to create rich, interactive sites.

If we're not careful, Flash is going to become more and more prominent in casual websites. The only advantage the the current standards have is that they're free and don't require a commercial solution to produce.

XSS is a sideeffect of trusting the client too much and a side-effect that won't be solved by anything you've suggested.

And why does something need to be "compiled" to be faster? What needs to be faster? Rendering? Javascript? Or are you talking about server-side? Why don't we start writing all our websites in C? Let's just regress back to treating our desktop machines as thinclients. We'll access websites like applications over X11. It'll be great. ;)

--

...spike
Ewwwwww, coconut...
Re:reboot the web! by Blakey+Rat · 2007-12-16 08:34 · Score: 1

I'm not THAT upset with it. Javascript + DOM is a good tool, but I feel the real problem is that the designers of these technology don't listen to previous solutions to the problems encountered on the web.

Why did it take until CSS 3.0 to get easy-to-use columns? The New York Times has been using columns for 150+ years; why did the CSS implementers feel they should just dump all that publishing experience in the toilet and do things their own way?

Likewise, CSS which is supposed to free us from table-based layouts is really terrible at reproducing some effects which are trivial with tables. For example, centering content vertically on the page. (It can be done with CSS, but it's a hack.) If you're going to sell CSS as a replacement to table-based layouts, you need to first make sure that CSS is capable of doing all the things table-based layouts can do easily. (Columns, another great example; awkward in CSS, almost trivial with tables.)

Javascript + DOM has "getElementById", "getElementsByTagName", "getElementsByName"... but for some headache-inducing reason it doesn't have "getElementsByClassName". Why not? WHY NOT!? GAH!

Why doesn't the spec define one of the fundamental differences between Mozilla and IE in the DOM: should non-displaying text in the original HTML document appear as text nodes in the DOM? IE says no; Mozilla says yes; web developers say make up your damned mind, I keep having to write workarounds for this crap! (Personally I like IE's implementation better. If it doesn't display on screen, it doesn't need to be in the DOM.)

In short, I think there's far too much theory and not enough practice in these technologies. What we need is *practical* development. Which is why I'm all behind HTML 5, BTW, it focuses on the practical realities of the web and not some pie-in-the-sky idea you'll never get anybody to follow. Do you seriously think a webpage like, say, this: http://www2.jcpenney.com/jcp/ProductList.aspx?deptid=25439&pcatid=25864&catid=27010&cattyp=DEP&dep=Housewares&pcat=COOKWARE&cat=Stainless+Steel&refpagename=WindowSolutionHOM%252Easpx&refdeptid=25439&refcatid=25864&cmAMS_T=H9&cmAMS_C=C5&CmCatId=25439|25864 will ever meet the XHTML ideals?

--
Comment of the year
Re:reboot the web! by Hortensia+Patel · 2007-12-16 08:54 · Score: 1

If you're going to shill for Silverlight (which you clearly are, given your Scoble-soundbite title and your previous post here), at least be honest about it. This reads like one of those "evaluation guides" that sales put out for lazy journalists: "An XYZ app should be judged on features A, B, and C; by coincidence, our new XYZalizer product does A, B and C..."

I fully sympathize with your desire for a better way, but not at the cost of throwing away the Web and replacing it with the $VENDOR Network, which is what Silverlight (and others) are trying to do. Microsoft will not support an open Web. Not yet. They can't; they're institutionally addicted to monopoly rents, and monopoly rents require platform lock-in, whether that platform be Win32 or Silverlight or anything else.

I think Mark Pilgrim said it best, and far more entertainingly: "Seriously? Seriously? Do I really have to explain why this is a bad idea? Again? To a bunch of technological virgins who haven't been fucked yet?"

(Incidentally, we did a quick evaluation of Silverlight a few months back, and once you stripped away the layers of PR I really couldn't see what the excitement was about. All the demo apps were some variation of "Oh look, it's a another video player. Just what the world was waiting for." Completely uninspiring. And we're a .NET shop; gawd only knows what everyone else thought.)
Re:reboot the web! by Z34107 · 2007-12-16 09:18 · Score: 1

.NET / Silverlight?

ducks

--
DATABASE WOW WOW
Re:reboot the web! by Anonymous Coward · 2007-12-16 09:31 · Score: 0

we need a programming language that:
*works on the server and the client ...
*something that does not forgive idiot html "programmers" who write bad code ...
*something that can be extended easily
*something that can be "compiled" for faster execution

Ha! I like how tactfully you put it.

- a fellow Lisp programmer
Re:reboot the web! by Anonymous Coward · 2007-12-16 09:47 · Score: 0

*cough* Java *ahem*
Re:reboot the web! by Anonymous Coward · 2007-12-16 10:20 · Score: 0

Sounds a lot like silverlight. Do you like Microsoft?
Re:reboot the web! by etnu · 2007-12-16 10:35 · Score: 1

No, we don't.

*works on the server and the client

Server side needs are different from client side needs, and so far the best approach has consistently been to use different languages in the different environments. The client needs to run in a secure sandbox with limited access to system resources, but the server needs high performance. These two concepts are at odds with one another.

*something that makes making UIs as easy as drag and drop

No, we don't need this. We need real programmers who actually know what they're doing to take designs made by designers and let them work. Clearly you didn't live through the travesty of Visual Basic and other RAD drag and drop environments if you're making this statement.

*something that does not forgive idiot html "programmers" who write bad code

Fair enough, but now this is at odds with your previous statement. You want to make it easy to write UIs but then you want to eliminate "idiots"? Have you ever written software before?

*something that doesnt suffer from XSS

Any environment that allows dynamic interpreting suffers from this, but this is a browser flaw, not a flaw in any of the languages used.

*something that can be extended easily

All programming languages can be extended easily.

*something that can be "compiled" for faster execution

Compiling has little to do with execution speed, except where the compiler is coupled with an optimizer. Of course, unless you force yourself to use strongly typed languages with all sorts of limitations, you can't take advantage of much of this anyway, and once you do that you just slow down development yet again.

*something thats implemented same way in all browsers (or even better doesnt require a browsers and works on range of platforms)

Nothing will ever be implemented the same way on all platforms. There isn't a single programming environment in existence that runs on multiple platforms that is implemented the same way on all of them. C? Nope. C++? Nope. Java? No. PHP? No. Perl? No. Python? No.

HTML, CSS, and Javascript do not require browsers at all, and work on a range of platforms. They're also 3 completely distinct technologies which can be used completely independently of one another:

- HTML is a markup language for describing the structure of a hypertext document.
- CSS is a declarative language for describing styling rules for a user interface. There are dozens of enviroments that use it that aren't using HTML.
- Javascript (aka JScript, ECMAScript) is a general purpose programming language. It is implemented in hundreds upon hundreds of environments ranging from embedded systems to servers.

HTML is in dire need of being upgraded. Javascript could definitely use a real standard library. CSS is fine as well, we just need more implementations using the latest version once it's formally standardized.

All of these issues are incredibly minor considering how flawed other popular development tools are. I'd take Javascript's flaws over C++ or Java's flaws any day of the week, and comparing CSS to other commonly used style declaration systems isn't even a close competition.
Re:reboot the web! by Anonymous Coward · 2007-12-16 12:10 · Score: 0

am I the only developer thats sick of this html / css / javascript mess??
No, you're not the only one...
Re:reboot the web! by Zphbeeblbrox · 2007-12-16 13:14 · Score: 1

Amen...

I don't see that much needing fixing actually. Incremental Improvements are a great way to go. Trying to fix all the problems all at once only results a lot of wasted effort.

Seems to me the W3C could learn something from some of the Agile Development Methodologies gaining popularity these days.

--
If you see spelling or grammatical errors don't blame me. I tried to preview but IE here at work borked the CSS
Re:reboot the web! by Anonymous Coward · 2007-12-16 15:16 · Score: 0

Microsoft Silverlight satisfies all of those requirements.
http://www.microsoft.com/silverlight/
Re:reboot the web! by etnu · 2007-12-16 16:00 · Score: 1

There are a lot of people who think that a lot of things are bad. That doesn't make them right. Most people who complain about web development are whiners who ignore all of the flaws of desktop applications, or worse, people who just don't have a clue what they're talking about to begin with. Most of the articles you posted fall into the latter half.
Re:reboot the web! by etnu · 2007-12-16 16:02 · Score: 1

Yeah, because Java worked so well on the Web before.
Re:reboot the web! by etnu · 2007-12-16 16:16 · Score: 1

"I just really wish someone would make something that was pretty much exactly what CSS is, but make it a little more robust. Not with more types of styles, but with ways of positioning or sizing an element based on its parent element, better support for multiple classes, variables (for globally changing colors), and ways of adjusting colors relative to other colors. I'd love to be able to say "on hover, make the background 20% darker or 20% more red". I'd love to be able to change my color in one place instead of having to change the link color, the background color of my header and the underline of my h elements each time I want to tweak a color."

This is a great idea for a new CSS feature, but it doesn't require a new language. I'm pretty sure there's already a proposal for something like this anyway. What you probably want is the ability to do something like this:

@define theme-block { background-color: #ccf; color: #fff; } @define theme-window { border: solid 1px #000; font-family: sans-serif; } .block-container { base-styles: theme-block, theme-window; font-weight: bold; }

I've implemented this sort of functionality as a preprocessor, and I agree that it's useful, but it wouldn't be hard to have something like this included in a new version of CSS, if it isn't somewhere in CSS3 already and I'm just not aware of it.

"I'd also love if you could separate form validation from the page. doing validation with JS works, but it's not optimal. Having a validation language would be pretty awesome. Especially if you could implement it server-side. If the client could grab the validation code and validate the form before sending and handle errors (by displaying errors and highlighting fields) and then the server could also run that same code and handle errors (security... it would be easy to modify or disable anything on the clientside...) that would be great. All you'd really need is just a handful of cookiecutter directives (validate the length, format/regex, and also have some built-in types like phonenumbers and emails), that would be great, too."

WML has had this for years, and it wouldn't be a problem to add it to HTML. Something which I've found highly effective here is to define an attribute in the html such as this: <input type="text" validate="regex" filter="[0-9]"/>, and then use a javascript library to examine the various fields and automatically do the form validation.
Re:reboot the web! by jole · 2007-12-16 20:27 · Score: 1

I think that (just released) IT Mill Toolkit 5 is an answer to most of your questions:

*works on the server and the client
=> Java both on server and client.

*something that makes making UIs as easy as drag and drop
=> Not there yet. UI:s are programmed, but layouts can be composed in WYSIWYG manner.

*something that does not forgive idiot html "programmers" who write bad code
=> As it is based on Java, it has statical typing and compile-time checks.

*something that doesnt suffer from XSS
=> As most of the UI programming can be done on server-side, security is top notch (when comparing to client-side solutions)

*something that can be extended easily
=> Check

*something that can be "compiled" for faster execution
=> Check

*something thats implemented same way in all browsers (or even better doesnt require a browsers and works on range of platforms)
=> Relies on Google Web Toolkit compiler technology to abstract out browser differences. Google has done wonderful job with GWT 1.4 and while as runtime is not as fast as Flash, it is very browser agnostic. Furthermore - as most of the programming can be done on server, all UI widgets can be tested with wide variety of browsers.

--
Vaadin - the best open source framework for building web applications in Java - no plug
Re:reboot the web! by Anonymous Coward · 2007-12-16 21:27 · Score: 0

ouch.
Re:reboot the web! by Anonymous Coward · 2007-12-16 21:56 · Score: 0

Pssst! I have what you need: text/plain!
Re:reboot the web! by aztracker1 · 2007-12-16 22:23 · Score: 1

Hmm.. apparently you aren't familiar with Flex, or Silverlight.. while both commercial, both are widely available, and work as you suggest... as to client/server workings, you can use any tech on the server-side, but some options are easier than others... Personally I like XHTML + CSS + JS for client-side, and generally asp.net server-side, though I have been looking into ruby/rails and monorail more.. I am familiar with PHP and don't much care for it. I've also used swing and cfm in the past, both are okay.

--
Michael J. Ryan - tracker1.info
Re:reboot the web! by jesterpilot · 2007-12-17 01:25 · Score: 1

If we're not careful, Flash is going to become more and more prominent in casual websites. The only advantage the the current standards have is that they're free and don't require a commercial solution to produce.

Another, very important feature of the standards is that they put the viewer of the webpage in the drivers seat. With a nice xhtml&css-page, you can control fonts and pics, shut css on and off, and block popups. But that's not in the interest of the advertisers. Flash makes it almost impossible to avoid advertising, and this is the real reason why it's becoming more popular. It is also the biggest problem any standard will have to overcome: how to keep the user in the drivers seat and keep the advertisers at bay.

--
Trust me, I work for the government.
Re:reboot the web! by MyDixieWrecked · 2007-12-17 03:01 · Score: 1

This is a great idea for a new CSS feature, but it doesn't require a new language. I'm pretty sure there's already a proposal for something like this anyway. What you probably want is the ability to do something like this: Back when CSS2 first came out, I read a proposal that someone had regarding something similar to this. The issue was that CSS was designed to be as simple as possible. I agree it should be part of the CSS spec... although, I think the guys behind CSS want to keep as much logic out of it as possible; that's the only reason I'd suggest that it be separate.

Your example is exactly what I'm talking about, though. Although I'd even love something more than just blocks... where you can define a value... be it "50%" or "500px" or "#666" or "1px solid white" and use it anywhere that that value is allowed.
WML has had this for years, and it wouldn't be a problem to add it to HTML. Something which I've found highly effective here is to define an attribute in the html such as this: , and then use a javascript library to examine the various fields and automatically do the form validation. For this, what you recommend is a great workaround for today, but if you were to separate the validation logic out into a separate file (via a link tag) or separate section of the document (which brings me to another thing which I'll get to in a second), you could make it easy to include the validation code on the server side without any effort. No need to go through and parse the document with server-side-JS and you could keep the validation coders out of your document structure.

So, with that, you could have great separation between design (CSS), coding (the JS, and any server-side languages, and validation), document structure (XML/XHTML/HTML).

Getting back to my earlier mention of the information stored in the header, I really wish the style tag wasn't its own tag. Same with script. I almost feel that it would be best to have a separate tag; let's call it "module" that you can use for these non-document sections; and it is only allowed to go in the header.

so you could have something like ... and even have the option to do something like and in the same way, you could also supply the validation code, style information (using CSS or whatever), or any other extra information you'd want to include.

"module" is a terrible name for it, but you get the idea.

--

...spike
Ewwwwww, coconut...
Re:reboot the web! by Anonymous Coward · 2007-12-17 08:50 · Score: 0

I agree on the CSS stuff but it's still preferable to tables.

> but for some headache-inducing reason it doesn't have "getElementsByClassName". Why not? WHY NOT!? GAH!

Works for me :P

It's only 3-4 lines of script to do this anyway.
Re:reboot the web! by Blakey+Rat · 2007-12-17 10:06 · Score: 1

Wow, well at least Firefox 3 will have it. Now only a decades-long wait until all the other browsers do, too. Hah. Thanks for the link.

--
Comment of the year
Re:reboot the web! by Anonymous Coward · 2007-12-17 20:43 · Score: 0

Java?
Re:reboot the web! by Rysc · 2007-12-18 13:40 · Score: 1

What disturbs me about HTML5 is the regressions in syntax nazisism. What I mean is that the primary feature of XHTML, for me, is that it REQUIRES matching/closing tags and it REQUIRES all lower case elements and attributes. This HTML4/5 "You can ignore the closing p tag, it's ok" idea is *not cool*. HTML 5 plus all of the XHTML syntax strictness would be a win in my book.

Without that strictness I'm going to have to give it a pass. Bring on XHTML! It may be half way useless but at least it's clean.

--
I want my Cowboyneal
Re:reboot the web! by Blakey+Rat · 2007-12-18 14:32 · Score: 1

What I mean is that the primary feature of XHTML, for me, is that it REQUIRES matching/closing tags and it REQUIRES all lower case elements and attributes. This HTML4/5 "You can ignore the closing p tag, it's ok" idea is *not cool*.

Why not? Markup languages should be easy for *people* to read; if a computer has to do a teeny bit more work to read it, well, that only takes the computer a millionth of a second longer. As long as it's not so slaughtered that the browser can come up with a reasonable DOM representation, I'm all for HTML5.

Bring on XHTML! It may be half way useless but at least it's clean.

You have really, really strange priorities.

--
Comment of the year
Re:reboot the web! by Anonymous Coward · 2007-12-19 05:22 · Score: 0

Do you mean Java?

Different directions -- Need Both by alexhmit01 · 2007-12-16 06:26 · Score: 5, Insightful

Most of the web is non well-formed, so it's variations of HTML 4 with non-standard components. An HTML 5, that remains a non-XML language, presents a reasonable way forward for "web sites." Without the need to be well-formed, the tools to create are easier and can be sloppy, particularly for moderately admined sites. Creating a new HTML 5 might succeed in migrating those sites. If you avoid most breaks with HTML 4, beyond the worst offenders, Browsers could target an HTML 5, and webmasters would only need to change 5%-10% of the content to keep up. That would mean a less degrading "legacy" mode than the HTML 4 renderers we have now.

So while the HTML 4 renderers floating around wouldn't be trashed, they could be ignored, left as is, and focus on an HTML 5 one. Migrating to XHTML is non-trivial for people with out-dated tools and lack of knowledge. You can't ignore those sites as a browser maker, but HTML 5 might give a reasonable path to modernizing the "non-professional" WWW.

XHTML has some great features, by being well-formed XML, you can use XML libraries for parsing the pages. This makes it much easier to "scrape" data off pages and handle inter-system communication, which HTML is not equipped for.

It's interesting in that HTML and XHTML look almost identical (for good reasons, XHTML was a port of HTML to XML) but are technically very different, HTML being an SGML language, and XHTML an XML language. Both programs have their uses, HTML is "easier" for people to hack together because if you do it wrong, the HTML renderer makes a best guess. XHTML is easier to use professionally, because if there is a problem, you can catch it as being an invalid XML document. Professionals worry about cross-browser issues, amateurs worry about getting it out there.

XHTML "failed" to replace HTML because it satisfies the needs of professionals to have a standardized approach to minimize cross-browser issues, but lacks the simplicity needed for amateurs and lousy professionals.

Rev'ing both specs would be a forward move that might simplify browser writing in the long term while giving a migration path. XHTML needs a less confusing and forward looking path, and HTML needs to be Rev'd after being left for dead to drop the really problematic entries and give people a path forward.

Re:Different directions -- Need Both by Bogtha · 2007-12-16 06:50 · Score: 1

HTML 5, that remains a non-XML language

HTML 5 has two serialisations, a quasi-HTML serialisation and an XML serialisation.

XHTML "failed" to replace HTML because it satisfies the needs of professionals to have a standardized approach to minimize cross-browser issues, but lacks the simplicity needed for amateurs and lousy professionals.

XHTML failed to replace HTML because a browser with a dominating market share doesn't support it and using it in a backwards-compatible way confers very few advantages over HTML and none whatsoever for typical developers.

--
Bogtha Bogtha Bogtha
Re:Different directions -- Need Both by bcrowell · 2007-12-16 07:00 · Score: 1

XHTML failed to replace HTML because a browser with a dominating market share doesn't support it [...]
Right.

[...] and using it in a backwards-compatible way confers very few advantages over HTML and none whatsoever for typical developers.
Wrong -- or at least it depends on what you mean by "typical." Technologies like SVG and MathML are XML-based, so there is a big advantage to having xhtml support in browsers: it lets you use inline SVG and MathML according to the w3c standards. Because MS doesn't support xhtml, SVG and MathML have basically been killed as practical browser technologies. Instead of SVG, we get Flash, which is proprietary (full of patent-encumbered and license-encumbered parts, controlled only by Adobe). Instead of MathML, we get horrible-looking bitmap renderings of mathematics on web pages.

--
Find free books.
Re:Different directions -- Need Both by Bogtha · 2007-12-16 07:07 · Score: 1

Technologies like SVG and MathML are XML-based, so there is a big advantage to having xhtml support in browsers

Yes, but the advantage is only there if you give up on Internet Explorer compatibility or put in a lot of extra work by coding an additional Internet Explorer version without SVG and MathML, i.e. the version you are supposedly skipping by using XHTML.

Because MS doesn't support xhtml, SVG and MathML have basically been killed as practical browser technologies.

Yes, so you can't really count them as advantages XHTML brings, can you?

--
Bogtha Bogtha Bogtha
Re:Different directions -- Need Both by uzytkownik · 2007-12-16 09:35 · Score: 1

Because MS doesn't support xhtml, SVG and MathML have basically been killed as practical browser technologies.
Yes, so you can't really count them as advantages XHTML brings, can you?
Using this logic CSS 3 or CSS 2 brings nothing since IE do not supports it. Using this logic any non-MS, or at least without plugin, technology has no advantages. Lack of support is disadvantage of program or exclude the technology from practical use but it is not a disadvantage of it.

--
I've probably left my head... somewhere. Please wait untill I find it.
Homepage: http://blog.piechotka.com.pl/
Re:Different directions -- Need Both by Bogtha · 2007-12-16 09:52 · Score: 1

Lack of support [...] exclude the technology from practical use

This is exactly what I am arguing.

--
Bogtha Bogtha Bogtha

Re:I bet my ass.. by pkadd · 2007-12-16 06:28 · Score: 0

read the faq, it's explained there. plus i am a web developer, not a designer, that's ahuge difference.

q: What is the definition of someone who codes php and html?
a: someone who hangs with programmers

--
Pure awesomenes

Do you work for Adobe? by Anonymous Coward · 2007-12-16 06:28 · Score: 0

It sure sounds like you're suggesting Flash/Flex/Apollo with something like ColdFusion/Java on the backend.

I'm going to make my own browser standard... by tjstork · 2007-12-16 06:31 · Score: 1

Seriously, at this point, having a single standard for web pages is going to be passe. All it will take is a good open source implementation for the browser, critical mass, and eventually, the big players will follow.

--
This is my sig.

No, the direction is not uncertain by Anonymous Coward · 2007-12-16 06:34 · Score: 0

If different browsers decide to adopt different standards, people (ie. web designers and developers) will fall back on whatever currently works for all browsers (ie. whatever we're using right now).

Re:I bet my ass.. by stubear · 2007-12-16 06:35 · Score: 1

I think you really need to change your domain name to vomit.com. It would be far more fitting given your web design skills.

Direction not uncertain by Anonymous Coward · 2007-12-16 06:38 · Score: 0

Browser vendors have publicly stated they will not implement XHTML 2. That 'standard' is stillborn. (X)HTML 5 is the way forward.

So I guess... by Snarkhunter · 2007-12-16 06:39 · Score: 0, Troll

So I guess the question is which one Microsoft will ignore the most?

Re:I bet my ass.. by coryking · 2007-12-16 06:39 · Score: 1, Troll

So if you are a developer and not a designer, why do you think you are qualified to even talk about how we should design our HTML or what browsers we should target?

Stop being so damn myopic and think like a designer or even *gasp* a business person. You are telling people that they should sign up to your religion, even if it means cutting off 70% of their market share. If you stood even an inch outside your "web developer" click and looked at the big picture, you'd see how insane you sound.

But yeah yeah yeah. IHBT and I'll HAND, thanks.

Re:I bet my ass.. by Bogtha · 2007-12-16 06:40 · Score: 1, Insightful

the classic "why are you trying to use two columns? two columns are evil" religious zealots

I don't think I've ever seen anybody say this. Example?

all their pages look really dull and boring.

In actual fact, their pages don't look boring at all. Your default browser setup looks boring.

Remember, a web design doesn't look like anything until it is realised with the combination of hardware, browser defaults and personal settings. If you think a site that uses your preferences looks boring, then your preferences are to blame.

--
Bogtha Bogtha Bogtha

Article sucks by nonpareility · 2007-12-16 06:43 · Score: 1

XHTML V2 and related modules are officially supported by the W3C, and the related modules are becoming key ingredients for other XML specifications that the W3C maintains. Unfortunately, official W3C approval is no guarantee of support by major Web browsers.

It wouldn't be the first time browser vendors were ahead of official recommendations.

Official W3C approval is pretty much dependent on support by major Web browsers. The W3C process says there should be two interoperable implementations of each feature before a proposed standard becomes a recommendation.

The FAQ doesn't even try to give a serious answer about the expected date of approval

Really?

Current browsers support both HTML V4 and XHTML V1.

Internet Explorer doesn't support XHTML V1.

Similarly, future browsers might support both HTML V5 and XHTML V2.

Don't count on it. XHTML2 is pretty much dead.

are html 5 and xhtml 2 worked on by W3C? by falconwolf · 2007-12-16 06:43 · Score: 4, Informative

Both standards are being worked on the by the W3C standards group.

According to the IBM paper html 5 is being done independently of the W3C. "In April 2007, the W3C voted on a proposal to adopt HTML V5 for review" is about as much as W3C has with html 5.

Falcon

--
Should there be a Law?

Re:are html 5 and xhtml 2 worked on by W3C? by gsnedders · 2007-12-16 07:09 · Score: 1

There is an HTML WG at the W3C chartered to create a new version of HTML. A basis for review means, in W3C language, a starting document that will then be reviewed and changed as needed.
Re:are html 5 and xhtml 2 worked on by W3C? by falconwolf · 2007-12-16 07:20 · Score: 1

There is an HTML WG at the W3C chartered [w3.org] to create a new version of HTML. A basis for review means, in W3C language, a starting document that will then be reviewed and changed as needed.

However html 5 was started outside of the W3C by an independent group.
Falcon

--
Should there be a Law?
Re:are html 5 and xhtml 2 worked on by W3C? by VP · 2007-12-16 16:07 · Score: 3, Informative

Both standards are being worked on the by the W3C standards group.

According to the IBM paper html 5 is being done independently of the W3C. "In April 2007, the W3C voted on a proposal to adopt HTML V5 for review" is about as much as W3C has with html 5.

Falcon Wrong. The W3C restructured the original HTML working group. Here is Tim Berners-Lee's initial message about the refocusing of the efforts for evolving HTML, and here are the details for the two new working groups - the HMTL working group and the XHTML2 working group.
Re:are html 5 and xhtml 2 worked on by W3C? by falconwolf · 2007-12-17 04:28 · Score: 1

Wrong. The W3C restructured the original HTML working group. Here is [mit.edu] Tim Berners-Lee's initial message about the refocusing of the efforts for evolving HTML, and here are the details [w3.org] for the two new working groups - the HMTL working group and the XHTML2 working group.

The IBM document is wrong then. Here's what it says:

" Some prominent HTML specialists outside the W3C--browser vendors, Web developers, authors, and other stakeholders--disagreed with the direction of XHTML V2. In 2004, they started an independent work group to propose an alternative direction for the next version of HTML. Under the flag of WHATWG (Web Hypertext Application Technology Working Group), the group put together proposals for HTML V5 and Web Forms V2."

It goes on to say the W3C later, in April 2007, voted to adopt html 5.
Falcon

--
Should there be a Law?

Re:I bet my ass.. by BlueParrot · 2007-12-16 06:54 · Score: 1

This also seems to be the case when ever somebody bitches about web designers changing fonts, using javascript, or doing something to make their page look nice.

Strawman. Nobody minds a page which uses these things properly ( i.e gracefully fall back when not supported , don't rely on them for navigation etc... ). Problem is, some people get it VERY wrong. There's a Swedish news site I like because they have good journalists, but their web developers deserve to be shot. They actually implemented a "marque-like" news scroller using javascript and DHTML, the thing is slow as hell and for some reason interferes with scrolling the page (i'm guessing the engine chokes when it tries to render an element which is moving vertically across the edge of the window while scrolling with "smooth"-scroll enabled ). So why don't I just disable javascript for that page? Well, the site's navigation relies on it...

Btw, if you actually need javascript to make your page "look nice" then I'd claim that you are actually doing something wrong. If you did it right it would look nice even when printed.

Re:Where is Microsoft? by Anonymous Coward · 2007-12-16 06:54 · Score: 2, Insightful

you ever use anything with ajax? i.e. u like google maps? u can thank MS for bringing that out of javascript...

ms ain't the devil for development, sometimes they drive new features and functionality that would take forever to incorporate otherwise. do they always do it in the best of ways, no, but they do bring out good things from time to time...

beta vs vhs.... by josepha48 · 2007-12-16 06:56 · Score: 1

... all over again. It seems to me that at some point one will become more popular than the other. The question is which one. Then the other will go away. So far though I do not see anything being really improved upon. IMHO there should be certain built-ins to the browser to make it worth it.

Here is what I would suggest: 1 multi-column drop down, with sort capabilities. This is something that is available in desktop applications; 2) built-in browser menu; 3) better scripting modal window, I should have OK(alert), OK/Cancel(confirm), Yes/No, and Yes/No/cancel message boxes at least or a better way of specifying these.

Maybe some of this is improvements in JS/ECMA Script, but making more things 'built-in' to the browser, would make a more standard experience, assuming you could get everyone to upgrade to the browsers and people to develop to these standards.

After reading this article, it would be nice to have both these standards merged into one so I get xforms with HTML5 menu and toolbars.

--

Only 'flamers' flame!
Does slashdot hate my posts?

Re:beta vs vhs.... by Chief+Camel+Breeder · 2007-12-17 00:15 · Score: 1

"Then the other will go away." You may well be right, but it's not as inevitable as with the tapes. We can easily have browser that support both languages. If video recorders taking both VHS and Betamax had been available at no extra cost then we'd still have Betamax.

Re:I bet my ass.. by zaunuz · 2007-12-16 06:59 · Score: 1

Congratulations, you win +1 troll

--
this is probably the most boring sig in the world

Re:I bet my ass.. by coryking · 2007-12-16 07:01 · Score: 1

I don't think I've ever seen anybody say this. Example? Exhibit A. Kinda Exhibit B.

Okay.. so I overstated myself a bit, sue me; this is slashdot after all, right? You know what I'm saying though, there are a lot of people who at least in this little corner of our interweb seem to think that unless we design explicity for an 80 column lynx terminal we are going to hell - I'm not talking degrade nicely for a lynx terminal, I'm talking "designed for lynx(tm)" and forging anything more advanced.

I bet you can dig up people in 1997 that were bitching on slashdot about people using images.. I mean IMAGES for christ sake! I'm sure those people are to this day reading slashdot with images disabled using their parents 9600 baud modem and a trumpet winsock slip connection.

By the way... this comment is borderline what I'm talking about and so is this one or this one.

Despite appearing like one, I'm not an insane pixel perfect guy either. I just think our standards really suck right now and don't actually meet the needs of either designers or developers. "Pixel Perfect" only exists because it is hard to make fluid layouts with our current bag of tricks. I enjoy bashing W3C zealots almost as much as bashing religious zealots. Both are too trapped in their dogma to see the real world.

Hopefully CmdrTaco will require you to use javascript or flash to post comments so we can finally weed those people out :-)

Re:I bet my ass.. by bcrowell · 2007-12-16 07:03 · Score: 3, Informative

Anyone thinking of clicking on the parent's link (to vumit.com) should realize that it's a goatsex-style shocker page.

--
Find free books.

What a mess... by Enleth · 2007-12-16 07:07 · Score: 1

If you're more interested in XHTML V1.1 than HTML V4, looking for an elegant approach to create documents accessible from multiple devices, you are likely to appreciate the advantages of XHTML V2. If you only use XHTML V1 because of its XML compliance but you prefer the new features in HTML V5, you might appreciate XHTML V5 (HTML V5 rewritten as an XML dialect).

And what if I'm interested in creating elegant, accessible documents incorporating the new features? I guess I'm screwed by the idiots at W3C, then?
That's not a matter of allowing those poor, repressed amateur web designers to express themselves. It's their problem that they cannot comprehend something as easy as XML and it's a pity that Web authoring tools don't work like a good, old hammer - if you don't know how to use it, you hit your fingers and know better next time to be careful because it hurts. 1997 is over, tag soup is becoming a horror of the past - what's up with those people trying to keep it?

--
This is Slashdot. Common sense is futile. You will be modded down.

Re:What a mess... by porneL · 2007-12-16 11:12 · Score: 1

Editor of HTML5 conducted large scale (Google-scale) study of pages and concluded that 93% pages on the net contain syntax errors.
You can't tell browser vendors to stop displaying 93% pages on the web. Creating a new XML-only language won't make it go away either. In fact, W3C already tried that and it failed.
So these "idiots" have already done what you're fuming about, learned that it will not work, and already found a new way of solving this problem. HTML5 defines, in gory details, how to parse tag soup, so every browser can read it the same way.

Re:I bet my ass.. by coryking · 2007-12-16 07:13 · Score: 1

Can you make meebo look nice or be as functional without javascript? What about gmail? What about a web application that wants to fill the browser window in a fluid way (i.e. 100% height). Try doing that without javascript.

Can you make a comment system that is easy to use without javascript? Sure, but you can make a much more enjoyable user friendly one once you limit your scope to javascript only.

Your example of site navigation with javascript? It is poor design not because it is using javascript. It is poor design because it isn't very user friendly. The OP troll dude has a really shitty navigation system, but I dont see any javascript. Testing the web page on users would have probably fixed both peoples design.

Maybe the best way to fix usability bugs is *using* javascript! Do you think Digg would have been at all as popular if it didn't use javascript for it's voting buttons? In that case, javascript significantly enhanced the user experiance. On slashdot, javascript comments significantly improve the ability to track threaded comments. What about all those neat overlays netflix uses? Those also improved the user experience significantly. Can you do that without javascript?

Shitty javascript site navigation is just a sign that the web designers and developers didn't know usability from a whole in the ground. Used by professionals who know what they are doing, javascript is a powerful tool to improve the user experience of any website.

In other words, javascript is a powerful tool. Use it wisely.

No standard without reference implementation by ikekrull · 2007-12-16 07:21 · Score: 4, Insightful

The worst thing about W3C standards is the lack of a reference implementation. If you can't produce a computer program that implements 100% of the specification you are writing in a reasonable timeframe, your standard is too complex.

Is doesnt matter if the reference standard is slow-as-molasses or requires vast quantities of memory, at least you have proven the standard is actually realistically implementable. On the other hand if your reference implementation was easy to build and is really good, then that will foster code re-use and massively jump-start the availability of standardised implementations from multiple vendors. It might also show that you have a really good standard there.

If you don't do this, you get stuff like SVG - I don't think there is even one single 100% compliant SVG implementation anywhere, and there may never be.

There aren't any fully compliant CSS, or HTML implementations either, to my knowledge.

The same goes for XHTML and HTML5. If you, as a standards organisation, are not in a position to directly provide, or sponsor the development of an open reference implementation, then personally, I think you should be restricting your standard to a smaller chunk of functionality that you are actually able to do this with.

There is no reason a composite standard, with a bunch of smaller, well defined components, each with reference implementations, can't be used to specify 'umbrella' standards.

Now, i am also aware that building a reference application tends to make the standard as written overly influenced by shortcomings in the reference implementation, but i really can't believe this would be worse that the debacle surrounding WWW standards we've had for the last 10+ years. Without a conformant reference implementation, HTML support in browsers is dictated by the way Internet Explorer and Netscape did things anyway.

I'm also aware that smaller standards tends to promote a rather piecemeal evolution of those standards, when what is often desired is an 'across the board' update of technology.

But this 'lets define monster standards that will not be fully implemented for years, if at all, and hope for the best' approach seems to be obviously bad, allowing larger vendors to first play a large role in authoring a 'standard' that is practically impossible to fully implement, and then to push their own hopelessly deficient versions of these 'standards' on the world and sit back and laugh because there is no way to 'do better' by producing a 100% compliant version.

--
I gots ta ding a ding dang my dang a long ling long

Re:No standard without reference implementation by Bryan+Ischo · 2007-12-16 07:54 · Score: 1

I agree with you completely. I think that *every* standard should come with a reference implementation. I can't even comprehend why standards bodies don't do this. It is the single most effective way to ensure that your standard is adopted. And it proves, as you said, that the standard is reasonably implementable - the code will demonstrate how easily implemented the standard is, and certainly the standard body would modify the standard where its egregiously difficult to implement instead of sinking lots of time into writing a difficult implementation, which is better for everyone as well - by providing a reference implementation, the standard body would be forced to "eat their own dog food", which would definitely encourage polishing up the rough parts before finalizing the standard. Once again, good for everybody.
Re:No standard without reference implementation by Bogtha · 2007-12-16 09:06 · Score: 2, Informative

The worst thing about W3C standards is the lack of a reference implementation.

For a few years now, the W3C publication process has included an additional final step. It is not possible for a specification to reach final Recommendation stage unless it has two complete interoperable implementations.

--
Bogtha Bogtha Bogtha
Re:No standard without reference implementation by Anonymous Coward · 2007-12-16 11:45 · Score: 1, Insightful

Amaya
Re:No standard without reference implementation by ikekrull · 2007-12-16 17:57 · Score: 1

Amaya doesn't offer 100% compliance for HTML, CSS, XHTML or anything at all really.

Its a good effort, but after 5+ years of development, they dont support major functionality like frames, the CSS is pretty inadequate, and the point of my post is that an 'Amaya' should be ready when the standard is published, not over half a decade afterwards.

--
I gots ta ding a ding dang my dang a long ling long
Re:No standard without reference implementation by Bazouel · 2007-12-16 21:30 · Score: 1

EXACTLY. They should provide a library or parsing grammar that browsers can use to obtain the DOM. A library would be better because it can also handle things like the boxing model, etc. Having a C++, Java and C# reference implementation is really a base requirement.

I have to disagree with something you said thought. If you cannot produce a base implementation which uses reasonable computing power (cpu and memory), then maybe the grammar is just too complex. XSD is a good example of that problem.

--
Intelligence shared is intelligence squared.

Why Bother by Anonymous Coward · 2007-12-16 07:24 · Score: 0

Why bother with any kind of standard? IE will just display it fscked up anyway...

Re:Where is Microsoft? by Planesdragon · 2007-12-16 07:25 · Score: 3, Insightful

Is Microsoft involved in this at all? If it is, then I am worried. If Microsoft isn't involved at all, then it will fail. That's what "monopoly" means.

Re:Where is Microsoft? by Bogtha · 2007-12-16 07:27 · Score: 2, Insightful

you ever use anything with ajax? i.e. u like google maps? u can thank MS for bringing that out of javascript...

Ajax-like techniques are possible without XMLHttpRequest and I don't believe Google Maps uses XMLHttpRequest anyway. If any organisation is responsible for the popularity of Ajax, it's Google, as it was when they started using it extensively that it really took off.

--
Bogtha Bogtha Bogtha

Re:I bet my ass.. by Bogtha · 2007-12-16 07:32 · Score: 2, Insightful

Please re-read the original comment. It was saying that you can use JavaScript without being backwards-incompatible. You seem to have confused this with avoiding JavaScript altogether. Every single point you make is good against an argument that JavaScript should be avoided, but completely irrelevant to somebody asking for it to degrade gracefully, which is the distinction BlueParrot was trying to explain to you.

--
Bogtha Bogtha Bogtha

Support for multiple devices... by pikine · 2007-12-16 07:36 · Score: 4, Interesting

From the conclusion of TFA:

If you're more interested in XHTML V1.1 than HTML V4, looking for an elegant approach to create documents accessible from multiple devices, you are likely to appreciate the advantages of XHTML V2.

The author apparently has no experience with rendering XHTML on mobile devices. First of all, since the screen is smaller, it's not just about restyling things in a minimalist theme. It's about prioritizing information and remove the unnecessary one so more important information becomes more accessible in limited display real-estate.

For example, anyone who accessed Slashdot homepage on their mobile phone knows the pain of having the scroll down past the left and right columns before reaching the stories. You can simulate this experience by turning off page style and narrowing your browser window to 480 pixels wide. The story summaries are less accessible because they're further down a very long narrow page.

Another problem is the memory. Even if you style the unnecessary page elements to "no display", they're still downloaded and parsed by the mobile browser as part of the page. Mobile devices have limited memory, and I get "out of memory" error on some sites. For reading long articles on mobile devices, it is better to break content into more pages than you would on a desktop display, both for presentation and memory footprint reasons.

For these two reasons, a site designer generally has to design a new layout for each type of device. The dream of "one page (and several style sheets) to rule them all" is a fairytale.

--
I once had a signature.

Re:Support for multiple devices... by Abcd1234 · 2007-12-17 09:21 · Score: 1

Unless, of course, you're not an idiot, and have your source documents in XHTML, and then transform them for a mobile device using XSLT, in essence stripping them down server-side, while maintaining a single document source. And this is only possible with XHTML, as it's properly formed XML.
Re:Support for multiple devices... by pikine · 2007-12-17 10:24 · Score: 1

I knew some witty person would mention XSLT at some point, so I've already prepared a question for you. Is there an elegant method to XSLT a large page into several smaller pages? Suppose you would have two XSL stylesheets that want to do the page breaking differently. The "desktop" version breaks the content up to fewer and larger pages than the "mobile" version.

Surely, if your content is structured into nodes, you could adjust the granularity of page breaks by node depth. But for most content out there, you just have a sequence of paragraphs. I doubt if XSLT can break in the middle of a very long paragraph where it could be desired for mobile display.

--
I once had a signature.
Re:Support for multiple devices... by Abcd1234 · 2007-12-18 03:15 · Score: 1

Good christ, how long are your paragraphs that you feel compelled to break them into pieces?? I'm sorry, but it sounds to me like you're inventing an unreasonable requirement in order to invalidate a position you disagree with. I see absolutely *no* reason why you can't live with paragraph-level granularity, if you have a need to break the content up. And that's only aided by XHTML, as long as you make sure to place your paragraphs in divs or p tags.
Re:Support for multiple devices... by pikine · 2007-12-18 03:52 · Score: 1

Just try to view what you just wrote on a 128x160 screen and you'll see what I mean. A paragraph doesn't have to be very long to make browsing experience horrible.

Can't you carry on a discussion without cussing from your rotten mouth?

--
I once had a signature.
Re:Support for multiple devices... by 1110110001 · 2007-12-19 01:10 · Score: 1

Couldn't you include such an information in the original document? I.e. a div for pages - the desktop browser wouldn't care and you'd have a node that you can select for the transformation to smaller devices.

If you need different page breaks, just make your divs the smallest possible break and use a couple of these for bigger pages or use different classes to mark the breaks for each platform.

Depending on what you do or want to do on the server side, this could still be a solution and you'd have one document to rule them all.
Re:Support for multiple devices... by pikine · 2007-12-19 16:23 · Score: 1

Yes, but I thought XHTML was designed to separate content and presentation. We have established that XHTML by itself isn't enough, and then we just found out that XHTML plus XSLT still requires presentation information to be encoded in the original document. As I said, it's a fairytale.

--
I once had a signature.

The current situation is awful. by Animats · 2007-12-16 07:46 · Score: 4, Insightful

The current situation is awful.

Major tools, like Dreamweaver, generate broken HTML/XHTML.. Try creating a page in Dreamweaver in XHTML or Strict HTML 4.1. It won't validate in Dreamweaver's own validator, let alone the W3C validator. The number of valid web pages out there is quite low. I'm not talking about subtle errors. There are major sites on the web which lack even proper HTML/HEAD/BODY tags.
The "div/float/clear" approach to layout was a terrible mistake. It's less powerful than tables, because it isn't a true 2D layout system. Absolute positioning made things even worse. And it got to be a religious issue. This dumb but heavily promoted article was largely responsible for the problem.
CSS layout is incompatible with WYSIWYG tools The fundamental problem with CSS is that it's all about defining named things and then using them. That's a programmer's concept. It's antithetical to graphic design. Click and drag layout and CSS do not play well together. Attempts to bash the two together usually result in many CSS definitions with arbitrary names. Tables mapped well to WYSIWYG tools. CSS didn't. (Does anybody use Amaya? That was the W3C's attempt at a WYSIWYG editor for XHTML 1.0.)
The Linux/open source community gave up on web design tools. There used to be Netscape Composer and Nvu, but they're dead.

Re:The current situation is awful. by The+Master+Control+P · 2007-12-16 08:28 · Score: 1

The sad thing about broken web code is that it's browsers that enable it.

If people know they can be lazy and write crap code that the browser will somehow manage to render anyway, they will since it's easier than writing correct code.
Re:The current situation is awful. by shutdown+-p+now · 2007-12-16 08:34 · Score: 5, Insightful

Drag'n'drop is simply not a working approach to design proper UI (i.e. the one that automatically scales and reflows to any DPI / window size / whatever).
As for "defining named things" - the concept of HTML is all about semantic markup. That's why using tables for layout is frowned upon, not because they are bad as such.
Re:The current situation is awful. by ceoyoyo · 2007-12-16 08:36 · Score: 3, Insightful

HTML isn't supposed to be WYSIWYG. If you want traditional graphic design, make a PDF.

HTML is supposed to be a document format that can be flexibly rendered. Pretty much the opposite of WYSIWYG actually.
Re:The current situation is awful. by Ma8thew · 2007-12-16 08:38 · Score: 1

CSS is not 'less powerful than tables'. If you take the time to learn about it, you'll find out how much cleaner and more efficient using CSS can be. To address you're other point, the web is ill suited to WYSIWIG. The number of different platforms, window sizes and browsers prohibits this. Elastic layouts, elegantly resizable text, these are only possible with CSS and hand coding. Nevermind the fact that accessibility is made almost impossible without separation of content and design. Taking your own website as an example, I would estimate that the amount of code in your home page could be more than halved without the clunky table based layout and countless font tags. It would also make maintenance, and spotting the numerous empty and unclosed tags a lot easier.
Re:The current situation is awful. by coryking · 2007-12-16 09:01 · Score: 2, Interesting

WYSIWYG is impossible if you are using templates. You gotta visualize how the chunks come together!
If you want traditional graphic design, make a PDF. PDF is for printing, dummy :-)

I've got a better idea anyway... How about a way to take our centuries of knowledge about "traditional graphic design" and apply it to the a web-based medium? Do we have to chuck out everything we know about good design just because of the silly constraints of HTML/CSS? How about we improve or replace HTML/CSS with something that incorporates all we know about "traditional graphic design", all we know about good semantic markup, all we know about good programming, all we know about accessablity and all we know about usability and create something better?

"Use a PDF, jackass" is an open invitation to fuck all ya'll and use Silverlight or Flex. Who knows... maybe Adobe and Microsoft understand us better then "the experts"?
Re:The current situation is awful. by zmotula · 2007-12-16 09:10 · Score: 1

This is going to sound like a troll, but for me the situation is awful precisely and only because of Internet Explorer. I am a web designer, and my job would be infinitesimally easier and more fun if I could write for sane browsers only. I know how to work around most of the bugs now, but usually that means sticking to basic, dumb solutions (or testing like a madman). I do not need major tools, I am perfectly fine with Vim and Unix toolbox. I am happy with the div-float-clear approach as implemented by decent browsers. I do not need WYSIWYG tools, I feel much more comfortable and productive writing it myself. All I need is decent implementation of the current web standards in all major browsers.
Re:The current situation is awful. by ceoyoyo · 2007-12-16 09:10 · Score: 2, Insightful

What's wrong with a PDF? It's got exactly what you seem to want -- total control over your layout. It also supports hyperlinks. Safari certainly renders PDFs inline as if they were somewhat retarded web pages. I'm not sure why you think it's just for printing.

HTML has it's purpose. It's time to stop trying to pervert it to yours. Either invent a fixed document format for the web or use one of the ones that's already widely supported (ie PDF). But guess what? There's a REASON people hate web links that go to PDFs. It's because the web itself was wisely intended NOT to be WYSIWYG because I don't want to have my monitor set the same way as yours is.
Re:The current situation is awful. by Anonymous Coward · 2007-12-16 09:16 · Score: 0

Infinitesimal does not mean what you think it means.
Re:The current situation is awful. by zmotula · 2007-12-16 09:17 · Score: 2, Funny

1/infinitesimally, then. Sorry :)
Re:The current situation is awful. by Anonymous Coward · 2007-12-16 09:27 · Score: 0

I think the reason that we have such shoddy sites on the web is because people are relying on these WYSIWYG tools to design their pages. The web would be a better place if HTML and XHTML had to be STRICT for it to even be displayed in a browser. It would be nice if all browsers fully supported HTML/XHTML/CSS for this to work (I'm looking at you IE!). I've found that my sites load faster, look better (beautiful source code, too) and are built faster when I hand code.

div/float/clear is kind of awkward. I'd be interested to find a better standard here.

Between using CSS and tables, now that I know and use CSS I prefer it hands down over tables for design. I despise spacer gifs.

As for linux web design tools, I've been really happy with Quanta Plus. It hasn't yet provided a single piece of code that didn't validate. Of course, I'm not using it for WYSIWYG functionality.

I really think that professional web design should be separated into design and programming. Designers create a mockup of the site using whatever graphics program makes them happy. They hand the mockup to the programmer and she makes it happen following STRICT markup. Let's face it, most of us don't have equal analytical and design traits to do both sides of the job well.
Re:The current situation is awful. by gsnedders · 2007-12-16 09:56 · Score: 1

Tim Berners-Lee originally intended HTML to be created through a WYSIWYG editor. WorldWideWeb (the first web browser) has a WYSIWYG editor built in.
Re:The current situation is awful. by grumbel · 2007-12-16 10:00 · Score: 2, Interesting

### Pretty much the opposite of WYSIWYG actually.

That might be the theory, but it simply is not true in reality. HTML is pretty much a WYSIWYG format with additional support for different font sizes and page width. The second you add a tag you are tied to a specific display DPI, the second you add a navigation bar, you no longer have a document that can adjust to different output devices easily. I mean just look at the web today, nobody is using HTML for writing documents. If people want to write a book, they use TeX, if people want to do something else, they stuff their content into a DB and render it to HTML when the user requests it. If the user wants to have a printable version, they rerender the DB content. Ever seen a manual being downloadable as 'single page HTML' vs. 'multipage HTML'? This is only needed because HTML isn't flexible enough to handle both styles of viewing with a single document. A flexible format would allow you to render the document in multiple different ways, but HTML doesn't allow that. You have to change the HTML code to change the rendering result in a significant way.

Not all of this is of course to blame on HTML itself, the browser takes it share of blame to for not offering additional ways to render the HTML. But HTML by design is really closely tied to its output device and I doubt that will ever change.
Re:The current situation is awful. by iluvcapra · 2007-12-16 10:05 · Score: 1

WYSIWYG is impossible if you are using templates. You gotta visualize how the chunks come together!
But you have to specify "how all the chunks come together" for a variable set of page widths and glyph sizes.

How about a way to take our centuries of knowledge about "traditional graphic design" and apply it to the a web-based medium
They did. Where exactly do you think the idea of floating and clearing came from? Open up a copy of your favorite magazine, and the text wraps around floated images. The big impedance mismatch between a web page and a piece of paper is the fact that a webpage can be displayed in an arbitrarily sized viewport with arbitrarily-sized and replaceable fonts. A grid layout just invites everybody to write "this website is optimized for 1024x768 monitors, expand the window to full screen for the complete experience!"

The formatting model in HTML and XHTML trade strict definition of every element position (which you can still do with tables) for better presentation on a wide range of viewing systems, from a Vista box with QXGA to a Treo. HTML and XHTML tags aren't supposed to position elements; they give semantic hints that the browser uses to position the objects itself. Under the hypertext model, the browser always has final say about the positioning of elements, and this is particularly important on mobile devices and in the unusual though important case of the user providing his own stylesheets.

--
Don't blame me, I voted for Baltar.
Re:The current situation is awful. by Antique+Geekmeister · 2007-12-16 10:25 · Score: 1

Oh, we didn't give up on them. We splintered wildly, and there are dozens, even hundreds of mostly crappy tools over at sourceforge.net. None of them have gathered that much following, but I personally favor Amaya, which produces robust, legible, and clean code, and is incredibly useful for debugging the worst of the debris out Visual Studio burdened content.
Re:The current situation is awful. by BlueStraggler · 2007-12-16 11:14 · Score: 1

Correct code works as specified on less that 1% of the browser base, and therefore is a topic that is only meaningfully discussed by academics, utopians, and others who don't live in the real world.

Broken code works on 99% of the browser base, and therefore is what the rest of us actually use. The fact that there is no specification for what particular witches' brew of broken code will get the job done is a point of annoyance for web developers, but it does not change the fact that following the specifications will break your web site.
Re:The current situation is awful. by coryking · 2007-12-16 11:40 · Score: 2, Insightful

You know why you are wrong? The idea of a pure semanticly described document with all the formatting elseware works only for non interactive documents that get printed, like a book. Period.

Semantic markup languages like HTML break down because the web isn't for print. Semantic markup is the holey grail in the print world because it works so well for linear documents. The web is an interactive, non linear medium that doesn't get printed.

The web is an two way, interactive, non linear medium that is evolving to almost real-time interaction between the client and server. Books, which are written in semantic languages like LaTeX, dont have client-server interaction. Books dont have forms. Books dont have real-time data. Books are none of these things. Books only have headings, tables of contents, footnotes, indexes and other easy to describe things. These are all very easy things to handle in semantic markup languages. In fact, you are insane *not* to use semantic markup for a 300 page book because it makes changing the layout difficult.

You *cannot resize a book with a mouse*. You *cannot order an ipod* from a book. You *cannot post a comment shared across the globe* in a book. You *dont print the book in different sizes* (for example, you couldn't take Programming Perl and use the same content for a pocket sized book). You *dont have programming language running inside the book*. Books dont have programmers designing significant chunks of their architecture.

The web is more than a book. The web has some things that are book like that make sense for semantic content (all H1 should be this font) but lots that dont make sense (make the page 100% high so there are no scroll bars and inlay a second grid for scrollable content... think gmail). You think it makes sense to have a language that is only semantic for creating web applications? How could it even begin to describe google maps?

Even more damning is a book, which is described semantically, HAS A FIXED OUTPUT DEVICE LIKE A PDF FILE!!! Book authors can "cheat" with their semantic markup and layout because they already know what the target output device is!! They know what inks they can use, what fonts they can use, what the margins are, what the DPI of the printer is, and what the page dimensions are! They all output pixel perfect books using a semantic markup language! We HTML authors no NONE OF THIS and yet you expect us to design our web pages the same semantic markup abtraction as a book author!?

Can't you see the irony of recommending I use PDF when the main way to generate a PDF is with software using a semantic language!

Can't you see we can acheave the same goal of "making it easy to change the layout" in ways besides a stylesheet? Ever heard to a template language like the one used by Ruby on Rails or Template::Toolkit? Isn't it easier and cheaper to swap out "big layout" bits like columns by swapping out a template than it is a stylesheet? You think all it takes to target a mobile phone is just swapping out the stylesheet? No sir! I have a template system that changes *the entire fucking document* to suit mobile phones and their limitations! Isn't that the better way when you consider how different the two devices are?

So stop treating the web like a damn book! The web is not a book and semantic markup breaks down as an abstraction with modern development. This is very obvious to anybody who has done real web application development. Either help invent a better language to abstract what the web is or get left in the dust while you preach to a shrinking congregation.
Re:The current situation is awful. by ceoyoyo · 2007-12-16 12:16 · Score: 1

That's quite a rant. I think you've quite definitely made the point that the web is not and should not be WYSIWYG, because you CAN resize it with a mouse. The web is not a book. So quit trying to lay things out yourself (ie with tables). That's my device's job.

What exactly do you think is the critical difference between a "template" and a style sheet? Yes, a good web site can work just fine on a mobile device with a simple change of style sheet. Actually, a GOOD web site will work reasonably well on a mobile device WITHOUT a change of style sheet.
Re:The current situation is awful. by ceoyoyo · 2007-12-16 12:19 · Score: 1

Got a reference? Did he say that?

There's a difference between a graphical web editor and a WYSIWYG one. I find it very hard to believe that anyone involved with designing the web intended it to be WYSIWYG. If they had, we'd be laying things out with absolute coordinates. I do find it completely possible that they thought graphical design software is a good idea. It is.
Re:The current situation is awful. by ceoyoyo · 2007-12-16 12:26 · Score: 1

Yes, the current web doesn't USE HTML very well. It seems that a bunch of traditional print layout designers decided they should be web designers.

Nevertheless, it's possible. You can have a navigation bar that resizes itself. It's not infinitely resizable, but it's quite adequate for dealing with a wide range of page widths. Take a look at Slashdot's, for example. The vertical one works just fine for anything from about 300 pixels on up.

When you add a tag you're tied to a specific DPI? I don't get that. I can't actually think of any tags where you have to specify the dpi.

The best ebooks are in HTML format. They size themselves nicely to fit any size screen you might want to read them on and they work well when you start fiddling with the font size. No, HTML isn't an appropriate format for traditional print documents because that's exactly the opposite of what it's supposed to do.
Re:The current situation is awful. by coryking · 2007-12-16 12:31 · Score: 1

So quit trying to lay things out yourself (ie with tables) Maybe I want to run a business that makes money? How can I be successful if my site doesn't try to lay things out and my competition does? Should slashdot just dump it's database into a CSV file and let you render it?
Yes, a good web site can work just fine on a mobile device with a simple change of style sheet. Okay hotshot.

Should I require the user to enter the exact same password on both a mobile phone with a numeric only keypad and a full sized browser with a real keyboard? Keep in mind, my password is 8 characters of random line noise and it takes a good minute to key it in on my RaZR. Ever tried entering that into your tiny mobile phone? How do I fix this obviously glaring usability bug with nothing more than a stylesheet AND keep my user secure?

Extra credit: How can I present this entire thread of 150+ comments on slashdot by just changing the stylesheet? Keep in mind, it has to be just as usable as what I'm doing right now AND you cannot change the javascript or images. Stylesheet only bud.
Re:The current situation is awful. by gsnedders · 2007-12-16 12:36 · Score: 1

From his own site:

there were all the software parts to make a wysiwyg (what you see is what you get - in other words direct manipulation of text on screen as on the printed - or browsed page) word processor. I just had to add hypertext, (by subclassing the Text object)
Re:The current situation is awful. by The+Master+Control+P · 2007-12-16 12:43 · Score: 1

Konqueror rendered it perfectly. I love Konqueror.

But my point is, we got into the situation you correctly describe (broken shit everywhere that mostly-works when fed through certain rendering engines and nothing complying with the standard) because browsers enabled it by making correctness optional, though in fairness (as the original poster said) it's also in part because the w3c never put out a reference implementation to compare things to.

I mean, do Photoshop or the GIMP doggedly try to open a broken jpeg? Does Word desperately attempt to render corrupted .docs as they were meant to be seen? Do movie players spend half an hour trying to figure out how to play a bad file? No, they say "this file is broken" and stop processing it. Why shouldn't web browsers say "this page is broken" and stop rendering at that point (perhaps after putting in anything from the address tag)?
Re:The current situation is awful. by ceoyoyo · 2007-12-16 12:57 · Score: 1

Okay, last message. You're purposely misunderstanding.

I said layout. No, don't dump the database. Use HTML and CSS as it was designed. Specify divs, label content, and specify hints about how it SHOULD be laid out. Then let my browser do what it's supposed to do and figure out how best to make that fit my screen, with the guidance of your hints. If you insist on using tables and sizing your page to 1024x768 or 800x600 or whatever you happen to choose I'm less likely to buy from you because your page is annoying. Take a look at Amazon. It sizes itself to fit your browser (within limits). Tables are used for some elements, but not to force a strict WYSIWYG layout. I guess Amazon doesn't make any money though.

Well, we're talking about layout and HTML, but since you bring it up, how is it you keep your user secure on their mobile? Two passwords, one of which is alphabetical only? That doesn't sound all that secure to me....

As for Slashdot, it works great on my iPod Touch, no style sheet changes required. The page sizes itself to the screen and handles font size changes admirably. The only irritating part is the ads, which occupy a fixed width, but letting them fall off the right side of the screen works fine and has the added benefit that I don't have to see them. When viewing slashdot on my notebook I almost always bump the font size up a couple of sizes. Works just fine, no problems.

I notice your web page handles width and font size changes gracefully as well. Definitely not WYSIWYG.
Re:The current situation is awful. by ceoyoyo · 2007-12-16 13:05 · Score: 1

Here's the full quote:
I wrote the program using a NeXT computer. This had the advantage that there were some great tools available -it was a great computing environment in general. In fact, I could do in a couple of months what would take more like a year on other platforms, because on the NeXT, a lot of it was done for me already. There was an application builder to make all the menus as quickly as you could dream them up. there were all the software parts to make a wysiwyg (what you see is what you get - in other words direct manipulation of text on screen as on the printed - or browsed page) word processor. I just had to add hypertext, (by subclassing the Text object)

So from both the sentence before (talking about a RAD kind of graphical GUI builder system, I presume) and the parenthetical phrase where he defines WYSIWYG as "direct manipulation of text on screen as on the printed - or browsed page" it sounds to me like he's using WYSIWYG in a very loose way to describe a GUI word processor where what you see is approximately what you get. So, for example, as compared to the usual TeX editor, when you make something italic it shows up as italic right away. NOT that your document will look precisely the same every time it's rendered. If he intended strict WYSIWYG then he'd have made sure the web supported (and required) only absolute coordinates.
Re:The current situation is awful. by coryking · 2007-12-16 13:20 · Score: 1

I notice your web page handles width and font size changes gracefully as well. Definitely not WYSIWYG. Thanks you for the compliment. It was very hard to get it to look like that without tables and render in IE fucking 6 (thank god that will be gone). There are still some rough spots that dont size well, but they are low priority since they are seldom used and only exposed to people logged in.

Tables as a TABLE tag suck - I'm not arguing. I'm not sure what I am arguing about anymore besides that our tools, right now, suck. I think we limit ourselves if we think that the HTML/CSS model is the best way to do things and I think we need to be more creative. The web isn't a book, it shouldn't be pixel perfect, but good layout is essential. XAML feels very right for some reason, like Microsoft listened to both designers and programmers when it designed the language. It lets you really define how your grid should work and how things on it should move around based on changes in the rendering output. I like XAML because it lets participate in the rendering process. For example, XAML lets you tell the rendering engine "hey, I really want 400 'pixels'" and the rendering engine can make a callback into either your XAML code or even your C#/VB.net code-behind and tell you "sorry pal, you ain't gonna get that, tell me the absolute minimum you need instead and we can work with that". In XAML, pixels are just an abstraction too... a pixel might not be depending on the DPI, but you can still force the rendering engine to snap to a real pixel to keep it from blurring across two or more pixels.

My point is, it is possible to have our presentational cake and eat our semantic icing too. HTML & CSS as it is right now just doesn't work because it favors semantics over layout.
Two passwords, one of which is alphabetical only? This is how paypal does it and I agree it has... issues. But I can also see people weakining their normal password so it is easy for them to enter into their mobile. You can see my point though, there is more to targeting a mobile phone than just a stylesheet switch. I think is a red herring that a lot of people toss out when they try to convince us to go 100% semantic. It forgets there are very real differences in the two devices that go way beyond what a simple stylesheet can address. Do you post as long of comments on slashdot from an iPhone? Is that something that can be fixed with a stylesheet only or does it require you to rethink how people interact with the entire site when they are on a mobile?
Re:The current situation is awful. by coryking · 2007-12-16 13:32 · Score: 1

The only irritating part is the ads, which occupy a fixed width, but letting them fall off the right side of the screen works fine and has the added benefit that I don't have to see them. I almost forgot. That is another challenge for me going forward when I start doing a mobile version of my platform. Google has a different ad format for mobile content (and guess what, it is actually server side and requires either php or asp... and I'm all mod_perl :-). People scan your page different on a mobile phone too, so I'll have to re-adjust my ads so you see and click on them. I still haven't figured out who will buy ads that are site targeted for mobile devices though - people aren't gonna buy $1,000 nikons on their iPhone :-)

Am I evil? No. Advertising is an important consideration when designing a layout. You'd be amazed how big of a difference it makes when you optimize your ad placement. There is, of course, a fine line between optimal ad layout and obnoxious layout :-) I try to stay on the non obnoxious line because I like my visitors to return :-)
Re:The current situation is awful. by iluvcapra · 2007-12-16 13:38 · Score: 1

You know why you are wrong? The idea of a pure semanticly described document with all the formatting elseware works only for non interactive documents that get printed, like a book. Period. Semantic markup languages like HTML break down because the web isn't for print. Semantic markup is the holey grail in the print world because it works so well for linear documents.
I admit being deeply confused about the body of the post, even given the setup, since I don't think it really addresses the main point. If you wanna have absolute control over how all elements are positioned in the body box or viewport, you're necessarily taking control of this away from the client, and just IMO this is probably a mistake, since the client knows a lot of things the web publisher doesn't, like view limitations, availability of fonts, the mode of rendition of those fonts (like aliasing, which can have a profound effect on layout), and output device, be it screen, paper, screen reader, or some other unpredicted interface.

I'm not necessarily saying that you should use (h1) instead of a (div) with styling, or a (strong) instead of a (span) with styling, but that HTML flavors in general are built around the assumption that the enclosed content is organized "semantically" into blocks of text (just like every other markup language since the beginning of desktop publiching, like nroff, TeX, RTF, or OO, you name it). A block of text is a much more flexible specification of content than any absolute method; it maps to a common mode of human interaction and thus can be made to work on a variety of output devices and regimes. It's also a natural fit for the need to position runs of text around other positioned objects, when the final position of the imposing object is not known at the time the HTML is generated... This is starting to sound a lot like a strict versus dynamic typing argument, isn't it?, with rendering == execution and html authoring == compiling.

I'm also not saying you can't use grids or tables either, everyone does, but even on the trickiest sites the grids are just a framing device for the stuff to be read. The news story, or slashdot post contained within the glue of the table is always organized semantically, into paragraphs, lists, quotes, etc.

--
Don't blame me, I voted for Baltar.
Re:The current situation is awful. by coryking · 2007-12-16 13:57 · Score: 2, Insightful

but even on the trickiest sites the grids are just a framing device for the stuff to be read And even then, those are letters of a common alphabet delivered over light that travels inside glass. What is your point? You saying layout isn't important or something?

Layout is just as important to understanding content as the content itself. If you went into a $100USD per dish restaurant dressed in a tuxedo with your hot chick date and the menu is all in comic sans, what do you think about the quality of the food you are about to be served? Those guys who march around downtown areas might have really good compelling content, but nobody reads it because it is always done in permanent marker and twenty different colors. You know, the time cube guy might be right, but his site design makes him look like a joke. People argue that Kerry lost the 2004 election because they did a poor job with the presentation of their logo.

The thing that upsets me about these debates is people think that the colour scheme used, the fonts used, the line spacing, the margins, the proportion between elements, or any other fundamental unit of design is just pretty window dressing around content. Those people also tell you looks dont matter and first impressions aren't important. They are wrong. Very, very wrong. Layout matters, even more on the internet than in print. We need powerful tools in our language to help us express layout. Dismissing layout as a trivial afterthought is a great way to ensure our future is nothing but flash apps.
Re:The current situation is awful. by coryking · 2007-12-16 14:05 · Score: 1

er... but yeah. I think your thoughts about strict and dymaic is interesting. The browser knows a lot of stuff that *html* doesn't know. Javascript can know it though. Maybe we need to formalize the way a page is rendered (at least at a high level) and let our semantically marked up content participate more in the layout. The rendering engine can tell our semantic bits something about itself an we can both negotiate to make sure the final page is rendered and the meaning of our content is preserved. XAML seems to follow this idea a bit, but I haven't played with it enough to really figure it all out.

I better pull out of this now before I make no sense at all. This kind of stuff is always a good exciting debate. Thanks for not being religious :-)
Re:The current situation is awful. by Anonymous Coward · 2007-12-16 15:03 · Score: 0

CSS is not 'less powerful than tables'

Except for the case when you want to actually display tabular information.

In which case table layout still sucks (lol, Y/N column 500 pixels wide) but it sucks much less than trying to stack up divs to make a table.
Re:The current situation is awful. by Xogede · 2007-12-16 15:29 · Score: 0

Yeah, divs totally suck for layout. I mean, in the time it takes to write a div, the previous one has already rendered!
Also, divs make the code too easy to read for your competitors. We can't allow that, can we?

Call me when you'll have to take over the development of a huge site that's written with 7 levels of nested tables and spacer gifs every 3 lines. That's what I've been forced to go through.
Re:The current situation is awful. by grumbel · 2007-12-16 16:00 · Score: 1

### Yes, the current web doesn't USE HTML very well.

I think the problem is that you can't use it well. Neither HTML nor the webbrowsers are really prepared to render a page differently. Sure, they might be able to do some small adjustments, but something simple has hiding a navigation bar is already pretty much impossible for the average user.

### It's not infinitely resizable,

Yep, thats part of the point. You can make stuff that works, but it only will work for a specific range of font sizes, since after that the webpage layout will just fall apart. Try setting a really big font and almost all pages with a little bit layout will completly fall apart.

### When you add a tag you're tied to a specific DPI?

That should have been 'img tag'. Images are a huge part in todays web, but they are the very thing that makes webpages extremely DPI depended. The blame here might however not be so much HTML itself, but the browser, even Mozilla can't properly smooth scale images, so all images displayed at a non-native resolution look like garbage. So instead of doing "1em" everybody gives exact pixel measurements to make the pages look good, instead of having them scalable.

### The best ebooks are in HTML format.

When it comes to books CHM is *far* superior to plain HTML. Since CHM adds all that what HTML lacks, searchable index, TOC and all that stuff. The joy of a CHM is that its not just a plain page, but a whole book. With HTML you can only represent a single page, if you want to have a second page, you have to include all the navigation into the document itself, which I consider horrible messy and which is the reason why basically everybody generates his HTML automatically instead of writing the content directly in HTML.

Frames where kind of a great idea and allowed to keep navigation and content seperate, but the implementation was sadly to problematic to be usable.
Re:The current situation is awful. by BZ · 2007-12-16 17:22 · Score: 1

> There are major sites on the web which lack even proper HTML/HEAD/BODY tags.

All three of those tags are optional in an HTML 4.01 document (see the DTD for HTML 4.01).
Re:The current situation is awful. by Animats · 2007-12-16 17:42 · Score: 1

> There are major sites on the web which lack even proper HTML/HEAD/BODY tags.
All three of those tags are optional in an HTML 4.01 document (see the DTD for HTML 4.01).
From the HTML 4.01 spec, section 7.4.2:
"Every HTML document must have a TITLE element in the HEAD section."
So HEAD is "optional", TITLE is mandatory, and TITLE can only appear in HEAD. Right. Somebody should submit a DR on this, I suppose.
Re:The current situation is awful. by BZ · 2007-12-16 17:58 · Score: 1

> "Every HTML document must have a TITLE element in the HEAD section."

That's correct. The and _tags_ are optional. The HEAD _section_ is not. Where it starts and stops is determined by the and tags if present, and otherwise by the other tags in the document. For example, parsing:

Foo
Text

gives the same exact results as parsing

Title
Text

per the HTML 4.01 spec, since is only allowed inside HEAD and
is not allowed inside HEAD but _is_ allowed inside BODY, and both and are optional.

Welcome to the world of SGML and DTDs!
Re:The current situation is awful. by BZ · 2007-12-16 18:01 · Score: 1

Uh... Those examples should have been:

<title>Title</title>
<p>Text</p>

and

<html>
<head><title>Title</title></head>
<body><p>Text</p></body>
</html>

Sadly, the "Plain Old Text" mode seems to be broken....
Re:The current situation is awful. by ceoyoyo · 2007-12-16 18:49 · Score: 1

Sure, anything is going to break if you push it too far. But resizable from 300 pixels wide to as big as you want is pretty good. Being able to vary the font size by a factor of even two is MUCH better than nothing. Slashdot and lots of other pages do both of these, no problem. Too many pages don't vary their width at all, and any change in font size makes them unreadable.

Yes, images are the big problem. There's no excuse for modern browsers not to resize pictures gracefully. Safari seems to do it pretty well. Opera's zoom feature was great, where you could increase or decrease the size of everything, text and images. Even with image sizes specified in pixels, most web pages can resize gracefully if they put the images in floating divs.

CHM... can I read that on my iPod? My notebook? It sounds great, but it also sounds like it's built with the same philosophy as HTML, that it's not WYSIWYG but has flexible layout. The extra features sound nice, but aren't really related to layout. Specified pages in e-books really irritate me. Give me a chapter length page any day.
Re:The current situation is awful. by bar-agent · 2007-12-16 20:16 · Score: 3, Insightful

Drag'n'drop is simply not a working approach to design proper UI (i.e. the one that automatically scales and reflows to any DPI / window size / whatever).

Drag'n'drop works fine if it is manipulating a proper UI API. OS X's Interface Builder, with its springs and struts system, comes to mind.

--
i'd hit it so hard, if you pulled me out you'd be the king of britain [bash.org]
Re:The current situation is awful. by Jesus_666 · 2007-12-16 21:17 · Score: 1

What's wrong with a PDF? It's got exactly what you seem to want -- total control over your layout. It also supports hyperlinks. Safari certainly renders PDFs inline as if they were somewhat retarded web pages. I'm not sure why you think it's just for printing.
So, how do I do Flash and AJAX in PDF?

--
USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
Re:The current situation is awful. by Ma8thew · 2007-12-16 22:00 · Score: 1

That's the entire point of tables. To display tabular data. In that usage they are content, not design.
Re:The current situation is awful. by dodobh · 2007-12-16 22:26 · Score: 1

Do we have to chuck out everything we know about good design just because of the silly constraints of HTML/CSS?

But you do have to chuck out the fundamental assumption of a layout controlled by the designer. That is not a silly constraint, that is a feature.

Once you remove that, and everything else which depends on it, then apply what is left.

--
I can throw myself at the ground, and miss.
Re:The current situation is awful. by Anonymous Coward · 2007-12-17 03:15 · Score: 0

CSS is not 'less powerful than tables'. If you take the time to learn about it, you'll find out how much cleaner and more efficient using CSS can be. You mean DIV vs TABLE...not CSS vs TABLE. You can use CSS to power tables too. No, it's not always cleaner or more efficient. It really depends on the situation and context. I can create a 3 col layout with 1 table controlled by CSS that is 100x more elegant and cleaner than using DIVs, god-awful floats and wrappers with ugly hacks for each browser and JS manipulations just to make a simple 3 col container.

The problem is, is that users nest TABLE upon TABLE upon TABLE...upon TABLE. But the same problem arises when user's nest DIV upon DIV upon DIV with ridiculous CSS & JS hacks because they are too afraid to use a TABLE in any scenario even if it means less code. This most likely stems from the fact that they read some tag nazi's blog forbidding them to ever to use a TABLE.

If using a TABLE is cleaner, more elegant, more efficient and uses far less code to produce what you are trying to do, use it. If a DIV is more efficient, needs less code, etc for what you are trying to do, use that. In the end you have to decide what the best tool is to get the job done and not worry about "Oh, I can't use a table because it's only suppose to be for tabular data." Yes, I am fully aware it's a W3C recommendation and fully am aware of the "tables take more time to render" theories, but in a world where no browsers fully support their recommendations, where designer's also nest DIVs upon DIVs, where CSS is different from browser to browser and ugly hacks are needed...you have to think like a developer and "keep it simple stupid."

Accessibility can happen regardless of whether or not you use DIVs or TABLEs.
Re:The current situation is awful. by Anonymous Coward · 2007-12-17 04:30 · Score: 0

The Linux/open source community gave up on web design tools. There used to be Netscape Composer and Nvu, but they're dead.

http://www.kompozer.net/
Re:The current situation is awful. by mcvos · 2007-12-17 05:03 · Score: 1

The "div/float/clear" approach to layout was a terrible mistake. It's less powerful than tables, because it isn't a true 2D layout system. Absolute positioning made things even worse. And it got to be a religious issue. This dumb but heavily promoted article was largely responsible for the problem.

Excuse me, but what's so bad about that article? I hadn't seen it before, but everything it says is true. Using those complex, overwrought tables full of spacer.gifs is incredibly stupid. I'm afraid the oldest sites my employer hosts are built like that, and they're a pain to maintain. Modern designs that organise all the content in divs and use CSS to position them make everything a lot easier.

CSS layout is incompatible with WYSIWYG tools

So? The web isn't a WYSIWYG medium anyway. The web is about content, and about presenting that content in a friendly manner. Tables make content subservient to layout, throws semantics out the window, and makes your content inaccessible to anyone who does not use your graphical representation to access the content.

Clear, semantic HTML + CSS is much more powerful than table layout.
Re:The current situation is awful. by ceoyoyo · 2007-12-17 05:44 · Score: 1

Try searching Google for "PDF embedded flash." Flash has it's own equivalent of AJAX.

Of course, if you're going to have Flash you can just forget the PDF and use pure Flash. It has WYSIWYG editors and you can make it move! Plus your viewers will hate you even more!
Re:The current situation is awful. by Jesus_666 · 2007-12-17 07:45 · Score: 1

It's not like I want to... But 90% of all hobbyist web developers do. Unless the medium is easy to work with and supports the latest in interactive bling bling, they're not going to use it, whether it's called PDF or XHTML.

--
USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
Re:The current situation is awful. by BlueStraggler · 2007-12-17 08:22 · Score: 1

Why shouldn't web browsers say "this page is broken" and stop rendering at that point?
For starters, it's undefined what "broken" HTML is. The standard includes forward- and backward-compatibility conditions that require invalid/illegal/unrecognized tags and attributes to be ignored. It's not required to declare which standard you are coding to, so there is an implied <!DOCTYPE DOGS_BREAKFAST>. Missing closing tags can be inferred and are syntactic sugar in many cases, anyway. For instance, I'm leaving the </p> tag off the end of my paragraphs here (although slashcode might catch that). Even with grossly misformatted documents, closing open elements at the end of a block or document produces something readable most of the time, and it easy for the parser to do, so why not do it? (Bearing in mind that Netscape 4's parser could not handle unclosed tables properly, and it did indeed refuse to render such pages; this contributed to the general impression that it was a piece of crap.)
Furthermore, a general-purpose browser has to have parsing algorithms to handle all of the various HTML specs, so you have code in place to handle all kinds of constructions that are invalid in once case and valid in another. Which means it's not that much more difficult to bend the rules from time to time, since you're already allowing construction X in other situations. Which in turn makes it possible to throw together a frankenstein parser that handles the maximum number of constructions, regardless of formal validity. This is called "quirks mode", and it is the Duct Tape that binds the web together. What you are really asking for is that browsers disable quirks mode. That's like asking a super-hero to disable his most potent superpower.
Re:The current situation is awful. by ceoyoyo · 2007-12-17 09:45 · Score: 1

So if the next versions of HTML/XML/PDF/whatever didn't support the latest bling 90% of hobbyist web developers wouldn't use it? Can we start a letter writing campaign to the W3C?
Re:The current situation is awful. by Blakey+Rat · 2007-12-17 10:44 · Score: 1

They did. Where exactly do you think the idea of floating and clearing came from? Open up a copy of your favorite magazine, and the text wraps around floated images.

You know what else the magazine has? Columns. Why didn't CSS have it until version 3.0? A version that's not implemented anywhere yet.

--
Comment of the year
Re:The current situation is awful. by nine-times · 2007-12-17 15:03 · Score: 1
I actually think that most of your issues point to the same truth: WYSIWYG tools are not appropriate for generating web pages.
- Major tools, like Dreamweaver, generate broken HTML/XHTML: The people who have tried to make WYSIWYG web-design programs have failed to make them work properly.
- CSS layout is incompatible with WYSIWYG tools: WYSIWYG can't deal appropriately with the separation between content and presentation, even though that separation is fundamental to good web design.
- The Linux/open source community gave up on web design tools: most of the people who know enough to write WYSIWYG HTML editors also know enough to know that they won't work.
The only thing I might agree with is this one:
- The "div/float/clear" approach to layout was a terrible mistake: Yes, CSS positioning is currently a bit of a mess. It's a bit too hard to make simply layouts, and the situation isn't helped by poor CSS implementation is certain browsers... (I'm looking at you, Microsoft). However, using tables for layout is often a bad situation too. Because tables were designed to deal with tabular data, doing complex positioning usually requires writing nesting tables within tables within tables-- on and on. It makes code that's hard to maintain and prone to breakage. There's a good reason why good web developers avoid using tables for things other than... well, tables. But yeah, CSS needs some improvement.
Re:The current situation is awful. by 1110110001 · 2007-12-19 01:18 · Score: 1

The fundamental problem with CSS is that it's all about defining named things and then using them. That's a programmer's concept.

Now that's funny, because even MS Word has the same concept. You could select one heading after the other and change the font to Comic sans, like some do. But you get annoyed as you add an other heading, because it has again the default style. That's when you learn what these style classes are for.
Re:The current situation is awful. by rp · 2007-12-19 03:49 · Score: 1

Can you explain your second point? That "dumb" article provides lots of good arguments, but I don't understand yours. What is it that is wrong with CSS? What is a "true" 2D layout system?

Your third point I don't understand, either. The names are there to identify categories. Categories are essential, and names are just a way to identify them across all pages of a website. Maybe CSS editors aren't smart enough about dealing with classes, but I can't see how the fundamental mechanism of named classes in CSS is inadequate.
Re:The current situation is awful. by guabah · 2007-12-19 07:04 · Score: 1

That thing you say about the table less approach makes you look like you don't cara about accesibility that much, do you?

Re:I bet my ass.. by coryking · 2007-12-16 08:04 · Score: 2, Insightful

Sweet.. So we agree and I owe you some kind of beer. Slashdot makes everybody a flamer :-)

There is a very strong business case for good degradation too... Last I checked, Google doesn't interpret your javascript. You want good SEO, you better make sure the content flows right in lynx (which is the best way to think about how google sees the page).

Sadly, screen readers are pretty much like google too, but I really think we aren't feeding screen readers enough information for them to properly read a page. I really dont know the answer to screen readers. I've never played much with it, but in the windows world, if you were doing a winforms app you can sprinkle your form with metadata to help screen readers. But again, even the winforms solution is a bit like an alt tag.

When I took a usability class, we watched some video I wish I could find of somebody using a screen reader. Talk about intense. Imagine reading a web page, or any document for that matter, while looking through a straw that is only one word wide. That is about what it is like. Now read it with the voice cranked to "hyper fast talk mode" and that is how the blind experience the web. Very interesting and eye opening.

Whatever the future holds (silverlight/flex), we need to make sure the standard has some good, juicy metadata to help out screen readers (and google, really).

Where was I now?

Err What? by Anonymous Coward · 2007-12-16 08:20 · Score: 0

There is no contest, the browser vendors have made it very repeatedly clear on the WHATWG and HTML5 mailing lists that they do not intend to further support XHTML. They are going down the HTML5 dead-end, and s0d the rest of us.

Re:Where is Microsoft? by ShadowLeo · 2007-12-16 08:26 · Score: 2, Insightful

I believe what you are referring to is the "Hidden iframe" technique. Google lists plenty of resources on using this technique.

Re:Where is Microsoft? by Bogtha · 2007-12-16 08:33 · Score: 1

now hear this folks! it wasn't ford, dimler and benz that should be praised for the automobile. it's the people that use them.

One of my points was that XMLHttpRequest was never the only option for Ajax-like effects. It just happens to be the most convenient. Your analogy simply doesn't work.

stop being such a google shill.

If you had ever read my previous comments concerning Google you would know that I am in no way a shill for them; in fact I think the quality of their client-side code sucks and have said so many times.

--
Bogtha Bogtha Bogtha

Re:Where is Microsoft? by Bogtha · 2007-12-16 08:37 · Score: 3, Informative

That's one of them, yes. It really depends on what you want to do; for example you don't need anything other than typical mousedown event handlers for things like Google Maps, and you can use things like dynamically generated image URIs to send data back to the server asynchronously, which is compatible all the way back to Netscape 2. There are lots of options, the value in XMLHttpRequest is more convenience than functionality.

--
Bogtha Bogtha Bogtha

This is silly. by uhlume · 2007-12-16 08:38 · Score: 2, Insightful

<restricton lock="Random_hard_to_guess_string" except="java,safe-html" />

Doesn't really matter how "hard to guess" your string is if you're going to transmit it cleartext in the body of your HTML document, does it?

"But wait!" you say, "We can randomize the string every time the document is served, thus defeating anything but an embedded Javascript with access to the DOM." Perhaps so, but now you're talking about server-side behavior — something clearly beyond the purview of the HTML specification.

If you think about it clearly, there's only one place that it makes any sense to address hostile embedded content, and it is server-side, with the growing battery of techniques already in service. Insisting that the HTML spec and browsers should be addressing this issue is assinine.

--
SIERRA TANGO FOXTROT UNIFORM

Re:This is silly. by r00t · 2007-12-16 11:33 · Score: 1

You create the random string when you create the page. That is the only chance the attacker has.

It is presumed that the page is created from a chunk of untrusted data embedded within trusted data. For example, a web mail or forum. The attacker is not given the chance to go back and edit his evil code. If forum comments can be edited, well, the web page will get generated again and that gives a new key value.

Typically one does indeed have fancy stuff on the server. Slashdot certainly does. Slashdot is a giant perl script.

Really this is no different from the other method, except less error-prone and it reduces server load. Rather that having the server try to parse some potentially hostile tag soup, the server just wraps the mess in the new tags.
Re:This is silly. by uhlume · 2007-12-19 16:31 · Score: 1

I'm sorry to state this so bluntly, but your comment only demonstrates that you have no idea what you're talking about. Your suggestion would require a dynamic language — something which HTML is not, and is not likely to become.

--
SIERRA TANGO FOXTROT UNIFORM

Re:Where is Microsoft? by coryking · 2007-12-16 08:40 · Score: 2, Interesting

I remember when rusty and friends rolled out Dynamic Comments on Kuro5hin/Scoop. They did it with an iframe that chucked out a bunch of onload() crap that wrote into the parent document. Pretty slick for the time.

Way ahead of it's time though... most javascript was either for homework assignments or popup ads. All of it was copy/paste hackjobs that the web author found on super-mega-awesome-javascript.com or something. The result was "most people" hated javascript. You could browse 99% of the interweb with it disabled and all you'd miss were popups. Kuro5hin was one of the first reasons to actually turn on javascript because dynamic threaded comments were 100% better than the non-dynamic ones.

Now that javascript is starting to come of age and real programmers are writing cool things on it (and really javascript is kinda cool programming language once you get past super-mega-awesome-javascript.com and the differing implementations), almost anything that is useful on the internet uses javascript in some way. In a way, javascript has crossed the chasm from early adopters like kuro5hin to mainstream adoption and that nice beefy 80% of the market.

What I find funny is only the tech people are the laggards of this bell curve. And all 10% of them seem to hang out on slashdot pining for the days of yore. What a world we live in when the supposed alpha geeks are the laggards of a technology bell curve!!

Re:Where is Microsoft? by Selanit · 2007-12-16 08:58 · Score: 3, Informative

I don't believe Google Maps uses XMLHttpRequest anyway.

Err, yes it does. From the Google Maps API reference:

The Google Maps API is now integrated with the Google AJAX API loader, which creates a common namespace for loading and using multiple Google AJAX APIs. This framework allows you to use the optional google.maps.* namespace for all classes, methods and properties you currently use in the Google Maps API, replacing the normal G prefix with this namespace. Don't worry: the existing G namespace will continue to be supported.

And that's just a recent refinement. Google Maps has used the XMLHttpRequest object for ages. Yes, it's possible to get a similar effect using hidden iframes and such, but doing it that way is really awkward. They'd have to be crazy to pass that amount of data back and forth that way when they've got XMLHttpRequest.

Re:Where is Microsoft? by Lord+Ender · 2007-12-16 09:11 · Score: 1, Troll

are possible without XMLHttpRequest and I don't believe Google Maps uses XMLHttpRequest anyway

http://www.google.com/intl/en_us/mapfiles/95/maps2/main.js:

function NE(){try{if(typeof ActiveXObject!="undefined"){return new ActiveXObject("Microsoft.XMLHTTP")}else if(window.XMLHttpRequest){return new XMLHttpRequest}}catch(a){}return null}

you FAIL!

--
A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.

Re:Where is Microsoft? by Bogtha · 2007-12-16 09:16 · Score: 1

I've just checked by loading up Firebug to monitor and clicking around on Google Maps for a while, and it didn't use XMLHttpRequest at all. The basic functionality is done by dynamically loading and positioning images. I'm sure there are parts of the API available through XMLHttpRequest, but the major functionality and what it is famous for is not done with XMLHttpRequest as far as I can see.

--
Bogtha Bogtha Bogtha

we need a programming language that: by Anonymous Coward · 2007-12-16 09:51 · Score: 0

*doesn't and never will exist.

HTML, Javascript and CSS have become the lay of the land, like it or not, because they've been in use for so many years now. Replacing them all in one fell swoop would require everyone to throw out all of their old code and start over from scratch, for little or no reason. What we have now actually works, but the edge cases and never deprecating old components is killing overall compatibility.

Instead, what needs to happen is, all browsers that cannot render a page simply need to not try and bail out, telling the developer that at this point, their webdesign sucks and is invalid. Of course, as long as Microsoft Browser of the Idiots exists, this will never happen either, as they pride themselves on the fact their browser is perpetually broken.

fonts are platform-specific and copyrighted by r00t · 2007-12-16 09:53 · Score: 0, Troll

You may love the latest stuff shipping with Vista, but it's not on my computer and I'm not going to swipe a copy.

I don't even have Comic Sans, Ariel, Verdana, Times New Roman, etc.

I do have fonts. Some of them look kind of nice. You probably don't have them.

Re:fonts are platform-specific and copyrighted by coryking · 2007-12-16 10:11 · Score: 1

So good. Enjoy how my HTML degrades. Don't think you will stop me giving you fonts that really make my page look good. Don't be surprised if my page looks like crap in your font either. But at least you can read it, right?

Re:I bet my ass.. by Antique+Geekmeister · 2007-12-16 10:30 · Score: 2, Insightful

I did. I still do: most images add nothing to the content, they merely add dancing bears to the web content, pull more bandwidth, provide client tracking through the tracking of third-party 1-pixel GIF's, and generally slow down my web performance. They also interfere extensively with text->speech synthesis for the visually impaired.

The Web is not for the developers. It's for the people who want and need the data, the clients who in the end actually pay the bills and view the pages. If it's a games site for people to play Flash games, great: othewise, get out of the dancing bears business and let me look up what I need.

The W3C's validator is good, by falconwolf · 2007-12-16 10:55 · Score: 1

if awkward to install in some systems.

I liked it, and didn't find it awkward to install on either of the two PCs I installed it on. Back then though I had Windows now I have a Mac. I also liked XMLSpy.

Falcon

--
Should there be a Law?

Are you serious? by Anonymous Coward · 2007-12-16 10:57 · Score: 0

You don't need to pass a validating parser, much less walk the DOM on every document. Comments should be stored in the database with the correct encoding. What kind of moron stores user submitted content in a db without first converting it to their apps default encoding? RSS feeds are irrelevant, even with ATOM you should check the encoding in the validation step as you would for any other third party content.

Ads are a problem, market forces should lead the brokers to standards compliance; unfortunately one vendor managed to grab a monopoly on the browser market. So I give you ads but consider the other points bogus.

you're serious, aren't you? by r00t · 2007-12-16 10:58 · Score: 1

The W3C was well on the way toward being fully useless, pointless, and ignored. They'd build themselves a lovely ivory tower, locked themselved inside it, covered their eyes and ears, and started to enjoy LSD. It was heaven for people who liked politics and design-by-committee more than engineering and practicality.

We love our tag soup. It mostly works, unlike xhtml which only works in Gecko. (nope, not IE, unless you use the text/html MIME type and your "xhtml" just happens to be tolerated when parsed as html) Tag soup gets stuff done.

XML is nothing to be proud of. Though I'm no fan of LISP, even LISP-style notation would be better than XML. XML is gross inefficiency while not even being particularly readable. In any case it's not a significant improvement over the bastardized SGML that is the foundation of HTML.

Re:you're serious, aren't you? by Anonymous Coward · 2007-12-17 01:08 · Score: 0

> and your "xhtml" just happens to be tolerated when parsed as html

XHTML is back-compatible with HTML4, that's why the "text/html" MIME type works.

The post below yours makes a point about syntax errors in real-world markup, but these wouldn't exist if browsers displayed a parse error.

You say that XML is nothing to be proud of and I wouldn't disagree, what I would say is that tag soup is an embarrassment and promoting it is ultimately short sighted. What happens when we're faced with a paradigm shift in computing, do you really want to be bug-matching legacy tag-soup parsers?

HTML5 should be an incremental update to 4 and the XML serialization additional modules for XHTML1.1

I prefer XHTML 2, thanks by wikinerd · 2007-12-16 11:01 · Score: 4, Insightful

I thank the HTML 5 guys for their attempts, but I prefer XHTML v2

From TFA:

XHTML V2 isn't aimed at average HTML authors

XHTML is for intelligent human beings, you know, people who can actually understand what separation of concerns is.

[HTML v5] propose features that might simplify the lives of average Web developers

So HTML v5 is for people who don't understand separation of concerns.

Unfortunstely that's the 99% of web kiddies out there.

The standards will appeal to different audiences.

One standard for smart people who know programming and actually work with an engineering mindset, another for those who see the web as a big graffiti and work with an "anything goes" mindset. No thanks, I prefer ONE standard for smart people, XHTML v2, and just to kick out everyone who isn't qualified.

Re:I prefer XHTML 2, thanks by Dracos · 2007-12-16 12:12 · Score: 4, Insightful

Agreed, this article is HTNL5 apologist rhetoric. I thought it was rather well-balanced until the author got to HTML5, where his preference is subtly revealed.

XHTML2's universal src attribute is mentioned (confusingly called a tag), but the universal href attribute is not, which allows any element to be transformed into a link. Nor is the rolse attribute mentioned, which allows a tag to be assigned a semantic meaning (like menu or header) without expanding the tag set.

TFA even admits in a roundabout way that HTML5 exists because the majority of so called "web developers" are ignorant of the current standards and incapable of effectively using them. If you need to be "clever" to use XHTML2, then perhaps no one will have to reach for the eye-bleach every time they wander into places like MySpace (where page skins are based on an exploit where browsers interpret <style> tags outside the document head, which is illegal).

I tell people "Writing web pages is easy. Writing them well is hard." This is proven by the amount of junk documents on the web that don't validate as anything but pretty, even if beauty is in the eye of the beholder.

The author wisely avoided any discussion of the silly new tags (some of which are presentational, not semantic) HTML5 includes. He does mention XHTML5, which is "optional"... why should we take that step backwards?

The anti-XML-compliance people like to complain that XML is too verbose. If they don't like it, they can use something else, like RTF. Cars have gotten verbose too over the years. Those people can put their money where their moths are by buying an antique that doesn't have a radio, GPS, seat belts, padded dashboards, windows, crumple zones, suspension, electric engine starters, or any number of improvements that could be argued to be bloat.

XHTML2 is the way we should go.
Re:I prefer XHTML 2, thanks by maxume · 2007-12-16 12:22 · Score: 1

If there was no business on the web, your attitude might work. Seeing as business pays for most of it(I think somebody is buying all those ads...), don't be surprised when one of their concerns ends up being important. What concern? The one where people trying to use their website don't see some weird error message just because there was a '>' in the wrong spot.

Yeah, lots of business pages show users stupid things all the time, but they aren't doing that by choice, they are doing that because they are incompetent. Competent businesses want html5, not xhtml.

--
Nerd rage is the funniest rage.
Re:I prefer XHTML 2, thanks by Anonymous Coward · 2007-12-16 12:46 · Score: 0

One standard for smart people who know programming and actually work with an engineering mindset, another for those who see the web as a big graffiti and work with an "anything goes" mindset. No thanks, I prefer ONE standard for smart people, XHTML v2, and just to kick out everyone who isn't qualified.

The problem is your sites look like ass.
Re:I prefer XHTML 2, thanks by gbjbaanb · 2007-12-16 12:46 · Score: 1

Wrong, businesses want websites that work and look pretty (to summarise). If there's a misplaced '>' in their page, they'll get the web designer to fix it.

Competant businesses don't care about technology for technology's sake.
Re:I prefer XHTML 2, thanks by csnydermvpsoft · 2007-12-16 12:47 · Score: 1

The one where people trying to use their website don't see some weird error message just because there was a '>' in the wrong spot.

With a non-strict language, that misplaced bracket will behave differently in different browsers. In a strict language, that would never be deployed by any competent admin, who would be running everything through a validator before deployment. For the same reason that I prefer strongly-typed programming languages, I prefer strict semantic languages - bugs are discovered much sooner and handled more uniformly.
Re:I prefer XHTML 2, thanks by maxume · 2007-12-16 13:35 · Score: 1

So who out there in the business world is whole heartedly devoted to xhtml?

The problems with using xhtml in IE muddy the waters, but there doesn't seem to be a whole lot of traction for the strict error handing that is supposed to come with xhtml.

--
Nerd rage is the funniest rage.
Re:I prefer XHTML 2, thanks by maxume · 2007-12-16 13:58 · Score: 2, Interesting

One of the main goals of html5 is to formalize error handling. It accounts for many edge cases that html4 didn't specify, mostly by looking at what browsers do to handle html4. There are already parsers available for several different languages. There are enough existing broken pages out there that this might work pretty well.

The problem is that the worry present in dealing with the strict language often results in no benefits over just using something non strict. Given that IE(at least 6, I don't know about 7) doesn't properly handle xhmtl, there is really no way of saying whether the current situation between html4 and xhtml has anything to do with preferences on the deployment side, as it doesn't work to deploy xhtml. My guess is that people with money on the line would prefer to show a (probably somewhat) broken page rather than an error message, but this is just a guess.

There isn't anything stopping anyone from validating html4, it just has a relaxed idea of what to do if an error ends up in some output. Hopefully we can agree that there is room to differ.

--
Nerd rage is the funniest rage.
Re:I prefer XHTML 2, thanks by Cracell · 2007-12-16 18:32 · Score: 1

I'm a Ruby on Rails Web Developer and have been working with XHTML and HTML for a few years. When I first read about XHTML and understood it, I was like "awesome" they have all of the concepts down correctly, this is the solution to the web problem. But it's not.

First off. The current web works. Almost all sites that you want to be on render decently well in IE, Safari, Opera, and Firefox. It's usable and being used.

Now there's room for improvement, heck I've cursed while trying to use CSS over and over. Looks perfect in one browsers, hideous in another. And then it'll just not behave as my references say it will. gah. But it does work eventually just with frustration.

Anyhow back to XHTML vs HTML.

XHTML is the proper thoughts and theory, but it just doesn't make sense in practice in many ways. Whereas HTML works great in practice. This is why the browsers aren't supporting XHTML 2.

Lastly there's no reason to make pages break if their formatting is a little off. That's dumb, the Internet is so much about freedom of information including the freedom to post it. Of course if they don't know what they are doing they should lean towards stuff like Google pages. But I've made my dumb mistakes and I'm glad my pages don't just break because I forgot to close my last div tag.

--
Signatures are so 90s
Re:I prefer XHTML 2, thanks by Stu+Charlton · 2007-12-17 07:32 · Score: 1

No thanks, I prefer ONE standard for smart people, XHTML v2, and just to kick out everyone who isn't qualified.

You realize that by saying this you are dooming XHTML v2. A technology that excludes the masses will be eventually trampled under.

--
-Stu

cynical, but true by r00t · 2007-12-16 11:10 · Score: 0, Troll

I often wish for an Open Source browser brave enough to say "screw the W3C, we're going to be IE compatible". I suppose it's OK to leave out the exploitable buffer overflows. I want the rest though.

Recognize the popular ActiveX controls, providing Open Source substitutes when possible. Feed any remaining ActiveX crap into Wine, with appropriate sandboxing.

Do the VBscript stuff.

Do the DirectAnimation stuff.

Ignore MIME types; they get lost anyway when you save the files. ...and so on, etc., ...

Being "right" just isn't worth the trouble. This isn't a fight worth fighting.

Re:why is this tagged internet by Hai-Etlik · 2007-12-16 11:31 · Score: 1

If they were implying that the web is the Internet, I'd agree with out. However, the web is PART of the Internet, and an important part, so the tag seems quite fair. The point of categories like this is to group related articles, they have to cover broader concepts than the articles themselves.

Where's the databinding? by srijon · 2007-12-16 11:59 · Score: 2, Insightful

What the web is crying out for is a standard that supports a rich data hierarchy, a rich presentation hierarchy, and a databinding mechanism to connect these two (preferably without using CSS, but that's another debate).

That's exactly where the next-gen UI frameworks have gone (Flex from Adobe, XAML from Microsoft). These frameworks represent the wave of the future and that's where the web needs to go too.

Meanwhile, the web standards community spouts all this rhetoric of "separating presentation and semantics" in HTML/CSS, which is nonsense. Both HTML and CSS are precisely concerned with presentation. And they are not at all separate. You need to know and love both to coax good looking pages out of a browser. All this huffing and puffing, yet the best they can offer for application-specific data models is microformats!

As far as I can tell, both HTML 5 and XHTML 2 are icing on the cake, and missing the main course altogether.

Amen! (Re:reboot the web!) by Tablizer · 2007-12-16 12:20 · Score: 1

Amen!

Web Browsers and DHTML/DOM/JS are meant for "e-brochures", yet people are trying to bend them into everything, and it gets uuuugly.

Java's probably the closest to what we really need; however, it needs to simplify its GUI API's (most its API's are buerocratic, in fact), de-link the GUI engine from specific languages, and/or allow some kind of scripting/dynamic-typed option, and go OSS.

--
Table-ized A.I.

Re:I bet my ass.. by Tablizer · 2007-12-16 12:28 · Score: 1

Anyone thinking of clicking on the parent's link (to vumit.com) should realize that it's a goatsex-style shocker page.

Vumitting is what I did when I first saw goatse.

--
Table-ized A.I.

Pedantic tip of the day by Anonymous Coward · 2007-12-16 12:35 · Score: 0

The word is "clique", not "click".

Re:Where is Microsoft? by kestasjk · 2007-12-16 13:33 · Score: 1

The monopoly thing doesn't really work when talking about web browsers any more. Firefox has shown that a better browser will gain market share quickly whether or not it comes bundled.

That's why it's so absurd that Opera has just launched an anti-competition lawsuit against MS, saying MS is blocking competitors out, while Firefox is still steadily steamrolling ahead.

The other good thing about standards is that you want Microsoft to be involved, whether you think they're evil or not; you want them to be part of creating the standard, to agree on the standard, and to stick to the standard.

--
// MD_Update(&m,buf,j);

Re:I bet my ass.. by Firehed · 2007-12-16 13:43 · Score: 1

Too late *cry*

oh ffs, the captcha is "jerking". Thanks for adding insult to injury, Slashdot.

--
How are sites slashdotted when nobody reads TFAs?

Missing the point? by cavebison · 2007-12-16 15:31 · Score: 2, Insightful

The beauty of the web was that anyone could put up a web page.

All you "standards nazis" out there, please don't forget that. The web is for everyone, yes, even those who can't write HTML "properly".

Hopefully browsers will always render badly formed HTML, otherwise the web will be a poorer place for it.

Re:Missing the point? by jabernathy · 2007-12-16 22:17 · Score: 2, Insightful

Good standards and better tools would solve a lot of the problems. An interactive HTML validater should be smart to translate the code into something conforming to standards while asking the user "what did you mean by this incorrect syntax?"
Re:Missing the point? by cavebison · 2007-12-16 23:33 · Score: 1

True... I think we're yet to see a good web page designer that doesn't require the user to be a good web page designer. You can create a blog at blogger etc, but making your own site as you want it is still difficult.

Then again, as a dev, I don't want to be put out of work.. I knew there was a conspiracy in there somewhere.
Re:Missing the point? by musicmaster · 2007-12-17 00:14 · Score: 1

I totally agree.

When I compare DIVs to tables the tables are much more flexible, specially when you need a solution that works on different browsers and window widths. Instead of building on the strengts of the tables to make something better the standards gurus developed with the divs something that misses many of the strengths of the tables.Now we have two half solutions instead of one whole.
Re:Missing the point? by cavebison · 2007-12-17 01:14 · Score: 1

Exactly. I feel that tables should have been more focussed on instead of less so. The colspan/rowspan requirements are difficult to manage for average users, and when building apps to generate them. Simple things, like a table attribute meaning "inherit width from previous table". That alone would be very useful, minimising the need to calculate colspans when breaking up tables of data with headings etc. You're guaranteed the tables will render to the same width (assuming its variable). Tables could be made so much more powerful, less restrictive, easier to use. I'm not ashamed to be a table fan!

Whatever shall we do? by Anonymous Coward · 2007-12-16 16:37 · Score: 0

Uhh, the direction of browsers is to XSLT transform whatever format they receive into one that is compatible with the format they render. Its not so much a quandary, but a waiting game to see what formats they'll need to transform.

To imagine it any other way is silly.

Ok. by Almahtar · 2007-12-16 17:08 · Score: 1

So, like C++?

Re:Where is Microsoft? by Anonymous Coward · 2007-12-16 19:07 · Score: 1, Informative

for example you don't need anything other than typical mousedown event handlers for things like Google Maps I'm sorry, but that just sounds awful. If you've used MapQuest before Google Maps, then went back after using GMaps, you'd see what I mean. There is a lot of stuff going on under the hood related to the mouse events, data loading, caching, etc that HAVE to be there in order for that experience to be as good as it is.

Sure, those techniques have been there since Netscape 2... and without XMLHttpRequest, we'd be back to the 1998 status quo.

Re:I bet my ass.. by ThirdPrize · 2007-12-16 21:40 · Score: 1

While you are at it you could redirect them to a Linux site and tell them to reformat their machine and install Ubuntu on it as well. That was sarcasm if you hadn't guessed.

--
I have excellent Karma and I am not afraid to Troll it.

Pragmatic vs Academic by Dikeman · 2007-12-16 21:48 · Score: 2, Interesting

Recently, I've had the privilege to work with people that were preceding both ISO committee's and W3C committee's. What struck me was their tendance to create standards that were on a high academic level. At the same time any pragmatic argument failed to be of any influence on the standard.
Although this leads to standards that are a pleasure to those who like the pilosophical aspect of representation of and interaction with information - and I'm certainly one of them - it also leads to standards that will never be used.

In the real world outside ISO and W3C, mundane arguments, like cost of implementation, degree of skill needed to work with those standards, ease of transition, etc, etc. *are* of importance and will influence the standard that will prevail in the end.

Although I can enjoy the academic approach to a new standard, I have to say that as owner of a IT company my hopes are on the pragmatic approach of HTML V5.

BTW: The job i did for those ISO guys (They didnt't work fulltime for ISO) was to map the ISO standard they had developed,to a practical implementation in the organisation they worked for after they had failed to do so themselves, so go figure.

bias in everything by igjav · 2007-12-16 22:11 · Score: 1

HTML V5 rewritten as an XML dialect will not probably be a XHTML V5, as XHTML uses is own naming scheme and branching for development. Unfortunatelly, HTML has diverged in two opposite directions, and HTML V5 one is simply better for everyday users/developers.

So when article author says:

(...) If you only use XHTML V1 because of its XML compliance but you prefer the new features in HTML V5, you might appreciate XHTML V5 (HTML V5 rewritten as an XML dialect). (...)

... is not biased (unfortunatelly) to correct interpretation but to current misunderstanding

Don't care as long as... by squoozer · 2007-12-16 23:03 · Score: 1

...they provide us with a way to produce something that works vaguely like a modern desktop GUI that doesn't take 6 months to get working and even then have more bugs than Scotland in midge season!

The web applications produced in the last few years are a real testament to dogged determination in the face of insurmountable odds but come on we need a rethink regarding (interactive) web applications if there are going to really take off.

--
I used to have a better sig but it broke.

Re:I bet my ass.. by EsbenMoseHansen · 2007-12-17 00:23 · Score: 1

By the way... this comment is borderline what I'm talking about and so is this one or this one.

You sound like someone who thinks they have the slightest clue to exactly how the end result will be rendered. This is probably the most common fallacy in web design, and probably the reason why most web sites are designed so atrociously. A common symptom is a web page that only takes up the middle bit of the page, or font sizes specified in pixel sizes. Those web designers fail to take into account

screen DPI
viewer eye sight
viewer distance to monitor
browser differences
font rendering engine differences
user preferences

The best thing to do is to trust that most browsers are not setup by users. Insteadl, they are setup by some distribution or vendor, which probably went to some pain to choose the default serif, sans serif, italic and monospace font to be sensible, well rendered and available. In some cases they might even have made sure that they got the font size reasonable, though this aspect is almost impossible without user interaction (see e.g. viewer eye sight and distance to monitor). So leave the main text to the default for the browser, and scale the rest around that. Then apply all the other good design principles you have and like (such as no more than 2 menus, avoid tabs in tabs, never more than 10 or so items in one page and so on and on), and the page will turn out alright. Then make a specific stylesheet for IE users, making the necessary adjustment I believe is necessary there.

Now, most web designers do not do that. Which is why I tend to use font zoom a lot, and have this bit in my user stylesheet:

p, ol, ul, td, body, th, div { font-family: sans-serif ! important; } em { font-family: italic ! important; }

It might not be perfect, but it makes many pages a lot more readable --- and better-looking, too.

--
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.

HTML5 vs. XHTML5 by Nurgled · 2007-12-17 00:32 · Score: 1

HTML5 and "XHTML5" are two serializations of the same "language". The HTML5 spect explains how to transform an HTML5 text into an XML DOM, while XHTML5 just uses a normal XML parser. Both result in an DOM, and the way that DOM is interpreted is consistent between the two serializations.

Re:HTML5 vs. XHTML5 by uzytkownik · 2007-12-17 00:57 · Score: 1

That's what I thought. And since xhtml 5 can be embed in xml document as any other xml format we have problem ;) as far as I understend.

--
I've probably left my head... somewhere. Please wait untill I find it.
Homepage: http://blog.piechotka.com.pl/

Very few people would use such a language by walterbyrd · 2007-12-17 01:07 · Score: 1

As long as browsers support slop, developers will write slop. A browser that doesn't support slop, won't be used.

When? 2012? 2022? And how many people care? by walterbyrd · 2007-12-17 01:19 · Score: 1

If these standards are not going to be completed until 2012, or later, that just means that the web will be even more entrenched in old standards, and people will be even more reluctant to change.

Very few people pay attention to the current standards. Why is anybody going to pay attention the new "standards" ?

My own standard (jerkwads) by Anonymous Coward · 2007-12-17 01:44 · Score: 0

...with Black Jack! And hookers! In fact, forget the standard!

Can we put block-level elements in paragraphs? by jonadab · 2007-12-17 02:23 · Score: 1

All I want to know about upcoming XHTML versions is, can we finally put block-level elements inside paragraphs? We've been whining about the inability to do this, and basically not _using_ paragraphs as such because we can't, for over a decade. It's, as far as I'm concerned, the *one* problem with XHTML as it stands.

As far as HTML5, I thought XHTML was supposed to *be* HTML5, conceptually if not in name. Having received the benefits of well-formed markup, why would anyone ever want to go back to the old "Maybe this element is inside of that paragraph, or maybe it's after it, depending on where we decide the elided close tag belongs" way of doing things? I for one don't *EVER* want to deal with non-wellformed markup again. Make it go away.

--
Cut that out, or I will ship you to Norilsk in a box.

Re:Where is Microsoft? by nuzak · 2007-12-17 03:33 · Score: 1

> What a world we live in when the supposed alpha geeks are the laggards of a technology bell curve!!

Not really. Grumbling old hidebound greybeards that scoff at every fancy-dancy technology are just displaying a variation of the insecure know-it-all and above-it-all-ism that pervades geek culture. Really, it's not just limited to geeks -- these days, everyone thinks themselves an expert on everything after watching a few episodes of 24 and Dr Phil -- it's just that geeks have the knowledge in the technology area for the constant scoffing to sound plausible.

--
Done with slashdot, done with nerds, getting a life.

HTML Validator by falconwolf · 2007-12-17 04:13 · Score: 1

I haven't looked at it for quite a while, but at least for many years, the "CSE HTML Validator" wasn't actually a validator

Thanks for that. I wonder why I didn't hear this earlier. See, several years ago for classes in college we had to write, and validate, xhtml and the professors had us use CSE's HTML Validator. They arranged a deal with CSE for students to buy an unlimited version, there was a free 50 use version but students could easily use it more than 50 tymes, at a reduced cost.

Falcon

--
Should there be a Law?

Early HTML 5 implementation by neutrino38 · 2007-12-17 05:18 · Score: 1

Mmmm, difficult to take any side at the moment but this is yet another format war. I guess that it is urgent to make the two proposal converge otherwise we might get a waste of energy in dulicate standard maintenance and implementation.

Just a simple question: this post on ars technica describes a pretty cool example of web page uing HTML5. Can XHTML gurus tell us how this would be done using XHTML 1.1?

Re:Where is Microsoft? by Selanit · 2007-12-17 05:35 · Score: 1

Okay, fine. The primary JavaScript library for the Google Maps API, in which the basic functionality of the thing is created, can be found here:

http://www.google.com/intl/en_us/mapfiles/95/maps2/main.js

Load that source code somewhere - in your browser is fine, or you can open it in a text editor. Search for the string "XMLHttpRequest." On line 228 you will find a function called NE(), which creates a new XMLHttpRequest object and returns it. (In IE, ut uses Microsoft.XMLHTTP, but the two are functionally identical).

The NE() function is called repeatedly by other functions, which are themselves called by yet other functions. XMLHttpRequest is essential to the functionality of Google Maps.

And as for Firebug, I'm sure you looked at the panel showing network activity. If you look at the requests originating from JavaScript, you'll see that they return JavaScript code which does not have surrounding HTML. Here's an example from a recent Google Maps session. If this were being accessed via an IFRAME element in the requesting document, the JavaScript data would have to be wrapped in HTML in order to be parsed by the browser so that the JavaScript could be accessed by the parent frame. But in this case you can see that it does not have the accompanying HTML. It's plain JavaScript. In the absence of that HTML, the web browser loading it into an IFRAME would treat it as plain text, because it is. However, information which has been retrieved using an XMLHttpRequest does not need any surrounding HTML, because it is being handled directly by the JavaScript. It can be easily executed using a call to eval(), preferably after doing some security checks to make sure that he code is coming from a trusted source.

I hope you have found this educational.

Re:Where is Microsoft? by Per+Wigren · 2007-12-17 05:37 · Score: 1

It's a problem when it comes to web standards though. It will still be a good while before you'll be able to use CSS3 and even CSS2 on public webpages without resorting to adding IE-specific hacks, let alone things like SVG and Canvas...

--
My other account has a 3-digit UID.

client-side security isn't security by decavolt · 2007-12-17 06:24 · Score: 1

...and what happens when a user with an older or non-compliant browser views your site that doesn't properly handle this tag that you'd be relying on? You can NOT implement your security client-side and expect it to be anything more than a speed bump for those that want to circumvent it. HTML is not meant to handle security in any way, and it shouldn't be expected to, ever, for the obvious reasons. What's stopping you from doing the same with server-side code, and why on earth wouldn't you prefer that to client-side? Your tag is a horrible idea, but you might see it in some upcoming version of IE anyway.

It only works to a certain depth... by Anonymous Coward · 2007-12-17 08:08 · Score: 0

Regexes are completely the wrong tool for handling HTML. Even what he said about entity encoding them can be dangerous, because removing 'bad' things doesn't always make it safer. And regexes can only handle tags that are nested no deeper than some level.

Which is, of course, why one uses the proper tool to handle the data.

Mind you, I got this information from the Perl regex book, so it's not like I hate regexes or Perl or anything. I mean, I did my first multi-threaded code by modifying a JAPH (yeah, *that* JAPH...). For that matter, Perl has a bunch of very nice parsers that can handle tag soup without mangling it. There's a nice recursive descent parser, not to mention one or three specifically for HTML (and XML, etc.).

Then again, if I look at who I'm replying to, I recognize that nickname as being the same as one of the Perl dev's nickname, so you probably already knew that. Looking at your homepage, you probably are *that* chromatic, so I'll just shut up now.

Re:It only works to a certain depth... by chromatic · 2007-12-17 12:01 · Score: 1

Regexes are completely the wrong tool for handling HTML.

That was my point at well; I just wasn't as clear about it as I could have been. I've never seen an unbreakable HTML-parsing regular expression.

--
how to invest, a novice's guide

Dreamweaver? by Nodlehs · 2007-12-17 08:14 · Score: 1

You lost any respect when you mentioned using Dreamweaver... Using Dreamweaver is like using those auto game creation tools to make games, (MMORPG Creater 10.5.21, Click a button, get a game!), what you end up with is a steaming pile of crap. If people want to be weekend web site developers, let them learn the proper way. Using a product to butcher the code for you (that you would then end up trying to learn from) is the worst way to go.

If you really want to learn, get a book. Study the basic precepts of the subject you want to learn. Then practice, practice, and practice some more.

Re:Dreamweaver? by Anonymous Coward · 2007-12-17 14:08 · Score: 0

the post had nothing to do with dreamweaver but a fleeting mention.

Rolse? by Anonymous Coward · 2007-12-17 08:26 · Score: 0

You mean the roles attribute... right? My brain had a parse error trying to figure out what you meant :-(

Maybe my brain only works with XHTML ...

Another War! by Soiden · 2007-12-17 09:14 · Score: 1

Let the HTML Wars begin!

--
Minti: What's that huge shuriken in your back?! Kin: It's the instrument of my victory.

Re:I bet my ass.. by Raenex · 2007-12-17 13:44 · Score: 1

Hopefully CmdrTaco will require you to use javascript or flash to post comments so we can finally weed those people out :-) I do almost all my web browsing without Javascript or Flash and use my own colors and fonts. Oddly enough, it actuallys works for the vast majority of sites I visit, including Slashdot. The thing is most web pages don't need dynamic behavior, and don't need to be over-specified.

difference by softdevs · 2007-12-17 17:56 · Score: 1

Whats the difference of HTML V5 and XHTML V2??? web development

Nope, not at all. by r00t · 2007-12-19 19:33 · Score: 1

I'm not sure how it is that you're misunderstanding me, but I damn well do know how this stuff works. I've even written a web server (like everybody and their dog, right?) and plenty of code to parse HTML.

Now, to an extent, there is something dynamic: pages are being automatically generated. It doesn't matter when. The pages can be cached, or not. They can be generated and kept forever, served out identically to every visitor. It just doesn't matter, except for web server performance.

To attack, one supplies data that will wind up inside the page. (a forum post, an email, etc.) It is at this moment that the attacker has his one and only chance to guess the random secret. The page is generated either right then, or repeatedly in the future. The attacker can now see it, but so what? He lost, and can not fix his error. His next attempt will be on a fresh new page with a fresh new secret. Knowledge of previously generated pages is completely useless to him.

Re:Nope, not at all. by uhlume · 2007-12-19 20:12 · Score: 1

It is at this moment that the attacker has his one and only chance to guess the random secret.
I think you're missing the point again: the attacker doesn't need to guess anything; if the "hard to guess string" is contained in a static HTML tag, as the GP specifies, the attacker need only view the source of the page. The only way to potentially avoid this issue is to generate the string dynamically each time the page is loaded. Even this could theoretically be circumvented by client-side scripting, unless DOM access to the tag is crippled in some way by the browser. Regardless, load-time generation of the random string would require a dynamic language, which HTML is not, and probably should not become.

The GP's suggestion could undoubtedly be implemented with some success by means of server-side parsing via Perl/PHP/ASP/et al — I'm fairly certain I've seen similar schemes in practice — but as described, at least, it just doesn't make any sense as an addition to the HTML specification.

--
SIERRA TANGO FOXTROT UNIFORM
Re:Nope, not at all. by r00t · 2007-12-20 06:06 · Score: 1

Right, one can view the source. That's too late for the attacker though, because the attacker needs to get a successful guess embedded into that very page. He needed to make his guess before the page was generated.

Client-side scripting is blocked of course; that was the whole point of the new tag.

Remember how the proposed tag works: it disables everything not explicitly allowed. Just prior to the potentially hostile data, the web site places the opening tag. The closing tag is only accepted if it contains the secret. The secret may become public knowledge after the page is generated and served for the first time. At that point the potentially hostile data is cast in stone. The attacker can't go back and fix his error. ("Oh, now I know the answer, let me go back and fix my error..." is not possible)
Re:Nope, not at all. by r00t · 2007-12-20 06:12 · Score: 1

Maybe the source of confusion has to do with where the potentially hostile content is coming from.

It's supplied once. The server writes out a foo.html file containing it. This file is never generated prior to the potentially hostile content.

The server is NOT making the page with an INSERT-CONTENT-HERE thing and dynamically slurping in fresh new hostile content each time the page is served. The server is NOT causing the client to slurp in fresh new hostile content from an untrusted web site whenever the page is rendered.
Re:Nope, not at all. by uhlume · 2007-12-20 10:36 · Score: 1

You keep talking about the server. What does the server have to do with an HTML tag, or vice versa? Web servers don't speak HTML, they speak HTTP. Again, you seem to be discussing some sort of server-side dynamic language, which HTML definitively is not.

--
SIERRA TANGO FOXTROT UNIFORM
Re:Nope, not at all. by r00t · 2007-12-21 16:00 · Score: 1

Example: slashdot.org is a web server that creates HTML whenever you post a comment and/or read a comment. (pages may be cached until a new comment is added)

You can put HTML into your comment. Slashdot.org runs a giant perl script which embeds your code into a web page, along with other stuff (ads, logout link, etc.) from many other sources. After you see the result, you can't go modify your comment. You can add a new comment, but then the old web page is destroyed (deleted from the slashdot.org server) and a new web page is generated.

Slashdot tries to filter this. Suppose that the filter is broken, but that browsers support this new tag and slashdot uses it.

You put evil JavaScript code into your comment. Slashdot nests that into the new tag when the new page is generated. You view the page, discovering the secret. You post a new comment, intending to abuse your knowledge of the secret. Slashdot generates a fresh new page to contain your new comment, with fresh new secrets. Now you can learn the new secrets by viewing the page, but again it does you no good. Every time you try to embed evil JavaScript, the old secrets (which you now know) will be replaced by the giant perl script that generates web pages.

lynxcache mirror by DoctorEternal · 2007-12-20 02:00 · Score: 1

lynxcache mirror: http://lynxcache.com/HTML_V5_and_XHTML_V2.html

344 comments