Varnish Author Suggests SPDY Should Be Viewed As a Prototype
An anonymous reader writes "The author of Varnish, Poul-Henning Kamp, has written an interesting critique of SPDY and the other draft protocols trying to become HTTP 2.0. He suggests none of the candidates make the cut. Quoting: 'Overall, I find the design approach taken in SPDY deeply flawed. For instance identifying the standardized HTTP headers, by a 4-byte length and textual name, and then applying a deflate compressor to save bandwidth is totally at odds with the job of HTTP routers which need to quickly extract the Host: header in order to route the traffic, preferably without committing extensive resources to each request. ... It is still unclear for me if or how SPDY can be used on TCP port 80 or if it will need a WKS allocation of its own, which would open a ton of issues with firewalling, filtering and proxying during deployment. (This is one of the things which makes it hard to avoid the feeling that SPDY really wants to do away with all the "middle-men") With my security-analyst hat on, I see a lot of DoS potential in the SPDY protocol, many ways in which the client can make the server expend resources, and foresee a lot of complexity in implementing the server side to mitigate and deflect malicious traffic.'"
The Internet Printing Protocol is a wierd mash up of HTTP and a proprietary binary format. God knows what they were smoking when they dreamt it up.
And most SSL implementations (not including newer TLS) can only handle one certificate and usually one host (not counting multi-host/wildcard certs), which pretty much negates his host comment.
Parsing a HTTP session with multi-part mime attachments using chunked encoding is murderous. Now true, many people don't have to worry about this, but the fact is the protocol leaks like a sieve. For instance, you can't send a header after you've entered the body of the HTTP session. You can't mix chunked-length encoded elements with fixed content-length elements with HTTP1.1. Once you've sent your headers and encoding, you're screwed. The web has a solution - AJAX, but then you need JavaScript.
I'd be all for something new. I'd suggest base it on XML with a header section and header-element to get the transfer started then accept any kind of structured data including additional header elements. With this, you can still use HTTP headers for back-wards compatibility, but once recognized as "HTTP 2.0" the structured XML can be used to set additional headers, etc. With the right rules, you can send chunks of files or headers in any arbitrary order and have them reconstructed.
Slashdot's rate-of-post filter: Preventing you from posting too many great ideas at once.
For instance identifying the standardized HTTP headers, by a 4-byte length and textual name, and then applying a deflate compressor to save bandwidth is totally at odds with the job of HTTP routers which need to quickly extract the Host: header in order to route the traffic, preferably without committing extensive resources to each request. ...
It seems to me that routing based on header is doing entirely the wrong thing. In any case, according to wikipedia:
TLS encryption is nearly ubiquitous in SPDY implementations
Which rather makes routing on content infeasible (OK you can forward route behind the SSL endpoint, but this doesn't seem to be what he's talking about)
So, because you would have to design new security tools and think a different way in order to make it sure, does that make it flawed? Does this mean we are no longer free to innovate unless it fits into some mold? That is just stupid. If someone comes up with a new way of doing things, put on your REAL security hat and come up with a way to secure it, don't just spread FUD about how it is BAD!!
"My immediate reaction is "WTF? What kind of moron doesn't make things 64-bit safe to begin with?" Linus
Much of what the web has become is no longer fitting the "fetch a document" model that HTTP (and GOPHER before it) are designed to do. This is why we have hacks like cookie managed sessions. We are effectively treating the document as a fat UDP datagram. The replacement ... and I do mean replacement, for HTTP, should integrate the session management with it, among other things. The replacement needs to hold the TCP connection (or better, the SCTP session), in place as a matter of course, integrated into the design, instead of patched around as HTTP does now. With SCTP, each stream can manage its own start and end, with a simpler encryption startup based on encrypted session management on stream 0. Then you can have multiple streams for a variety of serviced functions from nailed up streams for continuous audio/video, to streams used on the fly for document fetch. No chunking is needed since it's all done in SCTP.
now we need to go OSS in diesel cars
According to the link, IE on Windows XP does not support TLS+SNI -- including IE 8.
Until this is fixed or sufficient number of people migrate to a newer OS, TLS+SNI is still not viable for most websites.
Then it cannot replace HTTP and should be withdrawn, or it's been wrongfully sorted in under "HTTP/2.0 Proposals"
I have to agree with the FA. even though the guy isn't always clear enough about issues with HTTP 1.1, It seems like SPDY is more of a fix/speedup for HTTP 1.1 than a true, forward-looking web protocol.
Of course though, industry favors incremental changes.
If someone proposed HTTP today, it wouldn't pass muster by these experts either. And I doubt that any of these new protocols really would make much of a difference anyway. The infrastructure has been built around HTTP, everybody knows how to compress it and everybody knows how to deal with the kind of multiple connections that it requires. If anything additional is really needed, it could be expressed as hints to the server and the intermediate infrastructure without starting from scratch.
As a static data format its just about passable, but as a low overhead network protocol??
Wtf have you been smoking??
This is one of the things which makes it hard to avoid the feeling that SPDY really wants to do away with all the "middle-men"
Half the human race is middle-men, and they don't take kindly to being eliminated.
So what should operators of small web sites do in the 21 months between now and when Microsoft agrees to let XP die? Now that IPv4 addresses have become scarce, and home ISPs still haven't been pushing IPv6, shared hosting companies are able to charge double for the dedicated IP addresses needed to run SSL 3.0. Besides, Android 2.x devices are still being sold, and their SSL stack doesn't support SNI either.
Is this an author from the planet Varn? Or does he claim to have invented a yellowish coating?
Contribute to civilization: ari.aynrand.org/donate
JSON can also be validated against a schema, where the schema consists of a JavaScript file implementing isValid(parsed_object).
Wouldn't it be better to have the browser support zip/tarball path.
Now
would look thru the zip file.
I suppose there could be some security issues here, but it seems like it would be easier than chunking protocols if not much faster.
Further ...
Now we've got cached apps as well.
I'd suggest base it on XML with a header section and header-element to get the transfer started then accept any kind of structured data including additional header elements.
Haven't we learned enough already from industrial pain to stay away from XML? JSON, BSON, YAML, compact RELAX NG, ASN.1, extended Backus-Naur Form. Any one of them, or something inspired by any (or all) of them, that is compact, unambiguos (there should be only one canonical form to encode a type), not necesarily readable, possibly binary, but efficiently easy to dump into an equally compact readable form. Compact and easy to parse/encode, with the lowest overhead possible. That's what one should look for.
But XML, no, no, no, for Christ's sake, no. XML was cool when we didn't know any better and we wanted to express everything as a document... oh, and the more verbose and readable, the better!!(10+1). We really didn't think it through that much back then. Let's not commit the same folly again, please.
HTTP 2.0 is not going to happen for a long time. SPDY is here now and both Firefox and Chrome support it.
But, the only companies using it are Google and Twitter. I'd like to see web hosting companies offer it as a service
SPDY is encrypted by design. There is no option for middle-men, and frankly, that is the way I like it myself, as i would assume most people. I don't like when devices mess with my traffic.
As for most of the other complaints - given than Google is running SPDY just fine on all of it's servers, and they're basically one of the largest (if not the largest) hosts on the internet, I think they are all strawmen. If it is working for Google then it will work for others.
My experience using SPDY, as a user, is nothing short of spectacular. The performance gains in on Google properties with SPDY are incredible and very noticeable.
Just use firefox or chrome in XP, problem solved
Higuita
The problem all of these HTTP 2.0 proposals are trying to work around is the fact that each resource fetched by the web browser is handled via a separate connection. By combining these elements into a single (compressed) stream you can save a TON of overhead. This is why sites that use nothing but data::URI images load so much faster--even--than sites using the fastest CDNs. These 'solutions' are just workarounds to the crap that is HTTP 1.1.
Of course, the problem with data::URIs is that they can't be cached if the page's content is dynamic. However, the fact that you don't have to open a hundred additional HTTP connections just to load the cached content (have to check if something changed!) more than makes up for the lack of caching.
The real solution here is to just ditch HTTP and replace it with something like SCTP which can keep the connection open to the server and maintain the session in a secure fashion (negating the need for session-tracking cookies, hurray!). Having said that, such a change to the web would completely break the popular, N-servers-behind-a-load-balancer architecture. It would also negate the need for CDNs (for the most part)... Which is probably why many of the big-name vendors are proposing solutions that maintain the status quo.
-Riskable
"Those who choose proprietary software will pay for their decision!"
How hard can it be to write a JSON schema validator ?
There is no reason why the same data in an XML schema can't be represented as a JSON structure and why you couldn't use that data to validate another arbitrary JSON data structure. In fact it should be easier since we'll be missing the useless and often abused distinction between attributes and tags that XML has.
In reality, almost no one uses XML schemas any more (even html has gone the html5 way from xhtml1) because DTDs and their ilk are such are a verbose PITA to the point where alternating validating protocols sprung up to try and simplify it. But the main reason seems to be that no one really reads them and that they often stopped representing the actual data sent over the wire because no one bothered maintaining it after the dedicated developer who carefully crafted it for the first release.
But really any format that can express structured data is endorsed by me. I do not have a problem with JSON, in fact it is my 2nd favorite. My first favorite is Python's style, which is very, very close to JSON. But JSON has the advantage that web people already know it.
Please don't get bogged down with XML, I wrote XML into my post because despite what you all think, it's not that bad to parse, provided that you use a stream-reader style rather than SAX or DOM. The other reason why I wrote XML is because it does not pre-suppose any kind of scripting engine, so people would not be tempted to use code which would require a JavaScript interpreter, which would end up being a really bad idea.
XML, JSON, Python can all express structured data. They are all equally valid and anything expressed in one can be converted between them all.
Slashdot's rate-of-post filter: Preventing you from posting too many great ideas at once.
Everyone should use a schema; but hardly anybody does. You only need one case of "just skip validation, it'll work but we haven't updated the schema yet". Then the schema slips further and further out of maintenance...
I'm not saying this is how it should be. I'm just saying that a lot of times, that's how it is. Also, you have to validate in the code on some level anyway. You'd be foolish to rely on the schema as your single point of failure. Either you put the schema on a pedestal and make it capable of validating everything, and allow coders to not validate in code, or you lean towards validation everywhere and let the schema slip. Since people are often writing code that doesn't have anything to do with XML, you don't want to throw input validation out as a best practice.
So... the schema is a nice redundant check, or first pass; but it really just isn't that important.
half of the purpose of a schema -- being a human-readable documentation of the data format
That purpose can be achieved with English.
Schema validation is a very clear example of a situation where it's not good to have a Turing-complete language.
If you specifically don't want something Turing complete when processing XML, then why do XML fans use XSLT despite its being Turing complete?
the user will use a browser which is supported. Imagine google only working with sni.
What is the sound of millions of users abandoning Google? "Bing."
For those who do read the article and didn't understand what the debate was about. Here is a good slide show from google about the advantages of SPDY. Which also explicate the issues in "HTTP routers" in the article: http://www.slideshare.net/bjarlestam/spdy-11723049
but then you have to write the schema AND document it. This can (and does) lead to documentation being out of sync with the code.
It's not much different from a C# implementation of a mobile application for Windows Phone 7 falling out of sync with the Objective-C implementation of the same application for iOS. Or what am I missing?
Within a year, most end-users (in the US) will have access to IPv6 from their ISP. Within two years, most end-users will have replaced their non-IPv6 CPEs with ones which support IPv6.
So in other words, IPv6 from the backbone to a home PC's 802.11g radio will be deployed around the time the last mainstream non-SNI PC operating system is scheduled to die anyway.
Me, I'd probably drop support for XP, and let the end-user click through a cert warning if that's what they're inclined to do.
So how would you explain to the users that a blog, forum, or wiki is supposed to raise a serious certificate error after the user is logged in, and that HTTPS with such a serious error is safer for the user than an HTTP connection that can be Firesheeped?
How much more per month are we talking about for a dedicated IP, anyway?
The difference between $5 per month name-based shared hosting, which may put a thousand or more domains on one IPv4 address, and a VPS. You mention a $5 to $7 per month VPS plan; which provider do you recommend?
Seems cheap to me, especially compared to what joe already spent to get a valid SSL cert.
Personal use SSL certificates have been free of charge from StartCom for some time now.
As far as Android...a number of websites are pushing their users to use simple apps instead of the Android browser.
This is true of major web sites like Facebook, eBay, Amazon, and the like, where only one company hosts a particular web application code base. But a lot of smaller web sites run open source web applications, customized with plug-in modules, on top of interchangeable LAMP servers. Is there a standard WordPress app, a standard phpBB 3 app, or a standard MediaWiki app? (I'd Google it, but I'm composing this post offline.) Or must each web site operator duplicate effort in developing an Android app from scratch in parallel with the web site and then walking the users through turning on Unknown sources?
After April 2014, Microsoft will refuse to supply security updates for Windows XP. Security holes will be discovered, documented, and exploited to the point where connecting an XP machine to the Internet is an invitation to get owned. "If you run XP, you will be hacked." How is that not enough to discourage people from continuing to use XP?
SPDY defenders sound as bad as the Apple fans. SPDY has many flaws and only focuses on the needs of google. As TFA mentioned there are limitations to it's 1 size fits all approach.
HTTP keeps it simple, which is why it has worked well for so long. It actually has sessions...with the lame cookie header. Sure, it seems slow but we've had it decades now with our slow computers and bandwidth amplifying its flaws-- but today when we have plenty of speed now we are all upset and must change it because it is slowing us down? I've always been a harsh critic of HTTP but people keep some perspective and to apply a little wisdom.
HTTP is easily extended for certain features:
>File stream that fixes what kept pipelining from adoption. Mandatory deflate and UTF-8 support would remove the need for the extra encoding headers (but those could be optional before being phased out.) Going UTF-8 for protocol headers would not be a small change and would break compatibility (phase it in really slowly.)
>Standardized Session ID header; instead of using cookie which makes session management a manual chore. This would help with sessions.
>Server push-- HTTP push existed outside of Internet Exploder since the 90s.
>Keep-alive
Mandatory encryption is wasteful of battery power. If google wants to help with encryption they should be trying to fix this cert signing scam we all depend upon today.
Some features would be better placed into another protocol such as video streaming; we shouldn't be hacking up port 80 simply because of the number of network admins who are assholes. I frankly do not see why we should have a stateless protocol on top of TCP when a new UDP based protocol could be used for that. As for maintaining connections channels, we probably could do with something less than sockets and more than TCP (SOAP) - for really complex stuff websockets can be used.
Ahh... so google properties have converted to this ..
I wondered why my browser turns to crap and hangs on google so often.
That's assume the sites work at all -- so far, google groups has gone completely dark for me... nothing comes up but a input line asking for
groups... but nothing will come up... all javascript enabled, and nothing blocked, yet it doesn't work anymore...
You might look at your assumptions about how well it works...
Lemme guess your browser -- 'Chrome'?
Something that needs to be done anyway is to allow servers to detect two things during a crunch and just drop the data into the void: duplicate email data (spam, mass mailings) and video streams. ;-)
Sure, the grand idea of interconnectedness and everybody streaming "Lost" to every television in America (and then sharing it on Facebook with each other) is a pretty Unicorn Fantasy, but pretty much anything involving video is unnecessary in the sense of the web being a useful tool. Just make it 'unreliable' and you have an immediate fix for bandwidth problems. Somewhere in there could be a detector of cats and dogs if you want to maintain the illusion that teleconferencing is useful.
The server needs to be able to say, "Nobody needs to see that."
Is crap like this (to watch a stupid badger video on forbes.com, you have to load [i.e. trial & error with NoScript] JavaScript from around 25 external websites - pure idiocy). Better implement a feedback mechanism inside browsers that lets web developers who do these things know their website is a slow pile of turd and forget about the useless SPDY that totally misses the point.
"I love my job, but I hate talking to people like you" (Freddie Mercury)