HTTP 2.0 Will Be a Binary Protocol
earlzdotnet writes "A working copy of the HTTP 2.0 spec has been released. Unlike previous versions of the HTTP protocol, this version will be a binary format, for better or worse. However, this protocol is also completely optional: 'This document is an alternative to, but does not obsolete the HTTP/1.1 message format or protocol. HTTP's existing semantics remain unchanged.'"
HTTP is the world's most popular protocol and it's bloated and slow to parse.
It's nice to have a link to the draft. But couldn't we have just a little more "what's new" than it's binary? This is slashdot... Filled with highly techincal people. At least a rundown of the proposed changes would be very helpful in a discussion. The fact that they're proposing a binary protocol doesn't really matter to anyone besides anyone who wants to telnet to a port and read the protocol directly.
Can't wait to use the new 046102 047111 005113 tag!
I deny that I have not avoided attaining the opposite of that which I do not want.
HTTP already does that with pipelining. One connection, many files, optional compression on the body. The header is tiny.
Support my political activism on Patreon.
The big change is allowing multiplexing in one stream. It's a lot like how Flash multiplexes streams.
Makes it harder to troubleshoot by using telnet to send basic HTTP commands
Since we're using a tool in the first place, it's just as easy to use a tool that understands the binary format. Back before open source toolchains had really caught on as a concept, human readable formats were a big plus, because proprietary tools could be hard to come by. Not really a concern these days, as long as the binary format is unencumbered.
Socialism: a lie told by totalitarians and believed by fools.
It's nice to have a link to the draft. But couldn't we have just a little more "what's new" than it's binary? This is slashdot... Filled with highly techincal people. At least a rundown of the proposed changes would be very helpful in a discussion. The fact that they're proposing a binary protocol doesn't really matter to anyone besides anyone who wants to telnet to a port and read the protocol directly.
from quick glance, multiple transfers and communications channels("streams" in the drafts lingo) can be put through the single connection, cutting tcp connection negotiations.
world was created 5 seconds before this post as it is.
Draft 0 was SPDY. It's usually the way that standards evolve from one proposal; cut and shut standards don't often work out.
This is FAR from a done deal. The binary/ASCII question is being hotly debated.
the joys of debugging x.400 and reading a x.409 dump *NOT*
Why is it that news Internet standards seem to want to go down the route that the OSI did -
HTTP/2.0 defines stream multiplexing, framing, stream control, prioritizing - pretty much replicating TCP. What is the point of putting TCP-like protocol on top of TCP ?
I know! We'll call this mythical virtual machine something catchy, like "Flash".
John
...unless you're on an embedded platform for which you don't have a compiler, and maybe busybox might build in this fancy new binary HTTP client tool in a few decades, but it'll be another few decades after that before manufacturers enable it and ship it.
Seems it's going binary to have EVERYTHING be a stream, with frame based communications, different types of frames denoting different types of data and your "web app" responsible for detecting and handling these different frames. Now I get that there's a lot of demand for something more than Web Socket, and I know that non-Adobe video streaming such as HLS are pathetic, but this strikes me as terrible.
Really, why recraft HTTP instead of recrafting browsers? Why not get Web Socket nailed down? Is it really that hard for Google and Apple to compete with Adobe that instead of creating their own Streaming Media Services they need HTTP2.0 to force every server to be a streaming media server?
Adobe's been sending live streams from user devices to internet services and binary based data communication via RTMP for several years, but HTML5 has yet to deliver on the bandied about "Device API" or whatever it's called this week even though HTML5 pundits have been bashing on Flash for years.
So if Adobe is really that bad and Flash sucks that much, why are we re-inventing HTTP to do what Flash has been doing for years?
Why can't these players like Apple and Google do this with their web browsers, or is it because none of these players really wants to work together because no one really trusts each other?
At the end of the day, we all know it's all just one big clusterfuck of companies trying to get in on the market Adobe has with video and the only way to make this happen in a cross-platform way is to make it the new HTTP standard. So instead of a simple text based protocol, we will now be saddled with streaming services that really aren't suited to the relatively static text content that comprises the vast majority of web content.
But who knows, maybe I'm totally wrong and we really do need every web page delivered over a binary stream in a format not too different from what we see with video.
Think of it like a car, but instead of wheels, frame and engine, it uses binary.
With TCP, you need a separate port number for each service. For example, a Doom server runs on port 666, and an RDP server runs on port 3389. With Web Sockets, you can put all services behind one port and give each a separate path. For example, ws://example.com/doom lets a client open a Doom session, and ws://example.com/rdp lets a client open an RDP session. The advantage of using a path instead of a port number is that it can host a far larger number of services. For example, two users of a server could run game servers on ws://example.com/~WaffleMonster/doom and ws://example.com/~tepples/doom.
Let's say that you use a test client to send commands to your custom server interface and there's a bug. Now you have to spend extra time to discover if the bug is in the test client or in your custom server.
You have it backwards. Before open source caught on, binary formats were all the rage. They were proprietary and they were very prone to corruption. Once a document became corrupted, you were at the mercy of the developers to come up with a tool that could reverse the damage. When open source caught on, it pushed hard for using XML as the format for storing and transmitting information. Data corruption is much easier to spot with clear text and can even be easily fixed compared to binary data. In this respect, HTTP 2.0 is a complete step backwards.
The payloads are already often gzipped - but it's one connection per one file most of the time. If you need images, CSS, Javascript, more tiny images, then those are all separate HTTP connections and on some servers, they are serially requested over one connection. Combine it into one connection, and you can parallelize the download process into one connection, prioritize HTML over resources, and only send cookies once instead of once per file.
Why does it have to be Binary?
That's so twentieth century.
We should be using a Hexadecimal format!
"A person is smart. People are dumb, panicky dangerous animals and you know it." - K
What part of the term "HyperTEXT" did the working group fail to understand?
Really, outside of snooping/privacy discoveries, this is the SINGLE WORST piece of news I have seen in 15 + years on Slashdot. We could see this coming with the push for DRM in the HTTP spec. Here we watch the other shoe drop.
"They" don't like this web. They are entrenched media and information companies, and the elite financial and political powers that rely on framing/controlling public information - while collecting rents for doing so.
You think that your issues with security and control of your own platform are bad now? Wait til your browser is rejected by sites you need to do business with - because you won't parse HTML/HTTP 2.0. Linking this with crap like Intel TPM/TXT and you are well into Orwell Telescreen territory.
"Optional"? Five years from now, we'll all see if you can get your driver's license, pay your phone bill or shop at Amazon without this one...
"Flyin' in just a sweet place,
Never been known to fail..."
The rationale for http-2.0 is available in the http-bis charter. Quoting the spec:...
As part of the HTTP/2.0 work, the following issues are explicitly called out for consideration:
It is expected that HTTP/2.0 will:
Why won't they focus on what really matters? HTTP is still missing sane authentication techniques. Almost all websites require users to log in, so it should be a priority that a log in mechanism be supported in HTTP as opposed to being the responsibility of every web developer individually (don't you dare mention HTTP Basic Authentication). Relying on cookies alone to provide authentication is a joke. Not all websites afford to buy certificates or the performance penalty of HTTPS and security over HTTP is practically non-existent.
The HTTP protocol is very primitive and it has not evolved on par with the evolution of cryptography and security. A lot better privacy, confidentiality, authentication can be obtained with very little cost. Because most websites allow you to log in, the server and the client share a common piece of information that a third party does not, namely the client's password (or some digest of that). Because of this shared information, it should be far easier to provide security over HTTP (at least for existing users). I find it laughable that we still rely on a piece of fixed string sent as plain-text (the session cookie) to identify ourselves. Far better mechanisms exist (besides the expensive HTTPS) to guarantee some security, and it should not be the responsibility of the web developer to implement these mechanisms.
At the very least, HTTP 2.0 should support password-authenticated key agreement and client authentication using certificates.
While signing up can still be problematic in terms of security, logging in need not be. There's a huge difference between being exposed once and being exposed every time. If there was no Eve to monitor your registration on the website, then there should be no Eve that can harm you any longer. You have already managed to share some secret information with the server, there is no reason to expose yourself every time you log in by sending your password in plain text and then sending a session in plaintext every time you make a request. That is, a session which can be used by anyone.
While it may be acceptable for HTTP1.1 to not address such security concerns, I would find it disturbing for HTTP2.0 not to address them either.
To get an intuitive idea on how easy it can be to have "safer proofs" that you are indeed logged in, consider the following scenario: you have registered to the website without an attacker interfering; you would like to log-in, so you perform an EKE with the server and you now have a shared secret; every time you send a request to that website, you send some cookie _authenticated = sha2(shared_key || incremental_request_number). Obviously, even if Eve sees that cookie, it cannot authenticate itself in the next request, because it does not know the shared_key and thus cannot compute the "next hash".
This is just an example proof-of-concept technique to get an idea on how much better we can do. Many other cheap techniques can be employed. This one has the overhead of only one hash computation.
Given that HTTP is a tremendously used protocol, it does make sense to make it as space-efficient as possible, being responsible for a large portion of the bandwidth. I do believe it matters on a large scale. However, given the arguments above, this should not be their priority.
Embrace...Extend...Extinguish.
I come here for the love
Unix, when it was new, was radical in that everything was in ordinary ascii text files. Everyone else "knew" that you had to work in binary, have binary config databases, binary file systems with dozens of record type and so on. With each binary format you had to provide a binary editor and/or debugger. If something broke, you needed a high priest of that particular religion just to debug it, much less fix it.
Note how many Unixes you see for each machine running GCOS or PRIMOS. Of all the machines of the day that still exist, note that most z/OS files are simple EBCDIC. Over time, that square wheel quietly went away.
When the PC came along, application designers once again started doing everything in binary, plus the occasional DOS text file. If something broke, you needed to go back to the vendor, because programs didn't come with binary editors. Or you could get a high priest of a particular order to take it apart in a debugger.
And, just to add injury to insult, a 64-bit binary floating point zero is four times the length of an ascii zero followed by a space or newline. Spreadsheet files in binary were ~ 4 times larger that the ones in DOS text my company used (;-)) Turns out spreadsheet files are mostly empty, or mostly contain zeros.
Over time, lots of config files and data files became ascii or UTF-8, and a huge number fo data files became html or xml text files. And that square wheel went away.
Let a hypertext file be a sequence of bytes, separated by newline characters. Let the text be a sequence of bytes, optionally using multiple bytes per character, as in UTF-8.
Verily I say unto you, let it be simple, lest it be stupid. Round wheels are better than square, or ones that are just flat on one side
--dave
davecb@spamcop.net
Exactly. It is useful to be able to demonstrate that a given request/response occurs with minimal interference. Otherwise there is always questions as to whether FireFox or Curl is sending a request 'differently' somehow; being able to show that a given behavior is reproducible with a request issued over least-common-denominator telnet is inarguable.
Additionally, telnet is nearly ubiquitous while protocol analyzers are much harder to find, plus are often forbidden on desktops in large corporate environments as a security issue either due to their sniffing capability or for innate vulnerabilities.