HTTP Intermediary Layer From Google Could Dramatically Speed Up the Web
grmoc writes "As part of the 'Let's make the web faster' initiative, we (a few engineers — including me! — at Google, and hopefully people all across the community soon!) are experimenting with alternative protocols to help reduce the latency of Web pages. One of these experiments is SPDY (pronounced 'SPeeDY'), an application-layer protocol (essentially a shim between HTTP and the bits on the wire) for transporting content over the web, designed specifically for minimal latency. In addition to a rough specification for the protocol, we have hacked SPDY into the Google Chrome browser (because it's what we're familiar with) and a simple server testbed. Using these hacked up bits, we compared the performance of many of the top 25 and top 300 websites over both HTTP and SPDY, and have observed those pages load, on average, about twice as fast using SPDY. Thats not bad! We hope to engage the open source community to contribute ideas, feedback, code (we've open sourced the protocol, etc!), and test results."
Now we can see Uncle Goatse twice as fast.
In the future, the content will be loaded before you click! Unfortunately, it's not like it today, so I didn't make the first post...
remove flash, java applets ad's
20X faster!
From the link
We downloaded 25 of the "top 100" websites over simulated home network connections, with 1% packet loss. We ran the downloads 10 times for each site, and calculated the average page load time for each site, and across all sites. The results show a speedup over HTTP of 27% - 60% in page load time over plain TCP (without SSL), and 39% - 55% over SSL.
1. Look at top 100 websites.
2. Choose the 25 which give you good numbers and ignore the rest.
3. PROFIT!
And all other "add this piece of Javascript to your Web page and make it more awesomer!"
Yes, yes, they're useful. And you can't fathom a future without them. But in the meantime I'm watching my status bar say, "completed 4 of 5 items", then change to "completed 11 of 27 items", to "completed 18 of 57 items", to "completed... oh screw this, you're downloading the whole Internet, just sit back, relax and watch the blinkenlights".
Remember when a 768kbps DSL line was whizzo fast? Because all it had to download was some simple HTML, maybe some gifs?
I want my old Internet back. And a pony.
Potato chips are a by-yourself food.
The problem isn't pushing the bits across the wire. Major sites that load slowly today (like Slashdot) typically do so because they have advertising code that blocks page display until the ad loads. The ad servers are the bottleneck. Look at the lower left of the Mozilla window and watch the "Waiting for ..." messages.
Even if you're blocking ad images, there's still the delay while successive "document.write" operations take place.
Then there are the sites that load massive amounts of canned CSS and Javascript. (Remember how CSS was supposed to make web pages shorter and faster to load? NOT.)
Then there are the sites that load a skeletal page which then makes multiple requests for XML for the actual content.
Loading the base page just isn't the problem.
No. Akamai gives boxes to ISPs that cache Akamai's customer's content closer to the ISP's customers. Akamai then uses logic they've put together into DNS to redirect requests to the appliance closest to the request.
So which ports are you planning to use for it?
Deleted
AOL actually does something similar to this with their TopSpeed technology, and it does work very, very well. It has introduced features like multiplexed persistent connections to the intermediary layer, sending down just object deltas since last visit (for if-modified-since requests), and applying gzip compression to uncompressed objects on the wire. It's one of the best technologies they've introduced. And, in full disclosure, I was proud to be a part of the team that made it all possible. It's too bad all of this is specific to the AOL software, so I'm glad a name like Google is trying to open up these kind of features to the general internet.
here's an onion to hang on your belt, granpa.
now, on a more serious note, isn't gopher a faster protocol than HTTP ? could we just use it to transport html, pictures, etc ?
What ? Me, worry ?
They need start with practicing what they preach...
http://code.google.com/speed/articles/caching.html
http://code.google.com/speed/articles/prefetching.html
http://code.google.com/speed/articles/optimizing-html.html
They turn on caching for everything but then spit out junk like
http://v9.lscache4.c.youtube.com/generate_204?ip=0.0.0.0&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Calgorithm%2Cburst%2Cfactor&fexp=903900%2C903206&algorithm=throttle-factor&itag=34&ipbits=0&burst=40&sver=3&expire=1258081200&key=yt1&signature=8214C5787766320D138B1764BF009CF62A596FF9.D86886CFF40DB7F847246D653E9D3AA5B1D18610&factor=1.25&id=ccbfe79256f2b5b6
Most cache programs just straight up ignore this. Because of the '?' in there. It ends up being a query to static data.
Then never mind the load balancing bits they put in there with 'v9.lscache4.c.'. So even IF you get your cache to keep the data you may end up with a totally different server and the same piece of data just served from another server. There have been a few hacks to 'rewrite' the headers and the names to make it stick. But those are just hacks and while they work they seem fragile.
The real issue is at the HTTP layer and how servers are pointed at from inside the 'code'. So instead of some sort of indirection that would make it simple for the client to say 'these 20 servers have the same bit of data' they must assume that the data is different from every server.
Compression and javascript speedups are all well and good but there is a different more fundamental problem of extra reload of data that has already been retrieved. As local network usage is almost always faster than going back out to the internet. In a single user environment this is not too big of a deal. But in a 10+ user environment it is a MUCH bigger deal.
Even the page that talks about optimization has issues
http://code.google.com/speed/articles/
12 cr/lf right at the top of the page that are not rendered anywhere. They should look at themselves first.
While we're at it, let's also make processing web pages faster.
We have a semantic language (HTML) and a language that describes how to present that (CSS), right? This is good, let's keep it that way.
But things aren't as good as they could be. On the semantic side, we have many elements in the language that don't really convey any semantic information, and a lot of semantics there isn't an element for. On the presentation side, well, suffice it to say that there are a _lot_ of things that cannot be done, and others that can be done, but only with ugly kludges. Meanwhile, processing and rendering HTML and CSS takes a lot of resources.
Here is my proposal:
- For the semantics, let's introduce an extensible language. Imagine it as a sort of programming language, where the standard library has elements for common things like paragraphs, hyperlinks, headings, etc. and there are additional libraries which add more specialized elements, e.g. there could be a library for web fora (or blogs, if you prefer), a library for screenshot galleries, etc.
- For the presentation, let's introduce something that actually supports the features of the presentation medium. For example, for presentation on desktop operating systems, you would have support for things like buttons and checkboxes, fonts, drawing primitives, and events like keypresses and mouse clicks. Again, this should be a modular system, where you can, for example, have a library to implement the look of your website, which you can then re-use in all your pages.
- Introduce a standard for the distribution of the various modules, to facilitate re-use (no having to download a huge library on every page load).
- It could be beneficial to define both a textual, human readable form and a binary form that can be efficiently parsed by computers. Combined with a mapping between the two, you can have the best of both worlds: efficient processing by machine, and readable by humans.
- There needn't actually be separate languages for semantics, presentation and scripting; it can all be done in a single language, thus simplifying things
I'd be working on this if my job didn't take so much time and energy, but, as it is, I'm just throwing these ideas out here.
Please correct me if I got my facts wrong.
How about we don't use HTTP/HTML for things they were not designed or ever intended to do? You know, that "right tool for the right job" thing.
There is no "-1 offended" or "-1 you don't agree with me" mod options for a reason.
It's not all rosy as the short documentation page explains. While they are trying to maximize throughput and minimize latency, they are hurting other areas. 2 obvious downsides I see are:
1. Server would now have to keep holding the connection open to the client throughout the client's session, and also keep the associated resources in memory. While this may not be a problem for Google and their seemingly limitless processing powers, a Joe Webmaster will see their web server load average increase significantly. HTTP servers usually give you control over this with the HTTP keep-alive time and max connections/children settings. If the server is now required to keep the connections open it would spell more hardware for many/most websites;
2. Requiring compression seems silly to me. This would increase the processing power required on the web server (see above), and also on the client - think underpowered portable devices. It needs to stay optional - if the client and server both play and prefer compression, then they should do it; if not, then let them be; also keeping in mind that all images, video and other multimedia are already compressed - so adding compression to these items would increase the server/client load _and_ increase payload.
oldermanwholikestofondleyou.cx
To follow the goatse.cx standard, I believe it should be http://oldermanwholikestofondleyour.co.ck
It's only $250 to register a .co.ck address!
The good news is that SPDY seems to build on the SMUX ( http://www.w3.org/TR/WD-mux ) and MUX protocols that were designed as part of the HTTP-NG effort, so at least we're not reinventing the wheel. Now we have to decide what color to paint it.
Next up: immediate support in FireFox, WebKit, and Apache -- and deafening silence from IE and IIS.
Gopher is not installed by default, kiddie...
Gopher is installed by default on most builds of Firefox. Try this in your address bar: gopher://gopher.floodgap.com/1/world
Paid Q&A/Research
Someone already invented this.
It's called Opera browser
"I disapprove of what you say, but I will defend to the death your right to say it." - historian Evelyn Beatrice Hall
>>>Gopher predates HTTP by a fair number of years.
Not correct. Gopher and HTTP were both released in summer 1991, so virtually the same birthdate. However gopher was available on the IBM PC that same year while HTTP was still confined to Unix systems, so that's why people misremember gopher as being first. (HTTP came to IBM PC, Macs, and Amigas in 1993.)
"I disapprove of what you say, but I will defend to the death your right to say it." - historian Evelyn Beatrice Hall
If they really wanted a faster web, they would have minimized the protocol name. Taking out vowels isn't enough.
The protocol should be renamed to just 's'.
That's 3 less bytes per request.
I can haz goolge internship?
Most of the features of fasterfox are found in about:config. There is no sense in installing an addon that will slow the browser down when the browser already has pipelining and prefetching (albeit disabled)