Google Shares Insights On Accelerating Web Sites
miller60 writes "The average web page takes 4.9 seconds to load and includes 320 KB of content, according to Google executive Urs Holzle. In his keynote at the O'Reilly Velocity conference on web performance, Holzle said that competition from Chrome has made Internet Explorer and Firefox faster. He also cited the potential for refinements to TCP, DNS, and SSL/TLS to make the web a much faster place, and cited compressing headers as a powerful performance booster. Holzle also noted that Google's ranking algorithm now includes a penalty for sites that load too slowly."
If only Slashdot loaded faster I could have had my first post!
Now if only every website didn't include 300kb of Javascript libraries that I had to download.
That's what noscript is for. With noscript, your browser doesn't even download the .js files.
How many times will their crawler check a slowly loading website before they penalizes it?
"Human kind cannot bear very much reality" ~T.S. Eliot
I find my browsing goes faster if I just yell at my housemate to stop downloading torrents that are *ahem* 'Barely Legal'.
I saw my browser waiting on google-analytics.com quite often before I started using No-Script.
Why do sites put up with an AD server/analytics service that slows down a site by a large amount?
If I have nothing to hide, don't search me
This really begs the question of what it tries to load. If it simply loads the html, then JavaScript laden sites and Flash sites will have the edge over simple information sites that serve dynamic content. However, if they load all referenced content, then the reverse may be true.
I would like it if the latter were true. What could be better than every Flash site being seen as a large bundle of data that simply displays "This site requires Flash". When I surf the web, I surf for content, not pretty pictures. In my opinion, if a site can't simultaniously be surfed in Lynx, read in Braille, and parsed with a spider, then it really isn't a web site.
There are 10 commandments: 01)Thou shalt love the Lord Thy God 10)Thou shalt love thy neighbour as thyself.Matt22:34-40
Google's ranking algorithm now includes a penalty for sites that load too slowly.
I'm not sure how I feel about this. My initial response was a happy one, but the more I think about it, the more it seems to be unnecessarily discriminating against those who are too far away from the bleeding edge. Do we really live in a world where 'Speed=Good' so completely that we need to penalize those who don't run fast enough? And where are we drawing the line between 'fast' and 'slow'?
There's no inherent reason that Java should be slow. I run a discussion site (linked in my sig) that's running off an all-java codebase, and while it has occasional load issues. We can render the content for the front page of the site in 20 ms or less (it's at the bottom of the page if you are curious). Java has a proper application model, so with smart use of singletons you can effectively keep the entire working set of a forum site in memory. Our performance is much poorer if you start browsing through archives, but that makes up a tiny percentage of our page views.
Not to discount how important it is to make your website as fast as possible but...
I doubt anyone with a decent internet connection is complaining about these 320k pages. Even on a cell phone it's not a big deal. As technology moves forward and speed improves even more these size related complaints will get less and less important.
Think about it - who complains about a 340k file on their hard drive anymore. I'm sure in the mid '80s lots of geeks rightfully griped about it.
That would make google search results bad right. When I search I want the site with the best information. Not the one that loads fastest.
"He also cited the potential for refinements to TCP, DNS, and SSL/TLS to make the web a much faster place"
The core Internet protocol and infrastructure was and remains a conduit of innovation /because/ it is agnostic to HTTP and all other protocols. Optimizing for one small subset of its protocols and for a single kind of contemporary usage would discourage all kinds of innovation using protocols we've not conceived yet, and would be the single largest setback the modern Internet has seen.
There are 1.1... kinds of people.
java really really only has problems with startup time (that a web spider will never see) and the delay when a servlet|jsp is hit the first time. While doing web development, we see that startup and first load most of the time, giving an appearance of slowness, but it is much better on a production server with regular traffic.
Most real-world page load delay today seems to be associated with advertising. Merely loading the initial content usually isn't too bad, although "content-management systems" can make it much worse, as overloaded databases struggle to "customize" the content. "Web 2.0" wasn't a win; pulling in all those big CSS and JavaScript libraries doesn't help load times.
We do some measurement in this area, as SiteTruth reads through sites trying to find a street address on each site rated. We never read more than 21 pages from a site, and for most sites, we can find a street address within 45 seconds, following links likely to lead to contact information. Only a few percent of sites go over 45 seconds for all those pages. Excessively slow sites tried recently include "directserv.org" (a link farm full of ads), "www.w3.org" (embarrassing), and "religioustolerance.org" (an underfunded nonprofit). We're not loading images, ads, Javascript, or CSS; that's pure page load delay. It's not that much of a problem, and we're seeing less of it than we did two years ago.
This is just a pointless degradation of the search results. Google needs to stop screwing around with the accuracy of their results like this or they too will find themselves displaced by a competitor that focuses on relevancy of the content and not irrelevant outside factors.
For the unbearably slow, there's always Google Cache.
For Flash heavy sites, will the time it takes for the Flash to load be taken into account? Or how about sites slowed down by all the external ads?
Jumpstart the tartan drive.
Holzle said that competition from Chrome has made Internet Explorer and Firefox faster.
Bull. Back when IE and Firefox's last major releases came out, Chrome was a tiny drop in the bucket market-share-wise. January was the first time it passed Safari in marketshare. I think it's more accurate to say that competition in general has led to companies improving their browsers. I'd bet we could also attribute the performance improvements to better standards compliance by websites, since there are now so many mainstream browsers.
I'd say that Firefox vs IE competition (and Firefox vs Safari on the mac) have inspired the improvements...
Please help metamoderate.
Where are the measuring *from*?
I've moved a site from Linode New Jersey to Linode London, UK because the target audience are in London ( http://www.lfgss.com/ ).
However in Google Webmaster Tools the page load time increased, suggesting that the measurements are being calculated from US datacentres, even though for the target audience the speed increased and page load time decreased.
I would like to see Google use the geographic target preference and to have the nearest datacentre to the target be the one that performs the measurement... or better still to have both a local and remote datacentre perform every measurement and then find a weighted time between them that might reflect real-world usage.
Otherwise if I'm being sent the message that I am being penalised for not hosting close to a Google datacentre from where the measurements are calculated, then I will end up moving there in spite of the fact that this isn't the right thing for my users.
I'm guessing you dont like Flickr much..
Noscript doesn't turn off Javascript. Most browsers already have an option for that. What Noscript does is to make the control of Javascript (and Flash) much more fine grained and convenient.
Some typical case:
1. Scripts on poor web sites just serve to detract from the content. Those you simply never turn on.
2. Scripts on good web sites improve access to content. Those sites you enable permanently first time you visit (press no Noscript button in the lower right corner, and select "enable permanently") and forget about it.
3. Some web sites contain a mix of the two. Here you can either explicitly enable a specific object (by clicking on a placeholder, like with flashblock), or temporarily enable scripts for that site.
Basically, Noscript makes more, not less, of the web accessible. The good web sites you use normally will not be affected (as they all will be allowed to run scripts). But following links from social web sites like /. become a much more pleasant experience.
Of course, most of the noise scripts distacting from content are ads, so AdBlock gives you much of the same benefit. But I don't want to hide ads, as that is how the sites pay their bills.
Yeah, but that 320Kb is most likely divided over at least 30 HTML/CSS/JS/JPG/PNG/SWF files... And headers include lots of information including cookies being sent back-and-forth, so the average headers are closer to 1000 bytes (around 500 both ways) per request now. According to my count this is around 30Kb, or an overhead of 10%, so this does leave some room for improvement... But if you account for the fact that those 320Kb will most likely also be transmitted with gzip compression the bytes over the wire are closer to 100Kb (roughly), so this brings the header-overhead on the wire to a whopping 30%. Those headers can be compressed with standard gzip to bring it back down to around 8Kb, but when you would take advantage of compression with a predefined dictionary optimized for HTTP headers you could shave it down a lot more, at least well under the 30% overhead we currently waste.
I hope Google will kickstart this initiative by adding it to Google Chrome and contributing code to HTTP servers, this way Chrome will be even faster than competition and other browsers will have to keep up... I love it when competition works!
Also knock it up a notch with View -> Page Style -> No Style.
This works really well for sites that put stories into tiny columns or use unreadable fonts.
Right, because how could Java possibly hope to compete with the blazing speeds of PHP and Ruby?
sic transit gloria mundi
I'll have to double-check my sites to be sure, but I think I'd be throwing a huge optimisation at any of my pages that got near 320KB, never mind averaging that large. That's just crazy-huge for a page given the amount of actual useful content that most pages have. If only people put in useful stuff instead of filling sites with pointless cruft.
javascript,http and tcp is now bleeding for years now..
...javascript/etc is slow because of it's design
it often mades my computers run like they were 10 year old crap fighting with ecc errors and io errors behind the scenes...this is not web 2.0, this is web 0.2
...http is designed to download web-page, not to solve a cryptic javascript initiated download puzzle
i don't really think anyone should save it - it should be replaced for good ... but nowadays links are getting faster, if this was enough in the last few years it will suffice till the burrial of http
so...to speed up the headers should be altered - and compress them? outcome: smaller required bandwidth footprint
I think Lynx is the wrong example to use, as it does not have Javascript.
Run Internet Explorer for a few days. Go to the sites you go to now currently and see how unresponsive they are:
Enjoy the flash adverts sucking up your CPU.
Enjoy the diet and belly fat banners
Enjoy the and accordion and menu animations
Enjoy the Google Analytics loading
Enjoy the updating banner ads
Enjoy loading Prototype/jQuery/Google Ads again and again for every site you go to
Enjoy vibrant media in-text adverts.
Enjoy some of the sneaky popunders and fake virus warnings
I recommend the no-Javascript experience to anyone.
Load up with RequestPolicy, NoScript and AdBlocker Plus and you're sorted for hiding the crap you don't want to see. The AdBlocker is a way to block the stuff you accidentally let through with RequestPolicy.
As to the content producers, say escapist magazine, screw them. They obviously forgot about me when accepting that cheque from selling out. You don't block or try counter adblockers. It's my computer, my bandwidth.
Slashdot needs Geekcode | Can anyone recommend any good SCIFI? My tastes: Foundation, Startide Rising, CITY, Ringworld,
If you're not going to be clicking adverts, I am sure it costs nobody money. It just costs them bandwidth. The adworld is mostly CPC/PPC.
Content websites seem to think that if I do not block an advert, I will actually click it. That is ridiculous!
My principle is that advertising is like a bribe, they paid to put it in my face. That is a product I have no interest in. I will learn about products when I have a need for them.
Slashdot needs Geekcode | Can anyone recommend any good SCIFI? My tastes: Foundation, Startide Rising, CITY, Ringworld,
They are attempting to add it to Chromium and get it into HTTP servers.
http://dev.chromium.org/spdy
There is a link to the code somewhere there. If not, just search the chromium repo as that is where it lives.
It is open and you can play with it too. There is also an Apache mod for it, I believe, though I haven't searched for that lately.
More numbers and studies are available there.
disclosure: I'm one of the SPDY authors
I was thinking about this, but it might just be a penalty if your page loads really slow, say more than 10 seconds.
Google is still full of semi-dead links, when you click one, it takes forever to fetch the page - and half the time it never loads and the other half it doesn't load fully (so you get to see the ads, but not the content).
Pack all design (not content) pictures in one big picture, and use cropping to use the parts in the right places. Saves you separate requests and hence HTTP headers and establishing separate TCP connections.
Also shorten all your links. A server-site script can handle the en- and decoding. (But beware that this stops Google from matching keywords against the URLs.)
Much can also be done subjectively. Like never having elements with unknown heights hold up the rendering of elements below them. Always specify the sizes of external elements like pictures, objects/embeds, etc.
Also, if you want to go further, re-order the elements on your site in such a way, that they can be rendered in the order in which people start looking at them. Start at the center, then work to the upper left, render the menu and title, and then grow from there. Anything below the fold should be loaded and rendered last.
Also, by adding >script> tags in the <body> at the right place, you can time your JS execution to the right moments in rendering, and so optimize load.
But of course the best strategy still is: Don‘t make your freakin’ website so freakin’ big! Of course this has gotten bigger. I remember that our starting page law was to stay below 100kB back in 2004. I find it horrible that some sites take 500kB or even 1MB to load, even when all the standard site elements (design pic, CSS, JS) have already loaded.
Any sufficiently advanced intelligence is indistinguishable from stupidity.
It doesn't give them money Dave, if I do not click an advert (click) and do not buy the product referenced in the advert (impression)...
They get nothing.
Are you a content producer by any chance?
Slashdot needs Geekcode | Can anyone recommend any good SCIFI? My tastes: Foundation, Startide Rising, CITY, Ringworld,
Adverts pay for hosting.
Adverts ONLY pay for hosting if me, the surfer:
Otherwise they get nothing. I should know, I have £80 in my adsense account and nobody clicked my adverts and I had 80,000 impressions.
It makes NO difference if you show me an advert or not, I WILL NOT buy it or follow it. I immediately mistrust it. They had to pay to get it in my face. I would rather wait for word of mouth or a review.
Does that make sense now?
Showing me adverts is a lost cause, so it makes more sense for me to block them. Content producers have not lost anything by me not clicking their advert. They would have got nothing from me seeing their advert to begin with. It affects nobody but myself.
You could say they wasted bandwidth on me but that's bollocks. They lose bandwidth on everyone who visits the site but does not click on it.
Everyone wins.
Slashdot needs Geekcode | Can anyone recommend any good SCIFI? My tastes: Foundation, Startide Rising, CITY, Ringworld,
Any request that takes longer than 2 seconds will be uncomfortable to the user.
The opening comment here sez pages are closer to 5 seconds now, which means the web is a lose.
Not necessarily. Enough content should be rendered nearly immediately that the user is not discomforted by the wait, even though parts of the page are still loading. Now, granted that whether or not the page is responsive at that time is another question... if part of the page has rendered but it wasn’t the part I need, and the page won’t scroll for ~5 seconds as it loads, then yes, that is inconvenient.
Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
All final tests, from the developers on out, should be done on two generation old computers, and those *have* to be connected to the outside via non-commercially rated broadband. Then we'll stop seeing pages with multiple megabyte pictures, and megabytes of friggin' flash videos*, and bloated java apps.
mark
* Flash videos on *corporate* web pages while you're trying to use *their* search for *their* available jobs!
I write all my PHP on one line and use one letter variable / method names so it parses faster. You can't really do that with Java with it's long names and type declarations.
I can only hope you were being sarcastic. variable names are least parser intensive operation. Though for non lexically scoped variables (at least in perl, which I though PHP was at least loosely based on), the variable names are hash-lookups, thus long variable names have a minute incremental cost - especially in tight loops.
.class files (even in version 6) are pretty startup intensive the first time. And .jsp files are doublly so, because they are compiled into java source, then compiled into .class, then finally loaded. It'll only win over a PHP compile if it's
That aside, this isn't a rational comparison, given that php is a scripted language and java is a compiled language. So your 50 character java variable name is a 4 byte integer symbol reference at load time and execution time.
That being said, java
But high-performance pages are likely raw servlets and thus pre-loaded prior to startup.. Meaning before accepting port 8080. Thus in a clustered environment with rolling updates, you never see the startup slowness. The only remaining startup slowness would be pre-jitted code (running raw interpreted-mode for the first 100 executions or so). But by run 1,000 you're likely running bare-metal assembly - depending on the nature of the servlet that is. Granted, this doesn't compensate for overly abstracted code (many of the MVC frameworks) or inefficient cluster/database code.
-Michael