Average Web Page Size Triples Since 2003
Andy King writes "Within the last five years, the size of the average web page has more than tripled, and the number of external objects has nearly doubled. While broadband users have experienced somewhat faster response times, narrowband users have been left behind." The article breaks down a number of changes besides just page size, including image types and video duration.
Around 1/2 a megabyte. Yup. That big.
(Front Page?)
There are shills on slashdot. Apparently, I'm one of them.
Eat my shorts slashdot !!
While I feel for the people on dial-up or other narrow-band style connections, there isn't much anyone can do for them. Times change. While the majority of internet users in the states are on broadband(70% or more according to Web Site Optimization.com) . In my opinion it would be unfeasible to maintain two sites, one for narrow band users and one for high speed users. Those people in rural area's still have the ability to get high speed internet, such as satellite, direct line of site towers, cellular or even DSL.
... let's note how they've grown in screen size, too! I mean, back in the day, it used to be good enough to have a monitor that could display 640x480. Now, if you're using a 14" CRT, you're totally out of luck when viewing the intarwebs!
Ahem... honestly, I agree that "narrowband users have been left behind," but so have those with smaller monitors, older operating systems, and the like. Sometimes upgrading the hardware/software is just a necessity at some point. If you can't, chances are there's a library nearby that has some newer hardware that might work.
Would it be better if we went back to having a high content/low content index page so the user could pick which one they wanted? Maybe... but I don't think it's necessary, and it usually involves a lot more work.
Proudly supporting the Libertarian Party.
How many web pages had embedded video as a matter of course in 2003?
It seems to me that embedded video alone could account for at least half of this increase.
____
~ |rip/\/\aster /\/\onkey
Who would have thought it, now most people have always on, fast broadband internet web designers are less concerned about page size.
I would like to see more stripped down text only pages ( like the BBC has ) on web pages but otherwise I'm perfectly happy with this and don't see any need to handicap web developers just because some luddities out in the sticks somewhere haven't got a faster connection yet.
I used to save fanfics for a forum I go to, and would save the pages manually before I discovered easier ways (de-FFNet-izer)... It was also around this same time period I began using noscript and I was amazed at the difference in file sizes with and without it enabled. We're talking about a difference of a hundred to three hundred kilobytes per page!
IMHO it's all the useless JavaScript that's choking the tubes these days! Not to mention the privacy and security concerns...
--bornagainpenguin
Have a Virgin Mobile USA smartphone? Give VMRoms.com a try!
It's not the size, it's what you do with it that counts.
Task Mangler
Request your free CD of my piano music.
Well, when last time did you speak with someone with dial up? In fact, I believe that in many places you can buy cheaper broadband than a dial up. And there aren't too many people who have no choice for dial up anymore. I'ts like complaining on technical progress anyway...
The size of my wang. It's up to 4 gigadongs of Miley Cyrus-pleasing goodness.
The U.S. is big, and there's a lot of it where the local phone connection is as good as it gets.
Low bandwidth, flexible pages using CSS are also good for people on mobile units w/ small screens.
William
Sphinx of black quartz, judge my vow.
We may want to consider terabit ethernet.
If people can get past, can they get future? Best way to confuse a stoner
Everything still runs pretty fast, certainly much faster than those few occasions when I need graphics or https: and run Firefox. The difference is noticable on all machines, and greatest (~2x) on the slower ones.
Sometimes formatting gets messed up, but the main content is still in text and still very readable.
The article states the usage of PNGs is up. Definitely a plus, since that format was, from what I've read, designed specifically for web use. Other than that, my only impressions or thoughts would be that the use of YouTube embedding likely accounts for a large portion of the external object growth, now that everyone and their mum can do it.
Whatever next? Software expands to fill the hardware available....?
In my opinion, most of the content is flashy crap. Sometimes it isn't, and is a highly useful feature to a website and some even revolve around it [Refer: youtube.com] but that doesn't mean we all have to jump for the bandwidth heavy options. All of my websites pages don't use any shiny bits at all, and have the bare minimum.
I am on a broadband plan, which can be sometimes slow but pages do load at a decent speed. And most of the time, you're only going to a website to retrieve one piece of information, having things gleaming at me never got me to go deeper into the site.
KISS. Keep It Simple, Stupid.
Yep, tell me about it. When I'm stuck somewhere away from the PC, I catch up on sites from my Nokia N95. 500kB web pages are getting much more normal now, which is costly, and slow for people on phones.
I know Slashdot has a "Palm" edition, which is very low bandwidth, but it only gives you the stories, and top 5 comments. No posting, no nothing.
Surely the great web-wizards at Slashdot can make something that checks for a "Nokia" or "Symbian" user agent, and handles appropriately?
Get your own free personal location tracker
"noticable"? Not a cable?
According to Moore's law the size should be up between 5 and 6 times, so relatively the pages are shrinking.
Pål
Back in the days of 28.8K and 56K modem dominance, I used to do everything I could to trim down my site. Before uploading my html, I would remove every character that didn't *have* to be there for it to render properly. Meaning I had a single line html page, heh. Also, my images were trimmed down to be as lossy as possible without losing a noticeable amount of quality. I was not satisfied if my page with html, images, css, and js at a combined size of greater than 300K... which was still a nice long wait for a modem user. I haven't designed a site in years now... a lot of the care in site design seems to have been lost is all I can say.
Huge pages are fine when I'm in civilized countries with real networks. When I'm traveling in less developed areas and have to carry my own connectivity, it would be very helpful (far less frustrating) to have a thin (text only) option on every web page. I hate to say it, but there ought to be a law.
Invenio via vel creo
The narrowband people should upgrade... or keep surfing the old web on archive.org.
I would be curious to know how many web sites actively use the gzip response to compress content. While this does put an extra load on both client and server, it does help save bandwidth. For static pages these could even be cached in compressed form on the server, to help reduce processor load.
;)
As for many pages there is a lot of junk in there that could be stripped out or put into separate documents. This includes CSS or Javascript that is being reused by multiple pages, since this would be downloaded once.
I am sure that there other methods out there to save bandwidth, including forcing Flash developers, and their managers, to use a modem, smartphone or poor quality DSL, to access their web site - let them feel the pain
BTW I can't read the page at the moment, since it is not responding.
Jumpstart the tartan drive.
NoScript is your friend. Avoid a lot of bloat (flash/javascript ads?), and adds some security
The opposite of broadband is baseband in computerspeak. I've lamented the misuse of narrowband in this context for years, and now even the geek sites are getting it wrong. Ever heard of 100 base T?
Seriously - have you ever stumbled on a long-running blog that is 1 page long? Ever article the author ever wrote is stacked one after another, complete with more than hundred images. It can take minutes to load the entire page.
I don't know if the blog software is to blame, the clueless blogger, or if it was intentional in order to have the most pointers from Google. If I end up at one I immediately back out -- I don't need to hear the opinion of anyone that maintains a site like that.
The multi-megabyte one page blogs are a scourge on the internet.
I'm on broadband (only 2Mbps, but that's fast enough for most downloads and should be plenty quick enough for most browsing) and I've noticed larger download sizes as well. In 95%+ of cases I've not noticed any particular use for the extra bloat other than "we couldn't be bothered doing it properly" or "well, people have broadband".
Excluding places like YouTube where it revolves around big content, and ignoring bloggers who don't have the sense to link to external pages for their videos and so embed a dozen videos on a page, what is the point of all the bloat? Do any of the sites need even a fraction of what they add? A few tens of KB or more for an Ajax/Lightbox/other JS library that you use one minor function from? A huge and badly optimised image? Background images that take up the whole page and aren't properly sliced to remove the bits that aren't necessary or increase the parts that can be repeated?
On the plus side then at least my sites should seem a bit faster when anyone does visit them as I don't cram pages full of crap!
"the average web page has more than tripled"
On the other end, servers and link speeds have not kept up with the demand resulting in more slashDDOS KO's.
One more reason why courier-style websites still exist.
Ugh, I hate it when people describe dial-up as "narrowband" in an attempt to sound more technical. The term "broadband" is used to describe the signal encoding, not bandwidth. Therefore the converse of "broadband is "baseband," not narrowband. The opposite of narrowband is "wideband", and refers to something else. Um, k? Glad we have that all cleared up.
Entrepreneur : (noun), French for "unemployed"
Yet everyone cheers when video games have to be on dvds and computers require small fusion reactors to run the video cards. *rolls eyes* I'd say a 300% increase is fairly decent considering how many webpages now utlized embedded media (movies/audio players), whereas I don't recall that being the case as much 5 years ago. It could be worse, it could still be *all* done in flash.
That'd be a not iCable. Also seriously dude, I'm so leet I use lynx... Whatever Web sites you visit must look like crap in a console window.
Just yesterday I was searching around to see if there was any extension or add-on which would automatically load the "print this page" version instead of the full bloat version, as I am stuck on dialup here and man, most of the web is a pain now, and it gets worse all the time. Even with images turned off. I checked accessibility sites, etc, thinking maybe something developed for the blind. It is pretty dismal, those places emphasize screen readers and audio conversions. Closest I found was some greasemonkey scripts that have to be tailored to individual websites. Google has a low res search function, but it still isn't the same deal. It makes no sense to have to go to the full bloat version, wait for it to finish downloading, then hunt around for the print version, that's backwards for what you need in trying to help speed things up. If there was an HTML attribute added to the page so right off the bat you could be redirected the print only simple version, it would be acceptable. Slashdot is not too bad using the low res version, not bad at all really. BBC is pretty good too, but they are in the minority.
I agree with you on the Flash, it is by far one of the main culprits out there for bloat-age, and it is a catch 22 to avoid it. You can use Flashblock, but that means leaving javascripting turned on, which leaves you open to all sorts of other nasty page slowing "features" (and potential security issues). And if the website owners are worried about losing ad revenue, nothing stopping them from putting text only simple ads on the low res version pages.
I've found using Google's wap proxy site www.google.com/gwt/n to be a nice fix for use on cellphones and narrowband connections. The only problem being it renders the page as a narrow verticle column when using larger screen resolutions.
I hate all sigs, even this one.
Sounds about right to me. I still spend a lot of my time on a dual-1.25 GHz G4 with OS X 10.3.9 and Safari 1.3 and surfing is often painful on this machine. On a whim I saved the front page of ebay.com, looked at the source, and downloaded every referenced .js file I saw. (I think there were about 10.) It wound up being a total of ONE-THIRD of a megabyte of code. So all that code has to be executed, on top of all the HTML, CSS, and images. No wonder it takes forever and makes the browser unresponsive. Yes, I also have Firefox, but it's painful to use for other reasons. (Yes, I'm one of those people. Not religiously, and I won't argue with you about it, but I've got my preferences.) I do use it to "balance the load"--to open up sites that I know are heavy and that I won't spend a lot of time at.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
People think 'broadband' means 'fast'. Actually broadband can ~= faster. Broadband just means that there a particular signaling path has broader range of frequencies (more bandwidth) than some other signaling path. 768Kbps ADSL is broadband compared with a 56Kpbs modem, but is not broadband compared against a fiber optic connection.
In a more technical sense in telecommunications, though broadband is divided into into channels, where baseband just has one signal over the maximum of the bandwidth of the medium. So while cable is a broadband technology and 100-base-TX is a baseband technology, 100-base-TX is of course, much, much faster than cable.
The opposite of 'narrowband' is 'wideband', which doesn't mean the same thing as 'broadband' despite the fact the 'wide' and 'broad' are synonyms.
Confused yet?
My blog
When we had to worry about optimizing for CPU cycles, memory usage and/or application size (lines of code), we'd program in assembler for the inner loops at the very least
Games are written to make use of every bit of GPU and CPU they possibly can -- if you have a 2 year old machine, it's unrealistic to expect to be able to play the latest FPS out there without needing to upgrade your processor. And with Vista, it's the same for buying a new OS. Even linux and *BSD don't run in nearly the same footprint as they used to. (I remember running a picoBSD box in 1999 that fit on a 1.2MB floppy disk
Build it, and they will come^Hplain.
As an example pop over to http://www.dilbert.com/ and see how a PHB sub-committee has ruined a rather fantastic site by slapping a flash front-end on the toon for the day. The Sunday toon requires you to sense that there is an arrow on the last pane and know there should be more.
Absolutely ridiculous, however thankfully this link is available http://dilbert.com/fast for those that still want there fix.
He means oligopoly. In the case of buyers, an oligopsony.
Advertising on the web has tripled over the last five years? It's most definitely what's clogging the pipes...er, tubes.
What?
clearly, no one would mind if web page design and implementation had improved, however, as far as I can tell, things are now very pretty, and mostly disfunctional. The new http://www.tomshardware.com/index.html is a perfect example of a once great site that has been rendered almost completely useless thanks to a corporate 'redesign".
Can anyone please explain why is that upper management must produce so much evil crap?
do not always indicate the big picture Most of the page metrics seem to based on arithmetic mean, but the 'average' is easy to skew. I would think that a more relavent metric vs time would be median. TFA did mention one instance of this statistic: "In 1997, 90% of videos were under 45 seconds in length (Acharya & Smith 1998). In 2005, the median video was about 120 seconds long (Li et al. 2005). By 2007, the median video was 192.6 seconds in duration (Gill et al. 2007). The median bit rate of web videos grew from 200Kbps in 2005 to 328Kbps on YouTube in 2007. So by late 2007, the median video weighed in at over 63MB in file size. On YouTube, the average video size is 10MB, with over 65,000 new videos added every day." But does the 'reatime' video length account for compression vs resolution in current video file formats ? I would like to know if the actual download time vs. median file size ratio has significantly increased.
...use http://loband.org/
My brother in law helped design it for third world countries with slow dial up, so that they can interact with the web at large, but I found another good niche for it.
For half a year I worked on a luxury yacht, doing mostly blue water sailing to the high latitudes. Currently the only reasonably priced satellite connections are the Inmarsat B Fleet range, with a maximum speed equivalent to Dual ISDN.
Trouble is, that costs about $20 a minute, so instead the crew would be forced to use a packet switching service with a maximum speed of 33.6kbps. Surfing the web at that speed is simply unbearable, but if you put the URL into loband you can actually surf most sites at a comfortable speed.
We used it mostly for navigating the image heavy NOAA and Canadian Ice Service home pages until we found the chart we wanted then copy and pasting the link into wget. That way if the yacht suddenly rolled and the connection was severed it wouldn't have been for nothing.
Once over 80 degrees latitude, Inmarsat B drops below the horizon, and you have to use Iridium for data connections at 9.6kbps! Try surfing the normal web with that.
loband is an absolute life saver.
I design the graphics and code for the sites I build. See what response you get when you have a fairly basic looking page which does use properly compressed graphics which loads REAL fast.
Where's all the cr@p like Java / Flash / embedded video!!! Sometimes you despair.
Not everyone uses high-speed internet, and I find it insulting to design large pages that take a while to download, and eat up a viewers bandwidth allowance (as well as your own servers allowance if you use off-site hosting).
Take Nobody's Word For It.
The original article fails to adequately factor in the overhead inherent in the very large number of generated pages on the web such as those served up by ASP.NET web sites. In ASP.NET, significant overhead occurs because the system stuffs large amounts of (superfically encrypted) form data in hidden fields (known as "viewstate") between subsequent round trips to the server. This can quickly become very large due to overuse of viewstate by inexperienced developers. Failure to reduce round trips can make the problem worse.
.HTM pages.
From MSDN:
"The __VIEWSTATE hidden form field adds extra size to the Web page that the client must download. For some view state-heavy pages, this can be tens of kilobytes of data, which can require several extra seconds (or minutes!) for modem users to download."
I can only assume that similar overhead occurs in other similar systems (JavaServer Pages, Cold Fusion).
It's not clear to me whether the article takes these types of pages into account. Either way, the author appears to have missed the boat. If such pages were not included, then the problem as described poorly represents the real world. Many banks and e-commerce sites use the proprietary systems mentioned above to build their online applications, and so the average narrowband user will indeed encounter at least some of these types of pages. And if these types of pages were in fact included in the data, then the author of the article completely missed the reason for much larger page sizes in those particular cases, for reasons that would not apply to standard static
It could be argued that narrowband users have better options than ever before. Mobile devices (phones) are narrow band. Consequently, the most traffic'ed websites support a mobile version, low bandwidth alternative, of their sites. Examples: YouTube, Facebook, MySpace, Google, Amazon, etc.
Half the time I prefer to hit the mobile version of a site, anyway, as it provides a better meat-to-bone ratio.
peace|dewde
dewde.com
I mean, I can look at file sizes on my server and see that if I add all the graphics and html and css up into a single number then my front page averages 100-120K (which is mostly taken up by my webcomic, and the css file is only counted once for the entire site so even though it's much larger than I'd prefer -- nearly 20K -- it's much better than it could be) but that doesn't factor in banner ads, or any of the extra files Drupal includes that I can't immediately keep track of, or any other factors that I may not have thought of.
So is there a site where you can toss in your URL and it reads your page size? Sort of like those sites that will validate your xhtml and css for you?
Eviscerati.Org: All Hail the Eviscerati
I think parent is complaining about the corruption of the internet by greedy middlemen. I too have an abundance of ignorance, and don't believe it is correct not to have fiber to the home. POTS is so NINETEENTH CENTURY. I would be very surprised if the money I have paid for broadband has not been enough to pay for a personal fiber all the way to the local switch, and I DO have wire hooks on my 28' ladder. I could have done it myself, what's so fucking hard here?
The cost of that cleanup, of course, will be borne by taxpayers, not industry.
So after a quick google I found a site called websiteoptimization.com that let me plug in my url and it measured my site size. It was rather depressing.
Total size of the home page: 384617 bytes
6550 bytes html
311512 bytes images
2696 bytes css images
(314208 bytes total images)
36366 bytes javascript
27493 bytes css files
I... honestly didn't think it was going to be that bad. It would take about a minute and a half for someone using a 56K modem to view my front page...
about 130000 bytes would be removed if I dropped the banner advertising. 6000 bytes would go away if I dropped google analytics.
Not sure how to trim down the javascript and css. I use a lot of Drupal modules and pretty much every Drupal module has its own css file, and trying to do away with them entirely is an invitation to really screwing up my site formatting. And I'm not a programmer so I have no idea what all that javascript is for anyway -- I can't take it out because I don't know where it is and I don't know what it's doing.
But... sheesh. When my site was static, on my most days my total page size was under 80K, and that *included* the webcomic. This is kind of depressing...
Eviscerati.Org: All Hail the Eviscerati
Having seen the pretty pictures with the lines going up I had a read of one of the cited articles "A user-focused evaluation of web prefetching algorithms" I was dubious of the number 233% quoted in the article. Whilst the 1995 numbers come from a source that's referenced in other articles, the 2003 numbers come from a single month of proxy logs for a web-server and a news-server at the University of Valencia with 300 users and 132 users respectively. I didn't think that was a useful sample for drawing conclusions about the 'web' in general and wondered what other people felt.
And if one uses an XSLT stylesheet, one needn't even include the header/sidebar/footer on each page- just let the XSLT wrap it around the content div.
Really, given how even IE6 supports it, XSLT is almost criminally underused...
I don't disagree that POTS is oldskool, and that fibre is the future, but how long did it take to get POTS to everyone's house? 50 years? more? What makes you think that you can just wake up and BAM, you have fibre.
You talk about your ladder having wire hooks, but you are ignoring the fact that most modern communities don't have ugly telephone poles, so you have to rip up kilometers of pavement, and then repave when you are done.
Moving everyone to fibre from copper will probably take just as long as moving people from mail to twister copper pair.
Copyright 2010. All rights reserved. This comment may not be copied in any way including, but not limited to caching.
I think you mean baseband. O, what ever happened to networking essentials?
Narrowband is the opposite of wideband, meaning a signal that spans many frequencies.
The opposite of broadband, in this case, is baseband.
The longer time it takes the bigger the reason to start five years ago.
http://www.nationalpriorities.org/costofwar_home
Maybe those $4700 / household would be enough to give many of you people in USA fiber to your homes?
Obesity stikes everwhere! Adult websites just can't seem to fit their women in the webpages like they used to.
Zooooo-ooom!!
I started learning HTML in 1996 and I miss some of the old days of web design. When you have to keep in mind that people are using 56K (or less!) baud modems you have to do more with less code or they wouldn't come back. Tighter code doesn't always make a prettier page, but it does make a better coder. Now people slap up all the obnoxious crap they want because they expect the user to have DSL/cable hookups. It hasn't been an improvement.
I finally had to hook my mom up to broadband--it wasn't just for speeding up the agonizingly slow file downloads. She had a hard time just surfing, much less shopping--the pages were taking too long to render. And forget trying to watch a video, even on "low bandwidth."
Back in the day, there was the 5K Web page award. (The prize for completing the winning page under 5K in size was $50 + bragging rights). It was interesting to see what people could do with so little code and I'm sorry that there doesn't seem to be any of the winning pages still up--the site stands frozen in time back in 2000.
If you want to see what the internet looked like before the rest of the world figured it out what "The Internet" even was, check out the Internet Archive Wayback Machine that has a cache of "85 billion web pages archived from 1996." Their section on the Web Pioneers is a good place to start.
And yes, Slashdot is there too! :-)
If you've never been modded as "flamebait" or "troll," you've never tried to argue a minority viewpoint here!
How stupid is it that my google search returned me to the site where I read the original article... AND I DIDN'T NOTICE?
:P
Honestly. I looked at the article, which was on websiteoptimizations.com, and then I found the site analyzer via google, which was on websiteoptimizations.com, and all I could think of at the time was "huh. They used the same css template."
I read the article... I didn't pay attention to the URL.
Eviscerati.Org: All Hail the Eviscerati
More interesting than the headline is that the word "bloat" disappeared from the lexicon. We really need those Web 2.0 dissolves.
and all the idiot html'ers I've met over the years who swear up and down about CSS and still put 80 nested tables in their pages along with the rest of their crap.
Is that with the advent of the WYSIWYG, every Charlie dipstick that can figure out how to use one thinks He's/She's a web developer. It doesn't surprise me that page size has doubled. The average WYSIWYG writes crappy code, and if you don't know how to write it yourself the page stays bloated.
It has however, benefited my pocket since many of the businesses who have had a site built by these morons come looking for someone to "make their sites work better." It does still amaze me that even in this day and age your average business still doesn't check the credentials or abilities of the people that they hire as programmers.
-Goran
Carpe Scrotum - The only way to deal with your competition.
Can you get the same result of speeding up webpage loading by prioritizing which items need to be downloaded? For example, you load the main HTML webpage first, then switch to the images/objects on the originating server, followed by third-party images and stuff.
Given that supressing object/image loading was a staple feature of Netscape (combined with an instant button that loads them), I'm really suprised that they left this to dry.
"Narrowband" is a word made up by people who don't know what "broadband" means. A dialup connection can be broadband.
What has the Acid Test to do with CSS standards? The Acid Test, while valuable and interesting, is not the specification and has no authority whatsoever concerning CSS (and other techniques tested).
Media types exist at least since CSS level 2. I have no solid knowledge of earlier versions, but still, CSS level 2 has been around since 1998.
As I took the GP, the first suggestion was to replace the image with a background image which can easily be specified by media-specific CSS. I don't see any issue there.
Turn off Javascript and disable Flash and everything works quite well even at 56K.
Have gnu, will travel.
(a) HTML IMG tag already supports percentages
/. web page. do a search for the text "media=", you will find in the head section that different stylesheets are being applied based in the use context of the page. ;-)
(b) HTML/CSS already supports context-based CSS
To see an example, just do a "View Source" of this current
This is what happens when amateurs take over.
---- Booth was a patriot ----
No, not the size. I mean literally! With the exception of my resume and my marital status on my bio, I can't be bothered updating it.
Conclusion: I'm doing my bit to keep the web clean!
These posts express my own personal views, not those of my employer
It's scary being a Flash and Flex developer on Slashdot. You guys are unnaturally rabid.
This is all true...same phenomenon as Vista in terms of applications.
brain evolution
I think this is a good thing. As bandwidth becomes ever cheaper, there is less reason to worry about conserving it, and we can use it for more interesting things. Yes it's sad that some people can't get anything better than a slow modem, but that doesn't mean we should be writing pages like it's 1999.
That's fine, since my bandwidth has easily "more than tripled" since 2003.
I live in New Zealand, which has 1 of the slowest internet speeds overall in the entire OECD.
And boy, does this is impact on me. I still find myself stopping pages from loading all the time due to not being able to wait.
If each mistake being made is a new one, then progress is being made.