Are Long URLs Wasting Bandwidth?
Ryan McAdams writes "Popular websites, such as Facebook, are wasting as much as 75MBit/sec of bandwidth due to excessively long URLs. According to a recent article over at O3 Magazine, they took a typical Facebook home page, looked at the traffic statistics from compete.com, and figured out the bandwidth savings if Facebook switched from using URL paths which, in some cases, run over 150 characters in length, to shorter ones. It looks at the impact on service providers, with the wasted bandwidth used by the subsequent GET requests for these excessively long URLs. Facebook is just one example; many other sites have similar problems, as well as CMS products such as Word Press. It's an interesting approach to web optimization for high traffic sites."
compression to shorten the URL's?
Wordpress by default allows you to configure URL writing. The default is set to something like: http://www.mysite.com/?p=1.
For SEO purposes it's always handy to switch to the more popular example: http://www.mysite.com/2009/03/my-title-of-my-post.html.
Suggesting that we cut URL's that help Google rank our pages higher is preposterous.
Are forums (fora?) like these wasting bandwidth as well by allowing nerds, like myself, to banter about minutia (not implying this topic)? Discuss amongst yourselves.
Read the rest of this comment
Absolute power corrupts absolutely. indymedia
The PHPulse framework is a great example of a better way to do it. It uses one variable sent for all pages which it then sends to a database (rather than an XML page) where it stores the metadata of how all the pages interelate. As such, it doesn't need to parse strings, it is easier to build SEO optimized pages and it can increase page load times by 10 times over other MVC frameworks.
This is my sig. There are many like it but this one is mine.
The short Facebook URLs waste bandwidth too ;)
Tsunami -- You can't bring a good wave down!
By default Wordpress produces short urls.
Good is never enough, when you dream of being great!
Of all things that could be optimized, urls shouldn't have a high priority (unless you want people to enter them manually).
I'm pretty sure their HTML, CSS, and javascript could be optimized way more than just their urls.
But rather than simply sites, people often what it to be filled with crap (which nobody but themselves care about).
ps, that doesn't mean you should try to create "nice" urls instead of incomprehensible url that contain things like article.pl?sid=09/03/27/2017250
It's irrelevantly small portion of the traffic, while at the scale of Facebook, it could save some traffic, but does not make any impact on the bottomline worthwhile the effort!
150 chars long url = 150 bytes VS 50KILObytes + Images of rest of the pageview....
I'm throwing out of my head that 50kilobytes for the full page text, but a pageview often runs at over 100kb.
So it's totally irrelevant if they can shave off the 100kb a whopping 150bytes.
Pulsed Media Seedboxes
Those FacebookApps that spread through innocent people's notifications waste more.
slashwhat?
Twitter clients (including the default web interface) auto-tinyURL every URL put into it. Clicking on the link involves not one but *2* HTTP GETs and one extra roundtrip.
How long before tinyurl (and bit.ly, ti.ny, wht.evr...) are cached across the internet, just like DNS?
This is ridiculous. If I have a billion dollars, I'm not going to worry about saving 50 cents on a cup of coffee. The bandwidth used by these urls is probably completely insignificant.
---Technology will liberate us if it doesn't enslave us first.
How many times are the original pages called? Is this really the resource hog?
What about compressing images, trimming them to their ultimate resolution?
How about banishing the refresh tags that cause pages to refresh while otherwise inactive? Drudgereport.com is but one example where the page refreshes unless you browse away from it...
If you really want to cut down on bandwidth usage, eliminate political commenting and there will never be aneed for Internet 2!
Ken
75 whole freaking megabits? WOWSERS!!!!
They must be doing gigabits for images, then. Complaining about the URLs is complaining about the 2 watts your wall-wart uses when idle, all the while using a 2kW air conditioner.
Oh, you're not stuck, you're just unable to let go of the onion rings.
This is a stupid exercise. Oh my gosh, there's an extra few characters wasted. They're talking about 150 characters, which would be 150 bytes, or (gasp) 0.150KB.
10 times the bandwidth could be saved by removing a 1.5KB image from the destination page, or doing a little added compression to the rest of the images. The same can be said for sending out the page itself gzipped.
We did this exercise at my old work. We had relatively small pages. 10 pictures per page, roughly 300x300, a logo, and a very few layout images. We saved a fortune in bandwidth by compressing the pictures just a very little bit more. Not a lot. Just enough to make a difference.
Consider taking 100,000,000 hits in a day. Bringing a 15KB image to 14KB would be .... wait for it .... 100GB per day saved in transfers.
The same can be said for conserving the size of the page itself. Badly written pages (and oh are there a lot of them out there) not only take up more bandwidth because they have a lot of crap code in them, but they also tend to take longer to render.
I took one huge badly written page, stripped out the crap content (like, do you need a font tag on every word?), cleaned up the table structure (this was pre-CSS), and the page loaded much faster. That wasn't just the bandwidth savings, that was a lot of overhead on the browser where it didn't have to parse all the extra crap in it.
I know they're talking about the inbound bandwidth (relative to the server), which is usually less than 10% of the traffic. Most of the bandwidth is wasted in the outbound bandwidth. That's all anyone really cares about. Server farms only look at outbound bandwidth, because that's always the higher number, and the driving factor of their 95th percentile. Home users all care about their download bandwidth, because that's what sucks up the most for them. Well, unless they're running P2P software. I know I was a rare (but not unique) exception, where I was frequently sending original graphics in huge formats, and ISO's to and from work.
Serious? Seriousness is well above my pay grade.
but how much is that as a proportion of their total bandwidth usage, if they were worried about bandwidth im sure they could just compress the images a little more and save much more
No, those guys wanting to ban black cars are saner people than writers of this article ...
The black car thing atleast is somewhat significant! For example, see when mythbusters tested white vs. black car.
Pulsed Media Seedboxes
BS
Seriously. Long URL's as wasters of bandwidth? There's a flash animation ad running at the moment (unless you're an ad-blocking anti-capitalist), and I would expect it uses as much bandwidth when I move my mouse past it as a hundred long URL's.
I'm not apologizing for bandwidth hogs... back in the dialup days (which are still in effect in many situations), I was a proud "member" of the Bandwidth Conservation Society, dutifully reducing my .jpgs instead of just changing the Height/Width tags. My "Wallpaper Heaven" website (RIP) pushed small tiling backgrounds over massive multi-megabyte images. But even then, I don't think a 150-character URL would have appeared on their threat radar.
It's a drop in the bucket. There are plenty of things wrong with 150-character URLs, but bandwidth usage isn't one of them.
Stressed? Me? Of course not. Stress is what a rubber band feels before it breaks, silly.
If you take and type a full page (no carriage returns) into notepad and save it, you end up with 5kb per printed page at the default font/print settings. When was the last time that a web page designer cared about 5kb? If 150 bytes (yes, 150 char's) is a concern, trim back on the dancing babies and mp3 backgrounds before you get rid of the ugly url's.
Besides, if not for those incredibly long and in need of shortening URL's, how else would we be able to feed rick astley's music video youtube link into tinyurl and expect people to click it, expecting it to be a real URL?
Is it sad that I am more likely to recognize you and your posts by your sig than your name or UID?
umpteen ad links, crapromedia, and god knows what else. Seriously, the URL size is like complaining while you urinate into the sea then saying it's affecting the tide level.
What's the percentage savings? Is it enough to care or is it just another fun fact?
Simplifying / nanoizing / consolidation javascript and reducing the number of sockets required to load a page would probably be more bang for the buck. Is it worth worry about?
Second, define 'waste'. Most rational people would argue that facebook is itself is a waste of bandwidth, and that getting rid of it would leave more bandwidth for what people really want, which is p0rn, unless the rumors extorted in the previous article is true which is that facebook is really about such amateur barely legal material.
But even if we assume that Facebook is wasted, the percentage of bandwidth used is probably not excessive given it's entertainment value. I mean, it would be like getting rid of the department of homeland security. Sure it would lower the taxes we pay by 2%, but don't we already have enough unemployed executives complaining about how hard it is to live on a 1 million dollar severance package?
"She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
The reason many sites have long URLs like that is so they can be explicit and do better in Google search rankings. As long as Google values the actual URL for search rankings, URLs will remain long.
75 MBit/s? What's that in Libraries of Congress per decifortnight?
It's true I tell you, feller at work's next door neighbour read it in the paper.
it goes in cycles... you get better hardware, then you saturate it with software. Then you get better software and you saturate it with hardware.
Currently, we can apply said metaphor with internet connections. We started with jpegs. We had low baud modems. We then moved on to moving pictures we needed to download. They upped it to cable. Now we are to the point where the demand for fiber to your house is going to be needed in most situations.
Think how we've moved from dumb terminals to workstations and now we are using more dumb terminals (ie - VM's) and it will just keep cycling.
At this point we may as well start harping on engineers about TCP/IP packet overhead if we're concerning ourselves with this water under the bridge...
For an even more egregious example of web design / CMS fail, take a look at the HTML on this page.
$ wc wtf.html
12480 9590 166629 wtf.html
I'm not puzzled by the fact that it took 166 kilobytes of HTML to write 50 kilobytes of text. That's actually not too bad. What takes it from bloated into WTF-land is the fact that that page is 12,480 lines long. Moreover...
$ vi wtf.html
Attention Globe and Mail web designers: When your idiot print newspaper editor tells you to make liberal use of whitespace, this is not what he had in mind!
Typical half-assed slack-alism. HEY! If I take a really small number and multiply it by a REALLY HUGE number, I get a REALLY BIG NUMBER! The end is nigh! Panic and chaos!!!
Coding with assembly is like playing with Legos. Coding an application in assembly is like building a car with Legos.
Yea that's it... URLs are wasting bandwidth... never mind the massive amounts of useless garbage on the Internet no it's definitely long URLs.
You can't take the sky from me.
If they were using that space for descriptive purposes (like long file names) there might be an arguable tradeoff, but most URLs are full of illegible encodings that mean nothing to anyone except the people managing the service. This is all fine, but why not encode all the info and send it in one fat lump? Most users, perhaps with the exception of some nerds who hangout here at /. don't navigate by editing the URL directly. They press the big shiny buttons like normal primates.
In order to maximize the web experience for all customers, effective immediately all websites with URLs in excess of 16 characters will be bandwidth throttled.
Sincerely,
Comcast
Are YOU using the TOOL, or is the TOOL using YOU? Think about it!
For laughs, use Yahoo's YSlow on the article. The site makes a stupid amount of requests for CSS and JS, doesn't GZIP, doesn't use ETAGS, etc. They ignore almost every other bandwidth saving technique, but at least their URLs are short!
www.ishitoutanobama.com
Has anyone here even looked at what the real motivation behind this study is? It's to create this idea that web hosts, are, surprisingly, wasting the valuable bandwidth provided by your friendly ISPs. Do a few stories like this over a few years, and suddenly, having Comcast charge Google for the right to appear on Comcast somehow seems fair. The bottom line is, as a consumer, its my bandwidth and I can do whatever I want with it. If I want to go to a web site that has 20,000 character URLS, then, that's where I'm headed.
This is my sig.
honestly, is this really an issue when people are streaming entire movies?
I hope this is obvious to most people here, but reading some comments, I'm not sure, so...
The issue is that a typical Facebook page has 150 links on it. If you can shorten *each* of those URLs in the HTML by 100 characters, that's almost 15KB you knocked off the size of that one page. Not huge, but add that up over a visit, and for each visit, and it really does add up.
I've been paying very close attention to URL length on all of my sites for years, for just this reason.
Just use a smaller font for the URL!
...
I am wondering if this is more about exploting the fact that such long and exacting URLs might serve as a form of security through obscurity...
Previously: "Linux... Toward the Sunrise..." Now: "Linux... Toward the-- No, now, part of Every Sunrise"
75 MBit/s? What's that in Libraries of Congress per decifortnight?
Depends. Is it an African or European Library of Congress?
ebay has "upgraded" their local site http://my.ebay.com.au/> and "my ebay" is now a 1M byte download. That's ONE MILLION BYTES to show about 7K of text and about 20 x 2Kb thumbnails.
The best bit is that the htm file itself over 1/2 Mbytes. Then there's two 150K+ js files and a 150k+ css file.
Web "designers" should be forced to develop on a 128M P3 machine with VGA screen and dial up modem
-- Butlerian Jihad NOW!
Are Long URLs Wasting Bandwidth?
No. But this article is.
Isn't Facebook itself the huge waste of bandwidth as opposed to just the verbose URLs it generates?
Mind the gap...
I used to be the sysadmin for a public high school. The school's website was 100% static pages, and the Webmaster/Web design teacher was thoroughly incompetent. She pretty much read "Web Design for Dummies" and used Macromedia Suite MX to design the worst and slowest possible Java or Flash crap. Poor layout too--it looks like MySpace was rewritten by teenagers.
The school website URL was kind of long to begin with: http://www.school.county.k12.fl.us/
Here's where it got fun. First, she could not comprehend the concept of relative paths, so every single link was an absolute path.
For the school calendar, I wanted to use http://www.school.county.k12.fl.us/calendar. She could not have any of that, and insisted on http://www.school.county.k12.fl.us/DailyUpdates/calendar/calendar/calendar.htm. Her argument was that users should not memorize addresses of things they go to frequently--they should go to the main page and link through.
My personal favorite URL of hers? http://www.school.county.k12.fl.us/StudentParentInfo/PhoneList/PhoneList08-09.htm.
Every attempt I made to organize the webspace was met with her hysterically screaming and making it a mess again. She also insisted on uploading .psd files along with their resultant .jpg files--her "new and improved" website started at 900 MB and grew to 40 GB in three years.
eBay url for an electronic item for sale:
http://cgi.ebay.com.au/SONY-5-1-Home-Theatre-Amplifier-Receiver-STRDE497_W0QQitemZ180339459830QQcmdZViewItemQQptZAU_HOME_CINEMA_SYSTEMS?hash=item180339459830&_trksid=p3286.c0.m14&_trkparms=66%3A2%7C65%3A1%7C39%3A1%7C240%3A1318
NZ's Trade Me url for a piece of electronic equipment:
http://www.trademe.co.nz/Electronics-photography/Home-audio/Headphones/auction-210021888.htm
There is no excuse for the difference in length, and eBay is not only confusing search engines and us, but is also making their pages slower to load.
Note that the ebay.com.au and Trademe sites have about the same number of listings. The url is a symptom, and a cursory analysis of the rest of the page and the site will see plenty of other examples of poor design for page loading speed.
Even worse, it's like complaining about one person's wall-wart in an entire city of homes using air-conditioners.
I'm out of my mind right now, but feel free to leave a message.....
I bet a lot more bandwidth was "wasted" here talking about it. Facebook could "save" even more bandwidth, if they shut down all their servers and returned nothing to every request. This is a silly discussion.
Stasis is death. Embrace change.
Outside of meaningless SEO strings, a lot of the data in the URL would have to be transmitted otherwise, in cookie or POST data
from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
If you want to save bandwidth charges, there are much better ways to do it which will drop bandwidth by a lot more than the length of a URL. Move to AJAX calls, replacing only the parts of the DOM which need to be updated, instead of refreshing the entire page.
Yes! 10 to 400 times more so that long URLs! The IQ of your average Slashdot poster seems to be slipping lately.
Many sites today have insanely large cookies, which are sent with every request. Some sites are so large that the HTTP request won't even fit in on TCP packet....
Clearly these sites ought to be able to use much smaller session key cookies of some sort (obviously properly cryptographically signed). All this other junk they jam in their cookies they can / should just store on their server.
The querystrings get passed to the CGI script, but that's done completely server-side: the bandwidth isn't wasted because the querystring never goes through the pipes. So you end up saving a little bit bandwidth-wise
In any case, this is a case of grossly premature optimization. Very, very few URLs even come close to the bandwidth of a favicon.ico file, which is itself considered a pittance. There are far more effective ways to cut down on bandwidth than these near-trivial aspects.
After all these years of using Slashdot, I cannot access the site with just /. in my browser.
Here's the math:
Slashdot.org /.
-
___________________
10 wasted characters.
Web-bloat is as prevalent as any software-bloat. Chief cause: laziness.
The sub-market is at fault for offering me short-URI services, so that I can Twitter more and SEO be damned.
The search engines are at fault for inventing SEO and making the content of URIs matter.
The merchants are at fault for making my shopping cart expire, after I go away for five damn minutes.
The advertisers are at fault for making me pay extra for bandwidth, because I use a plugin to hide their content.
The webmasters are at fault for needing revenue, and using interstitial pages as though they were a damn magazine.
The browsers are at fault for demanding a page hit, instead of trusting the cache from 10 seconds ago.
The programmers are at fault for including bells-and-whistles AJAX and kitchen-sink CMS overhead, irrelevant to my interests.
The designers are at fault for using tables instead of CSS.
The coders are at fault for embedding scripts, styles, and too many indents for readibility.
The W3C is at fault for pushing HTML, XML, and XSL tags to be longer, not shorter.
The technology is at fault for not setting a threshold on how long a URI can be.
The mouse is at fault for letting me click a link instead of typing everything in.
The gopher protocol is at fault for letting http win.
The internet is at fault for being so old it can't evolve past 7-bit ASCII.
Okay...I have other pointless articles to go read.
...except they aren't using mod_gzip/deflate. At first I thought you browsed the web RMS style and maybe wc* didn't support compression** and you were just getting what you deserved***, but then I checked in firefox and lo and behold:
Response Headers - http://www.theglobeandmail.com/blogs/wgtgameblog0301/
Date: Fri, 27 Mar 2009 23:39:54 GMT
Server: Apache
P3P: policyref="http://www.theglobeandmail.com/w3c/p3p.xml", CP="CAO DSP COR CURa ADMa DEVa TAIa PSAa PSDa CONi OUR NOR IND PHY ONL UNI COM NAV INT DEM STA PRE"
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html
200 OK
No compression!
If they had been using compression, it would have made all that whitespace fairly negligible.
Probably a result of how their template system stitches everything together. Still, that is pretty bad. There is no excuse to run a webserver and not turn on compression. It is the single biggest way to boost page-load and decrease bandwidth.
* wget 4 lyfe!
** compression is probably evil and Anti-Freedom(tm) somehow, kinda like images are evil or fads like "graphical user interfaces" are evil. In otherwords,anything that makes things easier or faster for a user is basically evil and Anti-Freedom****.
*** braindead comment spamming bots are the only thing not using compression (except RMS, probably)
**** I'll leave it to you, dear reader, to deduce if I'm serious. Hint: no hint.
if we need to free up some bandwidth how about cutting out all the stupid ass ads everywhere, who clicks on those anyway?
...the first 1831 lines (!) of the page are blank...Attention Globe and Mail web designers: When your idiot print newspaper editor tells you to make liberal use of whitespace, this is not what he had in mind!
Believe it or not, someone had it in mind. This is most likely a really, really stupid attempt at security by obscurity.
PHB:My kid was showing me something on our website, and then he just clicked some buttons and the entire source code was available for him to look at. You need to do something about that. ::whispering to WebGuy #2:: Just add a bunch of empty lines. When the boss looks at it, he won't think to scroll down much before he gives up.
WebGuy:You mean the html code? Well, that actually does need to get transferred. You see, the browser does the display transformation on the client's computer...
PHB:The source code is out intellectual property!
WebGuy:Fine. We'll handle it.
PHB:Ah, I see that when I try to look at the source it now shows up blank! Good work!
The problem isn't bandwidth, it is that long URLs are a pain from a usability standpoint. They cause problems in any context where they are spelled out in plain text (instead of being hidden as a link). For example, they often get broken in two when sent in plain text email. When posting a URL into a simple forum that only accepts text (no markup), a long URL can blow-out the width of the page.
Where does this problem come from? It comes from SEO. Website operators realized that Google and other search engines were taking URLs into account, so CMSs and websites switched from using simple URLs (like a numeric document ID) to stuffing whole article titles into the URL to try to boost search rankings. One of the results of this is that when someone finds a typo in an article title and fixes it, the CMS either creates a duplicate page with a slightly different URL, or the URL with the typo ends up giving a 404 error and breaks any links that point to it.
What I don't understand is why search engines bother to look at anything beyond the domain name when determining how to rank search results. How often do you see anything useful in the URL that isn't also in the <title> tag or in a <h1> tag? If search engines would stop using URLs as a factor in ranking pages, people would use URLs that were efficient and useful instead of filling them with junk. The whole thing reminds me of <meta> keyword tags -- to the extent that users don't often look at URLs while search engines do, website operators have an opportunity to manipulate the search engines by stuffing them with junk.
Have you tried compiling the whitespace =)
The HTTP-Referer isn't designed for ?ref=somesource
Your stat software wants to know if more people click to your page through the logo ?ref=mylogo or through a link in the story ?ref=story. The Referer can't give you that info.
The HTTP-Referer also is no good for aggregation. It only give you a URL. If you didn't append something like ?campaign=longurl, it would be almost impossible to track things like ad-campaigns.
HTTP-Referers *are* good for dealing with myspace image leeches. If you haven't I suggest you read thorough you log files right now--I bet you'll find 20% of your traffic is myspace idiots leeching your images. Redirect those guys to something more... tasteful, and enjoy the bandwidth savings.
The better question is, how much bandwidth would be saved if we got rid of facebook altogether ?
I banned this POS site on my network, and suddenly real sites are much more responsive & everyone is more productive!
Wow. Judging by the patterns that I see in the "empty" lines, it looks like their CMS tool has a bug in it that is causing some sections to overwrite (or in this case, append instead).
I'd bet that every time they change their template, they are adding another set of "empty" lines here and there, rather than replacing them.
-David
The core answer to "are long URL's wasting bandwidth" is no, they are not. The extra bandwidth used is much less than the general background noise of the Internet. There are so many other things that waste more bandwidth than long URL's that it's not worth spending any time worrying about them. Think server headers ("x-powered-by: asp.net" is really annoying), spam, extra email headers, long email threads where the entire original thread is copied each time, un-optimized graphics on web pages, un-optimized javascript/css/html, un-followed prefetching and DNS prefetching, viruses spreading, port scans, etc. The list goes on and on. Watch a raw firewall log on an active connection for about 5 minutes to see what I mean about "background noise". Long URL's can only go to about 2,000 characters (~2KB) before you start running into compatibility issues (IE), and most "long URL's" are much shorter than that even. This just isn't worth worrying about.
What do you think inband http compression does to those blank lines?
Or banning black painted cars.
...so no.
American Idol might yield a better "idiots off the street" to bandwidth ratio, though.
As a web developer who was using a P3 with 256 MiB of RAM at work up until recently I think I'm going to have to disagree.
How about an example of the software I tend to be running at the same time:
Clearly all of this eats more than 256 MiB of memory, on a CPU that's a few generations out of date, let's just say there's a lot of swapping...
Then there's the matter of the VGA screen, last time I checked the stats less than 4% of the users were using a resolution of 800x600 or less, why would anyone care to develop for machines running at 640x480 with 4 bit color? And how do you expect anyone to get useful work done on a machine like that?
As for the javascript, sometimes it is, unfortunately, necessary to send fairly large amounts of javascript to the clients but if done correctly then the client should cache the scripts after loading them the first time.
That said, a 0.5 MiB html document is pretty huge, at least for what is essentially a start page, I'm sure they could trim that down if they really wanted to.
/Mikael
Greylisting is to SMTP as NAT is to IPv4
Instead of sending them to the client (waste step 1) which in turn has to send them back to the server (waste step 2), what should be done is keep them on the server. Generate an MD5 hash of the URL string. Use the MD5 hash as an index in a Berkeley DB file, storing the time last created, and the whole URL (maybe compressed). Make a replacement URL with the MD5 hash. When a request comes back with the MD5 hash, look up the URL to use for it, much like having your own tiny URL tool. A reaper process can run in the background, gradually stepping through the indexes in some order, and deleting entries considered to be too old (whatever is appropriate for the site).
This is actually a good idea for security, too. By refusing to accept the full URLs with all those variable fields that people could modify, you'll have more of a shield against hackers trying to tweak with the system.
now we need to go OSS in diesel cars
Just take a look at some of those review sites people post links to right here on slashdot. They contain so many ads from different addresses they make the client look up, plus a review of TWO items tend to be split over 10 pages, which each page being a single paragraph or ONE PICTURE. Slashdot! Save the bandwidth! Don't like to those sites, or if someone does, don't follow! Think of the URL requests we as slashdot could save if we didn't all follow a link to a site that has 100+ URL requests per page for 10 pages!
This is the best feature of TinyURL, and there's no need to get goatsed or tubgirled again.
Slashdot won't even let me type (really: paste) it in. So I have to enter it via tinyurl. The preview link is here [http://preview.tinyurl.com/dapfpm]. That URL was found traversing through the proxy when accessing a Youtube video. Later I repeated the same video (which was rather lame) and didn't get anything as long. I think they dynamically insert all the ads and stuff in the URLs.
now we need to go OSS in diesel cars
Without context, this is meaningless. 75 Mb/sec as a percentage of what? "As much as" -- translating to "less than, probably significantly less than". In context, this is probably not news. Which is usually why context is left out -- because in context, there would be nothing significant to report.
"Up to 1000 people had pianos dropped on their heads". Over what time period and geological area? If worldwide since 1698, it's a curiosity. If in my town since last Tuesday, it's cause for concern. But without context, I have no idea whether to get excited about the Piano Ban of 2009.
Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
Let me get this one straight.
He is saying that with long web names it is waisting bandwidth.
Now, with SPAM, VIRUS, MalWare, SpyWare, Bots, Trojans, Rootkits, and any other band wasting junk out there these people have the gull to complain about long URL's wasting bandwidth.
Go Figure!!!
Maybe we should consider not using a bunch of ridiculous and frivolous web-based application crap like ajax in order to serve information and images.
I mean, if we're going to nitpick, isn't the entirety of web 2.0 sort of retarded? We complain about web standards and yet no one is willing to write things in a simple and expressive enough manner that old or low end hardware such as cell phones are relevant to the mainstream web. Maybe URL's aren't really the big problem here.
Hell, if we are translating this to energy use, imagine how much electricity would be saved if everyone stopped using inefficient and expensive languages like Ruby... cities worth.
Why don't we stop serving video in flash and go back to simplistic plugin players or *gasp* ... Java! If we want to cut bandwidth, look to the past and narrowband.
If we want to cut energy or bandwidth use, it's quite simple. The modern web is full of ridiculous waste.
hugeurl.com is there for a reason.
I don't know which one, but I'm sure there is one.
What is next? XML is too wasteful?
I bet you are right. Hundred bucks says the HTML for the textbox where you change your templates has a couple newlines in it. SOmething like:
[textarea] ... some extra newlines.... .... some extra newlines...
{$OLD_TEMPLATE}
[/textarea]
Sorry I couldn't make that more clear, but slashdot would have ate the HTML (moreso than it already has)
do long filenames waste disk sectors?
(Answer: not as much as long files do).
I am anarch of all I survey.
i jst svd u 9 byts of bndwdth, thts why
Like:
- naming all your variables and functions really short
- persistent use of classes and css to avoid common attributes
- grouping/normalizing css properties
- using js to generate repetitive code
- not using any carriage returns
- removing every js function and css declaration you don't use
- removing all comments
- properly compressing images
- proper caching declarations
- doing all browser optimizations on the server
And you know, URLs usually contain important information. I've never seen someone put garbage in a URL. And if it's information that needs to be passed, it doesn't matter where it is placed. It will take up the same bandwidth.
It was said by replacing "Microsoft Office" by "MSOF" will make certain office releases from 10 floppy diskettes to 9, saving tremendous manufacturing cost.
Ya, and this is a design decision I think. Each link contains a hash to a cached node of some sort, for direct access to each resource.
FYI: Director of Engineering discusses Facebook's architecture (1hr)
Maybe. But I'm sure we can waste more.
This guy's the limit!
1
(Slashdot is killing the Earth's bandwith!)
It only matters whether the greater of the two directions is reduced, if you do 8gig out and 1gig in you get charged for 8gig out, even if you reduce 1g inbound to .925G. (also it's 95th percentile typically, but that shouldn't matter in this case)
It would cost more to do the refactoring than the ever could hope to recoup, even if shorter urls also decrease outbound traffic.
I compiled whitespace from the Haskell source (could have apt-get install whitespace'ed it, but this was more fun =] ). Then fed it that page's source ... no output. Anyone want to reverse engineer some probable CMS-noise?
.f00Dave
Why did I read that as "panic and cheese!!!"?
"It does not do to leave a live dragon out of your calculations, if you live near him." - Tolkien
Sure, they are wasting bandwidth. In fact, so are vowels. Lets remove all the vowels from text, that'll save even more!
But how much is being wasted on CAPTCHA challenges ?
Wanna fight ? Bend over, stick your head up your ass, and fight for air.
It not the size of the URL that matters but what you do with it.
At least that's what my GF says.
penny wise, pound foolish.
We could save a lot more bandwidth by not watching so many crappy videos on youtube.
I say, for one thing, that THOUSAND'S, perhaps million's of byte's of storage could be saved if we all stopped using these idiotic and gramatically incorrect apostrophe's when describing URLs (instead of url's)
.
- aqk
F U
This is silly. The URLs, even "long" ones are miniscule compared to the pictures, streaming video, music, javascript etc. on these pages. To worry about them is like worrying about the lint on a suit of clothes making them too hot. This is just absurd.
..if URLs are such a huge waste, image the bandwidth used by the rest of the data in the HTTP protocol alone, content excluded.
Since web applications are getting so advanced, isn't it time we moved beyond the "interactive document" paradigm and developed something a little more efficient for the internet applications of tommorow?
DISCLAIMER: I myself have no idea where to begin on such a task, but it does seem that we've just been patching on to a technology that isn't mean for what we do with it; HTTP/HTML has it's place for certain, but we could and should do better, don't you think? Is anyone working on an alternative? With service providers constantly screaming about bandwidth, there must be a way to tighten the belt, so to speak...
CAn'T CompreHend SARcaSm?
Dude, that's just an embedded Whitespace script. You need to upgrade your browser!
The URL of your page has nothing to do with the Google pagerank. See: http://googlewebmastercentral.blogspot.com/2008/09/dynamic-urls-vs-static-urls.html
Or, a more likely explanation, is server side script blocks.
To fthr sav our bdwdth, ^A txt shld be cmprsd into acrnyms!
Don't be quite so dismissive. We have a high traffic site which features a real-time data widget. This polls for a small data file very frequently. Eliminating the majority of the server response headers, changing to use a shorter domain name and page, and ensuring we polled on a different domain (so that cookies weren't sent) reduced our bandwidth usage by just over a third. That's not to be sniffed at.
Hi Mikael,
thanks for the considered response. Let me drag you back to the real world.
First, I usually don't run anyone's app/website fullscreen. 2ndly, even if I do have 1/2G RAM, you can bet your bottom dollar there's half a dozen other apps fighting for it. Slashdot, ebay, twitter, aardvark. cyclingnews.com are all sideshows predominantly providing c. 10K text. Work?
Now, my home machine has only 388M of RAM and 1500 x 1024 pixels. My work machine has more ram (1/2G), but much less screen (1240 x 1024). Yes, that's my brand new corporate PC. My mother=in-law (and mother) have less than both of these. My kids have subnotebooks with VGA and sub VGA screens. Half my extended family are still on dialup (or dialup paced wifi). Let's not get started on the early adopterkindens with their wunderwebphones!
Now, I have a number of supplier/client sites whose first useful hyperlink is more than 1000 pixels away from top left. Painful. Pretty, pretty graphics above. But not user friendly.
The bottom line is - for most of us, www browsing is only one of the thomgs we are doing (I currently have gimp, xsls,open project, 5 browser panes, email, a stats package, two dos windows, one running perl, RDP and word open. Yes I'm mildly http://www.randsinrepose.com/ ADD )
And, as you said, the my.ebay start page is surreal.
-- Butlerian Jihad NOW!
I/dont/think/that/urls/are/too/long/I/think/that/they/should/be/as/long/as/they/need/to/be/after/all/it's/only/bandwidth/and/bandwith/will/always/increase/until/it/doesn't/anymore..
We're talking about a leaking faucet next to the font of data that Facebook delivers.
It's interesting that they can quantify the amount of data they're receiving in GET requests but I'd suspect that using shorter URLs would still transmit the same number of packets over the internet AND would incur an extra database hit or decompression to translate the *slightly* shorter URL into something meaningful.
Are we really worrying about byte level optimizations here?
My God! It's full of eval()'s.
Just use go://target instead of long http://...
http://www.socialdns.net
It is an open alternative that creates a novel namespace using the go:// scheme.
Names are open and free and every user can manage his web TLD: go://blog.pedro
Let's say a Facebook page has 100 links on it.
Using GET syntax, each of the 100 links must contain all the metadata about the user's current state, etc.
Using POST syntax, the links each contain only the page information. The metadata WILL be added to the call, the same as the GET syntax, but not to each <a> tag on the page.
Peter predicted that you would "deliberately forget" creation 2000 years ago...
One of the issues with shortening URLs is that the search engines look at the URL and the words present in the URL to determine how to rank the URL. This works against the desire to shorten. For example having: "baseball/redsox/beckett" is important to get a higher ranking.
Given that my personal website is 'giantpachinkomachineofdoom.com' and the website for my upcoming project is 'tangibleimagination.com', I think it would be a terrible conflict of interest for me to comment on this issue.
Friend: "The NIC is misconfigured..." Me: "No prob, I'll just telnet in and fix it." *Silence*
want to complain about a small amount of usage.. how about what will be used when we start using 128bits instead of 32 bits for IP addresses in IPv6
I am part of a team redesigning my company's website. I used to be a web developer, but am now an infrastructure and system (read "general" system, as in systematic, not specifically computer systems) junkie. Anyhow, I did a source view of the generated HTML and how the VIEWSTATE "gibberish" for maintaining state in ASP.NET. I copied the value and pasted it into Notepad and saved it... it was 48k... just the VIEWSTATE value.
There is a workaround that stores it in a session on the server, which alleviates this. However, if I had not discovered this, it would have made it to production. How many instances of things like this exist out there in the open?
A typical web server will have a symmetric network connection and will be using almost entirely the outbound bandwidth.
If you are sending 90Mb/s and receiving 5Mb/s on your heavily loaded web server reducing theat 5Mb/s to 3 or 2 is utterly useless.
For example, the numerous images slashdot uses for the design are all on the .slashdot.org domain, for which my browser has 7 cookies stored (some of them from Google Analytics, one of the biggest culprits). This means that for each request (even HEAD) to these images and other static content, all those cookies are sent to the slashdot servers, wasting precious outgoing ("upload") bandwidth.
"I love my job, but I hate talking to people like you" (Freddie Mercury)
HTML tidy immediately chops 60kB off of it. So that's 60kB that wasn't doing anything at all.
$15/Mbps/mo * 75 == $750/mo. Yawn. Get a life people. Have you seen the other waste and spend involved in a large site?
a lot of other reasons for waste of Bandwidth 1) The idea that no instead of a link URL we need usuall a javascript onclick event 2) The idea that a really good layouted requires nesting 15 levels of components. 3) The ideas that we first need to test if flash is installed. 4) The idea that even just figuring out the start date of a movie nowadays requires to load and skip a flash video. 5) The idea that things which would be perfectly fine without AJAX now require extra request after loading the website. 6) the idea that every wesite needs special menu bars on each end of the page instead of a unified, simple menu with one hierachy level more
Not to nitpick but I'd like to correct some parts of your post...
VGA resolution means 640x480 with 4 bit color (16 colors) or 320x200 with 8 bit color ( 256 colors), most cellphones sold in the last few years are capable of graphics better than that and I sincerely doubt you know a lot of people running computers with "sub VGA screens".
/Mikael
Greylisting is to SMTP as NAT is to IPv4
http://www.getahighstreetmarketleadertoreflecttheirpositioninnaturalsearch.com/