Is RSS Doomed by Popularity?
Ketchup_blade writes "As RSS is becoming more known to the mainstream users and press, the bandwidth issue reported by many sites (Eweek, CNet, InternetNews) related to feeds is becoming a reality. Stats from sites like Boing Boing are showing a real concern regarding feeds bandwidth usage. Possible solutions to this problem are emerging slowly, like RSScache (feed caching proxy) and KnowNow (even-driven syndication). RSScache seems to offer a realistic solution to the problem, but can this be enough to help RSS as it reaches an even bigger user base in the upcoming year?"
Remember all the hype about "push" technology back in the mid-nineties? Nobody was interested, but RSS feeds are being used in much the same way now. I'm thinking there are two significant differences: 1) with RSS, the user feels like they're in control of what's going on; with push, users felt like they were at the mercy of whatever money-grabbing corporations wanted to throw at them, and 2) a hell of a lot of people now have an always-on Internet connection with plenty of bandwidth to spare. When you've got a 33.6kbps dialup connection, you use the Internet differently than when you've got DSL or cable.
How much bandwidth does Slashdot's RSS feed use?
It looks like the RSS feed on my home page has a small handful of subscribers. Neat.
$x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
$x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;
And institute jackboot banning policies if you access them more than x times per y hours.
What you're seeing right now are teething troubles. Nothing more, nothing less. The bandwidth and consumption experienced right now will be laughed off a couple of years from now as miniscule.
Take the BBC News website for example. On September 11th 2001 its traffic was way beyond anything it had experienced to that point. Within a year or so, it was comfortably serving more requests and seeing more traffic every day. Proof if it was needed that capacity isn't the issue when it comes to Internet growth, and won't be for the foreseeable future.
RSS is in its infancy. Just because people didn't anticipate it being adopted as fast as it has been that doesn't make it "doomed". By that rationale, the Internet itself, DVDs, digital photography, etc are all "doomed" too.
"Accept that some days you are the pigeon, and some days you are the statue." - David Brent, Wernham Hogg
Instead of downloading the entire RSS feed every time, why not have aggregators indicate to the server the timestamp of the last time the RSS feed was downloaded, or the timestamp of the last item in the feed the aggregator knows about, and then the server can dynamically generate the RSS with only new content for that client. Increases processing load while reducing bandwidth, but processing time is what most servers have lots of, not to mention it's far cheaper to increase than bandwidth.
"Is RSS Doomed by Popularity?"
:)
"Is Instant Messaging Doomed by Popularity?"
"Is E-Mail Doomed by Popularity?"
"Is Usenet Doomed by Popularity?"
"Is The Internet Doomed by Popularity?"
"Is Linux Doomed by Popularity?"
"Is Apple Doomed by Popularity?"
"Is Netcraft Doomed by Popularity?"
"Is Sex with Geeks Doomed by Popularity?"
There are several ways to mitigate the bandwidth issues. First, all aggregators should support gzip compression and the HTTP last-modified and etags headers. That'll take care of a lot of the problems. The other solution is to get people to use server based aggregators, like Bloglines, which only fetch a feed once per iteration, regardless of how many subscribers there are. As a bonus, there are several things that server-based aggregators can do that desktop based aggregators can't do, like provide personalized recommendations. I like this solution, but of course I'm biased since I'm the founder of Bloglines. :)
To some extent, this could be blamed on the feed itself. Ideally, it works like this..
When you request the feed, you first get sent your normal HTTP header. If properly configured, it will return a 304 if you have the most recent version -- however, as many feeds are generated in PHP[1], this header is defaulted off, and you'll end up with your standard 200, or go ahead, code. This single handedly wastes a metric tonne of bandwidth needlessly.
Even if you're trying to rape a feed, you'll only be wasting a few hundred bytes at most every half hour, than the whole 50K or whatnot size it is.
See here for a more detailed explanation.
[1] This is not a PHP specific issue; a lot of dynamic content, and even static content, fails to do this properly. But this is what it's there for, after all.
"An infinite number of monkeys typing into GNU emacs would never make a good program."