Is RSS Doomed by Popularity?
Ketchup_blade writes "As RSS is becoming more known to the mainstream users and press, the bandwidth issue reported by many sites (Eweek, CNet, InternetNews) related to feeds is becoming a reality. Stats from sites like Boing Boing are showing a real concern regarding feeds bandwidth usage. Possible solutions to this problem are emerging slowly, like RSScache (feed caching proxy) and KnowNow (even-driven syndication). RSScache seems to offer a realistic solution to the problem, but can this be enough to help RSS as it reaches an even bigger user base in the upcoming year?"
Remember all the hype about "push" technology back in the mid-nineties? Nobody was interested, but RSS feeds are being used in much the same way now. I'm thinking there are two significant differences: 1) with RSS, the user feels like they're in control of what's going on; with push, users felt like they were at the mercy of whatever money-grabbing corporations wanted to throw at them, and 2) a hell of a lot of people now have an always-on Internet connection with plenty of bandwidth to spare. When you've got a 33.6kbps dialup connection, you use the Internet differently than when you've got DSL or cable.
How much bandwidth does Slashdot's RSS feed use?
It looks like the RSS feed on my home page has a small handful of subscribers. Neat.
$x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
$x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;
And institute jackboot banning policies if you access them more than x times per y hours.
One thing that would help immensely is if RSS readers/aggregators would actually cache the RSS feed and not download a new copy if they already have the most current one. I could go through my server logs and point out the most egregious problem aggregators if anyone's interested.
How am I supposed to fit a pithy, relevant quote into 120 characters?
rsstorrent -- distributed rss,echoing bittorrent?
What you're seeing right now are teething troubles. Nothing more, nothing less. The bandwidth and consumption experienced right now will be laughed off a couple of years from now as miniscule.
Take the BBC News website for example. On September 11th 2001 its traffic was way beyond anything it had experienced to that point. Within a year or so, it was comfortably serving more requests and seeing more traffic every day. Proof if it was needed that capacity isn't the issue when it comes to Internet growth, and won't be for the foreseeable future.
RSS is in its infancy. Just because people didn't anticipate it being adopted as fast as it has been that doesn't make it "doomed". By that rationale, the Internet itself, DVDs, digital photography, etc are all "doomed" too.
"Accept that some days you are the pigeon, and some days you are the statue." - David Brent, Wernham Hogg
Instead of downloading the entire RSS feed every time, why not have aggregators indicate to the server the timestamp of the last time the RSS feed was downloaded, or the timestamp of the last item in the feed the aggregator knows about, and then the server can dynamically generate the RSS with only new content for that client. Increases processing load while reducing bandwidth, but processing time is what most servers have lots of, not to mention it's far cheaper to increase than bandwidth.
RSS feeds are meant as a way to strip all the nonsense from a site and offer easy syndication, right? Basically, present the relevent news from a full-fledged webpage in a smaller file size? If such is the case, this isn't an RSS issue, really. I see it more as a bandwidth issue. I mean, people are going to get their news one way or the other.. either with a bunch of images and lots of markup via HTML or with just the bare minimum of text and markup via RSS. I would prefer RSS over HTML any day of the week! But perhaps RSS makes syndication TOO simple. Thus everyone does it and that eats additional bandwidth that normally would be reserved for those browsing the HTML a site offers.
And you could implement bans on people who request the RSS feed more than X times per hour as someone suggested (Doesn't /. do this?), but I don't think that gets around the bandwidth issue. I mean, those who want the news will either go with RSS or simply hit the site. Again, RSS is the preferred alternative to HTML.
So here's my suggestion.. go to nothing but RSS and no HTML!
What is your penile percentile?
"Is RSS Doomed by Popularity?"
:)
"Is Instant Messaging Doomed by Popularity?"
"Is E-Mail Doomed by Popularity?"
"Is Usenet Doomed by Popularity?"
"Is The Internet Doomed by Popularity?"
"Is Linux Doomed by Popularity?"
"Is Apple Doomed by Popularity?"
"Is Netcraft Doomed by Popularity?"
"Is Sex with Geeks Doomed by Popularity?"
There are several ways to mitigate the bandwidth issues. First, all aggregators should support gzip compression and the HTTP last-modified and etags headers. That'll take care of a lot of the problems. The other solution is to get people to use server based aggregators, like Bloglines, which only fetch a feed once per iteration, regardless of how many subscribers there are. As a bonus, there are several things that server-based aggregators can do that desktop based aggregators can't do, like provide personalized recommendations. I like this solution, but of course I'm biased since I'm the founder of Bloglines. :)
Slashdot user GaryM posted a related question elsewhere about 20 months ago. At that time, in that forum, commenters dismissed his proposed solution, the use of NNTP, on the grounds that NNTP is deficient, but others continue to see NNTP as a possible solution nevertheless.
A lawyer & digital forensics examiner. Also an expert on open source software (OSS).
Every complaint about this that I've investigated has turned out to be either a broken RSS reader or an IP that's proxying a ton of traffic (which we usually do make an exception for).
Oh, and if you want to read sectional stories in RSS, then:
Slashdot's RSS traffic, like Boing Boing's, is huge, and blocking broken readers has saved us a ton of bandwidth, which of course means money. We were one of the first sites to do this but (as this story suggests) you'll see a lot more sites doing it in the future. I think our policy is fair.
Create a new first level domain ( like alt, comp, talk etc ) named "rss" and use an extra header to identify the originating rss feed URL. The latter header could be used by the RSS/NNTP reader to select which article bodies to download and to verify each RSS entry to identify fake posts.
Of course we blocked your IP when you hammered our server. And we'll do it again. Duh. We monitor abuse on the whole site, not just RSS.
This still baffles me. BitTorrent works great for distributing media like ISOs. Folks, it can distribute "little" stuff, too.
A content creator (say Slashdot) has webpages and it has an RSS feed. They create a torrent for each page. They sign the RSS file and each torrent (and its content) with a private key. They post their public key on their homepage.
Now, you can cache the RSS file on other sites that support you yet the users can still be confident that it really came from you. Inside the RSS file, users can try to get the webpage (and all its images, etc.) through the torrent first. When the page loads locally in your browser, it could still go out and get ads if you are an ad sponsored site.
If you are a popular site and have a "fan base", you should have no problem implementing something along these lines. If you are a site that has these problems, you are probably popular and have a fan base. Given the right software and the buy-in from users, the problem solves itself.
Well, RSS was simple, and everything you're talking about (caching, push-based update, etc) are application-level issues. Even though that stuff is defined in HTTP 1.1, it took years for HTTP 1.1 to come out.
If the web started with HTTP 1.1, it would never have gone anywhere because it's too complicated. There are parts of 1.0 that probably aren't implemented very well.
If you want to improve things, adopt an RSS reader project and add those features.
"Slashdot's RSS traffic, like Boing Boing's, is huge, and blocking broken readers has saved us a ton of bandwidth, which of course means money."
So's using correct HTML, and CSS.