Slashdot Mirror


Is RSS Doomed by Popularity?

Ketchup_blade writes "As RSS is becoming more known to the mainstream users and press, the bandwidth issue reported by many sites (Eweek, CNet, InternetNews) related to feeds is becoming a reality. Stats from sites like Boing Boing are showing a real concern regarding feeds bandwidth usage. Possible solutions to this problem are emerging slowly, like RSScache (feed caching proxy) and KnowNow (even-driven syndication). RSScache seems to offer a realistic solution to the problem, but can this be enough to help RSS as it reaches an even bigger user base in the upcoming year?"

6 of 351 comments (clear)

  1. Solutions by markfletcher · · Score: 5, Informative

    There are several ways to mitigate the bandwidth issues. First, all aggregators should support gzip compression and the HTTP last-modified and etags headers. That'll take care of a lot of the problems. The other solution is to get people to use server based aggregators, like Bloglines, which only fetch a feed once per iteration, regardless of how many subscribers there are. As a bonus, there are several things that server-based aggregators can do that desktop based aggregators can't do, like provide personalized recommendations. I like this solution, but of course I'm biased since I'm the founder of Bloglines. :)

  2. Re:RSS readers don't cache! by maskedbishounen · · Score: 5, Informative

    To some extent, this could be blamed on the feed itself. Ideally, it works like this..

    When you request the feed, you first get sent your normal HTTP header. If properly configured, it will return a 304 if you have the most recent version -- however, as many feeds are generated in PHP[1], this header is defaulted off, and you'll end up with your standard 200, or go ahead, code. This single handedly wastes a metric tonne of bandwidth needlessly.

    Even if you're trying to rape a feed, you'll only be wasting a few hundred bytes at most every half hour, than the whole 50K or whatnot size it is.

    See here for a more detailed explanation.

    [1] This is not a PHP specific issue; a lot of dynamic content, and even static content, fails to do this properly. But this is what it's there for, after all.

    --
    "An infinite number of monkeys typing into GNU emacs would never make a good program."
  3. Re:RSS readers don't cache! by IO+ERROR · · Score: 4, Informative
    For instance, the GPL blog software Word Press doesnt do ANY cacheing.

    Technically true but misleading. WordPress allows user agents to cache the RSS/Atom feeds, and will only serve a newer copy if a post has been made to the blog since the time the user agent says it last downloaded the feed. Otherwise it sends a 304. This is in 1.3-alpha5. I dunno what 1.2.1 does.

    Not to mention, a lot of these RSS readers are big sites like bloglines, newgator, etc who should be respecting bandwidth limits, but really have no incentive to do so.

    Not coincidentally, these are the egregious worst offenders I mentioned. Bloglines grabs my RSS2 and Atom feeds hourly, and doesn't cache or even pretend to. Firefox Live Bookmarks appears to cache feeds, but your aggregator plugins might not. I can't (yet) tell the difference from the server logs between Firefox and the various aggregator plugins.

    The best ones are the syndication sites that only grab my feeds after being pinged. Too bad I can't ping everybody. That could solve the problem if there was some way to do that.

    --
    How am I supposed to fit a pithy, relevant quote into 120 characters?
  4. Re:They just need to follow ./'s lead by interiot · · Score: 4, Informative

    You know what happens then? The same thing they do when you hamper your RSS feed in any other way, they scrape your HTML and create their own feeds. Slashdot doesn't monitor their front page as closely as they do their rss page, so you can get away with quite a bit of abuse, at least for a while. They've blacklisted my IP ocassionally when I got overzealous though.

  5. Re:They just need to follow ./'s lead by Electroly · · Score: 5, Informative

    HTTP 1.1 already supports this. A conditional HTTP request can be made which basically asks the server if the file has been updated. The server can then respond a 304 Not Modified and avoid sending the entire RSS file again. Unfortunately, poorly written RSS aggregators don't implement this, and it is those aggregators that are the real problem here. They typically are the ones with the default 5 minute update time, too.

  6. Slashdot's RSS blocking policy by jamie · · Score: 4, Informative
    Slashdot blocks your IP from accessing RSS if you access our site more than fifty times in one hour. I think that's reasonable, don't you? Especially since our FAQ tells you to request a feed only twice an hour.

    Every complaint about this that I've investigated has turned out to be either a broken RSS reader or an IP that's proxying a ton of traffic (which we usually do make an exception for).

    Oh, and if you want to read sectional stories in RSS, then:

    • create a user if you haven't already,
    • edit your homepage to include sectional stories you like (and exclude those you don't),
    • then reload the homepage and copy that "rss" link at the very bottom of the page. It will be customized to your exact specs!

    Slashdot's RSS traffic, like Boing Boing's, is huge, and blocking broken readers has saved us a ton of bandwidth, which of course means money. We were one of the first sites to do this but (as this story suggests) you'll see a lot more sites doing it in the future. I think our policy is fair.