Slashdot Mirror


Is RSS Doomed by Popularity?

Ketchup_blade writes "As RSS is becoming more known to the mainstream users and press, the bandwidth issue reported by many sites (Eweek, CNet, InternetNews) related to feeds is becoming a reality. Stats from sites like Boing Boing are showing a real concern regarding feeds bandwidth usage. Possible solutions to this problem are emerging slowly, like RSScache (feed caching proxy) and KnowNow (even-driven syndication). RSScache seems to offer a realistic solution to the problem, but can this be enough to help RSS as it reaches an even bigger user base in the upcoming year?"

20 of 351 comments (clear)

  1. rsstorrent will solve it all by RangerWest · · Score: 4, Interesting

    rsstorrent -- distributed rss,echoing bittorrent?

  2. Limit download to new content by zoips · · Score: 5, Interesting

    Instead of downloading the entire RSS feed every time, why not have aggregators indicate to the server the timestamp of the last time the RSS feed was downloaded, or the timestamp of the last item in the feed the aggregator knows about, and then the server can dynamically generate the RSS with only new content for that client. Increases processing load while reducing bandwidth, but processing time is what most servers have lots of, not to mention it's far cheaper to increase than bandwidth.

  3. Not a problem with RSS.. just humans. by dustinbarbour · · Score: 4, Interesting

    RSS feeds are meant as a way to strip all the nonsense from a site and offer easy syndication, right? Basically, present the relevent news from a full-fledged webpage in a smaller file size? If such is the case, this isn't an RSS issue, really. I see it more as a bandwidth issue. I mean, people are going to get their news one way or the other.. either with a bunch of images and lots of markup via HTML or with just the bare minimum of text and markup via RSS. I would prefer RSS over HTML any day of the week! But perhaps RSS makes syndication TOO simple. Thus everyone does it and that eats additional bandwidth that normally would be reserved for those browsing the HTML a site offers.

    And you could implement bans on people who request the RSS feed more than X times per hour as someone suggested (Doesn't /. do this?), but I don't think that gets around the bandwidth issue. I mean, those who want the news will either go with RSS or simply hit the site. Again, RSS is the preferred alternative to HTML.

    So here's my suggestion.. go to nothing but RSS and no HTML!

  4. Re:Usenet? by Anonymous Coward · · Score: 1, Interesting

    Exactly. The Web has long needed a newsfeed-style protocol that defines a path of caches that distribute data. It need not be quite as rigid as the Usenet setup. For example, the initial fetch of data by user request could establish the cache path, rather than having it explicitly administered. But the core idea of pushing a lot of data a lot of people want closer to their node makes sense. The "ball of endpoints" view of the 'Net unfortunately does not encompass this sort of distribution network.

    Current web caches tend to be local to the user (their ISP) or to the source (Akamai hosting, etc). It's the intermediate caches that are missing.

  5. RSS + Bittorrent -- works for Podcasts... by Spoing · · Score: 2, Interesting
    Or, is in the works now on Dave Slusher's Evil Genius Chronicles Podcast. [Podcasts = RSS subscrition feeds for time shifted radio blogging.]

    The Podcasters need it too. I'm subscribed to a couple dozen feeds and have well over 4GB of files in my cache right now.

    The biggest problem with Bittorrent and podcasts is that the RSS aggregators needs to be Bittorrent aware. Unfortunately, few are.

    --
    A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
  6. what's wrong with the old subscription model? by Trepidity · · Score: 2, Interesting

    When I want updates from sites, I subscribe to an email feed, and stick it in its own mailbox. I agree that some standardized format and display would be nice, but you can send XML over email too, so what's needed is a reader that I can point to an IMAP mailbox full of XML mails.

    An alternate approach would be to do the same thing with a news server. Why keep refreshing a feed for updates instead of letting it notify you when it has updates?

  7. Re:Duh... Simple solution by moojj · · Score: 2, Interesting

    I think the biggest reason people are offering RSS feeds is because its a standard XML file on the webserver. No need to make additional scripts, no need to setup additional services -- just upload the XML file. When you start complicating the "Really Simple Syndication" model you start making it less simplistic. In my opinion the easiest way to limit bandwidth is to supply the XML file on servers that support gzip compression and the "Etag" header function. This way RSS readers will only download a compressed XML file, but only when it has been modified. Larger sites could go one step further and ban polling by RSS clients that don't support the Etag lookup feature before requesting the XML file. Then, theres always the obvious solution: cut down the number of items inside the XML file, thus lowering the amount of bandwidth per hit.

  8. Re:Push by ikewillis · · Score: 2, Interesting

    http://beacon.sf.net/ tries to do this using UDP and filesystem monitoring. It waits for the RSS document to change then sends a UDP datagram to notify everyone that a new version is available. It's better than everyone polling the server via HTTP anyway.

  9. Solution: RSS over Usenet news by NZheretic · · Score: 4, Interesting
    One solution would be to use an existing infrastructure that was built for flood filling content - the Usenet news server network.

    Create a new first level domain ( like alt, comp, talk etc ) named "rss" and use an extra header to identify the originating rss feed URL. The latter header could be used by the RSS/NNTP reader to select which article bodies to download and to verify each RSS entry to identify fake posts.

  10. Swarming (Like BitTorrent) is the answer by MS_leases_my_soul · · Score: 4, Interesting

    This still baffles me. BitTorrent works great for distributing media like ISOs. Folks, it can distribute "little" stuff, too.

    A content creator (say Slashdot) has webpages and it has an RSS feed. They create a torrent for each page. They sign the RSS file and each torrent (and its content) with a private key. They post their public key on their homepage.

    Now, you can cache the RSS file on other sites that support you yet the users can still be confident that it really came from you. Inside the RSS file, users can try to get the webpage (and all its images, etc.) through the torrent first. When the page loads locally in your browser, it could still go out and get ads if you are an ad sponsored site.

    If you are a popular site and have a "fan base", you should have no problem implementing something along these lines. If you are a site that has these problems, you are probably popular and have a fan base. Given the right software and the buy-in from users, the problem solves itself.

  11. Re:Push by rlanctot · · Score: 2, Interesting

    My suggestion is to revamp RSS to use a P2P format of publishing, so you spread out the load.

  12. You're talking application-level by mveloso · · Score: 4, Interesting

    Well, RSS was simple, and everything you're talking about (caching, push-based update, etc) are application-level issues. Even though that stuff is defined in HTTP 1.1, it took years for HTTP 1.1 to come out.

    If the web started with HTTP 1.1, it would never have gone anywhere because it's too complicated. There are parts of 1.0 that probably aren't implemented very well.

    If you want to improve things, adopt an RSS reader project and add those features.

  13. Re:Push by jasonwea · · Score: 2, Interesting

    This seems like a far better than the UDP notification idea. Port forwarding for an RSS feed? No thanks.

    There is almost always a DNS cache at the ISP so the polling interval can be completely controlled by the TTL of the record. Using the existing distributed caching of DNS versus the large percentage of users who are not behind HTTP caches.

    I see two potential problems with this idea:

    1. A lot of people are stuck behind HTTP proxies with limited or no DNS. This isn't too bad as they could fallback to the current system.

    2. Access to the DNS server zone file. Unless you are running your own server, this might be a difficult thing to do as a lot of hosts do not allow direct access to the zone file and would probably frown on lots of changes to the file. If you have a static IP address you could host your own DNS server to get around this however. For someone with bandwidth problems from RSS feeds, this is unlikely to be an issue.

  14. Jabber and/or BitTorrent ! by Anonymous Coward · · Score: 1, Interesting

    Well, I've been thinking about this since RSS first came on my radar a few years ago, and it seems to me something like Jabber might be part of the solution.

    I.e., instead of polling slashdot.org every hour, you maintain a persistence connection to your local Jabber server (at your ISP perhaps), which registers with slashdot's Jabber server. When a new story is published, slashdot's server notifies all the registered servers with the new story, which then distributes the content to each local news reader.

    It would look and feel just like RSS readers do today.

    And you might think "that's too complicated".. well, RSS today works over an HTTP server that you have to install and maintain, and you don't worry about the details of HTTP, why not a Jabber server?

    I wish folks would think of this stuff BEFORE they start using RSS, but what can ya do.

    Also, BitTorrent could be involved somehow to reduce bandwidth even more. For instance in the example above, Slashdot wouldn't have to distribute to EVERY listener, just enough for them to start downloading from each other.

    All it takes is for a couple of the big RSS reader authors to add this and it will happen.. you just gotta have the guts to try. Next version of Mac OS will have Jabber libraries built-in I believe.. there's your chance!!!!

    1. Re:Jabber and/or BitTorrent ! by hildjj · · Score: 2, Interesting
      And here's a first cut at an Internet draft to make it happen. Very small amounts of code, if you have a pubsub service already.

      http://xmpp.org/drafts/draft-saintandre-atompub-no tify-01.html

  15. Re:Slashdot's RSS blocking policy by jamie · · Score: 2, Interesting
    The limit was bumped up a couple months ago, I don't remember exactly when. (And if abuse gets worse, of course, we'll take it back down... but hopefully in 2004 we're no longer on the bleeding edge and client application authors will get more friendly...)

    If you'd like me to check it out, I will. I've set up a Firefox live bookmark for myself and I'll check the logs for my own accesses and see what happens. If you do the same and get banned, go ahead and email me directly -- as soon as possible so our logs don't roll over -- and I'll take a look.

  16. Re:Slashdot's RSS blocking policy by bill_mcgonigle · · Score: 2, Interesting

    Sorry, I goofed, that feature I described is subscriber-only.

    That's OK, I'm a subscriber... still don't see how the custom RSS works. From my RSS reader how does Slashdot know I'm a subscriber? Special URL?

    --
    My God, it's Full of Source!
    OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
  17. Re:Push by Jahf · · Score: 2, Interesting

    The problems with many of these mechanisms is that (as you mention) smaller sites may not have the facilities to do it.

    On the other hand it seems like everyone and their dog can do P2P.

    A P2P-ish RSS system that:

    * Attempts to make each client capable (but not always used) of functioning as a caching server for the feed

    * Has a top-level owner of a feed who has sole rights to update the feed. Perhaps passing public/private keys with the feed to ensure no tampering. Anyone who wanted to subscribe to the feed would need to connect to the top-level one time to get the keys before using RSS-P2P caches.

    * Hopefully has some intelligence to determine the closest feasible cache (perhaps based on # of hops and # of retries) so that we are peering out bandwidth usage as best as possible

    * Use a standard port and open protocol such that a large organization can route any RSS-P2P requests through a main RSS-P2P cache at the router (further enhancing the ability to minimize traffic ... and also giving a polite way for an organization to shut it off ... just like HTTP)

    * Possibly can push a "refresh notification" packet to any clients that have connected to the cache ... if a client fails to pull a refresh after X # of notification packets, assume it went away ... push a "norefresh notification" every X (minutes|hours|etc) to make sure that the client knows the cache is still viable ... if the client doesn't get a (norefresh|refresh) notification after X number of (minutes|hours|etc) then assume that the server has gone down and find a new one

    * Probably obvious but the RSS-P2P cache would be able to select which caches it wanted to host (though I can see use for a mode where it is told to proxy and cache any RSS-P2P request it receives)

    * Since there are existing RSS (not RSS-P2P) setups out there, we could possibly enhance them by allowing the RSS-P2P cache to speak and send RSS over existing mechanisms (HTTP). Further, any RSS-P2P cache that has this mode enabled could, if willing, send a notification to the top-level RSS-P2P server (which would always be maintained by the authoritative feed owner) and be added to a round-robin DNS for the normal RSS feeds so that it helps share the load for normal RSS as well. Only people willing to be "supercaches" would do this, but it allows larger sites to help spread the load.

    Or I could be way off base. Been known to happen.

    --
    It is more productive to voice thoughtful opinions (reply) than to judge (moderate) others.
  18. Actually, this is a more general xml problem by evil_one666 · · Score: 2, Interesting

    XML munches up bandwidth like a lardy butter lover. Yes, yes, RSS feeds are handy, but they dont actually do anything that couldnt be achieved with a much leaner binary format. Its 2004, we dont have byte compatablitily issues any more

    See Roedy Greens (one time comp.java.lang FAQ maintainer)excellent essay on why XML causes these problems.

  19. Miski: client2server2server2client by Philip+Dorrell · · Score: 2, Interesting

    In 2000 I tried to invent a spam-proof usenet. The result of my efforts was Miski. The idea of Miski was that users would have addresses on servers representing what are effectively RSS channels, and other users would subscribe to these channels through their servers. There would be a DNS extension for the naming of servers. Channels would have names like username@example.com/"Java Programming". The system would be spam-proof because your server would only send you what you had subscribed to. It would be "push", because as soon as you posted something to a channel, your server would pass the message on to the servers of those who had subscribed to your channel. Only the notifications would be push: ordinary http would be used to retrieve the actual content.

    Miski also had the important concept of "reposting", whereby if you saw something you liked, you could press a single button in your client to repost the notification, so that any subscribers to you could know about the item being reposted, if they had not already heard about it from somewhere else. The presumption was that the client (or the reader's server) would trim out duplicates, so that people posting would have no inhibitions about reposting stuff that maybe many of their subscribers already knew about.

    Miski was more than just an attempt to create scalable-push RSS, or a spam-proof equivalent of Usenet: it was a vision of the "global brain". Using posting and reposting, notification of a new "interesting" idea could spread very quickly from the inventor of the idea to almost anyone in the world likely to be interested in that idea, even if the inventor was not well known. We would all be like neurons in the brain, with signals passing from one person to the next as fast as possible. It was an attempt to solve the dual problems of "How can I tell the world what I have to say when I have to compete against the efforts of all those other people trying to tell the world stuff?" and "How can I find out new stuff that's really interesting to me from among all this junk that I am getting from all these people trying to tell stuff to the world?".

    I asked the question How fast is the Internet?. Although packets can travel from one computer to another in seconds, or even less, information can still take days, weeks, months or even years to travel from the person who created it to another person who is interested in it. One way to measure this is to consider how often you find a document on the web which is interesting, but which you did not know about, and which has nevertheless been available for months or years, and which would have been interesting to you even when it was originally posted on the web.

    Sadly Miski was never implemented, and I reduced my ambitions to write Womcat Bookmarks, which attempted to be a less dynamic version of Miski, but has ended up being just another RSS reader.

    --
    Music: a super-stimulus for the perception of musicality. Musicality: a perceived aspect of speech.