Slashdot Mirror


RSS & BT Together?

AntiPasto writes "According to this Yahoo! News article, RSS and BitTorrent could be set to join in a best-of-both-worlds content management system for the net. Possible?" Update: 03/17 21:39 GMT by T : Thanks to Steve Gillmor, here's the original story on eWeek to replace the now-dead Yahoo! link.

24 of 161 comments (clear)

  1. RSS polling intervals by tcopeland · · Score: 4, Interesting

    "Now, should an aggregator be polling every 30 minutes? The convention early
    on was no more than once an hour. But newer aggregators either never heard of
    the convention or chose to ignore it. Some aggregators let the users scan
    whenever they want. Please don't do that. Once an hour is enough. Otherwise
    bandwidth bills won't scale."


    Hm. That's interesting. The RubyForge RSS feeds get polled every
    half hour by a couple folks, i.e.:
    [tom@rubyforge httpd]$ tail -10000 access_log | grep "16/Dec" | grep export |
    grep 66.68 | wc -l
    19
    [tom@rubyforge httpd]$
    Hasn't caused problems yet, but maybe that's because RubyForge only gets about
    30K-40K hits per day, and the feeds get just a fraction of that.
    1. Re:RSS polling intervals by scrytch · · Score: 4, Insightful

      Of course it hasn't caused any problems. It's a couple folks every half hour. Try a few thousand folks every minute (imagine it's a metaserver for some online game, or a blog during a major news event).

      Still, I'm not seeing anything beyond the "duh" factor here. All that needs to happen is for browsers to handle torrent links. Not some souped up napster app, a browser, so that I can type in a torrent link and get any web page (or other mime doc) for the browser to handle. Change the RSS to use the new URL scheme, and there you go. You could also do it as a proxy, but you run into worse cache coherency issues than with direct support of the protocol; who's to say who has the correct mapping of the content url to the torrent url?

      Good luck, mind you, on getting anything but blogs, download sites, and perhaps hobby news sites to jump on board. This issue has been beaten to death in the IETF and many other circles, and it all boils down to content control -- the NY Times simply doesn't want its content mirrored like that.

      --
      I've finally had it: until slashdot gets article moderation, I am not coming back.
    2. Re:RSS polling intervals by costas · · Score: 4, Informative

      The real problem isn't the polling intervals, is that most RSS readers/spiders do not respect HTTP 304 (Not Modified). RSS is ideal for Etag/Not-Modified-Since behavior, but no, most spiders are still too lazy to implement this.

      My newsbot (in my .sig) creates dynamic RSS feeds, customized for each agent; I thought that was a great feature to give users, but it's getting overused by some spiders hitting the site every 15-20 minutes, w/o listening for 304s...

    3. Re:RSS polling intervals by bongoras · · Score: 4, Informative

      1, BT lets you throttle your upload now. 2, if you do it, your download is also throttled. 3, if you want to modify btdownload.py so that it lies about how much it's uploading in an effort to get faster downloads, have fun. It won't help you because BT itself doesn't trust what the client says, it still sends only as fast as it's getting.

    4. Re:RSS polling intervals by welsh+git · · Score: 4, Insightful

      > A well behaved program won't go GETs on every RSS page, but will do HEADS,
      > compare them to what it already has, and decide from there
      > to get or not get the new page.

      An even more behaved program will issue a GET with the "If-Modified-Since: " header, which will mean the server will return a "304 - not modified" if the file hasn't changed, or the actual file if it has.. Thus doing in one operation what a combined HEAD and followup GET would take 2 to do.

      --
      Sig out of date
  2. Neat idea. by grub · · Score: 5, Interesting


    This could be carried further into a whole indymedia via BT. It would be even harder for governments and industry to silent dissident voices.

    --
    Trolling is a art,
    1. Re:Neat idea. by STrinity · · Score: 4, Funny

      This could be carried further into a whole indymedia via BT. It would be even harder for governments and industry to silent dissident voices.

      A couple weeks back, Indymedia had an article saying that the Protocols of Zion were created by the Illuminati to throw blame on the Jews while they take over the world.

      There's a fine line between being a dissident and wearing a tin-foil hat, and many of the guys at Indymedia are squarely on the wrong side.

      --
      Les Miserables Volume 1 now up with my reading of
  3. I highly doubt it. by junkymailbox · · Score: 3, Insightful

    The article's idea is simply to make the web (at least the rss) distributed and then query the distributed server to change from 30 minutes refresh to a faster refresh. But the distributed server needs to be updated also. It may simply be cheaper / more efficient to simply run more servers.

  4. I'd rather see BitTorrent improved in more... by clifgriffin · · Score: 4, Insightful

    ...practical ways. It's a nice program, I've used it on occasssion but it does have its share of bugs.

    And setting up a server isn't quite easy.

    It really could be a lot better with some work.

    1. Re:I'd rather see BitTorrent improved in more... by PierceLabs · · Score: 3, Insightful

      There are too many steps involved. What's needed is the ability to put content into a deploy directory things just get torrented and distributed.

      The other problem being the relative difficulty of actually finding those 'random' websites that contain links to the things you'd actually want to download.

  5. Ummm... by leifm · · Score: 4, Funny

    I'll believe it when I see it. This idea has been circulating the last few days through the blog world, the same people who think they're going to crush traditional media with the sheer power of their blogs. I say whatever.

    --

    "Windows Me offers tremendous reliability and stability improvements..." -- Paul Thurott
  6. Re:To me, by BillFarber · · Score: 3, Funny

    mmm, I love a fruit salad!

  7. BitTorrent is no-go for small files.. by dk.r*nger · · Score: 5, Interesting

    BitTorrent doesn't scale for very small downloads (less than a few MB, I'd say), due to the tracker.

    The tracker keeps, well, uhm, track, of the available pieces of the file, and every client reports in every time has got, or failed to get, a piece. So, using BitTorrent to distribute RSS feeds won't work, because the tracker will take up as much bandwidth, if not more, as a HTTP request, resulting in the "Not changed since your version" request.

    Apart from that, well, yes, BitTorrent is great to distribute large files :)

  8. Re:If he wants to save bandwidth. by djh101010 · · Score: 3, Informative

    A base Akamai contract starts at $2,000 a month for a 1Mb/second bandwidth allowance. Not sure if many/any Open Source projects have a budget for such.

    Akamai is great for offloading bandwidth and speeding up customer's page load times, but you're paying for the bandwidth one way or another.

  9. Konspire2b by Dooferlad · · Score: 5, Informative

    Konspire2b looks like a better option than BitTorrent for distributing news. You could have a channel mapping to an RSS feed and just wait for the news to come to you. No polling intervals and low bandwidth requirements for the operator. With BitTorrent you still have to poll for updates and this removes that requirement.

  10. why not nntp for syndication? by ph00dz · · Score: 5, Interesting

    I always thought that syndicators should take advantage of the current distributed architecture of the newsgroups to syndicate their content... but hey, maybe that's just me. The only real problem is one of authentication -- since you're downloading content from a publicly accessible source one would have to come up with some clever way of making sure you're grabbing content from the author you choose.

  11. IRC by Bluelive · · Score: 4, Interesting

    Using rss polling seems to me just a way to fake a subscribe push technology. Why not just use a push technology like irc. A channel per tracker, just join a channel to get the updates when they are send. Youd probably still want to use rss for events that youd miss while not online for longer periods.

  12. fidonet by mabu · · Score: 4, Interesting

    A good analogy would be comparing the setup to Fidonet and their "echo" messageboards. It's a very efficient method to distribute news.

    The key to usefulness however, is enabling technology to prioritize and authenticate the RSS feeds in some way.

    1. Re:fidonet by MS_leases_my_soul · · Score: 3, Interesting

      As a former FidoNet node SysOp, I have had a similar idea for a couple of years. I have messed around with the code but never been happy with it to a point of putting it on SourceForge.

      The idea goes like this:

      If you want to host a RSS feed, you run a program that is basically a peer cache. People hit your IP and "subscribe" to the feed. You give them a list of other subscribers' IPs and the public key for the feed. The client then hits these peers and checks to see who has faster bandwidth. If the peer is faster than you, you ask to become a leaf under it. It will either accept you as a leaf or pass you on to any leaves it thinks are still faster than you.

      When you have an update to your RSS, you sign it with a digital signature to prove the
      authenticity of the RSS file. The fastest peers actually poll the RSS publisher. Whenever
      they get a new RSS file, they push it to the leaves under them. The RSS file continues to flow downstream until every node has the RSS feed.

      Files under a certain size are just automatically grabbed by the top nodes whenever they become aware of them. Leaf nodes ask their parent node for the file, so again, the small files flow down the tree.

      For larger files, everyone uses BT pretty much as it exists today.

      Using a system like this, you could even go beyond digital signatures and include public key encryption so that you had to have the public key for the feed to even be able to read the messages. The feed owner could choose who would be allowed to have the private key, thus controlling who could post while at the same time keeping the traffic unreadable to any sniffing the wire.

      Integrate this into an encrypted peer-to-peer app like WASTE and you might have something worth using. So who wants to start developing code?

  13. Content management system ? by mybecq · · Score: 4, Interesting

    Can somebody explain how RSS and BitTorrent equal a content management system ?

    Sounds more like a (possibly improved) content delivery system.

    Too bad the article didn't indicate anything about content management.

  14. WebTorrent by seldolivaw · · Score: 3, Insightful

    I blogged about the possibilities of using BitTorrent to deliver web content back in April, but I didn't consider RSS. The idea worked out between myself and some friends was a network of transparent proxies as a way of dealing with Slashdot-style "flash crowds". When you request content, your proxy requests the content from you, and simultaneously broadcasts the request to nearby machines. If any of those machines have already downloaded the content (some form of timestamp and hash is necessary to ensure it's the correct and authentic version of that URL) then they will send that content to you, allowing servers already under or expecting heavy load to push out a new HTTP status message "use torrent", supplying a (much smaller) torrent file. This allows web servers to scale much better under flash crowd conditions.

    The drawback of the WebTorrent idea is that you need some way to group all the images, text and stylesheets together, otherwise you have to make a n inefficient P2P request for each one. RSS is a great way of doing that.

    There aren't many details online at the moment of the work we did on the WebTorrent idea; it was mainly an e-mail thread -- get in touch if you'd like details. The project page is available, but I stopped updating it so it doesn't have all the work that was eventually done.

  15. Whoah. </keanu> by CrystalFalcon · · Score: 4, Informative

    This is the first time I've heard FidoNet mentioned in... must be almost a decade. It's like the huge amateur network (which for a brief period outnumbered the Internet in raw node count, mind you) never existed.

    Anyway, FidoNet was not without its share of problems. The killing bullet, I'd say today, was the social factor - there were too conservative forces clinging to backwards compatibility at the cost of anything. Anything had to work with the most basic piece of software; this effectively shot progress and evolution dead.

    Not that there weren't attempts. There were. They just weren't successful.

    Anyway, setting up echoes would have the same problems as FidoNet echoes. The number one problem was typical for Slashdot: DUPES!

    Echoes were set up so that one node relayed a message in an echomail forum to its other connected nodes for a particular echo, effectively creating a star topology, different for each forum. However, since each sysop just wanted the echo linked, he would just hook up to somewhere, and forget about it. Then, others would hook up from him, and all of a sudden somebody had hooked up to two different valid uplinks.

    The result? The star topology all of a sudden had a loop in it. Messages would keep circling (since FidoNet used dedicated dialup lines, latency between nodes was typically in the hours range) and dupe filters were created.

    All of those filters and filter-enabling tags were optional, of course. After all, you couldn't mandate an operational node to change its behavior, you could just ask nicely.

    Political play to no ends. :-/

    Anyway, there were many other funny effects with EchoMail. Crosslinking was another - when one echo got linked to another at a node, so that all messages in echo X would enter echo Y at that node and vice versa. The most exotic of these was when a religious echo got crosslinked with a fantasy humor one -- through crosslinked physical directories at a node (the FAT pointers for the different directories hosting the two echoes pointed to the same location on the disk). Anyway, much hilarious discussion ensued, and not many understood much what people were trying to say in the crosslinked echo. :-)

    / former sysop and NEC in FidoNet

  16. Use conditional GET, not HEAD by NonaMyous · · Score: 5, Informative
    An even better behaved program will use conditional GET instead of HEAD. For more info, see HTTP Conditional Get for RSS Hackers :
    The people who invented HTTP came up with something even better. HTTP allows you to say to a server in a single query: "If this document has changed since I last looked at it, give me the new version. If it hasn't just tell me it hasn't changed and give me nothing." This mechanism is called "Conditional GET", and it would reduce 90% of those significant 24,000 byte queries into really trivial 200 byte queries.
  17. Re:RSS + BT = USENET + NNTP by penguin7of9 · · Score: 3, Insightful

    Does your usenet reader serve news articles to other users?

    Yes: the way people traditionally read USENET news is by becoming a USENET node, downloading articles to the directory hierarchy of the local machine, and then redistributing them to neighboring sites. Reading news by connecting to centralized news servers via a network client happened many years later.

    No, you need a costly usenet servers architecture.

    There is nothing intrinsically "costly" about it: it's something a PDP-11 used to handle and that regularly ran over dial-up.

    Not only machines, but also huuuge bandwith. Today's usenet servers that want to serve large portion of world hierarchies can only get it via dedicated satellite usenet-only feeds.

    Just like a BT solution, you only redistribute those articles that you yourself are interested in.

    The reason why we got a USENET infrastructure with a small number of backbone sites (compared to the readership) that carried everything is simply because a bunch of sites took on that role and carry everything. There is nothing in the protocol or design of USENET that requires it.

    RSS+BT on the other hand is poor server and rich clients that exchange articles between themselves via p2p network only supervised by a BT tracker.

    And you believe that BT and the BT tracker scales up to many billions of files on millions of nodes by sheer magic? BT would probably need a lot of work to scale up. And at least USENET doesn't need any supervision by anything--it's completely asynchronous and unsupervised.

    Note that I did not claim that USENET would work any better than RSS+BT--I have no idea whether it would--simply that people are basically reinventing USENET when they combine RSS and BT.

    I actually suspect that there are intrinsic properties of large peer-to-peer news networks that people don't like because that's why USENET became more and more centralized over the years.

    What morron modded parent as insightful?

    That's what I would ask about your posting. In fact, I would ask what moron wrote it.