Faster Feeds Using FeedTree Peer-To-Peer
dsandler writes "Researchers at Rice University have just released version
0.7 of FeedTree, a peer-to-peer
system for distributing Web feeds faster. Instead of polling feeds
independently, FeedTree users cooperate to share news updates
using multicast in Pastry, a scalable p2p
overlay network. FeedTree reduces the update delay for existing RSS and Atom
feeds to a few minutes without putting extra stress on the webserver (anyone
who's ever been temporarily banned by Slashdot's RSS feed knows this is a real
concern). Feed publishers can also choose to push digitally signed updates
for immediate, tamper-proof delivery to subscribers. The client software (download) runs on Linux, OS
X, and Windows, and works with any desktop feed reader."
I get teh free movies now right?
WIth Bittorrent et al firmly established, why do we need another P2P?
Ignorance is curable, stupid is forever.
I for one welcome our new p2p overlords.
It looks like they just re-invented the netnews protocol, which works in a very similar way.
What's the best OS X feed reader to use with FeedTree? I don't care for the way Safari handles RSS.
s/t - has anyone run this on FreeBSD? Perhaps it works with the Linux compat modules loaded? I'd like to try this out tonight, since I have 3 sites on my FreeBSD box that have feeds that are constantly being hit...this sounds like a solution for the long term.
fak3r.com
MMMmmmmmm, Pastry.
The simple truth is that interstellar distances will not fit into the human imagination
- Douglas Adams
too many
Since when did the internet support Multicast? UDP and TCP works, IGMP definitely doesn't though. I guess accuracy is something I shouldn't expect from slashdot though.
I remember seeing something like this in my logs over a year ago. I would see lines like this in my access log:
/rdf10_xml HTTP/1.1" 200 5322 "" "Shrook/76p (Distributed; +http://www.fondantfancies.com/shrook/distfaq.php) "
66.177.198.139 - Anonymous [04/Apr/2005:03:04:17 -0500] "GET
I haven't seen a hit from this in a while, perhaps that effort didn't gain much traction. Who knows if this one will... I never saw Shrook mentioned on Slashdot.
I wonder: If GMail were to incorporate an RSS reader (the way Thunderbird does), it could potentially update many, many users with a single hit of each RSS site.
I'm leaning towards using RSS as a way to do announcements rather than maintain a mailing list. Rather than tell me you want me to send you updates (and deal with being potentially a spammer, deal with your unsubscribe, your email address change, etc.), just poll my site every so often (days, for the lists I'm talking about; hours, for Slashdot) and let it show up in your mail queue.
The idea isn't quite ready for prime time; too few people use RSS. But GMail could make that happen in one fell swoop. Well, two fell swoops: you'd need some sort of browser extension to make the little orange "RSS feed" button notify GMail.
I wonder if just having GMail (and hotmail, aol, and yahoo) handle that would solve the problem to the point where we no longer needed a P2P RSS distribution system.
Alternatively, if ISPs were to cache the RSS feeds the way some do with certain web pages, that might also take a lot of the load off. People will still impolitely set their RSS readers to check the feed every 10 seconds, but at least it never gets out onto the backbone if it's cached at the ISP.
Try saying that headline 5 times fast!
You never expect irony, do you?
Want to be a professional wrestler? Visit www.iyfwrestling.com
@iyfwrestling
As a Rice Computer Science student I would like to point out that Pastry actually originated at Rice, under Dan Sandler. The first framework was in Java. You can see from his web page that he's responsible for FeedTree, too.
Microsoft Research became interested in the product and ported it to C#, effectively turning it into the form it is now. Many classes at Rice have now "backported" it, I guess you could say, and it's used for many of our classes that involve distributed networks, such as the current COMP 410 class which has previously turned out distributed file and process system codename Voltron.
Here's a link to the paper co-authored by Sandler and others at Rice.
I can't find any mention of the license terms on the Web site.
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
An excellent project, it deserves to become dominant in internet
RSS news distribution.
It's nice to be able to browse the source code.
What can we do to encourage adoption of this, before some wretched
proprietary format tries to muscle in?
Haven't used Google Reader yet, have you?
They'll likely integrate this with GMail at some point. But that's just my opinion.
- Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.
I personally use Bloglines - a web based news reader. This lets me check and read my subscriptions from home and work without having to read posts twice. Google Reader is a similar application but has tagging and merges all your feeds into one.
Is because RSS doesnt pay. There's no way to monetize the RSS-feed which often can be a large burden on a server in terms of CPU (if dynamic) and bandwidth.
Micropayments would solve this. Pay 0.001 for every reload automaticlly and you wouldn't need a solution like this. Fix that and solve thousands of small problems at once.
i don't understand how it is better than setting feed readers poll time to 10 secs? Do they think end-users care about bandwidth?
Scribe[Technical paper pdf warning!] is a framework to do very similar things. Is this an application developed on top of that? Scribe works by building a multicast tree of the participants too.
One interesting thing to note is that as a participant in scribe, you'll have to pass on notifications of feeds even if you're not interested in them, because you're a part of the tree and pretty much the only path to the guys below you. How does FeedTree deal with cheating/lying nodes that refuse to pass on messages? Also, to be a part of the overlay, you need to keep sending keep-alive messages. Not a big deal, I know, but I always thought Scribe was impractical for general use, but would work great for a restricted audience (like a large geographically distributed company) that can be "trusted".
There are 11 types of people. Those who understand binary, those who don't and those who are sick of this lame joke.
I'm afraid I don't understand what problem this is solving. It's like a solution that's still looking for an problem to solve. As an end user, why should I care? I'm not trolling; I just don't get it.
What is humor if not pain tempered by time?
In addition to the website, there is a technical paper that descirbes the whole architecture. How is it works on top of scribe, how it can work in different models of adoption. How security is handled and all the other technical details. If your interested in the gory details of how it all works you should go here pdf to see the paper.
I've been thinking for quite some time of utilizing this type of P2P distributed caching proxy concept with many different protocols. RSS is just one possibility amongst many that could utilize the basic technology here. Some others might include distributed file systems, distributed caching http proxies, or even a Google competitor that uses a distributed P2P implementation of the database and utilizes everyone's everyday web activity to augment the spidering (i.e. every time anyone who is part of the P2P search network hits a site, a side effect is that they update the search index with the latest data from that site).
I'm not sure though that this is truly beneficial in terms of reducing the burden that RSS places on the Internet in general. Yes, it reduces the burden on the originating web site, but I believe it increases the total number of packets that must flow across some internet connection somewhere. So, it appears to be a mechanism for shifting the cost from one at the server to a larger total one at the clients, not a mechanism for helping the internet as a whole. I would in fact be positive that this is not beneficial overall except for the fact that it may have a beneficial reduction of the peak traffic on critical network backbones. But that would only be true if the overlayed network topology is either geographically optimized or is based on something that has an accidental relationship to geography.
The client software (download) runs on Linux, OS X, and Windows, and works with any desktop feed reader.
New game in town: never use the word Java. BTW, it doesn't run on Linux and Windows. Except if you install Java of course.
Million Dollar Screenshot
No, it is more like that they are reinventing BBS: http://en.wikipedia.org/wiki/Bbs
it doesn't run on Linux and Windows. Except if you install Java of course.
A lot of people who are interested in peer-to-peer networking have installed a Java platform, even if only to run the Azureus client.
No. Anything already made available to the public on the Web is subject to the copynorms and copyright exemptions of the Web. For example, under 17 USC 512 (enacted as a rider to the DMCA), those who operate automated caches on a computer network are not liable in any United States court for damages that result from copyright infringements performed through such caches. To learn more, read about the OCILLA at Wikipedia.
The problem is, the web is a mostly one-way architecture and certainly doesn't work in any sort of a p2p fashion.
As Tim Berners-Lee originally conceived the World Wide Web, each computer would run both a client that dials out on port 80 and a server that listens on port 80. In practice, only the pervasiveness of dial-up Internet connections and duopolistic residential broadband ISPs' terms of service have necessarily interfered with this vision.
I've heard about this trend before, but it is still very disturbing to see something like this where an application that screams out for a universal client that can be run on any platform is funded by Microsoft who dictates that the language be changed to C# leaving the original Java version to languish. Although, it's nice to see that the original is still available and has an open source license, it's disappointing that MS couldn't simply fund it as it was. As well as being a waste of money to do a port where none was needed, it certainly lends creedence to the arguments of Microsoft bashers.
Signatures are a waste of bandwi (buffering...)
I know it's a crazy suggestion, but instead of having hundreds of people polling a single RSS feed, why not have the server which hosts the RSS feed actually PUSH the updates out to the people who are interested?
We already have a nice and simple protocol (XMPP) which could be used for this, although admittedly PubSub isn't as final as it could be.
Karma: It's all a bunch of tree-huggin' hippy crap!
Hey... wait a second... I did a little more digging and it does look like there is at least a java version of the code base. The Linux version seems to run on pure java and the library contains the pastry.jar file. Even though there's a src/net tree, where much of the code seems to reside, I'm not seeing ANY C# code.
:(
So, it seems I should've dug deeper before making my previous comment. Sorry about that folks
Signatures are a waste of bandwi (buffering...)
from the site:
If everybody used this, then there'd be no need for mirrordot and the slashdot effect would be a thing of the past and more people could afford to host pr0n on their personal websites
Signatures are a waste of bandwi (buffering...)
I'm not entirely sure what happened at Rice w.r.t. Pastry. What I was told by Dr. Wong that Rice had a Pastry version, MS adopted it and converted it to C#, then allowed us to use it freely. All of this was part of an elective class called COMP 410 that students take. Basically, a team of 10-20 people act like a software company, self-organize, meet with a "client" (professor acting like one) and build a huge system.
:]
And yes, we use entirely Microsoft software. But I think it's a good thing. When I took it, Microsoft gave us copies of Visual Studio 2003, SQL Server, and funding for some tablet PCs to use as part of the project. I thought it was a *superb* experience to work with so much real-world technology.
Yes, I suppose one could say that MS is stifling open-source competition... but seriously, we were building an application that used a distributed cluster of SQL Server databases, transactionally changed by Enterprise Services features with Event Queueing; all of this also used a distributed file and processing system based on Pastry (C#). Getting all that to work together with open-source in a single semester would be quite a challenge. What database would we even use? MySQL is definitely not capable of that. And Oracle isn't free.
So, in this case at least, I think Microsoft's support of us has been positive for students. We are not just a Microsoft shop -- there is even a research group at Rice called the Programming Languages Team, which focuses almost exclusively on Java for research projects. I'm currently involved in improving the open-source, student-oriented Java IDE called Dr Java, which is under the purview of PLT.
*Pant*
Well, I'm sorry this turned into a rant. I guess my point of this: Microsoft has not caused Rice to give up open-source software or anything like that. In reality, their funding has exposed us to more software and more systems than we would have otherwise. I think that is a Good Thing.
It is neither a good thing for students to be exposed solely to OSS, nor solely closed-source industry software. A university should educate well-rounded people, and much like liberal-arts universities require students to take many subjects, Rice exposes CS students to different technologies and environments in its computer science program. Otherwise, how can I ever decide which is best for a task?
[Note: I am heavily, personally in favor of Microsoft software and have accepted an internship with them in the C# Compiler group next summer. But this doesn't mean I dislike Java or OSS; I don't see why there has to be a conflict at all. Use whatever tool suits you best.
But that's just me. I'm going to do *my* best to make C#.NET the best language it can be. If you like Java, fine! We can learn from another
Java is, and always has been, a proprietary technology completely specified by Sun. Sun owns the specs and decides what language features to add. Period.
The
So, there is nothing at all "closed" or "proprietary" about C# or
So, let's say you were right and Microsoft did somehow convert Pastry to C# from Java. How is this closed or proprietary at all? If anything, it's *more* open.
Sun, the company, itself owns all aspects of Java. No one owns C# or
Adding encryption and IP packet morphing to these P2P feeds, and censorship would be very hard to enforce.
Microsoft is in complete control over the future of the C# language and the .Net libraries and runtime. Just because they do the standards dance doesn't mean they've given up control. Do you honestly think that C# or .Net can change in a way Microsoft doesn't approve of?
The ECMA even allows the standard to be patent-encumbered as long as Microsoft provides "reasonable and non-discriminatory" licensing fees. That makes me feel completely safe.
Microsoft's policy on making changes is to solicit customer feedback and then work internally to come up with the design. Compare this to the JCP. I'm sure that, in practice, Sun has more clout than the other participants, but their control isn't total. Design-by-committee may or may not be be stupid, but it is definitely more open.
Have you seen the Java 6 website? No, the license isn't the friendliest, but it's the production JVM. Rotor is just the research implementation.
Don't get me wrong...I think C# and .Net are open enough to allow implementations like Mono. I, personally, don't believe Microsoft will sue. I just don't understand how you can say C# and .Net are "much more" free than Java.
Aside from the production implementation and the related patents, right?
oh, you mean the "lockup your browser once past six tabs all running flash ads" plugin?
cute, real cute and oh so cooool..not.
Naw, let's just stick to "flash sucks so hard everyone should run three plugins to STOP it"
I don't even bother installing that thing anymore, I gave it a shot for years, it fails it. 99% of the people who use it are abusive to websurfers,because 99% of the uses are for ads. C'mon, you know I'm right on this. And the ads aren't even very good and they hog resources, they can take even a decent midrange machine with half a gig of RAM down, you have little control over the phone home spyware aspect to flash, and..it just sucks, man, it sucks. It's like the blink tag on crack, steroids and red bull, it's just wrong. Some things, like curb feelers and neon fender lights are just a dumb idea.
Um, no.
Sorry, but that's just ignorant.
Quote: The original comes from the Mono Project FAQ entry on patents. Please, stop the FUD.
More details at How I Invented a Decentralised Scaleable Push-Based Micronews System in 2000.
If nothing else, my documented but unimplemented invention might be good prior art, should it be needed.
Music: a super-stimulus for the perception of musicality. Musicality: a perceived aspect of speech.
I'll show you my awesome web based feed aggregation tool, if you show me yours.
I hold very few opinions. I hold information based on observation and fact. If you wish to disagree, please use facts.
Distributed peer-to-peer web 2.0 rss news updates? You young whippersnappers and your fancy-schmancy names!
In my day we simply called it gossip!
One of the lessons of history is that nothing is often a good thing to do and always a clever thing to say. - Will Duran
So, I work on Pastry. There are two branches of Pastry: MSPastry (developed by Microsoft Research) and FreePastry (developed initially by Rice, open source, now developed primarily at The Max Planck Institute for Software Systems (where I work)). They were started at roughly the same time, while Prof. Peter Druschel (formerly of Rice, now at MPI-SWS) was on sabbatical at MSR.
Microsoft didn't co-opt anything, and in fact allowed and encouraged the open source Java version initially. These days I understand the MSPastry isn't actively developed, but FreePastry lives on. FeedTree uses FreePastry, as does ePOST, and a variety of other projects.
The question is whether Feedtree has done its reinvention well or badly, and whether it can scale adequately as its user base changes, which will happen if it's actually useful. It probably won't have the same scaling stress as Usenet had - the load has increased by four-six orders of magnitude in the last 15 years. I started reading Netnews when there were a few dozen machines on it, maybe up to a hundred, with 1-200KB/day of traffic, and the most common network connections the first few years were 1200-baud dialup modems, and stopped running my own news server when the technical newsgroups carried about 5MB/day. I lost count of how much traffic Usenet had when a full feed passed 45 Mbits/sec (some time in the late 90s, one of my ISPs described the traffic level as "about a T1 if you don't get the binary pr0n groups, or a couple more if you do".) The sizes of servers changed, the way people accessed them changed, the business model changed radically, people like Henry Spencer did amazing work to get Usenet server software to run efficiently as it grew, partly so he could keep using his PDP-11s, somewhat the way people run Linux or BSD to keep older PCs running usefully today.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks