Running The Numbers: Why Gnutella Can't Scale
jordan (one of the founding developers of Napster), writes: "As the rumour mill churns over Napster's future, many folks see Gnutella as the next best hope for the music loving file sharing community. Problem is, Gnutella can't scale . [Note: if that URL doesn't work, try this mirror.] Almost all research on Gnutella up till now has been based on observations of the system in the wild, but this paper discusses the technical merits of that statement through a detailed mathematical analysis of the Gnutella architecture." The kind of numbers that you may not like to read if you figure networks expand to accomodate traffic at a never-ending pace. Update: 02/15 12:24 AM by T : Jordan also points to this mirror for your reading pleasure.
I am nearing completion of a network that satisfies a, b, c, e.
I havent started on d and f, but they could be added.
This project is called The ALPINE Network
It scales linearly, and provides a query mechanism that rivals the performance of a centralized directory. (Although the bandwidth is more than a centralized query, but at least you have direct control over how much bandwidth you use and how).
At any rate, I could use development assistance a great deal. Let me know if anyone is interested.
Regards...
Get a bunch of investors to run a big fat network pipe into a country with a name like "Ljubilvaniastanistan" where the rutabaga is the national currency and the yak is the national delicacy. Then watch Hillary Whats-Her-Name from the RIAA swallow her own tongue when she learns that her vast legion of lawyers are powerless to do anything about it. Of course, the Bush administration would probably order immediate airstrikes on the grounds of "protecting the wealth-creation security of national corporate interests", but that would be a public-relations nightmare, particularly if we put the new Napster right in the middle of a bustling village full of non-coms.
In all seriousness, I don't condone mass piracy, but the RIAA has been screwing people for decades and I have to admit that I enjoy watching them squirm. What could the RIAA conceivably do if Napster were located offshore, preferably in a country not bound by the terms of the Berne convention?
We're going down, in a spiral to the ground
Of course we've discussed this twice already here and here.
Someone you trust is one of us.
Reality Master 101 writes: Not everything is practical just because there is a need for it.
:) However, science says its possible to build a gnutella-like network that will scale. Therefore, we have NeoGnutella, which will be built if there is a big enough of demand, or OpenNap. OpenNap is, as we mentioned before, is easier to use, and simular to the Napster that most of us know and love, while NeoGnutella will have the benefit of never being able to be shut down. What will win? I personally think that both will survive, due to the fact that there is a large enough market to be divided up by 2 players (again, simplified example) but that OpenNap will probably grab most of the Napster fallout due to simularity to its commercial cousin. However, if OpenNap servers become attacked legally and thus often shut down, we will switch to the NeoGnutella because finding one "node" that we can persistantly connect to is a lot easier then refinding OpenNap servers, even if OpenNap seems to scale better then any distributed net solution, and even if OpenNap is more familiar. Therefore, the long term outlook for Gnutella depends upon if it will be adapted to scale, and if OpenNap will be attacked, as well as other issues not addressed in this rant. We all have different wants. OpenNap, Gnutella, Freenet, FTP/HTTP "warez" sites, IRC "warez" channels, Napster, (formerly) Scour, and other services have evolved to meet this need. Since Napster was the most appealing to most users (and because of media hype), it became one of the biggest file sharing programs out there. Now since Napster has a rocky future, another method will become the biggest.
:)
Warning: Rant Ahead!
Partially true. In your example, you said that if price of gasoline went up, teleportation or fusion-powered cars wouldn't be developed. I agree. However, if the price of gasoline went to $20/gallon tomarrow (an outrageous rate, but its just an example), then we'd either see a changeover to natural gas/electric or some other alternative energy source vehicle, or cars would be developed that got 400 miles/gallon.
So why would gas/electictric cars be implimented and not fusion or teleportation? Well, first we have a demand for transportation. The demand for transportation is rather high, at least in the developed world, and especially in the US, since all of us seem to want to live in the woods and commute to the city. Therefore, if the demand is high, we *will* find something to fulfill the need, as long as the cost of fulfilling the demand is not so great that we have to sacrific other, equally important demands. We don't commute to work via helicopters because the time, money, and energy we would have to exert to be able to use them isn't worth the extra few minutes we'd shave from our commute time. We don't commute to work with buses because we prefer living in areas with lower population densities (e.i. suburbs) which make buses impractical and we don't like the inconvience of having to conform to the bus's schedule and having to interact with other members of our community. We are looking for something that fulfills our need to get from point A to point B with the lowest oppertunity cost to us. This is the economics/social side of the scale. On the other side of the scale is the harsh laws of science and technology, which dictate what has been done, what is possible, and what is impossible, and what the costs for doing each are. Say we have a possible solution set such as this { car (gasoline), car (electric), walking, teleportation, car (fusion) }. Science tells us the teleportation looks impossible. Therefore, we eliminate it. Technology tells us that fusion powered cars haven't been done yet, and considering everything that we know about "hot" fusion, its doubtful we could ever fit a fusion reactor in a vehicle the size of a car. We are now left with gasoline-powered cars, electric-cars, and walking, in this simplified example. Walking is too much of an inconvience to us, science doesn't have a problem with it, but human nature, and the time it would take, plus distance that would have to be traveled, make it impossible. On the economic/psycology/social side, walking isn't happening. So what will it be, electric or gasoline? The technology that's in place makes gasoline-powered vehicle cheaper then electric, and gasoline, even at the high prices that it is lately, is still an economical means of transport. Plus, we have human nature, gasoline is tried and true, electric isn't. Electric also has some problems with travelling long distance, and infrastructure doesn't support electric right now. Therefore gas is the best solution to our problem. In the future, if electric becomes more ideal then gasoline (enough to override our habit of sticking with what we know), we will switch.
So, we learn this. Each problem/solution pair depends on economics, human nature (psycological/social), science, and technology.
Lets apply this to Napster, OpenNap, Gnutella, and the rest of the field. Napster was nice and easy, a lot of us became accustomed to using it, and the technology (on our end) was cheap. However, Napster is either dead or moving towards a fee-based service. All of a sudden, from the economics viewpoint, Napster is less ideal. OpenNap is simular to Napster, there is the additionaly hassle of finding a server, but since Napster is having trouble, OpenNap seems a lot more attractive. However, OpenNap from the social viewpoint, is insecure, it has a central server, it can be attacked. Therefore, what do we have left? Gnutella is free of cost, and cannot be shut down through elimination of a central server. It is harder to use, and technology says it won't scale in the current format. Plus, it eats up bandwidth like a hog.
The above was a rant, and presented simplified examples. I didn't mention gyro-driven cars, monorails, carts hauled by penguins, or bicycles, amoung other things, because I was trying to keep the examples simple (and carts hauled by penguins aren't really practical). I didn't mention stuff like how critical user mass applies to file sharing systems because it didn't pertain to the topic of the comment. So please, don't flame me with a comment how widget-driven cars are the ideal solution, or that file sharing also depends on bandwidth. Nitpicking just wastes both of our time. On the other hand, valid comments are appreciated.
Freenet is also very well architected, unlike bogus Gnutella. It's designed to scale up, so that popular stuff gets cached all over the place. Like, more people downloading means that your connections go FASTER. This is cool.
Cui peccare licet peccat minus. -- Ovid, Amores.
The OpenNap servers are *very* good. I don't think I've used a Napster server for several months now. Grab gnapster and get this and you are good to go.
Cypherpunks: Civil Liberty Through Complex Mathematics. Those who live by the sword die by the arrow.
Anyone who understands how Gnutella works (unfortunately, too few people) knows that Gnutella is horribly broken, will never work, and is basically unfixable.
The more relevent question is whether you can have a peer-to-peer network without central servers that *can* scale. And the answer is "no".
However, the REAL question is whether you can have a peer-to-peer network with decentralized servers, i.e., with clients that automatically establish a heirarchy among all the clients, and certain clients become more "server like". They only way to make a Gnutella work is by making it heirarchical, but the heirarchy needs to be automatic for it have the same general "virtual network" aspect of Gnutella.
Is it possible? I don't know. You would probably have to have automatic bandwidth measurements, depth probes, all kinds of things to make it work. I simply don't know if it would be possible to automate something like that.
--
Sometimes it's best to just let stupid people be stupid.
Simple, with all that media franzy going on (Napster trial even got 1st page covering in my local newspaper) it's a big-scale advertisement for MP3. Yes Napster has a userbase of 60 Million so using the argument that it's only specific individuals that are doing it is wrong, but if that story made it in my local newspaper (and we could see a mention for gnutella too), guess how many people that didn't know about it or napster will be curious to try different services out.
Now there will be media coverage (other than internet) mentionning other alternatives like IRC, Gnutella, search engines, etc etc, this is really a stupid move... not counting the many people that is going to be pissed off at RIAA and stop buying CDs.
RIAA should have worked closely with napster to bring a decent buisness model instead of bashing on them, they might have actually profited from that. They've shown how many "copyright material" were leeched every second (around 10,000) but did they show EVIDENCE that their sales decreased DUE to napster? no, they didn't have to, but if they would have, things wouldn't be that way. You bet after napster shuts down, their sales will decrease, I, for a start, will not buy anymore CDs.
I hope a company picks on big artists for digital distribution and doing something like stephen king, a buck a download, money would go STRAIGHT to them and the record label would stop it's own piracy (i.e. ripping many artists off and taking the public for complete morons).
For now Gnutella will do for most people, and if people SHARE, maths or not, it will work, not as nicely as napster did, but there will be a bunchload of alternatives if gnutella isn't doing the job.
--- Metamoderating abusive downgraders since my 300th post.
I am currently working on a fully decentralized searching network. You can read more about it here.
The key aspects of this network will be:
- No forwarding. This is currently eating gnutella alive. A UDP based multiplexed transport protocol is used to maintain hundreds of thousands of direct connections to all the peers you want to communicate with. You can also tailor your peering groups precisely to what you desire, as far as quality, reliability, etc.
- Low Communication Overhead. All queries that are broadcast are performed with minimal overhead within UDP packets. A typical napster breadth query (10,000 peers) would take a few minutes on a modem, and seconds on a DSL line.
- Adaptive Configuration. Peers that have better or more responsive content will gravitate towards the top of your query list, thus, over time you will have a large collection of high quality peers which will greatly increase the chance of you finding what you need.
There are a number of other features, however too much to detail here.
Also, this is under heavy development, and not operational. I am going solo on this at the moment, and so progress is slow. However, once completed, it *should* be a scalable alternative to completely decentralized searching / location.
But if Napster gets squeezed, you can bet your last dollar that it will be made to. Or something like freenet or audiogalaxy will take over.
But if the price of gasoline goes up, you can bet your last dollar that teleportation will be made practical. Or that cars that use fusion will be developed.
Not everything is practical just because there is a need for it.
--
Sometimes it's best to just let stupid people be stupid.
So, Jordan, you provide a nice demonstration of a flaw. It is considered polite in many circles, that when destroying someone's hard-work, that you make a peace offering in the form of some assistance.
:-)
Can we expect therefore to see an equally interesting and thorough discussion of how Napster/Gnutella can grow, evolve and perhaps merge, to provide the "ideal compromise" where we will not need 100Gb networks, but where:
a) The destruction of any significant %age of the network is transparrently ignored or healed.
b) The network will not segment as GnutellaNet can.
c) Bandwith requirements are low[er]
d) Anonymity of participants is maintained where required.
e) The law can't shut it down so easily.
f) Data can be secured, encrypted and/or signed (etc.) for specific users
And MY personal wish:
g) The end result is so globally accepted for file exchange and storage, that FTP dies a death, and we all live without buffer-overflow exploits for the rest of out lives
Note that Napster and Gnutella were very one-sided in their freedom with files. There was no facility available to ensure that the law wasn't honoured where desired.
--
Enjoy Y2K? Roll-on Year 2037!