Hypernets -- Good (G)news for Gnutella
Red Roo writes: "This online article addresses the recent
criticism of Gnutella network scalability by pointing out that it is a Cayley tree. As a viable candidate for massively scalable P2P bandwidth, all trees are dead! But by going to higher dimensional virtual networks (aka "hypernets") e.g., hypercubes or hypertori, near linear scalability can be achieved for P2P populations on the order of several million peers each with only 20 open connections. This concept seems to have been entirely overlooked by critics and developers alike."
Now that I don't have the time limit of First Post gloating, I can give you the details of Samizdat...
Samizdat is a censor resistant agile network protocol named for the clandestine publication of banned literature in the Soviet Union.
Some americans have asked me if this is just about stealing music from big companies and giving it to your friends.
While this is possible, the real purpose of Samizdat is to share files and messages in a secure, anonymous network.
The right to Free Speech is a part of the Universal Declaration of Human Rights and is also in the constitution of the USA and the Human Rights acts of Canada, the UK, Australia and New Zealand.
Recently, the governments of the free world have tried to remove that right by claiming hacking is terrorism and information is a weapon.
I say "freedom is the freedom to say 2 + 2 = 4" that the following articles from http://www.un.org/Overview/rights.html the declaration states laws like the USA PATRIOT , UK RIP and acts in other countries which allow the secret service or police to read your email are illegal under international law.
Articles 12, 19 & 27 of the universal declaration of human rights http://www.un.org/Overview/rights.html deal with the information rights of everyone.
If anyone tells you filesharing is illegal, then that law is a violation of an internation treaty your country has signed to join the UN.
- Kaos games and encryption systems developer
As far as I can tell, what this is saying is that Gnutella is scaleable because it doesn't use a tree, (with each node connected to only a few 'nearby' others) but rather as a more complex graph, with each node having connections to many nodes which aren't really nearby. In a true tree, there's really only one route from one node to another. In contrast, a hypercube has many many paths from one node to another.
In reality, I'm pretty sure no actual large-scale networks are like this, for obvious reasons, but I surmise they tend to be more treelike than gnutella is, where each client tends to try to make as many connections to other clients as it can. Therefore, it should be pretty scaleable; since if each new client is making connections to a bunch of other clinets that might not otherwise have a short distance between them, there aren't really goingto be any bottlenecks.
It should be noted that in it's evolution since the Nullsoft Gnutella 0.56 server/client, modern Gnutella servents have reduced traffic and improved network scalability by means of
- routing pushes instead of broadcasting them
- caching pings/pongs, and even queryhits
- use of UltraPeer/leaf relationship, which increases the speed at which traffic is routed
There are other ideas that Gnutella developers like those at Limewire have been kicking around, which are similar to ideas that publishing networks like Freenet and MojoNation have, such as data specialization (ie. queries are directed to those likely to have the data, not broadcast to the entire world).
I'm glad whenever mathematicians or people with specialities like traffic analysis examine existing p2p systems, or give their ideas on p2p systems - they might come up with some good ideas or give a good critique that clarifies elements of a p2p network. This paper is certainly less arrogant than ones with names like "Why Gnutella Can't Scale. No Really". A hypernet is an interesting idea, although I can think of a number of reasons why current p2p sharing networks would not implement them. Namely, because authoritarian networks like Napster were shut down by trade associations like the MPAA/RIAA, while more anarchic networks like Gnutella are more immune from such actions - we must consider not only the survival of the scaling network due to technical constraints like Dr. Gunther does, but also it's survival due to legal constraints orchestrated by large corporations. Then there's the question of how many peers the network is designed for - scalability is just one factor in the reasons why I would use a particular p2p client. Luckily, we will have competition between p2p networks like FastTrack, Gnutella, Freenet and MNet (Mojonation), and perhaps different ones will be used for different purposes, just like Usenet, distributed.net and so forth.
Even though you might have been trying to be a little facetious, I think your point is very valid.
I can grab files, with no small amount of effort, from an online file sharing service, and maybe get 2 GB a day. If I network, in person, with people who share similar tastes in music, I can get a lot more bang for the buck. As soon as we see larger portable technologies (It's already happening), trading media will be just like trading games in user groups was back in the 80s and early 90s. We all just bring our 300GB portable disks to meetings, link them all up, and take them home to copy onto our TB home systems.
The only thing that would prevent this from happening is a very rapid growth of broadband, to something like reliable 10Mbit levels. I don't see that happening before hard disk space grows to the sizes I quoted above.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
I don't have a /. account yet, but I do have written a (prototype) P2P agent that uses a 64-dimensional binary space to arrange itself. This allows for non-centralized address resolution in an statistically well distributed network.
Every node takes a random 64-bit number as address (collisions are possible but unlikely with 64 bits) and once seeded searches to position itself within its closest peers. The distance to another node is simply the number of different bits in the address (plus a higher weighting of high bits in case of a tie).
Upon this infrastructure, a group mechanism is implemented, where any member of a group stores all directory information for that group, and a list of known hosts for the identified content, as well as the peers for the next group.
Groups are hierarchically arranged, like a directory. Membership in one group mandates membership in all "higher" groups, up to the "root" group. Therefore, it is possible to navigate through the whole system like through a file directory tree.
Source code is available for the Macintosh (think like "Hotline" without servers). It still has a minor memory leak, limiting stability to a couple of hours, and several other drawbacks that prevent it from becoming full featured, like not being able to reach behind NAT, or limited protection against malevolent nodes.
Ultimately, I stopped development because an IP-to-Content relation can be established and therefore the network is attackable-by-content. If anybody wants to pick it up and push the work ahead, the source (PowerPlant Net classes & UI) is up for grabs. Contact me at "komet163@gmx.net".
Thought this might interest someone in the context of this multidimensional network discussion...
This shouldn't be too hard (at least it doesn't look too hard sitting here on a Sunday morning, half way through my first cup of liquid brains). The key is to note that we can't (and therefore needn't bother trying to) enforce the topology at all times. Instead, we just want to bias the network towards the desired form. For example:
This should at least be functional; no doubt there are a number of clever hacks that could be made...-- MarkusQ
While I admit I have no clue what the article is talking about (hyper what?), I was under the impression we could already do better than linear scalability with a simple hierarchical P2P network. Just have a dynamic cluster of loosely connected "servers" to which clients maintain a persistent connection to only one. With a little intelligent routing of queries between servers (i.e. don't just spit the querey out to all the other servers you know) you get O(1) scalability at the client side and something (probably significantly) better than O(n) on the server side.