Gnutella Not Scaling?

Re:Yeah no shit. by Sanity · 2000-09-22 00:16 · Score: 4

Actually with Freenet it seems to be a log(N) problem. Much better.

--

Scalable vs. Distributed by Limecron · 2000-09-22 00:18 · Score: 4

The problem is quite obvious and has been around as long as peer-to-peer and server based networks have both existed. Peer-to-peer networks work wonderfully when they're small. Server based networks are much more effiecent and thereby are nearly always used for large networks. Can Gnutella still work? Yes, but it will have be divided into smaller networks... For example: You have separate networks for: Pop MP3s Rock MP3s Country MP3s Rap MP3s Jazz MP3s Movies Warez..err..Shareware Of course, each network should have a critical mass and then divide in half when it reaches that point. Wow, maybe I should get programming...

Part of a solution by PureFiction · 2000-09-22 00:35 · Score: 5

There is a way to start resolving this problem, and it is currently in development.

The gPulp project is currently working on all of these issues. Check proposals and ideas at: http://gnutellang.we go.com/go/wego.pages.page?groupId=133015&view=page &folderId=136401&pageId=177268&JServSess ionId=3fe61b505308701b.415222.969643886549

There is also a server oriented gnutella application which aims to start resolving some of these issues in the near term. Features such as:

1) Provide a server for broadband / dedicated network users to provide content with a true server oriented gnutella node. This will be similar to a modified apache for singular installations, or a federated distributed server architecture for routing and caching fun.

2) Remove broadcast push requests (in all future clients)

3) Proxy and cache support for slow users. This will allow beafy servers to take over some of the load which dialup / slower clients experience. This will be somewhat ala freenet, as popular data will propagate through caches in various nodes. Also, this can provide a level of anonymity which is not present.

4) Adaptive servers which configure their network connections for optimal efficiency. Not too busy, not too slow, and with the widest distance topologically from their peers (if linked) and fuzzy / reactive propogation algorithms so that TTL's and routes can be dynamically modified as load increases or other factors require.

There is nothing fundamentally flawed with the gnutella architecture, and it is far from a 'dead' horse'. However, there are significant innefficiencies and complications which are causing problems right now. Rest assured these will be fixed.

Yes, it doesn't scale; we know that. by Animats · 2000-09-22 00:37 · Score: 4

I've pointed out several times that the Gnutella protocol doesn't scale well. It's not impossible to fix this, but it needs a major rethink.

The basic problem is that small sites either take a lot of search hits to which they will answer "no find", or their index has to be mirrored elsewhere, which introduces centralization. There's an economy of scale to searching.

So automatic, distributed, redundant, partial centralization is necessary. This is hard. It also has to be reasonably secure against hacking; look at the problems IRC has. It probably needs a reputation service, so people who spam the indexing system lose.

On the other hand, music interest, being a popularity thing, follows a power law; the music most likely to be searched for will be found easily. A simple hack on Gnutella so that it queries servers slowly, in order, starting at the one with the best response time, stopping with the first find, will keep the thing from collapsing until somebody cracks the hard problems. It's not necessary to crack the general distributed search-engine problem to fix this.

Math... by Hobbex · 2000-09-22 00:38 · Score: 5

I can't understand why this is news to anyone. Those of us who spend time thinking about these things said it right away when Gnutella was released, and we had discussed and rejected the broadcast model for routing several times before that (see the Freenet development list archives if you don't believe me).

The Math behind it is simple:

- Every user that that adds Cu amount of capacity to the network (on average).
- Every user also adds Tu amount of traffic (also on average). However, because of the broadcast nature that traffic is sent to all users, so with N users, each user generates Tu*N amount of traffic.

This means that the total capacity of the network is:
C = Cu*N
(Capacity per user times the number of users). The total traffic on the other hand is:
T = Tu * N * N = Tu * N^2.

For the network to work C needs to be greater than T, if T C. You simple cannot win using a broadcat model.

On the Freenet-dev list we have a standing rule that two words are indecent and offensive: "centralize" and "broadcast". We think we can pull it off without them, but it makes everything 1000% more difficult, which is the simple answer to why Freenet is developing more slowly then the one hundred million Napster and Gnutella variants outthere. That, and the fact that you are not helping us...

Re:Math... by PureFiction · 2000-09-22 00:50 · Score: 4

You forget a few vital points.

1) Every bit of information is NOT sent to every other client. Many requests are dropped, ignored, or simply do not reach their destination when the TTL expires.

2) The nature of the clients ensures that slow connections have fewer peers, propogate fewer requests, and receive fewer requests than faster ones.

These two attributes greatly reduce the theoretical maximums encountered when doing math.

The real world implementation does not even remotely follow the absolute mathematical predictions.

Re:Freenet isn't searchable by Sanity · 2000-09-22 00:41 · Score: 5

It is true that fuzzy searching has not yet been implemented - although searching by song title, artist, and album are possible using "subspaces", a mechanism present in our recent 0.3 release. I recently posted a proposal for this to the Freenet mailing-list and I think some guys are working on it.

The underlying Freenet architecture should actually be quite a good fuzzy-searching system, it is just that we have not got around to enabling that functionality yet as we have been concentrating on getting the underlying architecture right.

--

Gnutella may not scale, but it is still useful by (void*) · 2000-09-22 00:50 · Score: 4

Yes, it does not scale. Anyone who has done basic CS101 will tell you that. But this does not mean it is not useful. It just means that it was not cut out to span the entire net. I can see Gnutella working within a college's residential dormitory, for example. Or within an office building. Maybe not to the entire internet, but certainly for small networks, this might still be useful.

So I don't think Gnutella is going down in flames. Since it is open source, we may take that as a lesson learnt and perhaps rip out the offended non-scalable part and build a better file sharing device that actually works this time.

demonstration by TheTick21 · 2000-09-22 00:07 · Score: 4

I've always thought of gnutella as more of a demonstration than a finished product. While it may not be the best implimentation it shows that distributed file sharing can work well with no central server...its an important step...this version of gnutella may have reached its limit...but there will be more...just some thoughts

My Home: Apartment6

Death of Gnutella a little premature. by Derek+Pomery · 2000-09-22 00:07 · Score: 5

In the article they point out that the load could be cut in half by fixing some bad code.
They further mention that proposals for redesigned version have already been made.
link from article
Not only that, it says support and resources for this project are being sought out - it's active, it's open source, what more do we want?
Given the interest in Gnutella, I don't see any problem finding people to fix known bugs.
Rather then seeing this as the death of Gnutella, I saw it more as a positive article pointing out known bugs that are being fixed, and announcing a the planning of a new and even more powerful version.

--
-- perl -e'print pack"H*","6e656d6f406d38792e6f7267"' /. ate my old sig. Bastards.

Re:Yeah no shit. by Kaa · 2000-09-22 00:08 · Score: 5

This is the problem with ALL distributed architectures. Its an N^2 problem.

Only if you insist on reaching all the nodes all the time. If you can afford to reach only a subset of the nodes for any given request, then the problem becomes one of proper clustering.

Note that Napster also implements kind of clustering: you see the files of people in your "cluster", not of all Napster users on Earth.

Kaa

--

Kaa
Kaa's Law: In any sufficiently large group of people most are idiots.

check out Mojo Nation by burris · 2000-09-22 01:37 · Score: 5

check out Mojo Nation which is an open source distributed filesystem that is attempting to address many of the issues that plague systems like Gnutella.

It uses centrialized content tracking servers, but anyone can run one by just clicking a switch in their client. The content trackers store XML metadata describing the file, so you can search on different fields in different file type categories (easily defineable).

The the files themselves are broken into small redundant pieces and spread over the network. You only need half of the available pieces to reconstruct the original file. This way the system is resistant to servers disappearing. It also means you distribute your load over many hosts and clients with slower connections can still provide block services.

The coolest thing is that Mojo Nation has a built in digital cash called "Mojo" and a microcredit system that effectively turns it into a barter system for disk space, bandwidth, and CPU. Whenever you upload, download, search, or otherwise consume another systems resources, you must compensate them with Mojo. The Mojo represents the disk space, CPU, and bandwidth you are using. You can get Mojo by contributing your resources to the network through the client software (it's automagic). This way nobody can consume more resources than they are contributing to the system. Each person that uses it helps to make it stronger. Of course, being a real digital cash system, nothing stops people from sending Mojo to eachother in e-mail and settling the transaction with something like PayPal.

It's really cool, check it out.

Burris

Optimization... by pb · 2000-09-22 00:10 · Score: 4

Some of these problems could be easily solved.

I think there needs to be a way to tell what the network load on an individual node is, and attempt to negotiate connections with machines of similar connection speeds or ping times up to a maximum load cut-off.

Of course, there will still be people with hacked clients that report a bandwidth of 0 and a load of 10, but suspiciously have low pings. Those leeches should be killed, or at least swamped with connections...

Also, it would be nice if the network could re-organize over time, as in, promote people in your segment who give you back successful searches, and cut off branches that don't yield search results. Then everyone who wants free books would eventually find each other, and be separate from everyone who wants free porn (the other 99%, it seems)
---
pb Reply or e-mail; don't vaguely moderate.

--
pb Reply or e-mail; don't vaguely moderate.

13 of 137 comments (clear)