Slashdot Mirror


Next Generation of Gnutella

ResearchBuzz writes "Wired is reporting the announcement of gPulp (general Purpose Location Protocol), an open source technology for search engines. From the article: "It is based on the Gnutella structure, an open source application originally created by Nullsoft....Using the basic protocols originally developed for Gnutella, gPulp will search for information across a network in real time.""

8 of 70 comments (clear)

  1. How long before... by Vladinator · · Score: 3

    ... search engines are also able to use this? gPulp/Gnutella clients alone will suck bandwith, but what happens when Yahoo/Lycos/AskJeeves, etc. see this as cutting into thier market share, and decided to do this searching for thier users? One plus to this might be that they could cache such hits, and could allow searching that didn't suck bandwidth, but with the imperminent nature of such connections, I bet they'd do a scan EVERY time someone looked for "Eminem*.mp3" at ftpsearch.lycos.com or some such.


    Fawking Trolls!

    --

    "Going to war without France is like going deer hunting without your accordion." - Jed Babbin

  2. The Gnutella protocol has big problems by WolfWithoutAClause · · Score: 3

    The problem with the GNUTELLA protocol is that it it is quite inefficient, and it collapses completely at heavy load.

    The GNUTELLA protocol sends one message per search through the entire network upto the horizon.

    Eventually when enough people are in the network the individual links collapse under the load and the network falls apart.

    Nothing can completely solve this problem, but graceful degradation can certainly be designed for, and the way that the GNUTELLA protocol uses bandwidth can be very much improved, allowing for many more users, but at reduced horizon sizes.

    The current protocol wastes bandwidth in atleast one BIG way: it sends many short messages rather than one big one containing the requests.

    The reason that that is a waste is that each short message has a fixed sized overhead, at the TCP level. This means the useful percentage of bandwidth is significantly smaller than it might be if large messages were sent.

    Therefore it pays to hold off each search request for maybe 1 second before passing all the search requests on to the neighbours- the searching will be maybe 10 seconds slower due to the artificial delay (compared to 2 minutes; also offsetting this is the reduction in bandwidth), but possibly 50-100% as many users can be catered for.

    Secondly the behaviour at collapse can be much improved. To implement the above behaviour, each client should keep a list where it keeps requests/replies before forwarding them off to neighbours.

    If a request is held onto for too long without having a chance to pass it on then it should get thrown away (giving preference to low hop messages). That means that the horizon self tunes- avoiding collapse and giving better search performance at small network sizes.

    Otherwise death of the GNUTELLA net is predicted...

    --

    -WolfWithoutAClause

    "Gravity is only a theory, not a fact!"
  3. gPulp location by Frijoles · · Score: 3

    From the article, you can find gPulp at http://gnutellang.wego.com/.

    --
    -Frijoles-
    1. Re:gPulp location by warmcat · · Score: 3

      There are some very interesting proposals for the next version listed on the gPulp site.

      When I first saw it was ''talk before code'' my heart sank, but some of the proposals are actually very good.

      Also, I noticed at the Gnotella page that the author is pointing out that Gnutella has being going downhill - no wonder, I find it much harder to get anything useful from it than with Napster.

  4. P2P Network Searching by zpengo · · Score: 3
    Pretty soon every C: drive will have to have a robots.txt file.

    --


    Got Rhinos?
  5. Questions by Luminous · · Score: 3
    When I read this story on Wired this morning, I was first very excited. I was thinking this had to be the groundbreaking p2p app. But when I saw it was based on the Gnutella structure, I recalled a previous discussion on Slashdot where it was said Gnutella had maxed.

    I am assuming when they say based on Gnutella, they have 'fixed' some of the problems. Or, maybe it really isn't searching the entire network, but key segments of the network. And maybe it just ignores 56k connections altogether.

    In your honest opinions, what is the viability of gPulp? Is this a bandwagon that deserves support or should we (okay, you real programmers) continue to develop a more robust and 'intuitive' P2P system. I do believe a well-built, user friendly P2P app will take the internet by storm. We've only scratched the surface. What I am afraid of is instead of looking down the road and seeing what the requirements and capacity of the P2P app will be, the development community will continue to add and tweak the current flawed or underpowered systems.

    --
    This is not the way to build a lasting empire.
  6. However by Mr_Silver · · Score: 3
    Lets hope that it doesn't suffer from the same problems as Gnutella.

    The technology of the Gnutella system is limited by the version that the majority use. In other words, if version 2.0 is out and has some cool new features, it'll be useless if the majority of people are using 1.0 because they won't recognise the new stuff.

    An example, say my version of gnutella client can do regular expressions. Throw a regular expression at another (non RE supporting) client and it'll either think its a normal search string (in which case you'll probably get nothing back) or it'll throw it away (so you'll also get nothing back). You can't win.

    You can't make people update, if you made v2 not badwards compatible them you'd fragment the network. Napster may be peer to peer but if they release a new client with new features then those people who download it get those features immediately (unless they require someone else with a newer client of course)

    My other concern is the spead of searches. Again the network is limited by the majority. With speed its going to be all those people on modems. Have they done any testing on a very large scale? Does anyone know how much faster the network would be without all the people on modems?

    In the end, either this will work and be groovy or it'll flop or it'll work for a while, overload and die.

    One final thought, you have to wait at least 20 seconds to get a decent number of results back. Are people really prepared to sit their for that long (probably a *lot* longer) in front of their browser for the results? Most research indicates that people who have to wait more than 5 seconds for a webpage to download get bored and go elsewhere.

    --

    --
    Avantslash - View Slashdot cleanly on your mobile phone.
  7. A bit exaggerated by grahamsz · · Score: 5

    Certainly in the context of using it on a Lan to locate things this could be a very powerful tool. However how many sysadmins would like to have to secure every workstation from hackers instead of just every key server... fun stuff.

    However on the internet it's doomed to failure. I followed the work on GnutellaNG for a little while and it seemed at that point to be involved in attempting to reduce the bandwidth requirements of gnutella by slimming down the protocol, whilst simultanously increasing the functions and hence bandwidth requirements.

    Ultimately any P2P system is limited by the outbound bandwidth that each user has. At the moment with about 3000 host on Gnutella you are using about 1.5-2kbytes/s for each connection you have open (most ppl have 2 - 4) plus that doesn't include bandwidth left to upload or download.

    Curiously though this would be the most optimum way of doing things (not gnutella in particularl but p2p) if it weren't for the fact that we have so little bandwidth at the end user.

    Even cable modem users typically have only 128kbit upstream, which will only take gnutella to about the 10,000 user mark before it starts to fall over again. The same has to be true of any raw peer to peer system.

    No amount of optimisation will reduce the bandwidth requirements of any search having to be executed on any host.

    Freenet on the other hand is a lot smarter than that and does actually move information about in a streamlined manner. Unfortunately I fear that freenet would fall over and die right now if it were holding the terabytes of files that gnutella does - so it appears not to be the best solution either.

    We need more bandwidth at end users and less at big corporations, except that would count as empowering the people and be morally repulsive to most politicians.