P2P Web searches

← Back to Stories (view on slashdot.org)

Posted by ryuzaki0 on Sunday September 12, 2004 @11:19AM from the one-level-up-please dept.

prostoalex writes "Researchers at UCLA are looking for easier ways to implement Web searches by using peer-to-peer techniques to decrease the workload. 'Queries need to be passed along only a few links rather than flooded throughout the network, which keeps search-related traffic low,' reports Technology Research News."

15 of 80 comments (clear)

Min score:

Reason:

Sort:

Too many people trying to use p2p by benna · 2004-09-12 11:23 · Score: 1, Interesting

I'm sick of all this hype about p2p. Its a good technology but its not like we have to use it for everything. The old ways of doing things still work.

--
"It is not how things are in the world that is mystical, but that it exists." -Ludwig Wittgenstein
1. Re:Too many people trying to use p2p by Anonymous Coward · 2004-09-12 11:40 · Score: 2, Interesting
  
  No, they won't. You need tons of server hardware to cope with the bandwidth of anything even remotely popular. Thus free services tend to be spoiled with ads and whatnot.
  
  The magic of p2p is that you can build the same out of 'thin air'. There are no expensive server rooms and gigabit lines but just a bunch of nodes that are slightly more complicated than simple clients. You use it, you provide it. Fair game and you get exactly the kind of service you want without strings attached. At least theoretically - the reality still seems to be something different.
2. Re:Too many people trying to use p2p by farble1670 · 2004-09-12 15:55 · Score: 2, Interesting
  
  ?
  
  the point is that new technologies are adopted when they improve on an existing method. we already have super-fast, super robust, complete search technologies that are not p2p ... so what problem are they trying to solve? an academic exercise? well, that's okay ... but let's call it what it is.
  
  google is already so fast as i would not notice it is if were any faster. the best a p2p search technology could achieve would be equivalent speed with the addition of the consumption of my bandwidth.
I foresee.. by Gentlewhisper · 2004-09-12 11:24 · Score: 5, Interesting

Maybe in future Google will implement a small server in our "Gmail notifier" application, and each time we search for something on google, it will cache some of the results, and should anyone close by ask for it, just forward the old results to them.

Save the server load on the main google server!

**Plus maybe some smart guy will figure out how to trade mp3s over the GoOgLe-P2p network! :D

--
Online backup with Mozy, sounds like Ozzie, but more!
1. Re:I foresee.. by mrogers · 2004-09-12 16:42 · Score: 3, Interesting
  
  Actually, I'd rather have my next door neighbour know what I was searching for (and vice versa) than have any single person know what *everyone* was searching for. Power corrupts.
UCLA discovers ultrapeers! by Magila · 2004-09-12 11:30 · Score: 5, Interesting

From a quick read of the article it sounds like what they've done is implemented a slightly more sophistcated/less deterministic version of the ultrapeer/hub system already in use by Gnutella/G2 Basicaly quereies are routed such that they are guarenteed to reach a "highly-connected node" which is the equivalent of an ultrapeer/hub node. The main difference is the folks at UCLA have come up with a novel method of picking ultrapeers, but the end result isn't much different.
1. Re:UCLA discovers ultrapeers! by shadowmatter · 2004-09-12 15:22 · Score: 4, Interesting
  
  Not quite... Note: I'm about to karma whore here.
  
  About a year ago, right before starting my senior year at UCLA, I was offered an opportunity to work on this P2P project. At the time it was called "Gnucla," and was being developed by the UCLA EE department's Complex Networks Group. I turned it down, because I had already committed to working on a p2p system in the CS department. But since in all honesty their research was more novel than ours (and my friend was in their group), I subscribed to their mailing list and kept informed on what they were doing.
  
  What they've done isn't find a novel way of picking ultrapeers. Let's review what motivated ultrapeers -- in the beginning, there was Gnutella. Gnutella was a power-law based network. What this meant is that there was no real "topology" to it, unlike peer to peer networks that were emerging and based on Distributed Hash Tables (such as Chord, Pastry, Kademlia [on which Coral is based]). It had nice properties: a low diameter, and very resilient to attacks common on p2p networks. (Loads of peers dropping simultaneously could not partition the network, unlike, say, in Pastry -- unless they are high degree nodes.) But the big problem was that to search the network, you had to flood it. And that generated so much traffic that the network eventually tore itself apart under its own load.
  
  So someone thought that maybe if only a few, select, high-capacity nodes participated in the power-law network, it wouldn't tear itself apart because they could handle the load. These would become the ultrapeers. The nodes that couldn't handle the demands of a flooding, power-law network would connect to ultrapeers and let the ultrapeers take note of their shared files, and handle search requests for them. Thus, when a peer searches, no peer connected to an ultrapeer ever sees the search unless they have the file being searched for, because the searching happens at a level above them. Between low-capacity nodes and ultrapeers, it's much like a client-server model. Between ultrapeers, it's still a power-law network.
  
  But the ultrapeer network has problems in itself, so this group sought to find a way to search a power-law based network, such as Gnutella, without flooding. They exploited the fact that, in a power-law network, select nodes have very high degree connectivity. If you take a random walk on a power-law based network (meaning, starting from your own PC, randomly jump to a node connected to you, randomly jump to a node connected to that node, etc...) you'll end up at or passing through a node with very high connectivity. Thus, they were a natrual spot rendezvous point for clients wishing to share files, and clients wishing to download files. Perhaps, in this sense, they are an "ultrapeer," but we haven't separated the network into two different architectures like before. The network is still entirely power-law based, and retains all its wonderful properties.
  
  But that's not the entire story, just the gist of it. There are other neat tricks to it... Trust me, this is really good stuff we're talking about here. They recently won Best Paper Award at the 2004 IEEE International Conference on Peer-to-Peer Computing. (See paper here.)
  
  "Brunet," as they call it, is designed to be a framework for any peer-to-peer application that could exploit the percolation search outlined above. Google-like searching is just one possible approach (and perhaps a little unrealistic...). Right now I can tell you that they have a chat program in the works, and it is working well. The framework should be released when it's ready.
  
  Please don't flood me with questions -- remember, I'm not actually in their research group :)
  
  - sm
This was already tried... by shodson · 2004-09-12 11:40 · Score: 4, Interesting

Infrasearch was working on this, until Sun paid $8M for the company, them had them work on something else, then Gene Kan committed suicide. Be careful what you work on.
There already is distributed crawling by Anonymous Coward · 2004-09-12 11:53 · Score: 3, Interesting

It's called grub.
An alternative idea for complete indexing.... by i_want_you_to_throw_ · 2004-09-12 12:01 · Score: 4, Interesting

Feel free to shoot full of holes as needed....

Every website has DNS servers so what if that same company that ran the DNS servers indexed the pages of the sites that it hosted? Daily?

Wouldn't that then provide a complete index of the web?

Start a search and somehow get the results back through that distributed method. Haven't figure that out yet...... but if you can...
PROFIT!!!!!
1. Re:An alternative idea for complete indexing.... by timealterer · 2004-09-12 19:09 · Score: 2, Interesting
  
  So, under this theory... everybody indexes their own content? Implying, everybody would provide legitimate "indexes" and not simply provide whatever is most likely to bring in search engine visitors? "Look, here's my index! My site has a MILLION pages of free porn warez!!" Indexing needs to be done by a third party, that's just the way it is.
  
  --
  - Allen Pike
  Altering time, one time at a time.
Re:If it's P2P... by A1kmm · 2004-09-12 12:11 · Score: 3, Interesting

From the article...
> In this last step, all of the initially queried
> nodes percolate the query throughout the network
> so that the query is guaranteed to reach a core
> sub-network of highly-connected nodes. "Since a
> copy of the query is in one of the nodes in the
> core network, and since the content list of a
> node is cached at one of these high-degree
> nodes, one is guaranteed to find the content as
> long as at least one node in the network has
> it," said Roychowdhury.
So in other words, the "major sharers", i.e. nodes which are "high degree", i.e. have a lot of connections, form the "core network", and collectively host the entire index. However, this is starting to lose the advantages of being a peer-to-peer network. Obviously, you can't have it both ways.

--
X-Has-Sig: yes
another senseless Slashdot story title by Bert690 · 2004-09-12 12:55 · Score: 3, Interesting

This is some pretty cool research, but this really has pretty much nothing to do with the web.
It's an ariticle describing a new p2p query routing method. Nothing more. There's already a lot of such algorithms out there. This one seems to exhibit some nice completness properties that hold in idealized scale free networks. But I'm not convinced such a theoretical property would hold in the real world. While p2p networks tend to be roughly scale free, the "roughly" and "tend to be" qualifiers are what make such theoretical properties unlikely to hold in practice.
Nice to see they plan to release some software based on the technique though.
Ants p2p Impliments A Distributed search engine by microbrewer · 2004-09-12 13:03 · Score: 4, Interesting

A peer to peer program Ants P2P has just implimented a Distributed Search Engine .Ants P2P is Based on Ant Routing Anlgorithms so it needed a solution to finding files on its network it found a solution that works .The Network also has a HTTP tunneling feature and its developer Roberto Rossi is creating a search solution based on simmilar methoods to search Web Pages published on the network .

Ants P2P is designed to protect the identity of its users by using a series of middle-men nodes to transfer files from the source to destination. As additional security, transfers are Point to Point secured and EndPoint to EndPoint secured.

1. Distributed search Engine - Each node performs periodic random queries over the network and keeps an indexed table of the results it gets. When you do a query you will get files with or without sources. If you get files simply indexed (without a source), you can schedule the download. As soon as Ants finds a valid source, it will begin the download. This will also solve the problem of unprocessed queries. This way you will get almost all the files in the network that match your query with a single search.

http://sourceforge.net/projects/antsp2p/
P2P is a cheap excuse for a system.. by Turn-X+Alphonse · 2004-09-12 14:03 · Score: 3, Interesting

I'm so sick of companies wanting to push off their crap onto us. If I want something from them they should offer it me on terms I find acceptable.

In this case a couple of text links which may intrest me (Google refrence : check).

I don't want to have to share my bandwith with 50 other people so they can do the same. If you want to run a service, website or game server you should pay for it. Don't start passing off the bandwith bill onto us users.

Either get used to the heat (price) or get out of the kitchen (market).

--
I like muppets.