"Random Walkers" may speed P2P networks

← Back to Stories (view on slashdot.org)

"Random Walkers" may speed P2P networks

Posted by ryuzaki0 on Saturday July 6, 2002 @04:32AM from the sorta-like-a-mall-walker-but-different dept.

sean23007 writes "New Scientist posts an article about an innovative new method of controlling P2P traffic to maximize speed over a very large network. The idea, thought up by researchers at Princeton, Berkeley, AT&T, and Cisco, involves sending random "walkers" around the network, looking for a particular file, which would theoretically yield much better search speed than such other networks as Gnutella. They claim this could result in a network very capable of facilitating a massive distributed supercomputer."

7 of 129 comments (clear)

Min score:

Reason:

Sort:

Building this capability on top of existing networ by Space+Coyote · 2002-07-06 04:36 · Score: 3, Interesting

It shouldn't be too difficult to build a system like this that could operate over something like gnutella. You could monitor the search queries and record the most popular requests, then you could use the random walker technique to search out the best sources for these requested files. Anybody else running the same software could enter in a search query and have it look first in the list of popular items before doing the standard crawl-type search over the gnutella network.

--
___
Cogito cogito, ergo cogito sum.
Humans are the best robots by popeydotcom · 2002-07-06 04:54 · Score: 3, Interesting

Why let some algorithm trawl around your network when you can let some human dweeb do it for you. The problem with robots - or computers in general, is that they can be fooled into thinking something is something else.

Personally I use Filemirrors. It shows what people are really downloading.
They're on the right path by Jeremiah+Blatz · 2002-07-06 05:02 · Score: 3, Interesting

Ohh, bad pun, sorry.

But really, I think the way to think about P2P downloading is to use non-interactive search agents. These guys are on the right path, the random walkers appear to be a bit brighter than an exponentially widening search.

The problem with current p2p networks is that the database is constantly churning. It's not like the web, where data is relatively stable. Two identical searches performed within minutes of each other will return different results. The problem, of course, is that polling the network with these huge searches inflicts a massive bandwidth cost.

A better system would be to have a search agent. You would enter a set of search criteria (with a richer syntax than the existing model), and every 30 seconds or so the software would fire off an agent. The agent does a random walk, then downloads anything that matches. You then go on about your business. When the agents have collected a sufficient number of results. it alerts you.

This lacks the immediate satisfaction of searching and getting results right now, but when you're looking for obscure stuff, you don't get that satisfaction anyway.

Also note that the autonomous downloading assumes you're looking for something reasonably small, like an mp3, and not Girls Gone Wild.

Of course, it also has the advantage that if gives an incentive for hosts to remain connected longer, adding to the stability of the entire system.
non-real time searching? by jeffehobbs · 2002-07-06 05:07 · Score: 4, Interesting

It occured to me a while ago that the ability of a peer-to-peer's network to provide a particular file would go up dramatically if less emphasis was put on "real-time" searching and instead in a sort-of subscription model -- the user would request a file, and the request could propagate overnight from client to client while the network itself decided the best/most reliable way to provide that particular file to that particular user... beyond privacy concerns and TTL spoofing on requests (which are probably both non-trivial problems with this idea) I'd certainly be willing to wait a couple days to get a particular file or set of files as long as I didn't have to "babysit" the client and as long as I could be assured that the file would be full and complete when it finally arrived.

~jeff
1. Re:non-real time searching? by MasterVidBoi · 2002-07-06 06:29 · Score: 2, Interesting
  
  You might like eDonkey 2000.
  
  While I don't know how searches work exactly, they do continue to add new results long after I send them. As for downloading files, you give it a file size and an mp4 hash, and it'll go running around the network trying to find that file. It seems to do a good job of downloading from multiple sources at once.
  
  I run a third party client, mldonkey, as a daemon. A couple times a day I telnet into the daemon and check on the status of my existing transfers. It's not fast, but it will eventually get the file...
  
  As for searching, I barely use that anymore, since there are a couple indexing web sites out there (namely http://www.sharereactor.com), that have a large index of reviews of particular files (such as video quality), and will provide you with the size/hash of that file. When you go to ed2k, you'll get exactly that file. It seeems to work pretty well.
The paper is online by jdbarillari · 2002-07-06 05:12 · Score: 3, Interesting

Qin Lv, one of the researchers, was a TA in one of my classes last semester. She has a website, and the paper is posted there.
Even better yet by PureFiction · 2002-07-06 06:00 · Score: 3, Interesting

Use a non random walk (i.e. tuned to a specific domain or set of preferences) and let it walk the network with a lightweight UDP protocol:

cubicmetercrystal.com/alpine/

The conservative use of bandwidth with an intelligent search mechanism can provide scalable decentralized search for peer networks of any size.