Slashdot Mirror


"Random Walkers" may speed P2P networks

sean23007 writes "New Scientist posts an article about an innovative new method of controlling P2P traffic to maximize speed over a very large network. The idea, thought up by researchers at Princeton, Berkeley, AT&T, and Cisco, involves sending random "walkers" around the network, looking for a particular file, which would theoretically yield much better search speed than such other networks as Gnutella. They claim this could result in a network very capable of facilitating a massive distributed supercomputer."

16 of 129 comments (clear)

  1. Building this capability on top of existing networ by Space+Coyote · · Score: 3, Interesting

    It shouldn't be too difficult to build a system like this that could operate over something like gnutella. You could monitor the search queries and record the most popular requests, then you could use the random walker technique to search out the best sources for these requested files. Anybody else running the same software could enter in a search query and have it look first in the list of popular items before doing the standard crawl-type search over the gnutella network.

    --
    ___
    Cogito cogito, ergo cogito sum.
  2. Two important caveats from the article: by StandardCell · · Score: 5, Insightful

    "...such file networks have particular difficulty locating uncommon documents, and that walkers are especially bad at it. This is because there are relatively few walkers, which might have to visit every node on the network to find the file.

    "Langley also says the different connection speeds and processing power of the individual computers in the network may complicate the model: "Random walkers are going to end up at the well connected nodes."


    My biggest pleasure from P2P is finding obscure or rare music. I could care less how many copies of N*Sync or Britney Spears there are. Give me rare studio cuts or bootleg recordings of the Grateful Dead any day. This strategy won't help searches for those types of items, but it sure will help the sheeple get their music.

  3. Random Walkers and other algorithems by Juhaa · · Score: 5, Insightful

    If your on a marginally slow network (say upto a T1 (24 lines at 64 kbps)) and if you enable full node peering support on a Gnutella like network, you know why Random Walkers is needed. Cause the networks have become so huge, most of the bandwidth is taken by message searches. Even if you are not a node peer you would still be saturated by these.

    The Random Walker idea isn't new, it was tried a few times before as well on the orginal Gnutella network when they first came out. But, I don't think Random Walkers would be the answer to this network, the bigger the network gets, even Random Walkers would saturate the bandwidth and all you'd have is these bots finding the optimal path to files.

    Presently the network works much like how old routhers worked. Search packets are broadcasted to peering nodes which then broadcast the same packets off to their own neighbors. But the problem is that their own neigbors are sometimes the ones that sent the orignial packets. If this was to be done efficently (and how routers implement it), they need to create buffers to hold searches, once a search is recived a search would not be propagated to the other peers, it would be held for a given time period (say a few milliseconds), the node would then wait for the same search to appear from other neighoring nodes, it if does then it wouldnt send this search off down that node. This would cut down a lot of the wasted bandwidth and I am not sure why Gnutella ppl didn't model p2p after these routing methods.

    Walkers in an intellient routing setup would be the most optimial way of doing p2p, hopefully this would loose up some of on the congestion on the current p2p networks and let people with less bandwidth access to the shared files with the least sacrifice on bandwitdh for searches.

    1. Re:Random Walkers and other algorithems by Sanity · · Score: 3, Informative
      Walkers in an intellient routing setup would be the most optimial way of doing p2p, hopefully this would loose up some of on the congestion on the current p2p networks and let people with less bandwidth access to the shared files with the least sacrifice on bandwitdh for searches.
      Freenet does something quite like this. Requests are routed based on local heuristic knowledge of the network, information gravitates towards the demand for that information through a form of dynamic mirroring.

      this paper (PDF) gives a great overview.

  4. like the saying goes... by r00tarded · · Score: 3, Funny

    to know the path you must walk the path

  5. The research paper by sitcoman · · Score: 4, Informative

    The paper that was presented, entitled Search and Replication in Unstructured Peer-to-Peer Networks, can be found here (top of the page)

    --

    -=20
    me doesn't live for do [DEPRECATED]

  6. Humans are the best robots by popeydotcom · · Score: 3, Interesting

    Why let some algorithm trawl around your network when you can let some human dweeb do it for you. The problem with robots - or computers in general, is that they can be fooled into thinking something is something else.

    Personally I use Filemirrors. It shows what people are really downloading.

  7. More Q&A by sam_handelman · · Score: 5, Funny

    Q: Can we use it to predict the weather?
    A: Yes.

    Q: Well, that's not bad. Can it model the behavior of biological molecules?
    A: You betcha.

    Q: Still, that's rather pedestrian. Can it find large prime numbers?
    A: Numbers so big we can't even say them.

    Q: Hmm.. almost there. Can it evaluate complex object-relational predicates to get me EXACTLY the porn I want?
    A: Er..... Yes.

    Q: Excellent! Now we're talking. From the output, can we say conclusively that all of the porn which I want has been found?
    A: Please go away.

    Q: What about porn that I don't want - gay porn, scatalogical stuff. Can you guarantee I won't get that?
    A: Fine, yes. It predicts what you want from your past behavior and is always right. Happy?

    Q: Isn't that an invasion of my privacy?
    A: Arghhh!

    --
    The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
  8. They're on the right path by Jeremiah+Blatz · · Score: 3, Interesting

    Ohh, bad pun, sorry.

    But really, I think the way to think about P2P downloading is to use non-interactive search agents. These guys are on the right path, the random walkers appear to be a bit brighter than an exponentially widening search.

    The problem with current p2p networks is that the database is constantly churning. It's not like the web, where data is relatively stable. Two identical searches performed within minutes of each other will return different results. The problem, of course, is that polling the network with these huge searches inflicts a massive bandwidth cost.

    A better system would be to have a search agent. You would enter a set of search criteria (with a richer syntax than the existing model), and every 30 seconds or so the software would fire off an agent. The agent does a random walk, then downloads anything that matches. You then go on about your business. When the agents have collected a sufficient number of results. it alerts you.

    This lacks the immediate satisfaction of searching and getting results right now, but when you're looking for obscure stuff, you don't get that satisfaction anyway.

    Also note that the autonomous downloading assumes you're looking for something reasonably small, like an mp3, and not Girls Gone Wild.

    Of course, it also has the advantage that if gives an incentive for hosts to remain connected longer, adding to the stability of the entire system.

  9. You are a genius by HanzoSan · · Score: 3



    buffering searches would definately solve the problem.

    Why dont we buffer the searches and cache the route somehow, on certain computers so that when anyone else searches it goes directly down the preset buffered path?

    Ok since you have the solution, go implement it, code it!

    --
    If you use Linux, please help development of Autopac
  10. non-real time searching? by jeffehobbs · · Score: 4, Interesting


    It occured to me a while ago that the ability of a peer-to-peer's network to provide a particular file would go up dramatically if less emphasis was put on "real-time" searching and instead in a sort-of subscription model -- the user would request a file, and the request could propagate overnight from client to client while the network itself decided the best/most reliable way to provide that particular file to that particular user... beyond privacy concerns and TTL spoofing on requests (which are probably both non-trivial problems with this idea) I'd certainly be willing to wait a couple days to get a particular file or set of files as long as I didn't have to "babysit" the client and as long as I could be assured that the file would be full and complete when it finally arrived.

    ~jeff

  11. The paper is online by jdbarillari · · Score: 3, Interesting

    Qin Lv, one of the researchers, was a TA in one of my classes last semester. She has a website, and the paper is posted there.

  12. Even better yet by PureFiction · · Score: 3, Interesting

    Use a non random walk (i.e. tuned to a specific domain or set of preferences) and let it walk the network with a lightweight UDP protocol:

    cubicmetercrystal.com/alpine/

    The conservative use of bandwidth with an intelligent search mechanism can provide scalable decentralized search for peer networks of any size.

  13. Just like The MS Office Tool! by dbretton · · Score: 3, Funny


    I can picture it now, playing Return to Castle Wolfenstein, with Gnutella running in the background...
    Suddenly, my framerate grinds to a halt.
    "F%#!!@ Gnutella findfast"

  14. Another, i think somewhat better approach. by jon_c · · Score: 4, Informative

    Is a project called Circle. It uses the idea of a distributed hash table.

    The Author has a overview here.

    My brief synopsis:

    The network is orginized in a circle, like a circular link list, each node knows about it's neightbors it's neighbors neighbors and maybe a little more, bassicly every node knows about maybe 6 or so other nodes. Each node keeps a section of the hash table, say 0x0500-0x1000 or so.

    Each search item is hashed and then sent left or right depending on if it's less or greater then range you are storing, so 0x3000 would be sent to the rightmost node because it's larger. That node would repeat the process, therefore making the search a lot like a binary search.

    -Jon

    --
    this is my sig.
  15. anonymous databasing? by dh003i · · Score: 3, Insightful

    Here's an idea...

    Rather than dynamically searching for the file you want each and every time you type in the name, why not each user create a data-base on the files all the other users on the network are using?

    Once you get on the network, it does a search and accesses other people's database, so your current file database is updated.

    This would require much less bandwidth than searching dynamically every time you typed in a search term (though it would require a little bit of hard drive space), and it would allow search results to be produced very quickly, as you'd essentially be searching a file on your hard drive.