Slashdot Mirror


Napster and Gnutella Measurements

belswick writes "UW has posted a paper titled "Measuring and Analyzing the Characteristics of Napster and Gnutella Hosts" at Washington in PDF form. Interesting reading for those who implement P2P software, with actual measurements, tools, and topologies. You 3l33t H4x0rz are ACM members, R1gh4?" You can get a cache of the PDF and view it online as well.

54 of 113 comments (clear)

  1. For those of you who despise PDFs for simple text by AKAImBatman · · Score: 4, Informative

    ...here's the HTML Version courtesy of Google.

  2. You think that's bad... by jargoone · · Score: 2, Funny

    My company's intranet has all of our insurance claim forms on the web. They had recently redone the site, and I needed a claim form. I tried to do it from work, but seeing how I have an Ultra 5 and a client with a very restrictive software policy, I couldn't view the Word document posted. I figured it was the form itself, so I waited until I got home. When I did, I opened up the document, and was astonished to find that the document contained one line: a link to the insurer's website.

    I would like to get my hands on those developers... luckily they're 300 miles away.

  3. Way-y-y Out of Date by MattRog · · Score: 5, Insightful

    The numbers are all from early 2001. Napster has since been killed and reborn with an entirely different model, and gnutella (KaZaA et al) have exploded. What's the point of this report given the ancient data?

    Given those changes wouldn't it be more valuable to see if their hypotheses and conclusions hold up with the new data?

    --

    Thanks,
    --
    Matt
    1. Re:Way-y-y Out of Date by molarmass192 · · Score: 5, Informative

      Not to nitpick but Kazaa isn't based on Gnutella, it's based on FastTrack. They're both P2P but FastTrack is a closed system while Gnutella is an open one.

      --

      Good people do not need laws to tell them to act responsibly, while bad people will find a way around the laws-Plato
  4. is it me or... by boogy+nightmare · · Score: 4, Funny

    Do you alter the phrase from 'peer to peer' to 'idiots to me' in the privacy of your own head :)

    --
    Kingdom of Loathing (www.kingdomofloathing.com) Addicted is me
  5. mutella by jargoone · · Score: 3, Insightful

    Anyone else a big mutella fan? I always run it in a screen session, with the web interface enabled. I love that I can use the same session, from a terminal at home, ssh'ed in, over the web interface on my LAN, or through an https tunnel. Great piece of software, highly recommended.

    1. Re:mutella by BRSloth · · Score: 1

      Yeah, I was one. Not anymore, since I've learned the existence of giFT I'm not using it anymore.

      giFT works with a small server, which clients can connect. So I can control it graphically on my home (using giFToxic) or remotely (using ssh and giFTcurs).

      Also, giFT turns all that reseach into garbage, since it can connect on several servers of several different types. Tt currently comes with OpenFT (giFT original protocol) and Gnutella by default but you can also find FastTrack network plugin for it. There is also a OpenNap network plugin in the works (and I just can't wait to put my hands on it).

  6. Re:Ugh. by metallicagoaltender · · Score: 1

    The elite hackers comment was part of the original submission, as shown by the fact that it's italicized. Granted, Hemos could have edited that part out, but don't blame Slashdot for the lameness of some of its readers.

  7. Shame they didn't consider Freenet by Sanity · · Score: 5, Informative

    Freenet's next-gen routing algorithm does detailed analysis of node performance and incorporates this into its routing decisions. In effect, Freenet already implements their proposal, neatly integrating it into the Freenet routing algorithm.

    1. Re:Shame they didn't consider Freenet by skajake · · Score: 2, Interesting
      I have tried Freenet numerous times over the years, but every time it has proven to be dog slow. If they have implemented said alorithms, why is the performance still so bad?

      --

      ~ Maintainer of the Skajake Projects

    2. Re:Shame they didn't consider Freenet by sjwt · · Score: 1

      Yes,
      it was written this year,
      but the reson to ignore freenet was
      probly bassed of the fact that the
      data the colected was from 2001.

      --
      You have 5 Moderator Points!
      Which Helpless Linux zealot/MS basher do you want to mod down today?
    3. Re:Shame they didn't consider Freenet by sjwt · · Score: 3, Funny

      from http://freenetproject.org/index.php?page=download

      "Download Freenet
      Important note for first time users
      When you first start Freenet your node will know very little about the network - it could take up to several minutes or longer to open a website. Keep trying, because the more you use Freenet, the faster it will get. "

      --
      You have 5 Moderator Points!
      Which Helpless Linux zealot/MS basher do you want to mod down today?
    4. Re:Shame they didn't consider Freenet by OverlordQ · · Score: 2, Insightful

      Freenet network has been HORRIBLE lately, whereas you used to be able to download videos at quite a good speed, now it's nearly impossible to fetch 5k text documents.

      --
      Your hair look like poop, Bob! - Wanker.
    5. Re:Shame they didn't consider Freenet by Anonymous Coward · · Score: 1, Insightful

      Unfortunately, even with NGR, Freenet suffers from the fact that you can never be sure what is actually there and up-to-date.

      You can start downloading a splitfile, it'll successfully start...and then half-way through (or even 90% through), decide that it can't find the rest of the blocks required. Retrying may help, or it may not. All the blocks might not even exist on the network any longer. Then again, Freenet's purpose is quite different compared to other P2P systems.

      If Freenet sites were kept up-to-date a bit better (I think a lot of people gave up on it pre-NGR), it might remain more usable. Unfortunately, it's current state isn't that hot.

    6. Re:Shame they didn't consider Freenet by hankaholic · · Score: 1

      Have you tried integrating a virgin node recently?

      I have, and it's not gone well.

      --
      Somebody get that guy an ambulance!
    7. Re:Shame they didn't consider Freenet by GigsVT · · Score: 1

      If they have implemented said alorithms, why is the performance still so bad?

      It's written in Java. As much as Java zealots deny it, the fact remains that Java apps are all really bloated and slow.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
  8. Temporary Mirror by arvindn · · Score: 4, Informative

    http://theory.cs.iitm.ernet.in/msj.pdf
    http://theory.cs.iitm.ernet.in/msj.txt

  9. academia. by rebelcool · · Score: 4, Informative

    paper publishing takes a long time. gathering, analyzing data and making sure you're coming to proper conclusions takes lots and lots of research and double checking. And then theres peer review, which takes months as the paper gets submitted to academic peers who read and analyze and comment on it - when they've got the time.

    In any case, the data points themselves arent as relevant as the topology and structure of growth. doesn't matter if the data is from 2001, theres plenty to be learned from.

    --

    -

  10. outdated by milamber.net · · Score: 5, Insightful

    This is a very old (in internet terms) report. Its results are taken from when gnutella and napster were two "popular" p2p architecuters (report refers to data gathering in May of 2001). Since then napster has died and been reborn in a new form and gnutella has adopted a 2-tier topology such as kazaa has. Companies such as clip2, who are long dead and buried, are referenced and bandwidth usage is stated from 2001 making the results useless and the references impossible to find.

    I assume its an old report resubmitted by somebody who doesn't know better otherwise research like this is worse than useless because it provides completely inaccurate results.

    1. Re:outdated by gl4ss · · Score: 1

      it's not totally useless.

      in fact, it's pretty useful comparision of the _techniques_ used(wouldn't matter if it's from year 1023 or from year 4034).

      you just shouldn't treat as your usual "hottest gfx/cpu/mem/hd/usb-device" comparision.

      .

      --
      world was created 5 seconds before this post as it is.
  11. Does anyone care? by AKAImBatman · · Score: 2, Funny

    27 comments and not one actually on topic. Does anyone care about these statistics? Or know what to do with them? For that matter, has anyone successfully read the paper without their eyes glazing over? I'm sure its a fascinating paper to someone, but I can't get past the first two pages without loosing all concentration. And I'm weird enough that I usually like this stuff!

    1. Re:Does anyone care? by spitefulcrow · · Score: 1

      Uh. This paper has been published for months already and was submitted to at least one SIG conference. Statistics on it are actually quite interesting in that they show popularity of objects to be non-Zipf.

      --
      Sorry, my karma just ran over your dogma.
  12. some useful stats, but outdated by Adam+Fisk · · Score: 5, Informative

    This study is based on extremely old data and is not particularly relevant for today's Gnutella. The Gnutella crawl data is from 2001, a time when Gnutella was a vastly different network with a completely different searching architecture. Gnutella at the time was a very young protocol. Since then, the search architecture has moved beyond the flooding model, now using a combination of distributed indexing and "dynamic querying." These techniques are specified in detail here.

    The data on average number of shared files and uptime is interesting, but there's really not a lot in here that is actually useful for peer to peer development. There's a lot of active, very useful research being done elsewhere. The folks at Stanford have done a great deal of work in this area, much of it very applicable. Their work is here.

    --

    Adam Fisk

  13. Encrypted P2P ... by bigwavejas · · Score: 4, Informative
    Slashdot ran an article a while back regarding, "a secure, distributed mesh-like networking protocal and platform called Waste." See article:
    at:http://slashdot.org/articles/03/05/29 /0140241.shtml?tid=126&tid=93

    I've been using the software to send files securely to trusted friends, I wonder if this isn't the direction sharing mp3s will go in the future, in order to avoid the RIAA.

    In any case... Nullsoft has since banned using the software, but its still available under the GPL at sites like:
    http://grazzy.mjoelkbar.net/waste/mirror/

    Snarf on!
    F the RIAA

    --
    "Simplify, simplify, simplify!" Thoreau
    1. Re:Encrypted P2P ... by bigwavejas · · Score: 1
      granted the software isn't exactly perfect and has flaws; however, it serves it purpose (for me at least), encrypted chat and sending of files... Can you recommend an alternative?

      thx, Bigwavejas

      --
      "Simplify, simplify, simplify!" Thoreau
    2. Re:Encrypted P2P ... by EvilTwinSkippy · · Score: 1

      Well at least it was available until you slashdotted it you insensitive clod!

      --
      "Learning is not compulsory... neither is survival."
      --Dr.W.Edwards Deming
    3. Re:Encrypted P2P ... by lobsterGun · · Score: 1

      I may be making this up, but I think that the WASTE located on sourceforge is a from the ground re-write of the original.

    4. Re:Encrypted P2P ... by hughk · · Score: 1
      It works fine and is up on sourceforge. The problem is that it is for small nets only, i.e. >50 people and the performance is not the best.

      What it is really good for is as a mini-groupware application. You can be in a hostile environment (the internet) and your shared files and messages are relatively secure.

      --
      See my journal, I write things there
  14. current size of the gnutella network by smd4985 · · Score: 3, Informative

    can be found here.

    --
    smd4985
  15. Re:Ugh. by metallicagoaltender · · Score: 1

    Obviously Slashdot isn't the government, but the average person who's going to whine about an 'elite hackers' comment from a reader and blame it on Slashdot usually is dumb enough to cry censorship when the situation is turned around.

  16. Re:For those of you who despise PDFs for simple te by AKnightCowboy · · Score: 2, Funny
    ...here's the HTML Version courtesy of Google.

    And here's the text summary from the researcher:

    Stop using P2P clients you fscking pirates, you're wasting all my pr0n bandwidth at the university.

  17. BaH! by vDave420 · · Score: 5, Informative

    As a major developer of one of the world's leading Gnutella clients this data is old, untimely, and really not "new news" to anyone involved in Gnutella.

    Much of this data is based upon estimates & reported crawler (ha!) data.

    Want some real, hardcore data about Gnutella (or at least the BearShare portion of it)?

    I invented a revolutionary distributed stats system that is in place in the latest versions of BearShare. No more guessing about p2p network information, like transfer bandwidth, etc. Try checking out some of my results.

    This data is collected from the network, in a brand new, distributed, 'polled-not-crawled' scheme with remarkably fast turnaround times on data (new data points every 5 mins, on average).

    Much, if not all, of this in the above report information is actively being summarrized for Gnutella (again, the BearShare portion at least) and some early (non-automated graphing) of the results can be found in the above links.
    Expect (some of) this data (like node count, shared files/bytes, etc) to be available on our website (in real time) soon.

    Kinda interesting...
    In any case , story data is not novel any more, certainly not timely. =)

    I like my data collections much better.

    -dave-

    --
    The pig browse. With Google. Sigh is to the chicken. Chicken is fool. Giggle. The DailyWTF giggle.
    1. Re:BaH! by GigsVT · · Score: 2, Insightful

      You admit to being responsible for installing spyware on thousands of people's computers?

      I hope they catch you some day. You are no better than any other virus writer.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
  18. Re:Napster didn't return... the uncatchy name did by adzoox · · Score: 1
    Grammar corrections:

    The name Napster didn't seem like a catchy name to begin with. It also got associated with "song theft" or "Geek/Techie rights fighters"

    The Napster of today isn't NEARLY as fast due to the fact it's no longer a peer to peer system. It can't even begin to compete (bandwidth wise) with Gnutella. On a 56k modem I can get a song of of LimeWire (which partially uses the Gnutella network) in about 3 to 3.5 minutes. On Napster, it takes about 8 minutes. Neither is a problem with broadband.

    That said, I could also get a file that was cross platform compatible off the "old napster" - the new Napster can't be used effectively with Macs because of the DRM format chosen.

    And yes, I'm aware WMP9 came out for X the other day, it still doesn't have the playlist management iTunes or even Audion has.

    --
    Yell & scream & rant & rave... it's no use... you need a shaaaave ~ Bugs Bunny
  19. ARGGGHHH!!!! by AKAImBatman · · Score: 1

    without loosing all concentration

    LOSING, LOSING, LOSING!

    * AKAImBatman beats himself over the head with a wet mackerel.

    I think Safari needs a grammar checker...

    1. Re:ARGGGHHH!!!! by Dave+Clifton · · Score: 1

      I see 'loosing' so often it looks like a typo if someone writes 'losing'!

  20. Reminds me of a quote... by EvilTwinSkippy · · Score: 3, Interesting
    I have a friend who is a history major. He always says that History isn't history until everyone who was there has died.

    I'm starting to get the sense that Science really should stick with the timeless concept. 2 years is a blink of an eye when preparing a paper on particle physics, or mathematics. 2 years is at least 8 lifetimes on the internet. By the time you write about it, it's obsolete.

    --
    "Learning is not compulsory... neither is survival."
    --Dr.W.Edwards Deming
  21. Re:Please mod parent up by AKAImBatman · · Score: 1

    What else should we do for the next hour, until the next article arrives?

    Work?

  22. How to know when not to RTFA by Kethinov · · Score: 1

    When you see leetspeak in the summary, you know to keep your distance from the actual article.

    --
    You're right, I wouldn't steal a car. But if it were possible, I sure as hell would download one!
    1. Re:How to know when not to RTFA by JUSTONEMORELATTE · · Score: 1

      When you see leetspeak in the summary, you know to keep your distance from the actual article.

      But go ahead and post a comment, 'cause this is slashdot after all.

      --

    2. Re:How to know when not to RTFA by Kethinov · · Score: 1

      Of course! How else is one supposed to pass the long and boring hours at work?

      --
      You're right, I wouldn't steal a car. But if it were possible, I sure as hell would download one!
  23. Reminds me of the weather prediction system ... by JSkills · · Score: 1
    I once heard of a weather prediction system, based on a set of mainframes working in parallel, that once it had sufficient data, could predict the weather for the next day to 99.9% accuracy.

    The problem was that the process ran for 5 days, so if it started on Sunday, you could know what Monday's weather would be by the following Friday.

    This study (and I do understand it takes time to pull together this kind of comprehensive usage data in an organized format) falls along the same lines. It would have been far more relevant had it been published when the original Napster was actually (1) in its prime or at least (2) still up.

  24. Re:Please mod parent up by Dot.Com.CEO · · Score: 1
    This is the first useful comment in the entire thread.

    Well, given that you've posted something like ten messages in this thread, I find your comment mildly comic.

    --
    Mother is the best bet and don't let Satan draw you too fast.
  25. Re:Please mod parent up by AKAImBatman · · Score: 1

    Heh. No argument here. :-) What can I say, it's Monday, development db is down and I'm bored.

  26. Re:Ugh. by metallicagoaltender · · Score: 1

    Sadly, your humility is below average... ;-)

    Now after all of this chest thumping (which is even cuter as an AC) are you actually going to boycott Slashdot, or did you just want to join the masses of people that will complain about Slashdot without actually doing anything about it?

  27. Conclusions by TuringTest · · Score: 3, Informative

    Thanks to the structure of scientific papers, you don't have to actually RTFA in order to know what is all about:

    5 Conclusions

    In this paper, we presented a measurement study performed
    over the population of peers that choose to participate in the
    Gnutella and Napster peer-to-peer file sharing systems. Our
    measurements captured the bottleneck bandwidth, latency,
    availability, and file sharing patterns of these peers.
    Several lessons emerged from the results of our measure-
    ments. First, there is a significant amount of heterogeneity in
    both Gnutella and Napster; bandwidth, latency, availability,
    and the degree of sharing vary between three and five orders
    of magnitude across the peers in the system. This implies that
    any similar peer-to-peer system must be very deliberate and
    careful about delegating responsibilities across peers. Second,
    even though these systems were designed with a symmetry of
    responsibilities in mind, there is clear evidence of client-like
    or server-like behavior in a significant fraction of systems'
    populations. Third, peers tend to deliberately misreport in-
    formation if there is an incentive to do so. Because effective
    delegation of responsibility depends on accurate information,
    this implies that future systems must either have built-in in-
    centives for peers to tell the truth or systems must be able to
    directly measure and verify reported information.

    --
    Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
  28. newer data from the same authors by Anonymous Coward · · Score: 1, Informative
    Looks like these guys also have some data about FastTrack-based systems that is much more recent:

    Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload at http://www.cs.washington.edu/homes/gribble/papers/ p118-gummadi.pdf

    and

    An Analysis of Internet Content Delivery Systems at http://www.cs.washington.edu/homes/gribble/papers/ p2p_osdi.pdf

  29. Y'all are missing the point by Jerf · · Score: 5, Insightful

    Y'all are missing the point, thanks once again to the many-headed beast that is the word "P2P".

    In this case, the academics are strictly concerned with P2P as a network organization, with little regard to what apps are built on top of it. This has nothing to do with "Napster" or "Gnutella" as "file sharing systems". Instead, Napster and Gnutella are being studied by the academics because they are the only things you can get hard numbers for, because few-to-none of the academic P2P systems have been implemented on such a wide scale. They do not perfectly implement what the academics are studying but they are close enough to provide some data about how other systems might behave in the real world.

    Academic P2P systems tend to concentrate on "pure" P2P, where there are no servers and ideally no "supernodes" (though they'll settle for dynamic organization that emerges from the protocol itself with no human intervention). This is a much different and much harder problem then "Let's share music!".

    The closest to a wide-scale academic P2P system that has been actually deployed that I know of is Freenet; for ideological reasons (pure P2P, no servers) it shoots for the same goals that the academics shoot for for other reasons (mostly that pure P2P systems are hard enough to be interesting, whereas Napster's organization could be created by one teenager without much difficulty; no disrespect to Fanning but it's basically another varient of client-server). Note how much trouble it has had scaling up, just as Gnutella has had trouble; "pure P2P" is friggen' hard in the real world.

    This is "old news" as a couple others have noted because of the peer review process, but to the academics this is valuable to have such peer-reviewed hard data, because you can model and simulate your network to your heart's content, but until you see it in the real world on a large scale you can't be sure it works. Without this kind of hard data they're adrift in a sea of pure theory.

    This paper isn't for "you", so the fact that "you" don't understand what it's for or that "you" think this is useless is rather uninteresting. This paper is for academic P2P practicianers; if you don't know about academic P2P theory, you can ignore this safely. Academic P2P and what "you" think of as P2P are quite different.

    (The "you" here is the "average Slashdot poster to this article. Apply it to yourself (or not) as appropriate.)

    Note that in this paper "academic" is not used as a perjoritive; it's just that as I said, there's such a huge disconnect between academic and non-academic P2P goals that they hardly deserve to be lumped under the same name.

  30. ACM considered harmful. by voodoo1man · · Score: 2, Flamebait
    You 3l33t H4x0rz are ACM members, R1gh4?
    WTF is this supposed to mean? Are you endorsing the ACM? You do realize the ACM is basically one big (and with the web and open document publishing standards, totally unnecessary, like many other commercial "scientific" journals) scam of an organization?
    --

    In the great CONS chain of life, you can either be the CAR or be in the CDR.

  31. 5 - Conclusions by 2TecTom · · Score: 1

    In this paper, we presented a measurement study performed over the population of peers that choose to participate in the Gnutella and Napster peer-to-peer file sharing systems. Our measurements captured the bottleneck bandwidth, latency,availability, and file sharing patterns of these peers. Several lessons emerged from the results of our measure-ments. First, there is a significant amount of heterogeneity inboth Gnutella and Napster; bandwidth, latency, availability,and the degree of sharing vary between three and five orders of magnitude across the peers in the system. This implies that similar peer-to-peer system must be very deliberate andcareful about delegating responsibilities across peers. Second,even though these systems were designed with a symmetry ofresponsibilities in mind, there is clear evidence of client-likeor server-like behavior in a significant fraction of systems'populations. Third, peers tend to deliberately misreport in-formation if there is an incentive to do so. Because effectivedelegation of responsibility depends on accurate information,this implies that future systems must either have built-in in-centives for peers to tell the truth or systems must be able to directly measure and verify reported information.

    --
    Words to men, as air to birds.
  32. PLAINLY OUTDATED -- Gnutella has ultrapeers by Anonymous Coward · · Score: 1, Insightful

    This thing is plainly very outdated for Gnutella. Their conclusion recommendations include things like having various levels of responsibility for nodes.

    If it were current, they'd at least have mentioned ULTRAPEERS or LEAF nodes! Gnutella currently DOES have nodes which 'volunteer' to carry more load.

    In conclusion---it's really not worth reading anymore, because the designs they studied are dead and replaced already.

    -Terr

  33. Re:HEY! by hansiboy · · Score: 1

    Humm... Whats even more funny is the fact that if i'd had mod points and seen that one you'd gotten a "redundant" from me :)

  34. Oops! by Night+Goat · · Score: 1

    I first read that headline as "Napster and Genitalia Measurements". I guess that doesn't make much sense at all.