Running The Numbers: Why Gnutella Can't Scale
jordan (one of the founding developers of Napster), writes: "As the rumour mill churns over Napster's future, many folks see Gnutella as the next best hope for the music loving file sharing community. Problem is, Gnutella can't scale . [Note: if that URL doesn't work, try this mirror.] Almost all research on Gnutella up till now has been based on observations of the system in the wild, but this paper discusses the technical merits of that statement through a detailed mathematical analysis of the Gnutella architecture." The kind of numbers that you may not like to read if you figure networks expand to accomodate traffic at a never-ending pace. Update: 02/15 12:24 AM by T : Jordan also points to this mirror for your reading pleasure.
Then there weren't any votes at all. If you think that machines are infallible, you're an idiot.
There's a difference between an unbiased machine doing an imperfect, but mostly accurate job, and a biased human trying to divine a voter's state of mind from a scratch on a card.
--
Sometimes it's best to just let stupid people be stupid.
er, his words were "In my view, 'The Plant' has been quite successful." I'd say "successful" and "failed" are two words that are quite at odds with each other. He also refutes the claim that he stopped publishing the story due to financial causes.
The more relevent question is whether you can have a peer-to-peer network without central servers that *can* scale. And the answer is "no". I did my Ph.D. research on it. It works. Gnutella is broken, but don't draw a conclusion that a server-less environment can't scale. Read before you post this crap.
then the RIAA could easily make the case that you were storing illegal content on your machine
No, reread the parent. You cannot know where the information is actually stored.
A freenet node is basically a caching router, and AFAIK even the RIAA hasn't yet been able to repeal the common carrier status, so you should be ok.
Cui peccare licet peccat minus. -- Ovid, Amores.
The more relevent question is whether you can have a peer-to-peer network without central servers that *can* scale. And the answer is "no".
I hate to be a stickler, but the relevant question would be: Can a peer-to-peer network scale by solely relying on search packets to be propigated? No... However, in your next paragraph you open the window to the solution.
However, the REAL question is whether you can have a peer-to-peer network with decentralized servers, i.e., with clients that automatically establish a heirarchy among all the clients, and certain clients become more "server like".
Why do only certain clients limited to being more "server like". Every client could index some of the content of other machines. The question is: How can clients efficiently catalog the contents of other machines? Catalog segmentation (where peers only index files of a specified criteria) comes to mind as well as a number of other methods.
The current scenerio is does nothing like this. As soon as I connect to GNUTella, I'm flooded with tons of requests that I don't have, and all my client can say is Yes or No?
This is WHY GNUTella sucks! If clients could answer back, "I don't have it, but this guy does." we'd have a lot less search packets and a much more scalable network.
"Communism is like having one [local] phone company " - Lenny Bruce
Why do you say he's 'interested in puttind down Gnutella.'
Is a cold blast of reality a putdown??
First off Napster is to be praised for its ability to find some rare or bootleg tunes. BIGTIME props to Napster for that.
Bottom line though is you people seem to forget what it was like in the good ole days for us to pioneer this CRAZE that swept the net. I feel like I should be talking to (Grandkids here saying this) "When I was your age we had to search the web for FTP servers and download them the old fashioned way."
"I recall having access to a T1 at work when only the elite few had that and was running an MP3 site boasting 1 gig of tunes on a SCSI HD that was in a STATE OF THE ART P150 Dell server( I now have close to 20 gigs of MP3's)"
Sure Napster is/was great Gnutella although will continue to be trouble...We will all make it.
BTW if anyone wants to contact me, I will happily workl with you to upload my collection if you wanna open a site somewhere.
The argument of college bandwidth, alhthough many will hate me for saying it, is legit. I work for a company that installs network management softweare especially to Universities and the ones that have blocked Napster have seen a substancial amount of traffic drop. I do not know what the answer is, I can say I know several gamerz that HATE Napster etc for the amount of bandwidth they lose on campus. Poor guys probably have a Ping of 27 instead of 21
Razzious Domini
Razzious Domini
I could be a GREAT KARMA WHORE if I could just shed the few morals I have left.
You just cant connect to their servers.
Instead of learning a new program, you really only have to get Napigator at www.napigator.com, which does run on Win 9X and NT. It fires up Napster program and you will be able to connect to any of the OpenNap servers, as well as a whole slew of others.
I agree with the original poster that the OpenNap ones are the best. They even have more volume than Napster's servers.
The ivory tower has never had to reach so h
I am nearing completion of a network that satisfies a, b, c, e.
I havent started on d and f, but they could be added.
This project is called The ALPINE Network
It scales linearly, and provides a query mechanism that rivals the performance of a centralized directory. (Although the bandwidth is more than a centralized query, but at least you have direct control over how much bandwidth you use and how).
At any rate, I could use development assistance a great deal. Let me know if anyone is interested.
Regards...
No, I actually thought they were splitting up individual files at this point. Maybe I should read my own links :)
1. How do you identify all the peers?
Thats discussed on the site I mentioned, but essentially you each pick an ID to associate a given peer with. Its that simple.
2. Let's say 10% of those 10K people are doing searches. That saturates a 56K modem, assuming you can really get your packets down to 56 bytes
It would only saturate your link if all 10,000 searched at once. If they all searched within a 3 minute time period, or no more than 70 in one second, your link will not saturate. And the packet is 56 bytes for an 8 character query. For a 16 character query, it would be 64 bytes. etc.
What happens when you try to have 100K people? One million? How about the 10 million+ of Napster? Your scheme would not scale.
That depends on how good of a peer you are. If you dont repsond much, and have a very low link, then you will be at the bottom of those 100,000 hosts query lists, and will get queried infrequently. I cover this on the site, but this is not a problem. The only thing that is limiting your use of the network is how much memory you have (you would need a hundred meg or so for a million connections) and your bandwidth.
Ok, but now OpenNap basically just utilizes the Napster paradigm and therefore puts into place Index servers.
If the RIAA succeeds in suing Napster and blocking their service, which seems very likely at this point. It is not at all far fetched that they will easily be able to receive court orders against anyone else running the same time of service.
So your OpenNap is not a replacement service because every index server is liable for a court ordered shutdown.
That and the index server requires bandwidth, bandwidth costs money and how many people are going to donate full T3 lines to this? Thus the service is capped in terms of the number of connected users based on bandwidth available.
Once Napster is dead, there will be nothing else to replace it at the same scale unless it is operated with the blessing of the RIAA.
Cui peccare licet peccat minus. -- Ovid, Amores.
No. The implication is that it's a series. The goal is to figure out what the progression is, and then come up with the next in the series.
Yes there is. He looks at aggregate traffic numbers, rather than per-client or per-search numbers. Saying that a search creates 6 GBytes of traffic sounds scary and un-scalable (Table labeled Bandwidth Generated in Bytes (S=83, R=94)). Holly cow, that's a lot of data. Now, table "Reachable Users" reveals that that 6GB of data is searching 7.6 million clients. If we do the math, we find that our traffic level is a little over 800 bytes / client searched (including responses.) Is 800 bytes of traffic for a search unreasonable? I don't think so.
All the author really does is take an example of a mathmatical formula which grows exponentially and show how quickly he gets "scary" numbers. No effort is made to show whether or not the efficiency of Gnutella breaks down as the network increases in size. No effort is made show how much work is done per search or per result. He just makes assumptions about the gnutella network which results in exponential growth in the number of users, and then shows how the aggregate traffic also grows exponentially. Duh. What did you expect? By this logic, nothing scales.
Don't get me wrong, I don't think Gnutella scales either. But you don't need to wave around all the FUDdy math that this guy does to prove it. The argument why it doesn't scale is simple:
The reason is doesn't scale is that every search request (optimally) gets delivered to every client. We don't even have to look at how those searched get delivered. We'll completely ignore the amount of traffic in the backbone, and only count the traffic that has to exist on the last hop to each client. Let's assume that the requests are 100 bytes a piece, or about 1000 bits once we have all the overhead of UDP/IP/ethernet/PPP/ATM/whatever on top. If each search is 1000 bits, and the average client has a 56K modem, the whole thing falls apart when the search rate is 56 searches / second. If we assume 1 million users, each one can only perform a search about once every 5 hours on average before the modem links are 100% full.
The problem here is the broadcast of every search to every client. Any distributed search network needs to either assume very high bandwidth connections for all the clients (because they are all servers to the whole network) or have some hierarchy of caches / servers. The amount of bandwith being used at each client increases as more clients connect. If the number of users goes up by 1000%, the traffic on my local link goes up 1000%. This is why it doesn't scale. It has nothing to do with how many GB of traffic the network as a whole has to handle. It's the simple fact that the traffic at every client increases as more clients connect. This is the problem that has to be corrected, and Jordan's paper never even mentions this fact, relying instead on big scary numbers. His claim at the end that gnutella generates 2.4GBps of traffic for 1 million users is the ultimate FUD. How much traffic does Napster generate when it has 1 million people connected? He probably doesn't know because their servers go down first.
Of course, you'd have to work out how to prevent hostile clients and servers from corrupting your indexes, but I'm sure that's a much more easily solved problem than working out how to prevent some skript kiddie from flooding napsters servers off the net with a DDOS.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Woah, there, Space Cowboy. First, the travelling salesman problem is solvable even by a Turing-machine analogue (including Von Neuman machines), but it happens to be NP-Complete, so the solution complexity grows so stupidly fast in relationship to the size of the problem (in this case the number of cities to visit.)
Pattern recognition seems to be built into the human brain as much as (and perhaps more than) binary addition is built into modern processors. The mathematical ability of humans is based more on a mapping of pattern recognition and association than on anything else.
So, you questions answered, a self-organizing network becomes a problem when you want it to do something, like minimize network traffic (or at least collisions) or be trustable.
Ushers will eat latecomers.
IP is just rude.
Is there any torture so subl
Freenet's biggest problem is the lack of a simple search function. If Joe Blow can't do a search on "Metallica MP3" and have the entire works of Metallica returned then he ain't gonna use it.
Care about freedom?
I'd rather be lucky than good.
Get a bunch of investors to run a big fat network pipe into a country with a name like "Ljubilvaniastanistan" where the rutabaga is the national currency and the yak is the national delicacy. Then watch Hillary Whats-Her-Name from the RIAA swallow her own tongue when she learns that her vast legion of lawyers are powerless to do anything about it. Of course, the Bush administration would probably order immediate airstrikes on the grounds of "protecting the wealth-creation security of national corporate interests", but that would be a public-relations nightmare, particularly if we put the new Napster right in the middle of a bustling village full of non-coms.
In all seriousness, I don't condone mass piracy, but the RIAA has been screwing people for decades and I have to admit that I enjoy watching them squirm. What could the RIAA conceivably do if Napster were located offshore, preferably in a country not bound by the terms of the Berne convention?
We're going down, in a spiral to the ground
It occurs to me how similar P2P networking is to the way routers work. I can't help but wonder if routing protocols might be a good place to look for ideas for p2p (Split Horizon jumps to mind).
It's nota my planet, monkey-boy - Dr Lizardo.
If you have a server on the internet which you want people to connect to, it's got to be advertised somehow.
Won't be hard to locate them.
If the RIAA can get Congress to pass a law which places a substantial fine on those convicted of running internet services for the purpose of piracy... Which isn't that unlikely. The DMCA places something like a $1 million fine on creating tools to subvert copy protection.
Who will be able to afford to risk running said service? Bill Gates, maybe Larry Ellison. Doubt that'll happen.
The top level would be like the root servers in DNS.... However, unlike DNS, this system would need a way to update the "root hints" instead of a static set. I don't see why this couldn't be possible. However there could be a problem when a client is off for a loooong time and when it goes back online all of the root servers have changed and it wouldn't be able to contact the network.
I dunno....eh, screw it, this stuff is making my head hurt.
ÕÕ
All that I have ever found on Gnutella is porn... Not saying that it is a bad thing.... Just that it hasn't been that useful for finding what you are actually looking for. --LordRashmi
Isn't this the sort of thinking that went into the creation of the internet in the first place? The idea of a decentralized network, etc. etc.
Maybe Gnutella needs to take the meta-internet approach. A "new" internet on top of the current internet?
(I dunno. I ask because I'm curious. How is Gnutella in general different than the internet in particular?)
And nor does the average user, and therein lies the rub. As long as the interests behind RIAA are smarter than the vast majority of users (they are) you can be quite sure that RIAA will stop rappant piracy. You can say "But but there's always ", normally it's ftp, usenet, etc? The simple fact of the matter that most users don't have the time, the energy, or the intelligence to figure them out. The only reason that piracy has been as popular as it has been is because Napster lowered the bar suffiently low, it brought fast and easy piracy within the reach of a few keystrokes and mouse clicks.
You know, I have to take some exception to being called "sheep" because I buy CDs. Do you honestly, deep down, feel that because the current RIAA-supported distribution model doesn't compensate artists fairly, you are striking an idealistic blow for artists by using a model which, by and large, provides no compensation at all?
Reality check: if you download music copied from a CD sold by an RIAA-affiliated label, you are not "boycotting RIAA-sanctioned music". Boycotting means you are willing to go without a product on principle. That's not what you're doing. What you're doing is, at the least, legally considered stealing the music (presuming you don't own the CD already or buy it later)--and I'd have to say it's philosophically pretty dubious. If you didn't "just want the music," you wouldn't be getting it for free.
If you want to boycott the RIAA, you have to support artists who make their work available through "non-RIAA-sanctioned methods." But trading their music for free through Napster is not support.
It's easy to defend Napster for what it might become. I think digital music distribution is coming, soon, and I suspect it will live without the RIAA. But it will require a viable business model for the artists, not the record companies, that allows an average, "second-tier" artist to get equal or better compensation than they would from a record company and provides a reasonable level of promotional support for concerts, merchandising, radio airplay, and the like. Napster does not provide this model. A future model might be free as in speech, but currently Napster is unequivocally free as in beer, and we're not doing ourselves or anyone else any good by pretending otherwise.
The problem with Napster is that it has a single point of failure. The problem with Gnutella is it doesn't have an index. What you want is an index of all files with no single point of failure.
An index is a root node, which points at branches, which points at leaves. So make 10 copies of the root, 10 of each branch, 10 of each leaf, and put each on a different transient machine. (If you think 10 roots is too few, have every user keep their own copy of the root. It's not big.)
Then here's your protocol: Ping the roots one at a time, choose the first that responds. The chosen root pings the duplicates of the correct branch one at a time, chooses one. The chosen branch pings the duplicates of the correct leaf one at a time, it chooses one. The leaf sends the results back to the user.
Updating the structure is the same, with the addition that nodes occasionally try to sync with their duplicates. You end up with duplicates never quite in sync, but so what.
No, FidoNet requires a 'devlivery' or 'bulk transfer' protocol.
The protocol used by ALPINE is for messaging. The types of broadcasts are very small packets. Usually 50-60 bytes. This makes a huge difference.
MyopicProwls
MyopicProwls
My homepage
wake me up when this thread is over.
--
Well, it wasn't just "someone." The guy was (is) one of the original Napster guys. Napster may actually go down for good soon (though I kinda doubt it), and when it does, there will be a *lot* of people looking for a good place to get mp3's. Jordan Ritter is merely warning us of the dangers of 10 million people on Gnutella.
If the nodes relay requests (as was pointed out above) then it's improbable that this hypothetical attack would succeed.
King also made $500,000 in profit. On a free product. There's something there.
--
OliverWillis.Com
OliverWillis.Com
An Operative with an Agenda
-jon
Remember Amalek.
AHHHH YEAH!!! Hey Razz, those were the days... If I recall correctly I got that collection started with my weird al stuff... Then you went hog wild and got as many as you could... Oh BTW, that FTP server was on MY machine even though the files were on Brain. Those were the days.... Even though you brought me over to the darkside of I.E. / Outlook from good ole netscape...
uhh, why don't you just share and browse files? you can search directories, you know.
Why does everyone automatically point to Gnutella as a replacement for Napster? There are so many better file-sharing programs out there right now (for Windows at least, but I'm sure someone would love to port them ;-)
For instance, I stopped using Napster when they released Beta 8. Beta 8 is supposed to be an improvement, but Beta 8 crashed 10 times more often than my previous version. Instead I now use
WinMX
With WinMX you can share anything at all.. no restrictions. Also, it runs off of all of the rogue servers out there now (the Napster ones too) so it's not like you're missing anything from not using Napster. Check it out at WinMX.com
your right. when will people wake up and realize that PURE-distributed searches will NEVER work. The most efficient search (i.e. "the perfect search") is a hybrid that uses discovery, like the library of congress system, or the yellow pages. Its much more efficient this way.
What amazes me the most is that purists actually got people to believe for this long that a pure system like Gnutella would work. Bullshit!
-- Betting on the survival of the media industry is a serious risk. I advise investing elsewhere.
Sorry, I was in a hurry, so the answer was a bit terse :-)
> Why isn't the travelling salesman problem solvable?
Travelling salesman is solvable. It is just "hard".
In the generally usefull case (ie: when search for a 95% optimal solution, not for the optimal one), computers are much better than human to solve travelling salesman problem like. Hey would you design VLSI chips without a computer ? Mind you, those designs are not 'perfect'. Note that the very best results in a limited amount of time (even in the salesman problem) are given by human interacting with a computer.
> Why is pattern recognition such a difficult problem when humans do it so easily?
It is difficult, but not impossible in a practical way (ie: OCR software sorta works, Netwton sorta works). Sure getting a computer grokin all the alphabets drawn in one of the hofsdater book ("Metamagical Themas", I think) is impossible.
But frankly, we don't need this for Gnutella. Gnutella is, as you point, fundamentally broken. Even if I share 3 files, all the queries come to my disk. This is a terrible waste of bandwidth.
The original sin of gnutella is that I am the only one that knows what files I have. This means that if I search for an obscure file that only one person have, the only way to find it is to ask to everybody. No need to write the equation to know it doesn't scale. It will never scale. Period.
> Don't underestimate the difficulty of the problem of a self-organizing network. It is definitely a non-trivial problem.
I did not said it was trivial. Far from that. Finding the optimal network would be computationnaly hard. But almost any non-optimal network would be better than the distributed denial of service that is gnutella.
<RANT>
There are IMNSHO, two issues to solve.
First there is a need for client and servers. A server would just be someone that answer queries. Servers should need to know what files are avalaible to what client. Should each server know each other ? Should servers be organised as current gnutella client ? How much space does a file reference costs ? (At least 32 bytes for the name, 4 bytes for the IP, 2 bytes for the time stamp ? Of course IP location would be a bad idea for dynamic IP client, so a 128bytes random MD5 hash could be generated by client to identify them, and be re-generated every few days) How much space could a server use for it database ? (Say 1Gb for big servers. You can store about 30 Millions uncompressed file references in that. Ad about the same number compressed and indexed) Of course having a few centralized 1Gb servers would be a bad idea. It was just to make the point thqt there is not _that_ much data to query in the whole network. Hell, imagine a network of server working in the current way (ie: a network of neighbouring servers). Imagine the key-space beeing divided (For instance in 26. A group of server for each letter of the alphabet). Imagine that each client connects to 26 servers, one for each part of the key space (the client could always ask a server for additional servers for a part of the key space). This should support a large amout of queries (And if you ask for a file that begins with a given letter, you just have to ask one server). At launch time, you diff with each server to update their view of your host. The servers exchange this information slowly in the background. Every server knows every other in their key space, and slowly synchronize their database (ie: if a client is connected to them, they update the info on other servers. a rsync-like protocol should be able to do the trick).
The second point is how, in a system that enable servers election, you can maintain trust (ie: what would prevent some big evil company to bring the system down by getting its own server beeing elected as servers and polluting the system). Because a hack in a self-organizing network could have far-reaching gibson-like consequences...
</RANT>
Oops, sorry to be so long. And probably not comprehensible. Doesn't matter anyway...
Cheers,
--fred
You've apparently never used a news client since 1992 or so. These days all of the collating and uudecoding is done behind the scenes. Just select a file in Pan and press "D". In fact the Usenet is a great way to distribute Fansubbed Anime without overloading any particular server.
Now the problems (that havn't been mentioned yet): data on the usenet has a short lifetime, frequently less than 24 hours. If you don't keep on top of it, it is easy to miss things (like the fourth episode of a series). Second, you can't search out a particular song on the Usenet, you have to more or less take what is available. If you are looking for a particular song, the Usenet may not be for you (although you can certainly request it).
Down that path lies madness. On the other hand, the road to hell is paved with melting snowballs.
I read the internet for the articles.
After having tried to use Usenet to get various binary files, I can say this is less feasable than Gnutella is. Any file I've ever seen that has been split into multi-parts has almost invariably had at least one part missing, thereby destroying the whole attempt. I have had exactly FOUR multi-part files come through successfully and that is after five years and trying several news servers as well as multiple servers at the same time.
Improvise, adapt, and overcome.
Notice the automatic part concerning the servers... if the network lost a couple main servers others would automatically take their place and fill the new ones with data. It would create a type of neural network, you could say.
:)
Granted, designing a system like this could easily make a graduate's thesis, but hey, it's just a challange. Challanges are meant to be overcome
----
Do you even know anything about perl? -- AC Replying to Tom Christiansen post.
I really disagree. First of all, Napster unquestionably provides a distribution model that provides a reasonable level of promotional support for artists. It's really great how many new artists I've discovered because of MP3s. Not necessarily just because of Napster, but if a friend says (as often happens) "hey have you heard this new stuff from Boo Williams?" and I say "Boo who?" (no pun intended -- go download his music it's great) then all of a sudden Boo has an opportunity to have his music heard by someone who never would have heard it otherwise. Super!
Thankfully, MOST of the artists that I listen to have come down on the PRO NAPSTER side. This includes Ben Folds Five, Green Day, Limp Bizkit, The Offspring, Chuck D, and others. Unfortunately, some of my favorite artists have come down on the other side. These include the most vocal three: Metallica, Dr. Dre, and Eminem. That sucks.
MORALLY I get over the problem. Is it morally wrong for me to want free music? I don't think so. Is it morally wrong for an artist to produce work that I listen to for free, never buying his CD, never going to his concert, never buying his T-Shirts? Perhaps. Perhaps not. But certainly it is no worse that middlemen becoming so ridiculously rich by screwing me with $18 CDs. CDs should be between four and six dollars; about half should go to the artist (about what they get now, or a bit more).
Trust me, I sleep just fine at nite having spent the whole day listening to MP3s. And I do own CDs -- oh God do I own CDs. I counted once, and I probably gave the RIAA $4,000+ in my (short) lifetime. That's a lot. A whole lot. I figure they are still $3,500 ahead after all the free downloading I've done.
I also, by the way, have 'discovered' artists via MP3s and Napster, and subsequently bought their CDs and gone to their concerts (e.g. Cypress Hill and Lavay Smith -- don't laugh) so those artists are definately ahead because otherwise they wouldn't have seen any of my money at all.
MyopicProwls
MyopicProwls
My homepage
Yes, of course. Gnutella makes only the most limited attempt to ensure privacy. Case in point: a while back (in fact, it still may be running) a server returned fake matches to requests for kiddie porn, and published the IP addresses that had been caught trying to download the "files" on a webpage. I don't have to tell you how indignant some people got, and for the funniest reasons.
The opinions stated herein do not necessarily represent those of anybody at all. Deal with it.
there is an obvious troll in the parent post. Your mission is to find it.
--
I've read the article and it's obvious that gnutella cannot work in the same fashion that napster does. This is a blow to those who've become dependant on the mind-numbing ease that napster affords but what about changing the rules slightly. What about using private gnutella networks (clubs). Password protected servers that hold a set number of users. I've been planning on creating a system up for this ever since gnutella .30.
- S
I even share the equations and methodologies I used, and try to poke holes in my own conclusions.
Further, I'm not a competitor. I haven't worked for Napster in 3 months. Before Napster my background was in poking holes in things anyway. All I did was finish a personal project I started a long time ago.
You actually sound more like FUD than anything. :-)
--jordan
You've still got to query each peer, and a linear search like that isn't acceptable when you've got a lot of people hitting a big database.
Re-read the Freenet protocol. I think their key-affinity scheme makes it more tolerable for millions of users, and the only reason it isn't searchable (yet!) is because they have the extra requirement of plausable deniability-- you're not allowed to know what's on your machine.
So if you accept his numbers on the bandwidth, number of users, etc., on the Gnutella network, then you must logically also accept his conclusion that your search will send 18 gigs across the network!
"If we couldn't laugh at things that didn't make sense,
My bicyles
I will reply to this in HOPES of not being flamed for OFF TOPIC, but my manlyhood is at stake. Yes it was YOUR machine, thats true.
HOWEVER yuou left Netscape changed of your OWN doing I BEGGED YOU to come home and you chose the way of evil.
Razzious Domini
Razzious Domini
I could be a GREAT KARMA WHORE if I could just shed the few morals I have left.
The more relevent question is whether you can have a peer-to-peer network without central servers that *can* scale. And the answer is "no".
Not so fast. Right now, the biggest problem with decentralized networks is that they all have some form of routing/forwarding. If you got rid of routing/forwarding, then they could scale.
For instance, lets say you have a napster style peer group, 10,000 peers. What if, to query these peers, you sent a small UDP packet to each of them directly? No routing, no forwarding. How long would this take?
Modem: 2.5 minutes
DSL: 13 seconds
I would say that this is an acceptable period of time. And the bandwidth used was all your own, nobody elses, except for the 56bytes each peer received for that single packet they got from you.
I am working on such a network, its called The ALPINE Network and has all the features mentioned.
So, if you get rid of the forwarding/routing you can have a decentralized network that scales linearly.
You know, i see a lot of people talk about how slow and awful gnutella is, but i really haven't noticed anything wrong with it, in fact, it seems to run a little bit faster than Napster. And i run a 56k modem! Not once has gnutella run slow for me, and many of my song searches have yielded me more results than Napster. Well maybe I'm an idiot, but I think I'll take the same posistion on Gnutella's crapiness as the american people took on a possible electoral college fuck-up: I'll believe it when I see it.
----------------------------------
Yes, but I want to know is, say I download a illegal movie, can I get prosecuted if they have my ip, and is this likely to happen?
IS their anyway people can trace you through Gnutella, say, I download a bootleg video, can I get caught?
Absolutely. I have NEVER got a bad rip on Audiogalaxy. There's also one thing you forgot to mention. Audiogalaxy picks who you download from. You can sort of control it by picking a specific bitrate/filesize of the MP3, but you are pretty much guaranteed a download for all but the most obscure songs. I have always got my file.
More importantly, you CAN prove that you DID NOT KNOW that illegal content was being stored on your machine, and that it was beyond any reasonable effort for you to go snooping around trying to figure out what actually was there.
The math in the reference reminds me of the work IBM did (in Haifa, Israel) back in the early days of SNA. They were concerned with peer-to-peer networking of "hundreds" of devices without the ministrations of a front end processor.
Scaling was achieved by adaptive routing tables that could deliver packet fanout for broadcast traffic to prevent storming
**Vanuatu or bust**
i can see it used for just randomlly looking for interesting things tho (like fansubbed anime
--
No, because you control exactly how often or how much response you provide.
If you are getting swamped, you will respond to less and less queries, and then your quality in the eyes of those peers will drop, thus, you will receive less and less queries.
This is actually a balanced type of configuration, which handles load in an efficient manner.
Also note that over a DSL line, you could receive in excess of 10,000 queries a second.
We only use it internally. We get high speeds over the campus network, and there is a limited number of users. We set the TTL low. I haven't noticed any slowdowns because of it, and it's so fast that you even if there were a lot of queries flying around, they'd finish quickly. I've been d/ling movies at 700 k/sec that someone else spent 3 days trying to get at 2.3k/sec. That's why it's good.
"...paying $15 for a cd with 10 songs on it is the biggest and longest running scam going on right now."
;-)
Small potatoes. The biggest, longest-running scam going on right now is paying $9.50 to see a 90-minute movie. CD prices comes in somewhere after social security, insurance fraud, and patenting results of publicly funded research.
Or did this guy just prove mathematically that Napster has efficient control over everything that flows through it?
Not that that hasn't been mooted already...
--Blair
So you get an even bigger problem.
Will code a sig generator for food
These big scarry numbers actually look very very close to what normally a network analyst would predict for Gnutella. Gnutella network will display network slowdowns with increasing number of active nodes, that is true simply due to a fact that the networks have limited resources, the physical networks will stay the same, the software running on them can concievably bring the physical networks down. Caching data is a good solution for Gnutella but note that it is only good if you use a client that does caching and note that Internet users generally don't like sharing their own resources (I mean their bandwidth) with the neighbours.
You can't handle the truth.
99.44% "You are the product of a mutational union of ~640Mbytes of genetic information."
Damn. Looks like the 640 *bytes barrier is back.
Actually, there is a peered filesharing system that has some interesting design features which should help it scale better than gnutella. It also addresses a lot of the security concerns brought up by Napster and gnutella. ProjectELF is designed to be a completely untraceable and anonymous peered file sharing system. It is just at the beginning stages, but it already works just fine. Most of the remaining issues seem to be those of interface design and efficiency of coding. It is definitely worth checking out. Another cool thing is the author's use of a system that I have long advocated, namely sell everything at prices so low that piracy is more difficult that legitimate trade. He prices his product at a dollar. I guess he hopes to make it up in volume. That was always my theory. Sell ten million copies at a buck instead of a half million at $20.
What they could prove is that you transmited copyrigted material, regardles of wether or not you actualy stored it on your machine or not. And thats the problem. You can't get in trouble for downloading or storing files, its the uploading that get's ya.
Amber Yuan 2k A.D
"and dear god does this website suck now." -- CmdrTaco
Freenet could provide a whole new architecture for static web pages. Sites like Suck.com and OMM could insert pages to the network w/out paying for the bandwidth that their popularity demands. Which means horray for authors.
The latency, while it makes the system less usable, is on balance harmless..... it is always possible to increase the timeout value. The main focus of the paper was on the impact of Gnutella's bandwidth. And those "small" search packets really do add up, once you consider how many times a particular search is propagated.
The opinions stated herein do not necessarily represent those of anybody at all. Deal with it.
A simple query with an 8 character query string would be 56 bytes. The string above might be as much as 160 bytes.
I did some monitoring of the gnutella network for a few months, and the size of an average query is about 8-16 bytes at most. Many many queries where even less.
I am tired of incomplete downloads, wouldn't it be nice if we could filter out incomplete files from the query.
If the users are chained together through ids one hop at a time, then you would have to route and re-ruote a query for their ids before you even do anything!
No, you missed a major point; there is no routing and no forwarding.
This is what makes it so simple, and linear. You directly communicate with all the peers you want to query. Everyone directly communicates with each other. The only thing this implies is a transport service which can support a large number of concurrent connections efficiently. This is what DTCP does.
I wholeheartedly agree that the mp3 trading that occurs so openly and frequently on the internet will not stop. There is no organization that is knowledgable enough, powerful enough, or quick enough to stop it. It summed it up perfectly that I was reading Lars Ulrich's editorial in Newsweek, where he stated that it was music now, but the movie industry would probably be next, as I downloaded Gladiator.
But I digress, so here's what bugs me...
Information deserves to be free
Last I checked, mp3 files were not a cure for cancer, a report on contaminated food, or an expose on government corruption. These songs are the hard work of artists who spend years on their work. They go to record companies because they realize it's the only feasible route to success. The record companies invest a lot of money. Just because you don't agree with how they produce something, doesn't give you the right to take it. Period.
I'm not saying everyone should stop downloading mp3 files or that we should come up with a way to compensate the record companies. However, please don't try to shamelessly camouflage something which is flagrantly, and undeniably, illegal with the mask of liberating imprisoned artists.
The big problem with usenet is the avialability of files. If I wanted to download Metallica's Master of Puppets, I would first have to see if it is there. If not, I would have to request it, and wait some time for it to appear on the newsgroups (and all parts are there). If someone uploads a 128k/s MP3, and I wanted a 192k/s, then I have to request & wait again. Napster/Gnutella avoid this problem by allowing me to search all of the music that is available, and getting the files I want right now!
Doh!
What you ask looks, to me, a lot like the FreeNet Project, with 2 little differences: freenet was not done to be peer to peer, so it's done for "always available" information and the presence of a given file on the system does not depend on the supplier being on-line, but on how much people actually downloaded it (unused content will be the first to disappear).
UDP is generally disdained because of the fact that almost all firewalls and proxies are configured to ignore it.
Pan
I said no... but I missed and it came out yes.
(If the people running the RIAA and MPAA had been clueful, they would have been pursuing this strategy against anonymous file sharing from the very beginning. If 99 out of 100 requests for insert-top-forty-song-here on Napster return William Shatner singing "Lucy in the Sky with Diamonds", then most people would rather pay for the CD than sift through all the false results. But I digress.)
--
send all spam to theotherwhitemeat@ropine.com
Articles posted have an age associated with them until they are expired and no longer exist on your providers server. It is possible that you made an attempt to grab files that had an older age and were already expired.
Anyhow, when it comes to binaries, music and multimedia you want a news service that has high retention. (which equals lots of storage for binaries groups)
-Pat
Why not using IRC ? I mean, it's there, it's reasonably reliable, and allows both centralized and P2P communication.
/notice the_asking_bot IVEGOT Bachelorette.mp3).
/join #davidbowie), thereby bringing people with common interests together. Technically, IRC networks are the best example of a semi-centralized-yet-free network I can think of.
The "client" would be a bot. It would join a channel (say, "#bjork" or "#trancegoa") and to make a request, it would simply utter something on that channel in some protocolish language (eg "SEARCH 'Bachelorette'"), and other bots would respond in a P2P fashion (ie
This would deliver us from the scaling curse as it is described by Jordan's paper. This would also lead to a Usenet-like classification of available files among channels (if you like david bowie, you would
Think of this: Napster was made as a sharing system, where people could chat. We have a chatting system. Why not allow people to share files on top of it ?
> I'm speaking practically here. I'm going to visit 10,000 cities. Please give me the absolute guaranteed best route (in my lifetime, if you please).
You are wrong. Either you are speaking theorically, in which case the salesman problem is trivially solvable, or you are talking practically, and you don't give a shit about the *best* route. A good one will be sufficient, and there are very good heuristics for that.
Cheers,
--fred
1 reply beneath your current threshold.
Introduce a law that requires all US internet carriers to block any incoming or outgoing traffic from that country.
What I don't understand is why the country that happens to be the most powerful also has to be the most retarded.
And the crap part, is that it's JAVA BASED.
*vomit*
You build something that uses a distributed algorithm to build a spanning tree. The nodes near the top of the spanning tree become the servers. You build the algorithm so that parents in your spanning tree will naturally have more bandwidth than you do.
I've been thinking about this for a long while.
Building the spanning tree isn't hard. Every node just selects one and only one parent node. They tell the parent that they're a child of that parent. You prevent cycles having a parent refuse to be a parent unless it also has a parent. If it loses its connection to its parent, it tells all the children that it no longer is a parent. One node 'seeds' the network as a root by saying it can be a parent without being a parent and not looking for a parent. Eventually it can delegate roothood to a child that has proven high bandwidth. It cannot cease being a root without doing this delegation.
You can have connections to nodes that are neither parents nor children, but search requests should not be propogated to those nodes unless you have no parent. Eventually a search request will make it onto the spanning tree and be efficiently distributed.
You can eventually elect servers who are near the top of the spanning tree. Nodes should, in general, elect parents that have more bandwidth than they do. This means that nodes near the top of the spanning tree should have the most bandwidth.
Need a Python, C++, Unix, Linux develop
For god's sake, I could not help it, but the first time I saw how Gnutella works, without being a network specialist or a mathematician, it reminded me of those chain letters that promess you will become reach if you send the letter to other people and one buck to the person that sent it to you.
It does not work for your info, although you will find 2 or 3 guys that will swear they got the money.
We are clever people, most of us know some little maths. Please check your maths and recognize that something growing exponentially is deemed to fail if the supporting infraestructure (bandwith) does not grow at the same rate.
IANAL but write like a drunk one.
At first: When I'm talking about "Napster" I'm mostly talking about Gnapster and the OpenNap-Network, as I've only tried that.
OK, so I've been using Napster for some time and wasn't quite satisfied. Then I tried Gnutella in the incarnation of the LimeWire client and I think it rocks.
The Napster-Network may have its benefits, but it's like AOL: Just about everyone can use it. There often are these guys that can't figure out, how to change their shared-directory to exclude their precious files, but instead read in some l33t magazine, that they can set their max. uploads limit to 0. That is, I may find the file I want, but will be unable to download it.
Opposed to this I found it much harder to get into the Gnutella Network, and others might have too. Therefore, the user base of Gnutella is more experienced and educated about the importance of sharing in a peer2peer network. LimeWire did its part to this.
Gnutella might not scale as good as Napster doe, but as noted before, it doesn't have to. In Napster one has access to around 5 Terabyte, as opossed to Gnutella where I found to have an average 60 TB available. There is no need for even more files. On the other hand (remember that "No one will ever need more than 640KB of RAM"?) it will probably increase on its own, as technology evolves.
It will not increase in size, because more size is needed, but because more is possible.
I already had an intuitive grasp of what he was talking about, and his numbers seemed ballpark correct to me. I too thought the result set bandwidth numbers looked a little fishy, but the others seemed fine.
I've been thinking about this for months.
Need a Python, C++, Unix, Linux develop
I really want to build this with my StreamModule system, but nobody is helping me with it, and I don't have the time to hack it out, especially since I'm so ridiculously methodical when it comes to code.
You build something that uses a distributed algorithm to build a spanning tree. The nodes near the top of the spanning tree become the servers. You build the algorithm so that parents in your spanning tree will naturally have more bandwidth than you do.
I've been thinking about this for a long while.
Building the spanning tree isn't hard. Every node just selects one and only one parent node. They tell the parent that they're a child of that parent. You prevent cycles having a parent refuse to be a parent unless it also has a parent. If it loses its connection to its parent, it tells all the children that it no longer is a parent. One node 'seeds' the network as a root by saying it can be a parent without being a parent and not looking for a parent. Eventually it can delegate roothood to a child that has proven high bandwidth. It cannot cease being a root without doing this delegation.
You can have connections to nodes that are neither parents nor children, but search requests should not be propogated to those nodes unless you have no parent. Eventually a search request will make it onto the spanning tree and be efficiently distributed.
You can eventually elect servers who are near the top of the spanning tree. Nodes should, in general, elect parents that have more bandwidth than they do. This means that nodes near the top of the spanning tree should have the most bandwidth.
Need a Python, C++, Unix, Linux develop
Well, a human can fit 640Mbytes in a small area of a cell.
Will code a sig generator for food
Napster is dead. Past tense. Finished. Gone. Even if the court decision is reversed, the record companies have them in agreements, their nads are on the table.
If this was FUD, why wasn't it released while it was still relevant?
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
Precisely. HavenCo may prove useful in a real-world scenario after all (if they ever get up and running, that is).
Of course it's slow and eats up bandwidth. It turns every client into a server. With dialup users running functions that should be server based like searching, it slows everything. When Napster dies and I have free time, I have an idea(based on an idea from existing network infrastructure) for a new, mostly decentralized file sharing network.
BTW: also front page on both Minneapolis and Saint Paul papers yesterday.
You're essentially describing Freenet, or rather, a major component of what Freenet does. Data makes it's way between several intermediate nodes before reaching it's ultimate destination. Internode traffic is encrypted as well as locally stored data, making identification of large nodes tricky. Of course, if you run a large node yourself you can determine what nodes are critical routers for your section of the network.
Of course, it is yeet to be seen whether this scheme will scale either.
Ok, this time I did a bit more thorough check of the numbers. I agree with the first half, the traffic generated by the request half of the message. What I'm not as convinced of is the response side of the equation.
I don't know what the typical percentage of Gnutella users sharing files is, so I'll accept your figure of 30%. But 40% of those sharing files having a match? Even with your reduced number here I think it's high. If 40% of people sharing files had a match that would mean with default settings you'd get: (N=4, T=5) 484*(0.3*0.4) = 58 people finding a match. And with the numbers you use later of 10 matches a person you'd get 580 matching entries. I've never received anything near that high. But if I did, I certainly would have no motivation to increase T or N.
What happens if it's only 10% of those sharing that have a match? With the default settings you'd still get 14 people matching, or about 140 matching entries. That's still a *lot* of responses, more than I've ever received.
If all your default numbers are used, your nightmare scenario would yield 0.3*0.4*7,686,400*10 "found" responses to your query. That's 9 million 223 thousand 680 "grateful dead live" songs (though not unique) shared among 900 thousand deadheads who are all simultaneously online. Whoa.
I'm not an expert in human psychology by any means, but let me suggest this. With most tools, people don't feel any need to "tweak" them unless they're not working right. With 480 songs returned, I don't think many people would feel a need to tweak their settings. If someone was having a hard time finding something they might then change their settings -- but if they were having a hard time finding it they wouldn't get so many responses returned.
The only way I can imagine those monstrous amounts of data resulting from querries is if it happens by maliciousness or mistake.
Am I missing something?
If I spent half as much time reading that as they did coming up with that, I'd probably know for certain that guntella can't scale. But, since my attention span is barely long enough to post this, I'll take their word for it ;)
Sheesh... talk about overkill
With Napster, the bandwidth usage from the query is negligable. A single packet (your query) goes out to a single destination (the Napster index server). A small handful of packets (your listing of places your desired song is located) comes back. A few K total, then you get your 4mb transfer.
With Gnutella, the bandwidth usage from the query is significant. Your query goes to several peers, which then forward it to other peers, etc... and each server with the song requested sends you back a packet. Looking at the numbers in the analysis shows that your query will quickly generate more bandwidth usage than the actual transfer (which you'll still have to do to get your song). The bandwidth hit is distributed, true, but it still adds up, and grows logarithmically with the user-base rather than linearly.
Gnutella's success depends upon a significant portion of its users also being servers (i.e. making files available for download) -- being a provider as well as a consumer. There's a server-side hit, too... with Napster, a provider of files sends a few packets to the Napster index server advertising its wares. Aside from the bandwidth usage of the actual transfers the provider is serving, very little impact. With Gnutella, every query within your range will hit your server. Bandwidth usage from queries will quickly outstrip bandwidth usage from transfers, and this will tend to discourage people from being providers.
Please, don't get me wrong here. I think that peer-to-peer will be the future, but there are problems to be solved. Gnutella, as it stands now, will not scale well... the math in the paper in question is good, and matches real-world observations. The challenge is managing the queries, routing the queries intelligently, and keeping the bandwidth usage down "below the radar" of backbone providers and system administrators.
I don't know what can be done about the bandwidth usage of the transfer itself, but keeping the query traffic down will help in keeping administrators and providers no more filesharing-hostile than they already are. Now is the time to be treating these people well, instead of antagonizing them further. You don't want to bite the hand that feeds you your bandwidth :)
This problem has been solved before, by the way. Think "routing tables".
GnuTella doesn't scale... Lets move on. Do you have a replacment in mind? Maybe GnuTella isn't "as cool" as napster, but without a replacement...
Let's say 10% of those 10K people are doing searches. That saturates a 56K modem, assuming you can really get your packets down to 56 bytes
That would saturate your connection if they did a search on average every 8 seconds (Since 56kBITs can only hold 125 56 BYTE packets). That doesn't seem likely since it takes them 13 seconds for the DSL users to do spam everyone with requests.
How do you fit "The Orb - A Huge Evergrowing Pulsating Brain That Rules From the Center Of the Ultraworld, Live in Dusseldorf 1994" into 56 bytes?
Trees can't go dancing
So do them a big favor
Pretend dancing stinks!
It should be obvious that with a network of N nodes, to send a request to all of them will take N! time. (Yes, I know, we don't send to all of them, but even sending to a large portion is difficult.)
The new goal of gnutella developers should be to figure out some way of reducing the amount of poking around on the network you have to do.
Basically, sending a request to so many people is like knocking on every door in New York city to find your apartment. Use a telephone book man!
"Any connection between your reality and mine is purely coincidental." -Slashdot
--
But if the price of gasoline goes up, you can bet your last dollar that teleportation will be made practical. Or that cars that use fusion will be developed.
Not everything is practical just because there is a need for it.
Great straw-man rebuttal! How about if you try a more rational analogy? Going from gas combustion engines to teleportation or fusion power is a tad bigger leap than going from Napster to a similar service! And Napster ceasing to exist versus gas prices climbing higher is not analogous either...
A better analogy would be:
"If we run out of petroleum-based fuel, a similar or better form of energy will come to the forefront."
And that's ABSOLUTELY TRUE, reasonably proven through a huge mound of empirical evidence.
"And like that
Of course we've discussed this twice already here and here.
Someone you trust is one of us.
Have the math background? The math is very straightforward, it's high-school level stuff, so yeah, I have the math skills to handle it. I read the paper and skimmed over the math, and now I'm going back and reading it and doing the math.
No one said that distributed P2P needs to be a Napster clone. Free software authors frequently make the mistake of just coping some retarded existing commercial software (witness the influence of Windows on Gnome and KDE). We need to try a lot of totally diffrent ideas too. Here is one example:
Your system plays the role of file server by offering a list of available file and plays the role of search server for you by collecting the lists of available files from diffrent people. The key here is that only you search your own system's database, so only you get taged for the cost of collecting the databases of too many diffrent systems. Clearly, your system needs to figure out automatically which nodes it should track by remembering where you actually find stuff, but this should not present any real problem. You would also introduce a little randomization by tranking random nodes for a limited period of time.
This might work just as well as Napster for people who always DL the same type of music (like Tech for me). Clearly, you would not be able to show off to your friends by DLing any song they request, but that is not really that importent.
The Christian religion has been and still is the principal enemy of moral progress in the world. -- Bertrand Russell
You don't have to boycott CDs...just new CDs....
Support your local half price book store!
"Only one thing, is impossible for god: to find any sense in any copyright law on the planet." Mark Twain
Reality Master 101 writes: Not everything is practical just because there is a need for it.
:) However, science says its possible to build a gnutella-like network that will scale. Therefore, we have NeoGnutella, which will be built if there is a big enough of demand, or OpenNap. OpenNap is, as we mentioned before, is easier to use, and simular to the Napster that most of us know and love, while NeoGnutella will have the benefit of never being able to be shut down. What will win? I personally think that both will survive, due to the fact that there is a large enough market to be divided up by 2 players (again, simplified example) but that OpenNap will probably grab most of the Napster fallout due to simularity to its commercial cousin. However, if OpenNap servers become attacked legally and thus often shut down, we will switch to the NeoGnutella because finding one "node" that we can persistantly connect to is a lot easier then refinding OpenNap servers, even if OpenNap seems to scale better then any distributed net solution, and even if OpenNap is more familiar. Therefore, the long term outlook for Gnutella depends upon if it will be adapted to scale, and if OpenNap will be attacked, as well as other issues not addressed in this rant. We all have different wants. OpenNap, Gnutella, Freenet, FTP/HTTP "warez" sites, IRC "warez" channels, Napster, (formerly) Scour, and other services have evolved to meet this need. Since Napster was the most appealing to most users (and because of media hype), it became one of the biggest file sharing programs out there. Now since Napster has a rocky future, another method will become the biggest.
:)
Warning: Rant Ahead!
Partially true. In your example, you said that if price of gasoline went up, teleportation or fusion-powered cars wouldn't be developed. I agree. However, if the price of gasoline went to $20/gallon tomarrow (an outrageous rate, but its just an example), then we'd either see a changeover to natural gas/electric or some other alternative energy source vehicle, or cars would be developed that got 400 miles/gallon.
So why would gas/electictric cars be implimented and not fusion or teleportation? Well, first we have a demand for transportation. The demand for transportation is rather high, at least in the developed world, and especially in the US, since all of us seem to want to live in the woods and commute to the city. Therefore, if the demand is high, we *will* find something to fulfill the need, as long as the cost of fulfilling the demand is not so great that we have to sacrific other, equally important demands. We don't commute to work via helicopters because the time, money, and energy we would have to exert to be able to use them isn't worth the extra few minutes we'd shave from our commute time. We don't commute to work with buses because we prefer living in areas with lower population densities (e.i. suburbs) which make buses impractical and we don't like the inconvience of having to conform to the bus's schedule and having to interact with other members of our community. We are looking for something that fulfills our need to get from point A to point B with the lowest oppertunity cost to us. This is the economics/social side of the scale. On the other side of the scale is the harsh laws of science and technology, which dictate what has been done, what is possible, and what is impossible, and what the costs for doing each are. Say we have a possible solution set such as this { car (gasoline), car (electric), walking, teleportation, car (fusion) }. Science tells us the teleportation looks impossible. Therefore, we eliminate it. Technology tells us that fusion powered cars haven't been done yet, and considering everything that we know about "hot" fusion, its doubtful we could ever fit a fusion reactor in a vehicle the size of a car. We are now left with gasoline-powered cars, electric-cars, and walking, in this simplified example. Walking is too much of an inconvience to us, science doesn't have a problem with it, but human nature, and the time it would take, plus distance that would have to be traveled, make it impossible. On the economic/psycology/social side, walking isn't happening. So what will it be, electric or gasoline? The technology that's in place makes gasoline-powered vehicle cheaper then electric, and gasoline, even at the high prices that it is lately, is still an economical means of transport. Plus, we have human nature, gasoline is tried and true, electric isn't. Electric also has some problems with travelling long distance, and infrastructure doesn't support electric right now. Therefore gas is the best solution to our problem. In the future, if electric becomes more ideal then gasoline (enough to override our habit of sticking with what we know), we will switch.
So, we learn this. Each problem/solution pair depends on economics, human nature (psycological/social), science, and technology.
Lets apply this to Napster, OpenNap, Gnutella, and the rest of the field. Napster was nice and easy, a lot of us became accustomed to using it, and the technology (on our end) was cheap. However, Napster is either dead or moving towards a fee-based service. All of a sudden, from the economics viewpoint, Napster is less ideal. OpenNap is simular to Napster, there is the additionaly hassle of finding a server, but since Napster is having trouble, OpenNap seems a lot more attractive. However, OpenNap from the social viewpoint, is insecure, it has a central server, it can be attacked. Therefore, what do we have left? Gnutella is free of cost, and cannot be shut down through elimination of a central server. It is harder to use, and technology says it won't scale in the current format. Plus, it eats up bandwidth like a hog.
The above was a rant, and presented simplified examples. I didn't mention gyro-driven cars, monorails, carts hauled by penguins, or bicycles, amoung other things, because I was trying to keep the examples simple (and carts hauled by penguins aren't really practical). I didn't mention stuff like how critical user mass applies to file sharing systems because it didn't pertain to the topic of the comment. So please, don't flame me with a comment how widget-driven cars are the ideal solution, or that file sharing also depends on bandwidth. Nitpicking just wastes both of our time. On the other hand, valid comments are appreciated.
Ok, you can argue that it's actually a distributed server, rather than a p2p network. Well good for you.
--
Patrick Doyle
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
all this file sharing stuff should just be outlawed so i can have ample bandwith to do legal fun stuff like ... play red alert 2 ... all day.
>No, they stated that it was too late to
>establish clear guidelines followed by
>implementing the guidelines to count votes. They
>gave no opinion on what the guidelines should be.
Thats exactly what I said. Sorry if I didn't
make myself clear.
>It is simply my opinion that any guideline that
>requires a subjective opinion on the part of a
>counter is a bogus guideline
If a panel can't agree on a vote, then yes it
should be discarded.
Say you have a ballot that requires you to place
an 'X' in a box. Some guy comes along and
instead circles the box. If you showed that
ballot to 100 people, all 100 would agree on
who he meant to vote for. His choice is clearly
circled. The machine however, is designed to
only recognize an 'X' in a box, so his vote
would not be counted.
You said above that in this situation, the vote
should be counted. Well, if there were any
undervotes where clear intent could have been
discerned, they weren't.
Ever take a test on a scantron? They are green
sheets for taking multiple choice tests. To
choose an answer, you pencil in the corresponding
box. These sheets are never 100% accurate.
The teacher will always give you an answer key
to catch any mistakes that the machine makes.
What guideline does any teacher use?
Clear intent. There aren't any established
guidelines.
> And every vote that had unequivocal evidence
> was counted
An undervote occurs when the machine does not
record a vote on the ballot. There were 40,000+
undervotes in Florida that _WERE_NOT_COUNTED_.
That was what the lawsuit was about. Don't
you think that out of those 40,000 undervotes,
clear intent could have been found on at least
a few???
> Frankly, I find it astounding that anyone would
> argue that ambiguous votes should count
No one advocates that. The point is, out of
all of those 40,000+ undervotes, there were
probably a few of them where there was clear
intent on who the person wanted to vote for.
Are you just being a troll?
Anonymous posts are filtered.
Posted by polar_bear:
It probably isn't complete FUD, but if it's FUD then the people propigating it are likely the record companies -- not Napster.
Looks like they know they're being /.'ed. Check out top of the main frame of their homepage.
I rarely bought CD's before I found MP3.com. Now I regularly buy CD's from there. You can find first rate music produced by artists that are as good as anything you hear in the commercial mainstream and the artists have full control of their music. They also usually have one or more MP3 downloads to preview.
The best way to ruin your hobby is to try to make a living at it. Waiting on the paperless office since 1997
4th post and it's already /.'ed...
I have a shotgun, a shovel and 30 acres behind the barn.
1q2w3e4r5t6y7u8i9o0pqawsedrftgthyjukilo;p'azsxdcf
"The more relevent question is whether you can have a peer-to-peer network without central servers that *can* scale. And the answer is "no"."
That question bugged me so much, I will like to answer it for you, the answer is YES! I figured out a solution, after reading the paper yesterday, I spent my time in class scrawling and pondering over that, and I have a very simple elegant solution, I can't believe it! So, I am going to perform some experiments first before I make a fool of myself, but I certainly think it can be done. If I told you how, you do smack yourself in the forehead and say, "of course!"
------ Curiosity killed the cat. {satisfaction brought it back | it didn't die ignorant | lack of it is killing mankind
What about China? After all, they already do lots of dealing in copyrighted music.
..it can't scale. Anyone taking a computer science class should've realized that...
On my campus, we've been using Limewire to make a private Gnutella network. We use it to trade files with each other. That way we're not all trying to get the same files from the internet. It's much faster. People at other colleges should try it.
Search is similar to Napster's; I find that searching by genre is a lot more useful. If you search Metallica, it gives you a list of the most popular metallica songs, then up top it does the old "People who like Metallica also like this....", and you can search Heavy Metal, Rock, or whatever ... I find it more useful than Napster search...
How is it? I was planned to download their "satellite" client and try it out later.
Usenet was designed as a discussion medium, not a means to distribute large files.
It works great at the job it was designed (even better than slashdot, IMHO)
At the job of distributing large files... Each ISP keeps a copy, regardless if anyone wants it or even asked for it. That is not a good utilisation of resources.
Bill, has a newsreader which only presents new articles.
I think this possibility of attack has been mentioned before (well of course it has, you linked to the story ;). The counter arguments raised, that a) it hasn't happened yet, and b) are the RIAA/MPAA really that smart?, still apply to Gnutella.
The other problem is that the RIAA could possibly into sticky legal areas themselves by clogging peoples' bandwidth with garbage.
Yeah...i'd written out something longer, but it all either involved a lot of overhead with searches going up and down the trees, or having something like a root server.
I suppose there could be a round-robin DNS setup (root.p2p.org?) and if you are on a LOT your IP would go in there. If the software couldn't connect to you, then it'd just go for another.
Bugger it - can't find anything out there, people kick you off while downloading, things are mislabled, etc...
Actually, there is one point that you did not mention when describing your proposal. Your system could be used to link the various clone Napster servers. They would only pass on requests which they could not fill with their own users. This way people could still use the various Napster clients, but get more files offered. Nice
The Christian religion has been and still is the principal enemy of moral progress in the world. -- Bertrand Russell
Hmm, now that you've been shut down, suddenly it's OK to publish a laundry-list of flaws about the competition? Shady.
---- Just another spud server.
Also, this means that the population P DOES have an effect on the number of reachable users, because as P increases the number of redundant connections will decrease. Don't have the math to prove it, but I think that's the way it works.
Also, is there analysis of why gnutella can't scale in terms of P? I can see why it won't scale in terms of number of users I can reach, but why not in total users, IF users are content to let themselves be limited to a small fraction of the network (this should be enforced by the clients. I know people can wrte their own, but they shouldn't write them to allow huge TTLs).
Also, what of the reflectors?
and I'm very curious to see how the bunch over at musiccity.com, in league with napigator, attempt to revive the napster architecture of a central server while still avoiding the easily prosecitable "central server".
as Gnutella attempts to get quicker (its slow as hell now), I'll wish them well, and be looking to the napigator project for hope.
There's nothing Intelligent about Intelligent Design.
Freenet is also very well architected, unlike bogus Gnutella. It's designed to scale up, so that popular stuff gets cached all over the place. Like, more people downloading means that your connections go FASTER. This is cool.
Cui peccare licet peccat minus. -- Ovid, Amores.
Is it possible?
Yes!
By using an internal microcredit/payment system (called mojo) and localized reputations Mojo Nation aims to do exactly that. Better connected brokers (peers) will naturally become more "server like" due to having a better uptime, lower latency and a lower mojo cost overall for other brokers (peers) to use.
The resources in the system are allocated dynamically. No strict heirarchy needs to be defined, it will establish itself appropriately for each individual peer as it is needed.
PS a new version (0.950) was released today.
However, the REAL question is whether you can have a peer-to-peer network with decentralized servers, i.e., with clients that automatically establish a heirarchy among all the clients, and certain clients become more "server like". They only way to make a Gnutella work is by making it heirarchical, but the heirarchy needs to be automatic for it have the same general "virtual network" aspect of Gnutella.
------
Not a typewriter
Accidently hit enter too soon. Here's my real response to this:
Every time I see one of these discussions pop up, I see someone re-invent Freenet. Freenet has a lot of other optimizations to make "decentralized servers" work even better then what most people describe, but the basic principles are there. You did it oh so well, too.
------
Not a typewriter
I'm doing the reverse. Make DNS act like a P2P system. This is being done using Freenet as a base. FNS (Freenet Name Service) aims to be a complete DNS server implementation that gets its zone files off Freenet instead of local memory. This makes the tree heirachracy of DNS destroy itself and rebuild itself into a P2P system.
See http://sourceforge.net/docman/display_doc.php?doci d=2162&group_id=15579 for (outdated) documentation on it.
OK, I'm done plugging my project now.
------
Not a typewriter
Should be freenetproject.org.
------
Not a typewriter
One word: HavenCo
John Susek
The OpenNap servers are *very* good. I don't think I've used a Napster server for several months now. Grab gnapster and get this and you are good to go.
Cypherpunks: Civil Liberty Through Complex Mathematics. Those who live by the sword die by the arrow.
It seems to me that the principal bad assumption of gnutella was that forwarding search requests costs less than forwarding file lists. The second problem is the network topology, though that can be fixed relatively easily, and some of the newer client/servers seem to be tackling that problem.
If you switch to a more napster-like model where each user submits a file list, then freeloaders don't consume as much bandwidth. You develop a database over time as you stock up on file lists. The downside is that you can't just join and search (though maybe asking nearest neighbors to search could be part of the protocol). Since users might update only a few times per day or less, the overall bandwidth use isn't that high.
For the topology problem, I would suggest more of a ring-chain topology, with some redundancy (backup connections in case a link breaks, and multiple rings that are sparsely interconnected).
This is fun stuff to think about. Similar problems are present (self-organized networks) in "bottom-up" nanotechnology. Maybe I should ask for a DARPA or NSF grant for nanotech research and spend my time and money working on a p2p network...
Was that not the line in Jurassic Park about an enzyme prohibiting male offspring? Well, Gnutella may not scale well today, but a legion of MP3 loving programmers WILL find a way to share music. The proverbial cat is out of the bag and millions of consumers have tasted blood. The RIAA cannot put this Genie back in the bottle and only significantly lower prices for music with added goodies will bring buyers back. And it better be online, reliable and good.
I know others have said it previously, but Gnutilla really can't scale. It's a great idea, but until it's either implemented better or farms out search queries to a central server, it's not going to be a practical alternative to Napster. I'm on a 56k dialup at home, and just being logged in to the network brings my machine to a grinding halt. Search queries consume all of my bandwidth, making it impossible for me to download anything. The idea behind Gnutilla was ingenius, but until we start to see affordable residential T-3 service, I think I'll have to pass. Peer-to-peer file sharing is wonderful in theory, but at this point, it's just not practical.
Dear god, I can't believe I'm even going to
responsd to this. Here goes:
There were over 40,000 (yes, forty-thousand)
undervotes in Florida. That means that 40,000
people that voted in Florida either went to
the polls, but did not vote for president or did vote for president but their votes were not
counted. Statistically, it is safe to assume
that the vast majority of those votes intended
to have a mark for president.
Then there was this big supreme court decision.
You might have heard of it. They stated that
the votes should be counted, that the FSP should
establish guidelines for doing so, but that it
was TOO LATE to count the votes now. Therefore
40,000 votes in Florida would not be counted.
No matter which side you come down on (and I
think there are valid arguments on both sides),
there are a lot of people in Florida that were
disenfranchised.
Anonymous posts are filtered.
They reported that CD stores around college campuses had GROWING sales.. But the sales weren't growing quite as fast as they were elsewhere.
This could be from any number of causes.
1. People at a college might have more straightjacketed finances and can't afford to increase their CD spending as fast as the general public.
2. People at a college might tend to order online more often, thus satisfying their music consumption through non-local stores.
3. People at a college may be joining CD clubs or may be purchasing CD's at home where they have convenient access to a large collection and bringing them to college instead of purchasing them near college.
4. A statistical anamoly. A decline in sales isn't actually happening.
5. A million other possible reasons.. Colleges are drugging their students so they purchase textbooks instead of CD's.
The conclusion: While such a correlation may exist: college cd purchases aren't increasing as fast as the average in the nation, that could have been generated by any NUMBER of possible causes.
If you want statistics I'll believe: Take universities who's student populations are similar demographics that do and don't have (say) napster, and ask them how many CD's they purchased in the last year. Or use some other technique that isn't susceptable for the flaws #1-5 above and give me numbers that don't have obvious artifacts.
Anyone who understands how Gnutella works (unfortunately, too few people) knows that Gnutella is horribly broken, will never work, and is basically unfixable.
The more relevent question is whether you can have a peer-to-peer network without central servers that *can* scale. And the answer is "no".
However, the REAL question is whether you can have a peer-to-peer network with decentralized servers, i.e., with clients that automatically establish a heirarchy among all the clients, and certain clients become more "server like". They only way to make a Gnutella work is by making it heirarchical, but the heirarchy needs to be automatic for it have the same general "virtual network" aspect of Gnutella.
Is it possible? I don't know. You would probably have to have automatic bandwidth measurements, depth probes, all kinds of things to make it work. I simply don't know if it would be possible to automate something like that.
--
Sometimes it's best to just let stupid people be stupid.
Er; you did see who the author of the article was, right? Not exactly one of the record companies favorite people... Napster co-founder Jordan Ritter.
You're saying they paid him off, or did you just not bother to read the header?
No relation to Happy Monkey
Please forgive me if I've completely missed your point but my reading of your article leaves me with the impression that you are trying to adjust the parameters so that by scaling up you mean that tens of millions should see and have the opportunity to reply. I'm between sessions at P2P so my reading was quick. I did not have anything to do with the formulation of the gnutella potocols but it is clear to me that the intention is for each client to have a limied neighborhood of visibility within a network of arbtrarily large size.
I awaited with baited breath to see what in this vision would lead to a meltdown and I'm still waiting. Just to repeat we set and maintain a reasonable TTL so your search is over an immense number of sources but not every client in the network. Either I don't get it, you don't get it, or you don't want too get it. Please explain if you care to
Their server can't scale either judging by the rampant slashdot effect. To get on topic, it's nice to see a well thought out and calculated paper demonstrating why Gnutella will eventually collapse. I've been telling my friends this for a long time and their response has been, it's fine on DSL/Cable/your broadband of choice. I'm glad to see I'm not the only one who thought this through.
cheese logs keep my wang warm at night.
IIRC, and I am not sure that I do, but isn't there some bug in the windows TCP/IP stack that you can't have too many "open" udp "connections" at once?
All of the communication is done through a single UDP socket. DTCP is a multiplexing transport protocol which operates over a single UDP connection.
You are correct about the number of open UDP sockets though. On any UNIX or NT variant the limit is usually 1024 to 2048 per process, and 64k per IP address (the PORT value in UDP or TCP is only 2 bytes)
This is why native UDP or TCP cannot support the required number of connections to perform direct queries to each peer in a large network.
Gnutella is neat, but for a reliable MP3-only service, check out Audiogalaxy.
At first I was put off by the web interface, but:
1) It remembers everything you request in a queue and will get it when available. (A must for dial-up users)
2) Auto-resume using temp files.
3) A small app in your system tray/console only sends/receive when you have it running.
The greatest advantage is that ZDnet/CNet/MSNBC and other DON'T mention audiogalaxy in their "quest for Napster clone" articles, so the quality of users, and therefore the music, is excellent.
Unfortunately, it is a centralized system, but so far, it seems the mainstream media/RIAA have ignored it.
For some interesting stats and commentary on how most Gnutella users simply leach off of 1% of all users contribute 50% of the files, check out
n dex.html
http://www.firstmonday.dk/issues/issue5_10/adar/i
gnutella doesn't scale, because napster still works fine. and until it doesn't why try to fix something that isn't broken.
I had thought of IDA as a secret sharing scheme like Shamir's. Thanks for bringing this to my attention!
I found the original paper:
MICHAEL O RABIN : Efficient Dispersal of Information for Security, Load Balancing, and Fault Tolerance
Basically, it means you can break a file of length L into N chunks each of length L/M, such that only M chunks are needed to reconstruct the file. It's exactly the right thing for these circumstances.
--
Xenu loves you!
Gnutella scales upward poorly and is only mediocre at scaling downward. Freenet is horrible at scaling downward (where it is at right now, unfortunatly), but I have yet to see a better system for scaling upward.
The best thing we can do for Freenet right now is advocate its usage to freinds, then insert/request content. Especialy requests.
------
Not a typewriter
Go ahead. Strike me down all you wish. I have more karma than you could possibly ever imagine.
Actually, it's going to be hydrogen. For some very cool technology that's actually happening - cheap, clean, powerfull hydrogen cars - check out http://www.rmi.org/sitepages/pid18.asp
Freenet is a great idea, and with a client like Espra it will kick ass for sure !
"Naughty, naughty, naughty, you filthy old soomka !"
What is clear is that kazaa really works - I don't understand why it doesn't get more attention.
If nothing else the ability to download the same file from multiple other people at the same time - meaning you don't trash THEIR bandwidth but you get to use all of YOUR bandwidth - is amazingly useful.
Of course this thing also doesn't appear to be truly P2P - there is a server in there somewhere. Can't tell if that is related to the searching or merely to their attempts to make some $$$.
R.
First of all, the word you're looking for is 'subjectivity'.
Which is synonymous with subjectiveness.
Is this voter's intent clear? I think so. Would this vote have been counted, they way things played out?
As a matter of fact, yes. And they should. In fact, there is legal precedent for counting ballots when a voter uses a pencil/pen.
requires that the counting procedure 'determine the intent of the voter'. [...] So now tell me again how you know for sure that every legal vote was counted.
If someone circled a particular hole, then the voter's intent is clearly defined, and no subjectivity [I'll use this word just for you] is required. But that's not what happened. What happened was that Gore wanted to go beyond that, and to count votes with scratches on the card. That requires subjectiveness on the part of the counter. Put it this way, if two counters can reasonably reach different conclusions, then it's not a vote.
Bottom line, all votes with clear intent were counted.
--
Sometimes it's best to just let stupid people be stupid.
Heh. Grandpa indeed.
:)
MP3 first came out in 1996.
But it almost seems like forever doesn't it? To me it's encouraging that this stuff is so new because it means that in 4 years I'll be a "grandfather of the internet" too.
This is a truly fun time to be alive.
Sure, it's not right now. However, given that Gnutella is not under active development, it may not ever be.
As this paper points out, Gnutella eats up bandwidth because it keeps a few open connections. If it were to instead keep referances to other nodes, not constantly open connections, it would only consume bandwidth when it's actualy transfering files.
Thats exactly what Freenet does, actualy. Freenet also adds in a few more optimizations that make it even better. For instance, it keeps those referances, but it also routes to those referances based on how "close" (through a cryptographic hash of the key name) a given node is to the data being requested or inserted, whereas Gnutella is totaly random. This increases Freenet's efficiency even more.
------
Not a typewriter
Adam,
This is a really good point, running a gnutella network within a college. I'll look into it. Of course, what happens if people leak the info to outsiders? Is it possible to use limewire to restrict access? I suppose that, if not, it is a feature that could be easily added to a gnutella client.
Thanks for the idea,
Jason
I understand that there are basically three reasons for Freenet:
- Abolition of censorship
- Archival of documents based on their percieved "usefulness"
- Elimination of standard bottlenecks in most peer-only networks (I hate the term peer-to-peer, but won't digress into the rant behind that statement)
So, do we really care that Gnutella lasts any longer than the time that it takes to get Freenet everywhere?hahaha
--
Flat out - I buy CD's. I don't use Napster. Why? Though the RIAA is a political machine, I don't feel they truly represent the artists. If by buying a CD I am still supporting the artists, then that's what I'll do. If the artists want to distribute their music for free, then I'll take it. I am a musician. I'd love to get my music out and be heard. I understand what it takes. I've had the pleasure of being able to know professionals - in the jazz field. I very badly want to see a "viable business model for the artists". I'd love to see the record companies fade into a niche of cheesey pop with real music by real musicians coming from their own realm - not powered by the almighty marketing buck.
You've got to look at this from a wider angle - we're not the sheep. We've got reasons why we do what we do. We don't just run amok without thought. Look at our posts, you have a rationale as to why you continue to buy CD's. So do I, and it focuses around a "higher moral" kinda thing (hell, its late, I'm tired, and rhetoric is failiing me here). But what about some kid down the street who just bought the latest Backstreet Boys release... did they have some moralistic, cognitive rationale as to why they bought it, or do they not care?
Take exception to being called a sheep. I wasn't referring to you, you are part of the exception. Stay that way. It's where we need to be.
Hi! This is the Sig, blatantly attached to the end of this comment.
I love AudioGalaxy. It's a lot easier to use (in terms of searching), downloads auto-resume, and downloads automatically come from the fastest available connection nearby. You also tend to get far fewer truncated files because it will, by default, download the most popular version of the MP3 (in terms of size and bitrate), but you can also custom-select which version to download yourself.
It's definitely worth a try (and blocked by far fewer firewalls and ISPs than Napster!).
The trouble with news is one lost article screws your download. But that's what error correction is for! A simple Hamming code allows you to, say, break the file into 26 data shares and add 5 error-correcting shares such that the file can be reconstructed after one share is lost; you can do better with more sophisticated error correction schemes.
I haven't seen any P2P proposals which make use of error correction technology, and it does seem like it might be useful.
--
Xenu loves you!
FUD? Just read the math, man. Make your own decision, sure, but read the paper first. There's nothing FUD-like about the mathematics in the paper.
--jordan
It should be possible to create self-organizing dynamic networks that can provide the necessary efficiencies. It will be more complex than Napster or Gnutella, but it should be possible. And I think it would be a lot easier if we had reasonable IP multicast to build on.
I wholeheartedly agree, the concept of an R value and figuring out an accurate way to represent realistic demographics is extremely complex and very prone to error. I never really thought that it was completely accurate (and I felt as though I did a good job giving the "grain of salt" talk), because as you so aptly point out, there's a lot of variables in the mix that can affect the demographic. All I wanted to do was present some math and methodologies, and test some example cases to provoke thought.
I hope people don't think that every search for "grateful dead live" will yield such uncharacteristic results, but I do hope that people see the numbers and think to themselves, "wow, an 18 byte search query could potentially generate this much traffic." And it's true, it possibly could happen. Who knows; you point out the relevancy of human psychology in deriving a more realistic number, and I don't think anyone can take an accurate stab at how it factors in to a P2P network of 1M people.
--jordan
Simple, with all that media franzy going on (Napster trial even got 1st page covering in my local newspaper) it's a big-scale advertisement for MP3. Yes Napster has a userbase of 60 Million so using the argument that it's only specific individuals that are doing it is wrong, but if that story made it in my local newspaper (and we could see a mention for gnutella too), guess how many people that didn't know about it or napster will be curious to try different services out.
Now there will be media coverage (other than internet) mentionning other alternatives like IRC, Gnutella, search engines, etc etc, this is really a stupid move... not counting the many people that is going to be pissed off at RIAA and stop buying CDs.
RIAA should have worked closely with napster to bring a decent buisness model instead of bashing on them, they might have actually profited from that. They've shown how many "copyright material" were leeched every second (around 10,000) but did they show EVIDENCE that their sales decreased DUE to napster? no, they didn't have to, but if they would have, things wouldn't be that way. You bet after napster shuts down, their sales will decrease, I, for a start, will not buy anymore CDs.
I hope a company picks on big artists for digital distribution and doing something like stephen king, a buck a download, money would go STRAIGHT to them and the record label would stop it's own piracy (i.e. ripping many artists off and taking the public for complete morons).
For now Gnutella will do for most people, and if people SHARE, maths or not, it will work, not as nicely as napster did, but there will be a bunchload of alternatives if gnutella isn't doing the job.
--- Metamoderating abusive downgraders since my 300th post.
Someone will ask for a recount...
--- Can i borrow your Clue-Stick(tm)? I need to go beat a few people with it...
This numbers are tricky. They speak of a huge transmission size on a big net. What does it mean for a user ?
Using the default Gnutella parameters (Number of connections (N) = 4 and Time To Live (T) = 5) I can reach 484 (at most) users on the net. Using the same thinking I can say that 484 user can reach me.
How many queries does an average User does per hour ? When we do a query we have to wait a little time for the replies, the we begin to transmit (The tansmition does not affect the Net). So let's say that an average user do 20 queries per hour.(This is an overestimation)
So if 484 users do 20 queries of 83 bytes per hour, i have to recieve 803440 bytes per hour. That's 223 bytes per second.
We also have to add our outgoing traffic and replies. But the number will surely be less than 1KByte per second.
We can add 100 Millon users to the net and the behaviour will be exactly the same. But we can only search 484 machines.
If we increase N and T. Lets say N = 5 and T = 7, we can reach 27305 machines. If we all do 20 queries per hour I will recieve 45MBytes per hour. That's 12KBytes/sec. That's a bit higher, but many people can handle it.
It is possible to create more intelligent Gnutella clients, who cache content (some clients already do it), or cache query replies, or try to connect to near (geographicaly) machines.
My conclussion is that Gnutella Net can grow, but the number of machines we can reach is limited by the bandwith of the connections.
That mean, we can add more machines, but we can not reach more content without more average bandwith, or more "Intelligent" clients.
MOD THE CHILD UP!
Hate to reply to myself (for risk of acting schitzophrenic), but why is this "overrated". I can see anything about this that could possibly be overrated. I could even understand "flamebait" or "troll", if you look at it from a certain viewpoint, but why "overated?" Just because you (J. Random Moderator) disagree with my political views, doesn't mean you should mod me (or anyone else) down about it. If you are going to moderate, read the guidelines and follow them.
====
Crudely Drawn Games
But then who needs it to scale? Let there be a dozen or a hundred such networks. I don't think you will need a huge number of individuals to make it so that the majority of searches will find appropriate files. Maybe for executable files, source code or other specific data it would not be good. But when I look for music with napster, I only sometimes get what I actually set out for, but instead get lots of other stuff I stumble apon. And for music, porn and some other things that will more than satisfy.
The current Slashdot moderation system is made by gay communists!
i mean, each client is only passing a small amount of data between each, so i dont know if the agregation (sp) of the total bandwidth usage is a ... useful ... measurement...
tagline
... hi bingo
Then in 1897, a revolutionary new invention was patented, and it was predicted that current cities would be retrofitted for it and new cities would be built around it. The Telharmonium, invented by Thaddeus Cahill, was titled Art of and Apparatus for Generating and Distributing Music Electronically (pat. no. 580,035). The Telharmonium was the first music synthesizer, which operated by additive (Fourier) synthesis: Sine waves from banks of dynamos, all turning at different speeds, were added and linked through a complex switching system, then played through the speakers of its day, telephone receivers. The Telharmonium's music would be sent over telephone wires into homes, restaurants, and hotels. It was like the Muzak of the early 1900's. This way the best music could be transmitted into homes, and would be available to the poor.
But the Telharmonium was a massive instrument, quite possibly the largest musical instrument ever built, weighing in at over 200 tons, and had to be transported by 30 railroad cars. It was prohibitively expensive for its day, and only 3 were ever built. Broadcasting music by radio came into being, and companies like Hammond and Wurlitzer started making organs which worked on the same principle of additive synthesis, but on a much smaller scale, due to advances in technology. The Telharmonium was rendered obsolete forever. (For more info, see http://www.obsolete.com/120_years/machines/telharm onium/index.html).
Not to stray too far off topic, this post wasn't meant to be the history of electronic music, but rather, the distribution of music. People were meant to subscribe, to pay for the wired connection that delivered the music the Telharmonium produced. But when radio supplanted music delivered by wire, people could listen to radio broadcasted music, essentially plucking it from the ether, without having to pay for it. Nobody was interested in the Telharmonium anymore, and besides that, more "modern" organs made the same sound anyway. Therefore the Telharmonium and the entire business model of distributing its music has completely disappeared. (Indeed, all three of these 200 ton instruments have vanished without a trace.)
Now things have come full circle. We are back to getting our music by the wire we pay to have connected to our homes (be it dialup modem, DSL, cable, etc.), Many of these wired connections we are paying for are owned by companies that also have stake in the music industry, e.g. Time-Warner. What I want to know is WHY we are being punished, accused of getting our music for free by plucking them from the ether that is contained within the wires, when the wires themselves are owned by companies that also own the music industry?
--jordan
I am a firm believer in the fact that no matter what the RIAA or anyone else says or does, music will continue to pass freely throughout the internet community. Maybe Napster is dead, maybe not. Maybe Gnutella won't work, maybe it will. Who knows at this point. I'm sure someone, somewhere will come up with the "next big thing" and we'll all be enjoying our free music. There are so many people who have had a taste of this free music that even if only a small percentage get together to try and continue to keep it this way, there's no way the RIAA or the federal government can keep plugging up all the holes springing up all over the place. We just have to remember that we have the strength in numbers, and that paying $15 for a cd with 10 songs on it is the biggest and longest running scam going on right now. Information deserves to be free, let's at least try and keep that true for our music long enough for the record companies get the message. Maybe eventually the record companies will crumble and there won't be any more middle man sucking up all the profits. Then the fans and the artists can get the full benefits that they deserve.
I am currently working on a fully decentralized searching network. You can read more about it here.
The key aspects of this network will be:
- No forwarding. This is currently eating gnutella alive. A UDP based multiplexed transport protocol is used to maintain hundreds of thousands of direct connections to all the peers you want to communicate with. You can also tailor your peering groups precisely to what you desire, as far as quality, reliability, etc.
- Low Communication Overhead. All queries that are broadcast are performed with minimal overhead within UDP packets. A typical napster breadth query (10,000 peers) would take a few minutes on a modem, and seconds on a DSL line.
- Adaptive Configuration. Peers that have better or more responsive content will gravitate towards the top of your query list, thus, over time you will have a large collection of high quality peers which will greatly increase the chance of you finding what you need.
There are a number of other features, however too much to detail here.
Also, this is under heavy development, and not operational. I am going solo on this at the moment, and so progress is slow. However, once completed, it *should* be a scalable alternative to completely decentralized searching / location.
The math is bogus. There are two fundamental flaws with it:
- it assumes an unbounded number of gnutella clients
- it assumes that the propagation of a query never doubles back to an already visited node
Jordan brushes both these assumptions under the rug in his equation, saying it's for the "maximum number of reachable hosts" and therefore doesn't need to account for this kind of feedback. In fact, it's not as simple as he thinks.Instead of starting with the exponential propagation of a query, start by looking at the network as a whole. Since there are P clients each with N forwarding routes, there are N*P total forwarding routes in the network. Gnutella clients will only forward a given query once (they keep track of a numeric ID for each query to avoid resending one that loops back to them), so the maximum number of transmissions that can result from a single query is also N*P (to reach this maximum would require an incredibly regular network structure, by the way, and is probably at least an order of magnitude above what real conditions would look like).
Now let's look at Jordan's numbers. He comes up with a shocking 1.2G of traffic from an 83 byte message in a case where N=8. (For the math, I'm only considering the "outgoing" traffic, since the incoming traffic varies proportionally with outgoing.) That represents about 15 million transmissions, which would mean that the network held about 2 million simultaneously active clients. That's an absolutely huge network compared to Napster (Gnapster reports about 11,000 simultaneous users) or Gnutella (LimeWire reports around 2000). Given that the current network is 3 orders of magnitude away from this size, I don't think we have to fear 1.2G of traffic anytime soon.
What most people care about is how much traffic they'll see on their own node. Let's say that people will start to get annoyed when baseline traffic hits 16K/s. Assume every query on the network reaches you (unlikely, but we're looking at maximums). On average, you will receive that query a maximum of N times, since the message goes through a maximum of N*P transmissions for P clients, for an average of N receptions per client. As Jordan says, the number of queries per second varies proportionally with the size of the network. Let's use his assumption of 1600 clients generating 1 query per second and each query taking about 83 bytes. 15K/s means about 200 query receptions per second. Dividing this number by N -- lets pick a high number, like 8 -- gives the number of originating queries, 25qps. This translates into a network size of 40,000 clients, or about 10 times the current size. Yes, that's closer than we'd like, but it's not nearly as doomsdayish as Joran proposed.
What Gnutella should do is raise T and lower N. People have gotten scared by the geometric progression argument and think that big values of T are scary. In fact, it's big values of N that are the problem. There's no difference between a T of 20 verses 20,000, since once you've reached the entire network (which 20 will probably do) you've reached it, and your query dies not because of TTL but because all clients have seen it. On the other hand, changing N from 4 to 8 means you'll receive twice as many copies of every message, thus doubling bandwidth and halving the feasible size of the network. Basically, N should be only large enough to keep a query alive, and T should be large enough to reach the entire network with the given N. I would suggest setting N to just 2, with an initial optimization that the originating client sends it out to many more clients for the first hop, say 32, just to make sure that the query doesn't die in its early stages before there are enough "live" copies of it traversing the network. With that optimization, Gnutella could scale to 160,000 clients before it hits the 16K/s baseline limit.
it's not dead. it's just not going to be free.
But if Napster gets squeezed, you can bet your last dollar that it will be made to. Or something like freenet or audiogalaxy will take over.
But if the price of gasoline goes up, you can bet your last dollar that teleportation will be made practical. Or that cars that use fusion will be developed.
Not everything is practical just because there is a need for it.
--
Sometimes it's best to just let stupid people be stupid.
What about the *pre-user* bandwidth? Even if you have Gigs worth of data to move, if you have millions of users and things are split up evenly, that's only kilos per user. The clincher is looking at peak bandwidth at any given node, and comparing that to capacity. Did I skim the paper too fast, or did it not address this rather thorny mathematical question? Not that I believe Gnutella scales smoothly at all.
Dog is my co-pilot.
And for all the ignorant folks out there, 2 million users is not an absolutely huge network for Napster. AAMOF, last I checked Napster peaked at 1.8M aggregate concurrent last week. The problem is that ignorance of Napster's true usage is so pervasive that the comparisons I make and the conclusions I draw from them are sometimes lost.
I don't think my numbers are realistic. Please re-read the previous sentence. I tried my best to convey that impression throughout the paper, and I don't understand how some folks missed it. I do think that the numbers are still useful in formulating a perspective on what will happen if even a few percentage points of the Napster population decide to try and use Gnutella.
--jordan
On the contrary. Napigator is a nifty little freeware tool that lets the Napster client program use other Napster servers. The OpenNap network is huge and not going anywhere anytime soon...
...and will continue to improve if only folks would move to newer, more robust, and more compliant clients. If you're still running gnutella 0.53, or even Gn0tella, check out BearShare at http://www.bearshare.com/. You'll be surprised at how far Gnutella has come - that only hints at how far it may go in the future.
Critics said man would never set foot on the moon. Now critics are saying Gnutella is doomed. Funny, they've been saying that since March of last year and I'm still happily downloading MP3s. Ignore the critics and keep the faith.
Shaun
Thanks to the War on Drugs, it's easier to buy meth than it is to buy cold medicine!
The parent "takes responsibility" for a child by accepting its list of shared files--not a large text file that can be stored on the parent's disk. Along with the list it saves information about the child's bandwidth, IP adress, etc. Now when someone searches the parent they also search the child without causing much extra inconvenience to the parent (a few more lines of filenames to search through). Importantly, no packets pass between the parent and the child during a simple search. Periodically, when there is bandwidh lull, parents and children send each other "are you still alive" queries. If a parent dies the child finds a new one. A standard "who will adopt me?" query which locates a machine with
1. Enough bandwidth
2. Enough spare CPU cycles to be able to churn through a few more search list enteries.
3. Relative proximity to the child as defined by ping time. (Perhaps not necessary)
Once the child is adopted it passes its list of files to the parent and is notified only when there is a request for something from its list.
Maximal efficiency would be achieved through superparents, fast computers on fast connections that accumulate and search filelists from many children.
What would be the relationship among parents? Why not just the regular Gnutella protocol? The inefficiency of the current system is because everybody is sending requests to everybody. With the new hierarchy, only parents speak to each other. This way you can search the whole network and pass only a tiny fraction of the packets a similar search of Gnutella would require.
What I am in no position to write is a program for calculating which computer makes an eligible parent, how many parents there should be in a system, how to tell what a parent's TTL should be from a check of the general network health (busy network->only seach some fraction of parents; tough luck). The point is, algorithms like these should be possible, and when optimized they would automatically keep the network balanced. This with a whole lot less pinging in the system, an exponential reduction in traffic and no need to pass search queries through slow lines. (No people with dialup modems would be parents).
I wish I were more solid on the hardcore technical stuff. So you folks tell me--why wouldn't this work?
Spork
So, Jordan, you provide a nice demonstration of a flaw. It is considered polite in many circles, that when destroying someone's hard-work, that you make a peace offering in the form of some assistance.
:-)
Can we expect therefore to see an equally interesting and thorough discussion of how Napster/Gnutella can grow, evolve and perhaps merge, to provide the "ideal compromise" where we will not need 100Gb networks, but where:
a) The destruction of any significant %age of the network is transparrently ignored or healed.
b) The network will not segment as GnutellaNet can.
c) Bandwith requirements are low[er]
d) Anonymity of participants is maintained where required.
e) The law can't shut it down so easily.
f) Data can be secured, encrypted and/or signed (etc.) for specific users
And MY personal wish:
g) The end result is so globally accepted for file exchange and storage, that FTP dies a death, and we all live without buffer-overflow exploits for the rest of out lives
Note that Napster and Gnutella were very one-sided in their freedom with files. There was no facility available to ensure that the law wasn't honoured where desired.
--
Enjoy Y2K? Roll-on Year 2037!
They stated that the votes should be counted, that the FSP should establish guidelines for doing so, but that it was TOO LATE to count the votes now.
No, they stated that it was too late to establish clear guidelines followed by implementing the guidelines to count votes. They gave no opinion on what the guidelines should be.
It is simply my opinion that any guideline that requires a subjective opinion on the part of a counter is a bogus guideline. Any guideline should require unequivocal evidence that a vote goes one way or another. And every vote that had unequivocal evidence was counted.
Frankly, I find it astounding that anyone would argue that ambiguous votes should count. There is only one explanation -- because you're biased in favor of Gore. Since Gore could only win if you count bogus votes, therefore you're in favor of bogus votes.
Just for the record, I would have exactly the same opinion whether Bush or Gore had won.
--
Sometimes it's best to just let stupid people be stupid.
This was a plea for development assistance ;)
I could very much use some additional C++ development talent to help with this project. Anyone who is interested please let me know.
Thanks...
When you talk solely about downloading mp3's, I've tried both Gnutella and Napigator. I've always found Napigator to be more stable, easier to use, and more likely to provide good downloads than Gnutella. Better yet, Napigator works with existing Napster clients to bring da music to da masses.
If its trading of MP3's at stake, I beleive that Napigator and nap servers like OpenNAP will save the movement, and not Gnutella.
The next Slashdot story will be ready soon, but subscribers can beat the rush and slashdot the links early!
This included the fact that load on each server grows proportionally to the total number of servers, so the total CPU usage for the whole system grows quadratically. There are also serious issues with naming, searching, tagging, and other things that could have been dealt with.
There didn't seem to be much interest in this so I moved to lurking the Freenet mailing list, which seems to be a much more grownup way of doing the same thing.
Return Rant:
I'd say there is no way to put the genie back in the bottle, either by products dying out or by legal action. Now that there has been a taste there will eventually be one or more working models. None is likely to have the instant dominant position Napster had (except possibly Microsoft's offering if they bind it into Windows) but that doesn't mean the concept will die. File-sharing is a simple concept and a very addictive concept so it's something with low market entry and lots of possible market share. That will drive companies to invest. Us geeks will invest jour time just to keep the companies from sealing us in and because we like to hack code. I myself was working w/ file-sharing concepts long before Napster existed and am sure I will be long after. The concept has no doubt been growing ever since the invention of email. As a species we like being able to communicate freely. That includes text messages, voice messages, movies, photos, music, games, etc. Therefore there is no way the idea of sharing these things will die out. They'll just get thought about some more and new better concepts will be tried over and over until we find the perfect one. Email, ftp, gopher, web, instant messaging, Napster, etc are all steps we've taken.
At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
>I haven't worked for Napster in 3 months. What do you do now? Are you rich and retired? =)
Will code a sig generator for food
However, I think it could be used for specified servers to talk to one another. A client would therefore only have to know the address of one server in a server pool, which could search the other servers in the pool. Users would log on to one server, but because of the server peer-to-peer architecture, users on other servers could be connected to them, as the servers would query each other (like the current Napster servers).
Servers then only need stay up for a day or two, as long as the pool stays big enough, clients can get bounced along the pool. They could receive a message saying their server is stopping, but their server would pass them on to another pooled server and so on. If the user used their client reasonably often, they could maintain a constant connection to the 'network' even if in a few weeks time none of the same servers were serving the network.
One burning question: why can Napster not move to Cuba? What do the US government do then? I can't see Fidel refusing the investment. Any thoughts?
I hate to be an I-told-you-so but, oh wait, no I don't: I told you so!
Seriously, we who use OpenSource software shouldn't be afraid to accept constructive criticisim. Telling ourselves to stop spreading FUD isn't going to fix real problems.
Just use dijkstras algorithm on a massive scale! Have you ever played Warcraft? How come the ogres can find the point you clicked on the map so fast? They don't have a "server", they use peer-to-peer networking to find the answer.
Will code a sig generator for food
Uh, right. Hands up everyone who actually needs to compile the latest, greatest kernel? Hands up everyone who did anyway?
Disclaim: I'm a CompSci grad, but I haven't done a real analysis on this or anything. Be warned: unsubstantiated opinions ahead...
Gnutella cna never scale in the sense that napster does. If you search for a given file, you won't find every occurence of it on the network (in fact, you probably won't find most occurrences). If too many clients try to connect, the network will break down, at random points, and split into subnetworks. This is, as I understand it, not as big a problem as it might seem at first, though.
The thing is, you don't want to find every instance of a given file, you just want to find one, and it doesn't matter which one. Say that the gnutella network has split into ten major subnetworks, where communication between them has broken down (the networks generate so much traffic that any server attempting to connect to both will be swamped, foir instance). If a given file exists in only one copy, you have only one chance in ten to find it. If, also, you are the only one to want it, you are (almost) out of luck.
Now, the gnutella network isn't static. People tend to start their clients, then stop them again at semirandom intervals (they turn off the computer for the night, software crashes, they want to limit bandwidth (for a game of Quake or whatever) and so on). When they reconnect, there is a nonzero chance that they will end up on a different subnet (and I believe that the chance increases if they use an automagic host catcher thingy). So if you want that special file, you might want to try now and again for some days, and get lucky. Also, if people tend to share the stuff they themselves download (and this chance probably increases as bandwidth and storage capacity increases), popular files will get spread over the network, making it steadily more probable that you will find the file in your neighbourhood.
The thing is, gnutella _will_ scale, it just won't scale linearly over the number of hosts. Now, this is just my conclusion; any other takers on this?
Trust the Computer. The Computer is your friend.
Picture the following:
;-) If your filesystem needs space that's being used by the freenet cache, it can just go right ahead and grab disk sectors. (Anything you care enough to store permanently is in your filesystem.) Anything someone else cares about is in his file system.
A system much like freenet, with a few differences.
First, the only keys that the storage/communication mechanism cares about, are the MD5 checksums of the file in question.
Second, nodes do something like Seti@home, but with storage, not CPU cycles: ALL of the blocks on your disk that are not used by your filesystem are available to cache freenet files.
Fourth, files are split up into blocks, such that if a file spans blocks A, B, and C, if you get any two, you can reconstruct the third (IOW, RAID striping.) Each block is tagged with the MD5 of the whole file, and the sequence number of the block.
Third, the objects transferred are usually not whole files, e.g. if Alice asks for a Metallica tune, and sixty nodes out there have it, Alice randomly picks nodes to ask for particular blocks. This would tread far more lightly on each node out there, than Napster does now.
Tweltfh, If a node has idle time/storage/bandwidth, it can randomly receive blocks from other nodes.
Thirty-Seventh (yes, these paragraphs may be out of order.
Indexes are just files that match names to MD5's of files. There need not be any single scheme for indexing these files. There can be any number of names for a given file.
This addresses a few of the problems with Napster, like "D'oh! He logged off when I had all but five seconds of the song!", or "Man, I hate it when someone D/L's from me and I'm on a modem connection." If the typical hit on each node for a D/L is about 20K, bottlenecks go away.
A couple other random thoughts: It might be quite doable to implement this with UDP, if you make the blocks small enough. With this scheme, if I have any *part* of the file you want, I can help fulfill your request.
In Napster, I tended to look for people with the fastest net connections to D/L from. That's not really fair, is it? With the scheme I suggest, I'd ask for blocks from the fastest and the slowest nodes alike, and each of them would decide how helpful they wanted to be.
To put it in Star Trek terms, this idea makes every machine participating act as a pattern buffer for a transporter, as it were.
Comments?
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
Now that's a funny one. Move napster to cuba. Well, let's see:
Cuba is communist, thus any private business can get taken by the state
USA has a trade embargo w/Cuba. Thus it would be very hard to do a move, and be hard for napster to get any advertising money from US corps.
he misspelled the first word on his paper as "Forward", not "Foreword"
Gamertag: ChrisCasey
*sigh*
omega_rob -- friend of the dread pirate Napster
Inquiring minds want to gnow.
---
---
Just doin' my job!
Had an idea for a solution to file sharing over the internet without the vulnerability of a centralized static site serving as the database:
The client software would have preference settings allowing a user connecting to the fileshare system to indicate their "elegibility" to become a temporary Database Host. Options of Always, Ask, and No.
Client software would work like this:
Access specific IRC Channel and Query established hosts.
Hosts (the temp. Database Hosts) would respond stating who they are, and requesting the client's share list.
Database Hosts would negotiate which host would accept the new clients list.
Client would then be told which host to transmit list to, and when its next update would be expected
Search requests are then transmitted to the Hosts through IRC, results are returned directly to clients by Hosts. 1 to 1 transfers are then initiated using cilent's choice of protocol.
When clients contact hosts indicating they are still online, the Hosts will ask client program about Server eligability. Database Hosts will change to those who indicate a preferable host environment.
Of course there's specific things to work out, but what do you guys think? Use IRC as a central communications channel for everything, and use a randomized central group of systems as centrallized databases - faster search returns than gnutella can produce, but at the same time, the lack of an easily shutdown central server.
Just a thought. Don't have the skills or time to write up a trial client.
The only difference between a mom and pop and a corporation is mom and dad can't tell you what to do anymore.
GNUtella is an example of application layer multicast, and it has all those nasty problems.
There are two fundamental issues:
1. How do you find the nearest copy of X?
2. How do you know that it's the right X?
GNUtella could be greatly improved if there were a way to stop a search once you have found what you're looking for. There are two ways to accomplish that goal. The first is to send out some sort of "cancel" that gets queueing priority. At best you'll get a single order of magnitude improvement. And you probably need 5-6 orders, and maybe more.
The other way to cut the search off is to change the model so that instead of multicasting queries to all parties one multicasts content availability. Content is relatively stable and CPU cycles are cheap. Push to query processing as close to the client - preferably TO the client who makes it. By doing so you save me and you from an explanation why (#2) above is a big deal.
I understand that Cuba would pose serious logistical problems. However, as I mentioned in my other reply, there must be jurisdictions where Napster's server activity would not be illegal. What happens if they just relocate there?
Forgot this the first time around. Here are some tips to improve Gnutella's performance for yourself and for everyone.
1. Never connect to more than 5 hosts at a time. There's no need for it and you'll only hurt yourself by doing so. I used to spend a lot of time in the gnutella.wego.com discussion area, and then the GnutellaNews boards, helping out new users. Time after time someone would come in and say, "Gnutella is shit! I type in a search and I don't get results for 10 minutes!" Me: "How many connections do you have open?" Them: "50, and if I try with 100, it goes even slower!!"
The more active connections you have, the slower your Gnutella experience will be... And by being a congested node, you're adding latency to the network for everyone else. Set your max connections to 5. That gives me, on average, an overhead of 6-10K/sec in background chatter, not counting uploads/downloads.
If you're on dialup, max your connections out at 2 and (it hurts to say this) don't share files or you won't be able to do anything else online. If you really want to share - and that's a good thing - cap your uploads at 1. Leave routing up to the people with the fatter pipes.
2. Go for diversity in your connections. If you load up your client and see that you're connected to 5 RoadRunner nodes, dump a few of them and try to connect to other networks. Peer-to-peer file sharing relies a lot on peering, after all. Connecting across ISPs, networks, and even across countries is a good thing.
3. Don't share junk files. Please. Every time I search for Pink Floyd and get a ton of under-1MB MP3s in the results, I want to kill someone. Know which directories, if any, you're sharing... And clean them out from time to time. All those incomplete downloads you made are being sent out as search results, but nobody is going to download them from you. Those are a lot of wasted bytes coming through your query hits.
4. Perhaps most importantly, use a good client. See the parent for details.
Shaun
Thanks to the War on Drugs, it's easier to buy meth than it is to buy cold medicine!
I've set up some bandwidth limits on our FWs, in the beginning due to be less vulnerable to DDOS and then to restrict our users from scanning the net. Well, Gnutella is worse! On one of our smaller boxes crashed, the system locked up due to too many traffic (No, it wasn't Linux, nor Windows). It turned up, that some Looser changed the minimum connection setting to _100_!
Now I had to block the default Gnutella ports on the FWs, hopping most users won't find out, and thoes who do are less stupid...
--
42 cows on a 42km road on their way to 42.org
do you really want to de-uuencode every mp3 you get, and have to worry about part 323 of 3242 not being in the thread? sorry not for me
--
Just a thought....
ÕÕ
I was looking just yesterday at the gnutella protocol because I was planning on writing a client. I ended up sort of laughing at how horribly non-thought out the protocol is. The "protocol" as it's called, makes reference to how to handshake (it's really simple because it doesn't need to be complex), and how to get and receive files (http-style requests). That's it. That's the entire protocol. There is some vague mentioning of "passing messages onto people you're connected to" and "pinging", but as far as I could tell, there's no set way to do it. As a result, it's up to the client-writer to implement these things, and I sincerely doubt many of the client writers have had any in-depth advanced networking classes. Basically, as it stands right now, gnutella will never work, not with the protocol as it's currently implemented. The biggest killer is message-forwarding, if you can solve message-forwarding, you can have a scalable network. If your protocol includes message forwarding, you're dead in the water.
---
---
we stand in life at midnight, we are always on the threshold of a new dawn.
Lopster also features excellent integration with OpenNap, superior to Gnapster's in my opinion.
:wq
Tools like NewsShark and NewsGrabber make it easy to post or obtain binary formatted files such as multimedia and there is plenty of it available. No waiting for downloads, no acne-faced punk kids aborting them, and you can batch and resume at your convenience.
Usenet isn't that hard to use and there is a lot of music that can be found from your ISP's news server. Grab a client and check it out!
-Pat
Gnutella was not designed to take the type of traffic it has now. It guess it time to go back to the drawing board.
Fudlike? Hmm... I disagree. While the information may be 100% accurate, the analysis might be perfect, the way it is presented is very FUDlike.
It's well known that Math and Stats can be used to prove just about anything. Don't just trust something because it looks scientific. To me, big scary numbers presented by a compeititor generally feel wrong, so I make sure to check the math carefully when I see things like that.
I've wondered about a mini-server architecture w/ Gnutella. Could there be some way to have gnutella hosts hold mini elections, and elect clients to become mini servers?
signature smigmature
- James
What is Mojo Nation?
Mojo Nation is a revolutionary new peer-driven content distribution technology. While simple data distribution architectures like Napster or Gnutella may be sufficient to allow users to trade mp3 files they are unable to scale up to deliver rich-media content while still taking advantages of the cost savings of peer-to-peer systems. Mojo Nation combines the flexibility of the marketplace with a secure "swarm distribution" mechanism to go far beyond any current filesharing system -- providing high-speed downloads that run from multiple peers in parallel. The Mojo Nation technology is an efficient, massively scalable and secure toolkit for distributors and consumers of digital content.
Sounds like you're dancing around the issue. Have you read the paper? Do you even have the math background to do so?
Actually, UDP is almost universally used by game programmers because it is the only way to get around NAT (without central servers, which dont work well for gaming)
Also, no firewalls and proxies do not filter UDP by default (none that I have encountered), although you can configure them to do so.
Then there weren't any votes at all. If you think that machines are infallible, you're an idiot.
(or more likely, a troll)
Also, when data is placed on Freenet, it's split into pieces and distributed to several nodes making it even tougher.
Actually file splitting hasn't been implemented yet on Freenet. It probably will be by version 0.4.0 (current is 0.3.7) but not yet.
Perhaps you are thinking about how related files, such as file comprising a website on Freenet, get put on different nodes when they are inserted?
gnutella can't scale... white men can't jump
You also highlight the really big scary numbers that come from doubling the default gnutella settings.
I appreciate the fact you sharing the equations and methodologies you used, and I'm in the process of looking over the math right now. If I had a nice math package to help me and if I weren't brain-dead after a long day of work it would be going a lot quicker.
Anyhow, I don't mean to offend. I'm expecting the math will be correct and the methodologies you used will be ok. The point I was trying to make is that the speed at which most people posted comments meant they had only skimmed the article too, and some people were saying "You can't argue with him, he's using Math!!" I really think people should check over the math for themselves before they agree with what you're saying.
A hobby of mine is poking holes in things too. Mostly TV commercials. I try to figure out the loopholes that let them say the things they say. "7 out of 8 math profs say your analysis is flawed" (I just haven't told you their names or what mental institution they're currently in).