Finally Real P2P With Brains
dfelznic writes: "The mp3 archives of CodeCon are now availble, which is news in itself. But what makes this real interesting is that they are being distributed by BitTorrent. BitTorrent allows users to download a file from multiple different people. Instead of everyone nailing one server, users get the file from other users. Furthurnet uses a similar technology to distribute legal bootlegs of concerts. The archive is available at the BitTorrent demo downloads page. As soon as I started downloading (cable modem) at around 300k I got a request for the file and began uploading at 40k. This could be the answer to the slashdot effect;) Now, who is going to be the first to complain about the use of mp3s instead of oggs?"
Nice idea, I have to say, but my biggest problem with file-sharing utilities is the fact that the file you're looking for isn't going to be the same with everyone. NudeCheerleader(part1).mpeg isn't going to be the same as NudeCheerleader(part1).mpeg on someone elses comp. There's not a way I know of besides implimenting CRC to prevent people from just renaming files into other things. Maybe NudeCheerleader(part1).mpeg is really GoatseLiveVideo.mpeg, just renamed.
Job? I don't have time to get a job! Who will sit around and bitch about being broke and unemployed then?
edonkey has been doing this for ages..
Comment removed based on user account deletion
I hate to rain on the parade but Morpheus et al. as well as the latest version of BearShare both do this, and have for some time.
When you say p2p with brains, to me it means somebody has come up with a elegant balance between centralization and search speeds.
Peer broadcasting is hardly something to write /. about, I'd say.
Comment removed based on user account deletion
BitTorrent allows users to download a file from multiple different people. Instead of everyone nailing one server, users get the file from other users.
eDonkey does one better. Even if you only have parts of the file downloaded, you can immediately send parts of the file you do have to other users. And eDonkey has had a pretty good track record. I thought everyone and their mother knows about this, so why was this a Slashdot headline, especially when it's pretentious and untruthful?
"Love heals scars love left." -- Henry Rollins
They use hashes of the file to compair.
-jason m
It is a browser plugin (IE) that creates mini distributed networks based around a website.
So say you start downloading the latest Counterstrike patch from some server. Well you know how servers giving out the CS patch get filled up quickly.
Well if the users were running this program (plugins to IE, no restart neccisary, look if there is a {browser here} version yourself!) then when they started downloading somebody ELSE could start downloading FROM them.
No file synch issues (same file, same source) the server just re-directs future downloaders to current downloads and has the original downloaders forward the files along.
Need help treating your acne? Come here!
Loudcloud working on something like this for a little while... something called "bitcasting"?
EveryDNS. Use it. It works.
AC's need not reply
Ok people, I know we all have this dream of distributed web serving, but as a web developer I feel I must explain why this will not work now:
1) Response Time
To make this work you need more than a fancy P2P network. Remember site like slashdot are database backed and update very quickly. Sure slashdot caches pages, but many things like user preferences and comments are updated way to quickly for a P2P network too distribute it.
2) Security
Yes you can encypt, but who other than a hobbist is going to put the content that represents them on several machine at once and expose themselves to someone breaking it. If someone was successful they could do things like change the slashdot homepage for those they are distributing to. You cannot be a credible source and distribute yourself like that.
3) Slashcode (yeah I know, slashdot specific)
Have any of you actually read slashcode? I'll tell you what, it is damn complicated. There is no way a simple patch is going to make a site like this distributable. The entire thing would need redesigned, which is no small job. I'd say that this would be the case for any database backed site as well.
4) Databases
Since I mentioned a few times already, I think I'll point out the flaw here. Name one database system that is able to handle and organic network of servers (ie constantly going up and down), keep all the data available, keep all the data available on a resonable connection (not behind 56k lines), give the response time you need, doesn't take up huge amounts of systems resources, and can easily be set up on one of the P2P nodes by even a reasonably competent user. Oh that's right none, and you have to have that in order to have a dynamic site on a P2P network, which is a huge portion of the web at this point.
Well, that's all I can think off right now on this, but I'm sure there are plenty of other reasons why this isn't feasible in the near future.
Cheers
eDonkey has the same feature (with some differences in the publishing process), but is really an application of its own, very file sharing oriented, closed-source and banner-supported. Not exactly what a content provider would want users to download before they can access his files. Still, ed2k has the advantage of a large user base, and also supports ed2k:// URIs that can be used on webpages.
SwarmCast is interesting, but the company behind it mostly died, and now it is somewhat in limbo. Its Java base has made it problematic as a desktop application. The only real alternative to BT is Mojo Nation, which is currently being reworked as "MNet".
If you want to know what CodeCon is all about, check the Feature box on infoAnarchy, we had some detailed coverage.
I'm very surprised at the little ammount of attention that GNUnet has gotten in the P2P arena. GNUnet is anonymous, distributed, encrypted, reputation based, has accounting, allows for distributed queries, and uses dynamic routing. While GNUnet is still beta software, I think it's a great anti-censorship tool. What all this means in non-buzzword speak, is that you have a tool that combines a lot of the great qualities from other similar networks (FreeNet, mojo nation, etc) and doesn't have all of the short comings. Give it a shot.
There seems to be a lot of people who really haven't read the site or understand how the technology works. Yes all those P2P filesharing utilities allow you to download the same file from multiple people at once, it's not all that impressive and many of the problems such as validating matching files and such have been worked out.
:-P
This solution is different in a few very large aspects. It allows a company to keep track of who is currently downloading a file from their webserver. This information is then sent to the clients who can start the P2P poriton of the process and download segments of the file from other users, releaving the load on the companies server. In contrast to those other P2P FILE SHARING programs which share all your files not just ones you are currently downloading. A system like this makes the file server not only the original source for that file but the P2P server to find other people to download that ONE file from.
I can see where people may not want their upload bandwidth being used by others. For this reason any site implementing this feature would probably end up having to provide the file for normal download. The selling point would be a possibly faster download for users of the technology.
I would personally love to see huge sites like FilePlanet put this to use. Granted it would only be truely usefull for sites that have a constant stream of concurrent downloads for a file at any point in time but it would be much better than having to wait 2 hours in line to download a file
Comment removed based on user account deletion
Several people commented that this thing allows to redistribute files before you finish downloading them. But this is not a big deal simply because most of the time file is not being downloaded, it just sits on the HDD 99.9999% of its life. The gain from the early upload would be next to nothing.
Red Swoosh is a cool technology specifically aimed at distributign the load for things such as images on a website. The client download for IE just involves clicking install and DLing a client that's a few 100kb. After which you mirror a portion of the site. www.deviantart.com uses this, and to good effect. I'm not sure if you can mirror large files on it. It is of course centralized.
Photos.
The latest BearShare and LimeWire both allow you to "swarm" gnutella downloads.
BitTorrent allows users to download a file from multiple different people.
Or if you're downloading the latest boy band single: multiple identical people.
"As soon as I started downloading (cable modem) at around 300k I got a request for the file and began uploading at 40k."
I have a 1184/160kbs asymmetric (DSL) connection. This seems like a common ratio with many ISPs these days. A full speed download consumes at least a fith of my upstream bandwidth. Presumably that's due to things like TCP ACKs. Any kind of serious upstream activity squeezes things and can quickly reduce a download to half speed. I can't find the concept described very useful, especially if I'm in a rush to get something. Is there a way to throttle upstream bandwidth consumption?
nothing new, edonkey2000 has been doing this for months now.
linky linky
Runnin' On Empty
it's a helper app. You can build it for any browser that knows how to open a helper app for certain files.
You're just jealous because the voices only talk to me.
Mojo Radio, a Toronto area radio station ('talk radio for guys') uses something similar to do streaming audio. They use technology from ChainCast Networks to distribute the streaming of Windows Media broadcasts. It installs a little app in your Windows machine and runs whenever you listen to the stream.
As near as I can tell, they arn't using BitTorrent, which is a shame because it's perfect for just this.
You're just jealous because the voices only talk to me.
It does, however, seem like a fair trade.
The gnutella spec specifies the use of SHA, *NOT* CRC32 or MD5, as some others have recommended. Both of the latter two can be exploited to pass garbage by a check (with CRC32, you have some control over the content, even).
MD5 is *not* suitable for ensuring that two files are identical when a malicious user is involved. It *is* suitable for ensuring that a malicious user may not hand you anything that passes but pure garbage (given what we know about MD5 today).
CRC32 is totally unsuitable for any environments that could involve malicious users.
SHA is the only common hash appropriate for this sort of problem.
So far, this looks like it's going pretty well. Any and all feedback is much appreciated, and will hopefully help make BitTorrent an even better product. Please mail me about your experiences.
You download file chunks from multiple people, and files can even have a completely different filename. All files are given a hash value to compare to.
Speaking of good things about eDonkey, there is also forced uploads, meaning no losers cutting your downloads on you.
Please, call them "legal live concert recordings", not "bootlegs". That's like saying "legal pirated MP3s".
- A.P.
"Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
Also, because you're typically downloading a few tens to hundreds of megabyte chunks, you're a useful server for 90-99% of the time you're downloading, rather than the Freenet model where you're only useful *after* you've finished downloading the stuff you want. So instead of a long-term persistent set of users who always want stuff, BitTorrent is designed for temporary communities of people who want stuff Right Now, and it doesn't depend on them hanging around being useful after they've got what they want. (So you can download the latest release of a Debian ISO and then go install it without feeling like you're depriving the community by taking your machine offline.)
BitTorrent might be able to manage larger numbers of smaller files, e.g. a Slashdot event, but I haven't looked lately, and it's more interesting for the bigger things. (Of course, some slashdotting problems aren't file retrievals, but server interactions, like that one-IC web server powered by a potato battery, and it doesn't have anything to offer for that :-)
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Additionally, it makes it very efficient for the first set of people who are downloading the file. Instead of having to download the whole thing from one source, which is probably overloaded, you're able to download pieces from lots of different people. The server takes advantage of this - instead of giving Alice chunks 1, 2, 3,
This also reduces the latency required for later people in the process to get their material - instead of waiting for the entire 600MB CD to be copied N times in a row, the downloading gets pipelined.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
I'm not completely versed in morpheus/kazaa/bearshare/whatever, but I understand they allow you to download a file from more than one other person simultaneously, known as "swarming" the download (btw, this is called "anteloping" on furthurnet). It is my further understanding that you can only do this from people who have the *complete* file.
What bitTorrent (I think) and furthurnet (I know) are doing is different than this. If 5 people are downloading a file from the one person who is sharing it, those 5 people can be the beginning of 5 chains of people, relaying each packet down the chain as they get it, regardless of whether or not anyone has the complete file.
Furthurnet uses a protocol called PCP (Packet Chain Protocol) to do this, and it automatically arranges the chains so that those with faster upload speeds are toward the top, with the dialup users toward the bottom.
If the main host goes offline, even if no one on the chain has the entire file, everyone on the chain can still continue downloading everything that the topmost person on the chain has already saved.
A good example: say a dialup user has large file that is in high demand. A T1 user comes along and spends a long time downloading it off of the dialup users horrible upload speed, and gets about 80% of it before anyone else comes to download. Then you show up with your cable connection and instead of being at the mercy of the upload speed of the dialup guy, you have access to 80% of the file from the plentiful upload speed from the T1 guy. And of course Furthur knows to hook you up to the fastest open slot available when you come along.
The result of this is that the underlying host and network shape becomes transparent, and you just see a list of shows to download, you start downloading one, and all this stuff happens in the background. The longer everyone stays connected to the network, the more efficient it comes because it has more time to structure it with the faster folks in the "middle", and the slower ones on the "outside".
Over at furthurnet, the current record is having 71 people on a downloading chain. Combine PCP with the Anteloping and you can have some serious improvement over "dumb" p2p.
I wont even go into the benefits of the md5 checking furthur does...
If you're shipping around small files, like MP3s, there are lots of transfer systems that can do the job. But the Lossless Compression movement for music means that a concert tape is typically a few hundred megabytes large, maybe 1/3 the size of the uncompressed original, so it takes much longer to download, just as ISOs for Linux distributions are large. In that environment, you can't always depend on connections being up for a long enough time, so you need to be able to download parts of files, and swarming systems like BitTorrent help a lot.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Heh. Get this, I have a box running as a NAT-router only (ie, no firewall) with zonealarm on my desktop. :)
(the reason I'm doing this is mostly because all I need the NAT box for is to share a single IP, and having a real firewall on that got to be too much of a hassle with things like Starcraft and Quake) The NAT box is a P100 running FreeBSD 4.3-Release with natd and ipfw. More interesting is that my NAT box is currently behind *another* NAT box that acts as the gateway router for my ADSL service, also running FreeBSD. (I work for my ISP, which is why I know this
When my download started from the site, it was at ~150Kbps. (pretty much the max for my 1.5M/640K ADSL) It slowed down a little as the upstream bandwidth went up, but that was fine, as it consistently stayed at over 100Kbps.
I have a question though. How the hell is it that my upload is working at all? I'm on a network so private that it's scary.
"No problem. I have the capacity to do infinite work so long as you don't mind that my quality approaches zero."-Dilbert
Xolox, a gnutella for Windows, has been out for a long time now, and it allow for a client to serve parts of a *incompletely* downloaded file. Never underestimate the abilities of GNUtella.
From the CodeCon website: Welcome, Slashdot visitors! The CodeCon site seems to be holding up just fine, though we've removed our graphics as a precaution. The CodeCon mp3s are also holding up well due to BitTorrent. Please report any client-side problems you encounter.
;-)
I just love this, especially on a site thats about how to handle bandwith
How will tit-for-tat leech resistance work if someone has an Asynchronous DSL connection? If my download bandwith is 768 kbps but my upload bandwidth is technically limited to 128 kbps (as is common with many DSL offers for private home users), will the leech resistance feature think I'm guilty?
Idempotent operation: Like MS software, wether you run it once or often, that doesn't make it any better.
If there emerged a distributed downloading standard that was generally agreeable and became better and better the more servers that participated, I would think it would be great to see a plug-in for popular web serving software to support it.
Think about something like this: if you were running a site under Apache and had the option of installing a plug-in that would participate in the file sharing network as a server node. The plug-in would let you allocate a defined amount of disk storage and a defined amount of bandwidth. Then sysadmins who felt this was a good thing could just turn on their participation.
Sure it wouldn't be much at first, but you might get a very large base of servers with good connectivity all playing a role in the system. I think it would help it scale.
Just a thought. I wonder if anyone has considered a scheme like this.
I challenge you to find me any two sets of data with the same md5.
I've read the site and I was at CodeCon for the presentation.
Having said that, most of these comments are ignorant tripe. Before you post, you might want to take a look at the site and read about what actually goes on in BitTorrent. This will help you avoid looking like an ignoranus.
"Let him go, Ralph. He knows what he's doing." --Otto Mann (simpsons)
Sounds like this is similar to what MojoNation is/was trying to do. Their site doesn't seem to be responding right now, but here's the Google Cached version of the technical docs.
fencepost
just a little off
Correct. "Packet passing" is more realtime swarming, if that makes any sense, than Xolox's "parts sharing".
*shrugs*... the truth is truth whether or not you provide references. But if you want references, check out RSA's own FAQ. MD4 is definitely broken, and MD5 might have some significant weaknesses. It's likely to be brute forceable with reasonable resources.
Of course, they're also still in development, so Vorbis has that against it. But don't give up on it yet.
Stating on Slashdot that I like cheese since 1997.