PDTP - The Best of Both FTP and BitTorrent?
ikewillis writes "For awhile I've been following the development of PDTP (Peer Distributed Transfer Protocol), which is trying to merge the concepts of FTP and BitTorrent. This sounds like it could be useful for apt-get repositories or other high demand FTP sites. It's designed to be used as part of scalable networks which could replace manual selection of FTP mirrors. It also supports a number of other nifty features like cryptographic file signatures. Isn't it about time we ditched FTP for something better?"
Next thing it'll be transmitting voice and pictures over radio waves... AS IF!
A feeling of having made the same mistake before: Deja Foobar
I feel sorry for these people. See, this isn't your typical slashdotting... It's a slashdotting that comes after eighteen consecutive nonsense stories being posted over twelve hours on the US April Fool's Day.
So, their chance to build a reputation is going to be damaged by the fact that anybody reading Slashdot today has already given up on finding anything useful, and will be evaluating them as a joke that they're "not getting" rather than as a proposed networking scheme.
Furthermore, the geek world is bored today by Slashdot's denial-of-normal-service throughout the day. So, once word leaks out that this is a real and normal story, they're going to get all of the pent up slashdotting force applied to their server.
Simon, you should have started your set tonight with an NY Times article or two. That would have been a suitable transition between nonsense content and factual content, since NYT operates in that murky space and has a suitable web setup to absorb a larger-than-usual slashdotting. I'm sure the people at PDTP would have not minded at all if their moment in the sun had come an hour later tonight.
BannedMusic.org made a BitTorrent wrapper that installs the application and then automatically launches the download, they call it an "easy downloader" and have instructions and a script for sites that want to make their own. Makes it a *lot* easier for sites to give out big files to non-techy audiences.
Quoth the Debian Troll:
True story.
R.A.I.D. == redundant array of intolerable diversions
or at least on april fool's day....
...mirrors would need to be in sync at all times for this to work. Otherwise your PDTP client is only able to download from the mirrors that are in sync, or worse, will get some chunks from files that aren't up-to-date, causing problems.
Unfortunately, it's (almost) impossible to mirror new files instantaneously, so mirrors are never all in sync, all the time.
Isn't it about time we ditched FTP for something better?
Isn't it about time we ditched floppy disks for something better?
Isn't it about time we ditched IDE drives for something better?
Isn't it about time we ditched x86 for something better?
Isn't it about time we ditched Microsoft Windows for something better?
Isn't it about time we ditched CDs for something better?
Isn't it about time we ditched telnet for something better?
Isn't it about time we ditched CRTs for something better?
Isn't it about time we ditched 20-year-old TV sets for something better?
Isn't it about time we ditched COBOL for something better?
Isn't it about time we ditched BASIC for something better?
Isn't it about time we ditched SCO Unix for something better?
Isn't it about time we ditched DOS for something better?
Isn't it about time we ditched Dubya for something better?
my point is that there is a lot of very old crap out there that should be replaces, but is going to get used and keep getting used for years to come.
Interesting... this could bring piracy back to the ftp world, rather than the emule appz or bittorrent world where it's easier to get caught.
Heh...a few years ago, /. made an April fools joke about Python and Perl merging into a new language called "Parrot" Apparently, some people liked the idea, and started the project. I have no idea of its status, though :-(
Doh!
We already have. It is called SCP
"Weapons should be hardy rather than decorative" - Miyamoto Musashi
I think that goes for OS's too
Currently at v0.1.0, awaiting Something Big in Perl 6, it would seem.
You can never go home again... but I guess you can shop there.
There are several P2P research projects that are looking at building reliabale and scalable P2P systems.
Take a look at Tapestry, and Chord (and read some of the papers) to understand the issues involved in providing scalable and high performance P2P services. Not only is scalable search and overlay graph connectivity an issue, but also node failure and short session times of P2P nodes.
Additionally, when you actually handle the issue of downloading data, building application-lvel multicast trees to distribute the data efficiently on a large scale is not easy. Two papers from SOSP '03 SplitStream, and Bullet address that issue.
"...Beer..."
"I am Bittorrentholio...I need PDTP for my Torrenthole!"
"Heh heh heh heh...file transfers RULE!"
Just in case... here's a mirror. Always glad to lend a hand.
Entrepreneur : (noun), French for "unemployed"
Answer to this is the same argument that I've heard sometimes applied to open source:
If we all contribute a little, then the cost to all of us is that much less.
----- Documentation is worth it just to be able to answer all your mail with 'RTFM' - Alan Cox.
I'm waiting for boot disks that fire up a peer to peer client for installing your os, and updates. Debian would be a great start, it would hugely reduce the load of the servers. Also Fedora, the BSDs, etc.
Yes, you can already do bit torrent for the ISO, but that is its own kind of wast and hassle.
Some day.
Plato seems wrong to me today
SuprNova, the best torrent web site ever, is going Japanese.
:-P
I swear, this has nothing to do with today's date.
There's no problem with that, don't share your bandwith with anyone, noone shares bandwith with you. Then you can only download from one source with limited resources. Other people that share can download from many sources (eventually each one with much less resources) that provide a total bandwith much greater, and more, when there is more people downloading they also download faster, instead of you that don't wanna share and have to slower the download when more people that don't share start to download... And besides, most people's connection limits are dowload limits not uploads :-P
And unfortunately, it's windows only, and still requires installing the software, which is 3MB+.
What is needed is something along the lines of a very small, very simple java client or a browser plugin. Azureus is java, but is huge and has massive feature-bloat for the purposes of just downloading(and sharing back) one file. However, Bram and others don't seem terribly interested in expanding possibilities; a mac developer offered up numerous improvements to the BitTorrent team for the mac client(which among other things is based on 3.3a, not 3.4.1, weeks after 3.4.1 released) and was rewarded with deafening silence.
The bittorrent protocol is http based. It's extensively documented on the bitconjurer website. Cmon folks, let's at least see a mozilla plugin or something! :-)
Please help metamoderate.
am sick of trying to determine the april fool day jokes from the real stories.
So, if for example, I write this need little GPL'd app that everyone loves, and release it as opensource, I should be responsible for hosting the file server for everyone? What if hundreds of thousands of people use it everyday, and a new patch comes out. Should I have to buy a T-1 (or something bigger) that costs an arm and a leg, to provide the file patch for a free program to others with no income for me? Or should I ask others to help out with their extra bandwith, and get a few seeders out there with bittorrent and run the tracker with the DSL line i have. I could pay $20 a month for a metered tiered connection in my town, but I pay $50 for an "unlimited" (notice the quotes). I know that not everywhere has these kinds of services, but you don't have to leave the torrent open forever either, or just leave the upload at 1k/s or something. It might slow down your download, but your still going to get access to the file..
What are we going to do tonight Brain?
and you, for one, maybe don't deserve the bandwidth you get for _free_ downloads. your favorite linux distro has to pay the bills sometime too, and i don't imagine that you've actually paypal'd anyone to host their isos. so contribute (either by cash or bandwidth) or shut up.
I thought something better was sftp. As for distributions.. why not HTTP? Setup one reflector that dynamically kicks outs redirects as new mirrors come online. This is mutch better as we have a ton of clients already installed (curl,wget,..etc) We also have load balancing, dns round robin, authorzation, security(read: SSL) well defined in the protocol. All we need is a cgi script to kick out the redirects, and another that will make signature files based on the publically available SSL cert. Whamo all the same features.. and we didnt have to reinvent the wheel.
That's just great! Now the media will consider FTP a movie-stealing method. Then the MPAA will call a ban to all FTP servers!
Question:
"Skyfire is using a derivative of the Apache License. Doesn't that preclude linking with Qt as the Apache License is incompatible with the GPL?"
Answer:
The FAQ page"Qt/X11 is dual licensed under both the GPL and the QPL. The Apache License, while incompatible with the GPL, is not incompatible with the QPL, so when Skyfire is linked with Qt/X11 the terms of the QPL apply. Qt Non-Commercial Edition for Windows has a separate set of license terms which apply to all Windows builds of Skyfire." (emphasis added)
Isn't this license a poor one? Aren't they breaking sourceforge.net rules by using a OSI unapproved license?
Or maybe I don't know what I am talking about. PLEASE Correct me if I am wrong.The operating systems are not going to chuck ftp so soon and nor are they going to include torrent as a default program. :-)
I think theres still a while till we ditch ftp and move onto something else completely.Torrents and other p2p stuff is good but only if you take the effort to get them.What about the masses who want to click and go?It won't happen till they can right click and it says "Save torrent as".
Lord of the Binges.
HTTP does all that. There are well-defined
and well-implemented (Squid) cache-tree protocols
for HTTP. This is very old stuff. FTP is just
plain obsolete. It ads *zero* value over HTTP,
and it's harder to use. Trying to bring FTP up
to the standards of HTTP is a futile effort too,
since HTTP is mature on many more dimensions,
and does not suffer from the gross defects of
the more primitive FTP such as transmission of
port numbers as stream data.
-I like my women like I like my tea: green-
... this QT GPL project was ever done, we could just ignore any such issues for ever and ever: http://kde-cygwin.sourceforge.net/qt3-win32/index. php
the guy at autopackage.org was attempting something simmilar to this but for package distrobution...it looks like with this protocol, youjust need to set up all the OSS servers with packages on them and boom...you have one huge honkin FTP site with all packages nessisary for all things...then you just ned to download a discription file and then the package manager can grab all the packages from a few PDPT gets and your done...good bye RPM hell.
I am the Alpha and the Omega-3
BitTorrent suffers another problem in that the only usable implementations are currently only available in Python. The primary problem with Python is its excessive resource usage
Really? I'm currently running four throttled BT downloads on a PII-350 w/64MB. Max CPU usage is 8%, load average 0.25. If you're really that bothered see here for an alternative.
but other problems arise such as integration of the Python implementation into a native GUI frontend for a given platform
Ever heard of WxGtk? RPMs for most distros, if it wasn't part of your default install.
as well as the need to bundle the Python runtime with the BitTorrent client on most platforms as few deployed systems have a Python runtime available
Now this is just silly. I dont think there is a linux distro which doesn't include Python libraries and even for Windows it's a single small executable. Besides (correct me if I'm wrong) but isn't one of the reasons for using Python that it has bounds-checking on arrays and is therefore proof against the cause of most exploits - the buffer overrun?
The main problem with the "BitTorrent" idea is that it gets associated with "illegal" actions.
I was on the "Desert Combat" Testers team and we had to download 600-700mb patches once a week... off one ftp server.
When i mentioned the idea of using a modified BitTorrent client/server to ease the strain on the server i was told we could not use "illegal" tools.
First educate the public and then start to think about upgrading things to help the internet not crash and burn.
Phil
A psychopath can't tell the difference between right and wrong. A sociopath knows the difference - he just doesn't care.
Hmm, I was thinking about this earlier.. does anyone actually have any statistics for how much server transfer badwidth was saved by distributing a popular file (latest anime release or something) over BitTorrent? How much does it actually help?
Hello. I'm the project manager for PDTP, and author of Skyfire. There's nothing wrong with the QPL whatsoever, unless you mind the fact that it's GPL incompatible (but then again, so is the Apache license). The QPL is an OSI Approved license, so there's nothing to worry about.
But there are people that keep sharing even after their dowload completes... And you seem to be missing an important point. Even if you have a T1, if there are lots of people downloading through FTP you have to share the bandwith with all of them. With a p2p solution people stop using your bandwith sharing between them, meaning you can serve your files faster and with a greater total bandwith. When you say In BitTorrent, your download speed is theoretically capped to your upload speed you are assuming that the FTP server can serve you faster than you can upload. And that's not the case when there are lots of people trying to access the files in question (never heard of pages being slashdoted?).
Please don't use straight SHA1 - it requires downloading the entire file to verify.
.torrent files relatively big).
Bittorrent and some other file sharing networks split the file into chunks and keep metadata with the hashes of chunks. The problem with this idea is how big to make the chunks: too big and you need to download a big chunk before you can verify. Too small and the list of hashes itself takes too long to download (the hashes are what makes
I think the solution should be to use hash trees. Split the file into relatively small chunks (1k?) and calculate their hashes. Now take every two consecutive hashes and hash them. Repeat with the hash results from the previous step until you have a tree with a single hash at its root. The root hash represents the entire file just like an MD5 of SHA1 sum. The difference is that with a small amount of metadata as hints you can verify any part of the file without downloading the entire file. All you need is a short (log n) chain of hashes leading down to the root hash. The server will trickle the hash information interleaved with the download and the client will verify it on the fly and never need to write a single byte to the disk before it's cryptographically verified.
Stop worrying about the risks of nuclear power and start worrying about the risks of not using nuclear power.
Erasure codes have the property that for a file N packets long, you calculate some number K of coded packets, and the receiver needs only to receive N of any combination of coded or source packets to be able to recreate the original file.
For instance, I could have a file 100 packets long, and calculate 25 coded packets, then I could receive, for example, 80 packets of the original file and 20 coded packets (80+20=100) and recalculate the entire original file. Or 90 original and 10 coded packets. Or 75 original and 25 coded packets...etc.
Before Tornado codes, this was computationally difficult to do in practice for large files. Typical use of Tornado files is to just send all coded packets, and receivers can "fill their cup from the fountain, dipping it into the stream whenever they want" to get any N number of packets to recreate the original file. Obviously, this makes a lot of sense in a multicast domain.
But for distributed file transmission, I'm not sure this makes sense. You would need to collect N different packets. If you got the same coded packet more than once, it would not help you.
Keeping track of which unduplicated coded packets you have and need is just as difficult as keeping track of which unduplicated original packets you have and need.
Packet loss is not really much of a problem if you are using TCP. On the other hand, it is a problem when you are doing multicast UDP over the Net or satellite.
So overall I'd say file-level FEC using erasure codes is pretty much useless for distributed file transmission.
Anyone care to differ?
Yes, it is. However, SSH has been around for a significant time and still hasn't replaced telnet, even given the horrific security holes in telnet.