Use BitTorrent To Verify, Clean Up Files
jweatherley writes "I found a new (for me at least) use for BitTorrent. I had been trying to download beta 4 of the iPhone SDK for the last few days. First I downloaded the 1.5GB file from Apple's site. The download completed, but the disk image would not verify. I tried to install it anyway, but it fell over on the gcc4.2 package. Many things are cheap in India, but bandwidth is not one of them. I can't just download files > 1GB without worrying about reaching my monthly cap, and there are Doctor Who episodes to be watched. Fortunately we have uncapped hours in the night, so I downloaded it again. md5sum confirmed that the disk image differed from the previous one, but it still wouldn't verify, and fell over on gcc4.2 once more. Damn." That's not the end of the story, though — read on for a quick description of how BitTorrent saved the day in jweatherley's case.
jweatherley continues: "I wasn't having much success with Apple, so I headed off to the resurgent Demonoid. Sure enough they had a torrent of the SDK. I was going to set it up to download during the uncapped night hours, but then I had an idea. BitTorrent would be able to identify the bad chunks in the disk image I had downloaded from Apple, so I replaced the placeholder file that Azureus had created with a corrupt SDK disk image, and then reimported the torrent file. Sure enough it checked the file and declared it 99.7% complete. A few minutes later I had a valid disk image and installed the SDK. Verification and repair of corrupt files is a new use of BitTorrent for me; I thought I would share a useful way of repairing large, corrupt, but widely available, files."
jweatherley continues: "I wasn't having much success with Apple, so I headed off to the resurgent Demonoid. Sure enough they had a torrent of the SDK. I was going to set it up to download during the uncapped night hours, but then I had an idea. BitTorrent would be able to identify the bad chunks in the disk image I had downloaded from Apple, so I replaced the placeholder file that Azureus had created with a corrupt SDK disk image, and then reimported the torrent file. Sure enough it checked the file and declared it 99.7% complete. A few minutes later I had a valid disk image and installed the SDK. Verification and repair of corrupt files is a new use of BitTorrent for me; I thought I would share a useful way of repairing large, corrupt, but widely available, files."
I've used bittorrent for this purpose many times in years gone by.
:)
Especially with our slow links, or worse yet, on dialup (if I go enough years back) in Australia.
Before bittorrent I would use rsync. That required me to download the large file to a server in the US on a fast connection, then rsync my copy to the server's copy to fix what is corrupt in my copy.
It works beautifully.
You can tell how powerful someone is by the magnitude of the crime they can commit and be able to get away with.
Ok, you load torrentB in your favorite Bittorrent client, and start it up. It will automatically create 0-sized files with the names in filesetB (at least, all clients I know do that). Stop the transfer of torrentB, and substitute the 0-sized files in filesetB with the corresponding files in filesetA (may require some renaming). As you restart torrentB, your Bittorrent client will recheck the whole filesetB, keeping the valid parts in order to avoid downloading them. Voilá! You have migrated files from one torrent to another.
Note: You should make sure that the files you are substituting in are the same files you want to download through torrentB or, at least, keep a copy around until you see that the restart check accepts most of their contents.
First of all, scene releases are _never_ compressed; it's always done with the -0 argument, this makes is basically equivalent to the unix split program. If a file is to be compressed, it is done with a zip archive, and the zip archive is placed inside the rar archive. This is because rar archives can be created/extracted easily with FOSS software, but cannot easily be de/compressed. This was more of an issue before Alexader Roshal released source code (note:not FOSS) to decompress rar archives.
Second, people often have parts of, or complete, scene releases and they are unwilling to unrar them (often because it's an intermediary, like a shell account somewhere where law isn't a problem).
Third, people follow "the scene" and try and download the exact releases that are chosen by the social customs of the scene (I am not going to detail those here), thus, "breaking up" (ie, altering) the original scene release is seen as rude.
Fourth, the archive splitting is in precise sizes so that fitting the archives onto physical media works better; typically the archive size is some rough factor of 698, ~4698 and ~8500.
Fifth, archives are split due to poor data integrity on some transfer protocols (though this is largely historical nowadays); redownloading a corrupted 14.3mb archive is easier than redownloading a 350mb file.
Sixth, traffic of the size is measured in terabytes, with some releases being tens, or sometimes hundreds of gigabytes in size. Thus, there become efficiency arguments for archive splitting; effective use of connections, limited efficiency of software(sftp scales remarkably poorly, though that is beginning to change - not that sftp is used everywhere), use of multiple coordinated machines and so on. This is an incomplete list of reasons; it is almost as though every time a new challenge is presented to the scene, splitting in some way helps to solve it.
AC because I'm not stupid enough to expose my knowledge of this either to law enforcement, or to the scene (who might just hand me over for telling you this - it has been done). Suffice to say that this is more complex than you understand, and that even this level of incomplete explanation is rare.
The first rule of Usenet: don't talk about Usenet.
( Redundancy is ) ^ n