Faster P2P By Matching Similiar Files?
Andreaskem writes "A Carnegie Mellon University computer scientist says transferring large data files, such as movies and music, over the Internet could be sped up significantly if peer-to-peer (P2P) file-sharing services were configured to share not only identical files, but also similar files.
"SET speeds up data transfers by simultaneously downloading different chunks of a desired data file from multiple sources, rather than downloading an entire file from one slow source. Even then, downloads can be slow because these networks can't find enough sources to use all of a receiver's download bandwidth. That's why SET takes the additional step of identifying files that are similar to the desired file... No one knows the degree of similarity between data files stored in computers around the world, but analyses suggest the types of files most commonly shared are likely to contain a number of similar elements. Many music files, for instance, may differ only in the artist-and-title headers, but are otherwise 99 percent similar.""
Wait...what?
LOTS of overhead just to find the chunks.
.torrent files which contain the chunk breakpoints anyway)
The article talks about 16kb chunks, which for a dvd image would take more than the torrent protocol currently allows.
The client would spend more time communicating its chunk lists around than actually getting data.
(If I remember rightly, torrents can have a max of 65535 chunks and some servers prevent huge
liqbase
The only thing I use the file sharing networks for is to download new images of FreeBSD and Linux using BitTorrent.
The last thing I want is a "similar" file.
What would be a "similar" file to a FreeBSD ISO? It would either be a corrupted file or one with an introduced exploit.
Because it gets you published and, thus, increases your chance for tenure, that from which all blessings flow.
I think there is a world market for maybe five personal web logs.
What the parent is saying can be summarized with a simple example:
;)
A 200MB, 30min video that was compressed at 1000kbps DiVX is not the "same file with minor changes" as a 200MB, 30min video that was compressed at 900kbps DiVX. They ARE different files and should be treated as such. You also can't deduce anything from their filenames, play length, or any other characteristic so how would you determine which ones can go together and which ones can't? I did not see codecs or compression mentioned at all in the article.
This is the fundamental problem here. You can't recombine video and audio files unless they ARE the same file. You have to account for different bitrates, compression ratios, and who knows what else (I am no expert in this area but this seems obvious...).
Lemme guess -- the mp3s mentioned in the article were ALL encoded at the same bitrate, right? If not, then please correct me because now you have my attention
Which is why you would download a .torrent-like file specifying which of those you want. Then you would download the 99.9999% that agrees from any/all of them (essentially making your personal swarm temporarily bigger), and download the missing .0001% from the version you requested.
This is very straightforward. I don't see how people can misunderstand this idea.
After all, I am strangely colored.