Cringely's P2P Backup Idea
gewg_ writes "If Napster and Bit Torrent had a baby, would it Baxter?
As a follow-on to Cringely's
last column where he talked about having a backup strategy in the
wake of Hurricane Frances, this week he proposes a distributed RAID notion as a solution."
Buy Steampunk Clothing Online!
But on the serious side, the claim of using encryption to store data on someone's hard drive worries me. Let's say the encryption gets broken. Now you might get Aunt Nedda's cookie recipes, but then again, you might get BobCo's strategic investment plan for the next 6 months as well. I can see people signing up just for the chance to hunt through people's data.
For larger, business-driven uses, you probably want something like DataSafe. They will keep media for you in a very safe place. Or better yet, keep your whole business disaster protected -have more than one live site for IT operations.
Maybe you don't care about that data today, since the terror experience is still fresh, but you might care about it later. For example, assume that data was full of photographs of friends, deceased relatives, and other impossible to replace stuff. This backup scheme would've suited you even better than grabbing the hard drive, because you wouldn't even have had to do that.
All those moments will be lost in time, like tears in rain.
I am not trying to minimize your experience with Ivan, so please don't take this comment as such. The story you posted sounds crazy as hell and I wouldn't wish such an episode on anyone except my worst enemies.
I do believe you reacted a little emotionally, which is understandable given your current situation. I think that if you look at the article again, you will find the only reason he mentions hurricanes is because Frances news reports before the fact got him thinking about it.
That being said, I don't think Crigley was trying to insinuate that someone in a situation such as yours should or could worry about data. The point I took away from the article is that a person wouldn't need to worry about data at all under any disaster circumstance if you implement a system such as the one he proposes.
I think that if you look at it like that, you will agree that he is not trying to discount the gravity of your experience.
-ft
There are several research groups doing work on distributed P2P backup systems. I know there's a group at MS doing this, as well as a group at MIT (http://catfish.csail.mit.edu/~kbarr/pstore/), and several others that don't come to mind offhand. I did a project on this in grad school, so I'm familiar with the research.
:)
There are a lot of issues here, mostly centering around the fact that you can't trust people in an open P2P network.
1) They might look at your data.
2) They might not be online when you want your data.
3) They might delete your data, or do other malicious things to it (insert viruses, etc.).
4) They might freeload by using space on other hosts and then deleting all the data they receive.
5) If a host leaves the system permanently, you need to detect that and replicate its data somewhere else. Also, how do you know whether it's leaving permanently or just logging off for a while?
#1 is easy, just encrypt the data. #2, #3, #4, and #5 are hard because data integrity is really important in a backup solution. You end up having to replicate the data all over the place to "ensure" that it'll be available when you need it, but then you've got the problem of having to donate more space than you receive to use the system. Plus, it's still not certain that your data will be available when you need it.
Basically what I'm trying to say is that it's a hard problem.