Distributed Internet Backup System
deadfx writes "Since disk drives are cheap, backup should be cheap too. Of course it does not help to mirror your data by adding more disks to your own computer because a fire, flood, power surge, etc. could still wipe out your local data center. Instead, you should give your files to peers (and in return store their files) so that if a catastrophe strikes your area, you can recover data from surviving peers. The Distributed Internet Backup System (DIBS) is designed to implement this vision."
The main problem with this approach (and for that matter Freenet) is that it is slow for all but the smallest files.
Bandwidth is still the most precious commodity in computing. Once we get fibre to every house, then distributed storage will make sense.
Get your own free personal location tracker
What's with all of the "cut and paste" stories lately?
One of the things I like about Slashdot is the different takes on existing news presented by user submissions. Lately, though, many stories seem to be just copied directly from the link's website.
As has been mentioned already, [no this is not redundant, because I am writing this myself] the potential for data being stolen is too great an issue to overlook. This is not a viable option because the potential for theft is too great, and no ammount of encryption will make a difference. Encryption will always be broken.
Saskboy's blog is good. 9 out of 10 dentists agree.
What is to say that the FBI/RIAA won't come to your house, claiming you have terrorest information/stolen music stored on your harddrive? And assuming it was true, would you be legally/crimminally liable for it? This gives a whole new meaning to the excuse "well I was just holding it for a friend".
It's not so much that I wouldn't trust someone not to break the encryption, but what if the person who's holding your backup copies gets tired of giving up disk storage and just deletes the software from his/her computer. Or what if their computer happens to be off when you want to retrieve the backup?
This raises all sorts of interesting questions. Unfortunately the answer to all of these questions is most likely "we won't know until it goes to court and there is a ruling to estabish precedent."
"I don't know half of you half as well as I should like, and I like less than half of you half as well as you deserve."
It's been discussed (and even tried) before, the problems were many, namely security speed, and availability. One cannot guarantee any of those three every important variables. As a result it (the idea) died a horrible death--let's hope it dies again.
Unfortunately I think it would be bad *either* way. Now since "stolen music" is somewhat debateble here on /., and most people aren't too worried about being charged with terrorism, I'll try something more clear cut: Kiddie pron.
Ruling 1: You are responsible for what is on your HD
Result: Someone backs up their illegal pics to your harddrive (you don't know this because it's encrypted), you (innocent) get charged for it and sent to jail.
Ruling 2: You are not responsible for encrypted content that appears to have been generated by this netbackup program.
Result: Every pedophiles dream has come true. They simply encrypt their stuff and spoof it to look like someone elses backup file. They are now immune from procecution because "it's someone elses". Same applies to anyone else that wants to store something illegal on a computer system.
Obviously there needs to be a way to positively indentify who "owns" what content on your harddrive before a system like this could become [legally] safe.
I don't see companies using this to backup valuable/private information on the greater internet. But what about those hundreds of work stations with large hard drives that your peons are using? use the DIBS system to back up all your shared company data, it's still all on systems you own, behind your own firewalls, etc. but it gives you untold gigabytes of back up space that is at least as fast as decent tape backup system, but inherently cheaper.
the IT department could distribute the daemon to all work stations, and the users of the systems aren't even required to be aware of it.
Sounds great to me!
-- Obligatory Blog descramble to e-mail.
So what if your entire drive is backed up across a huge distributed network. And let's say Joe User had backed up cache files, etc that contained personal info (credit numbers, child pr0n, etc). Joe User is could become one screwed individual. It's a huge risk that the average user might be making unknowingly...
If water were beans, I'd be 70% beans.
Not being too familiar with this and not delving too deep, is the data all on one computer, or split?
Wouldn't it be more secure to put one third of each file on twelve different computers? Then when you need it, fold all the encrypted pieces back together again. That way-- even if the do crack your code, all they have is gibberish.
Or is that how it's done?
Wouldn't your files be encrypted with your public key so that only you could decrypt it with your private key? This is normally the way things work with public/private key encryption.
First, I think you're misunderstanding the point of DIBS... a public key is required to encode, but doesn't do any good for decoding, so giving someone your public key only allows them to give you things you could decode.
I wouldn't read too much into the fact that they say you're "trading files"... because that is, after all, what you're doing, even if you can't read the files that you recieved in trade.
On the P2P thing, I'm not sure public key cryptosystems would be advantageous at all. First off, the public keys would uniquely identify the participants. On the other hand, if a P2P client were to generate its own keys, then it would be trivial for authorities to join the network and see the traffic unencrypted.
There might be interest in "private" P2P, but that kind of defeats the purpose of P2P, right? Getting files from unknown sources and searching millions of clients worldwide?
Napster would have been boring if it were just me and my friends.
I agree. This may be a perfectly fine way to back up your terrabyte ogg/mp3/pr0n archive, but no way will any major corps take it seriously. Has nothing to do with how secure it really is, but more on executive perception.
- If we aren't supposed to eat animals, then why are they made out of meat? - Steven Wright
Ideally, you should be able to make your computer fail *COMPLETELY* and still be able to recover completely. The distributed backup plan seems to have different specific advantages for two specific groups of home users, but has the same overall beneficial results.
For the average Joe with only one computer running that ancient copy of Windows98 on a P133, the massive ammount of data-cruft is bound to be the weakest point of upgrading or even backing up. I've found that most families only have that one computer, and only have the option of backing up onto floppies. Usually their data can fit on one or two CDR/CDRW discs, but their system is also usually too old to get a cd burner to work reliably. In addition, they're just too stingy with the purse-strings to shell out the $100 or so for a decent, middle-of-the-pack drive, anyway. Sending critical data over the internet might be a better option, if a bit more time-consuming (no broadband, only 56k modem). Frequent backups like this has the potential to be substantially more reliable, not to mention scores easier, than a pile of floppies as you're ideally only sending the new data. I can't tell you how often I wished for something like this when working on a friend's/family's system across town and away from my own network.
And that brings me to my second group that can really take advantage of something like this: Power-users with a small network running at home. My network has a file-server that stores *EVERYTHING* on it for backup purposes. It's got ISO's of all my software and OS's, drivers, stand-alone programs, documents, and media files. Currently, there's about 80GB of data on there. Backing up that data is a Travan-5 drive (10GB/tape, native) and 9 cartridges. At about 3 hours per tape, backing up to 9 TR-5 tapes takes days, not hours. There's two additional tapes for backup of the server's OS and configuration and it easily fits on one tape. But if there are any significant changes to the system, I rotate the tape so that there's always a working copy in case things go terribly wrong. That's a total of 11 tapes. They're not exactly cheap, but it's probably the least expensive backup I can find right now without going to removable HDs (I'm avoiding that solution as HDs are, in my opinion, less reliable and durable than tapes). Using this distributed backup plan would allow me to recover my server's OS from the single tape and retrieve the data from the network when I have time.
The 2 desktops and 2 laptops can be fully recovered with an OS or system recovery cd and the rest is available on the server. In fact, I usually have one of each type of computer down at any given time for something-or-other. Having the data on the server allows me to blow away any of the systems I run at any time and completely recover the system to a working state in just over an hour.
Actually, I had been setting up a distributed backup plan for my own server with some of my friends so we'd all have each others' server's backup. More accurately, the plan was to merge the changes between all the servers' data and share it between all of us in a manner similar to CVS. There's only 3 of us, but we're located all over the state and we all have broadband. 80GB of data is a large ammount to initially transfer. Really, though, all we'd be transmitting is the changes we've made which would limit the total bandwidth used. We'd probably only set it up for once per week in automatic mode to further decrease the load with an option to manually update. In the event of a complete failure of one of the systems, there should be a copy from one of the other two servers that's no older than 1 week. As the storage requirements grow, each server can be updated with additional storage in sequence so that it recovers in a manner similar to how a RAID5 array rebuilds the data on a replaced drive.
Unfortunately, neither of my two friends in question have the resources to afford the hardware and set up their own server to the reliability standards that I'm requiring, so it kind of fell through for now. I'm working with them on how to get everything running, and I may just maintain it for them from a remote console. They'll still host the server on their network and have access to it, of course. But the responsibility of maintaining the system may just have to lie with me.
In short, it's not terribly difficult to implement a solution like this, but there are serious bandwidth concerns. If you're only doing this amongst your friends/peers, it's possible to mitigate the bandwidth issue by using a single removable hard disk to sneakernet the data to a fresh server. This allows for a much more reliable home network for power-users, and gives some peace-of-mind to the average user (and their power-user friends who fix their computer for them)
My sources are unreliable, but their information is fascinating. -- Ashleigh Brilliant