Slashdot Mirror


Distributed Internet Backup System

deadfx writes "Since disk drives are cheap, backup should be cheap too. Of course it does not help to mirror your data by adding more disks to your own computer because a fire, flood, power surge, etc. could still wipe out your local data center. Instead, you should give your files to peers (and in return store their files) so that if a catastrophe strikes your area, you can recover data from surviving peers. The Distributed Internet Backup System (DIBS) is designed to implement this vision."

21 of 303 comments (clear)

  1. Problem = bandwidth. by caluml · · Score: 5, Insightful

    The main problem with this approach (and for that matter Freenet) is that it is slow for all but the smallest files.

    Bandwidth is still the most precious commodity in computing. Once we get fibre to every house, then distributed storage will make sense.

    1. Re:Problem = bandwidth. by nano2nd · · Score: 5, Insightful

      You're right in that today's infrastructure isn't made for chuffing massive, hard-drive-sized hunks of data back and forth.

      But what about incremental backups?

      OK so you've got to get your base image uploaded -somehow- but after that, data changes very little on a daily basis and this level of data transfer to some secure backup repository won't be a problem at all with current bandwidth.

    2. Re:Problem = bandwidth. by gmuslera · · Score: 4, Insightful

      For internal networks where you have a lot of fast connected servers, sparing a bit of bandwidth and disk space to have a distributed backup across the LAN could be useful, specially when you can backup servers data in workstations and so on.

    3. Re:Problem = bandwidth. by elgaard · · Score: 2, Insightful

      Depends how much you need to back up.
      For home use it could be very useful. Especially
      if you only back up changes (like rsync).

      The important stuff are things like:

      1. Your digital photo album.
      On average it probably grows >1 MByte/day.

      2. Personal email and documents.
      A few 100KByte/day if you use an efficient document format
      and dont receive movies as attachments.

      3. System settings, list of installed software etc.
      Very small updates.

      By important I mean stuff you would be missing the day
      your house burns down.

    4. Re:Problem = bandwidth. by Kentamanos · · Score: 1, Insightful

      A redundant RAID configuration gives you about as much protection as what you're talking about though (a LAN).

      People want it distributed (outside of LAN range) to combat the threat of natural disasters, fires, or any other event that can wipe out a building.

    5. Re:Problem = bandwidth. by NoMoreNicksLeft · · Score: 2, Insightful

      You paypal the other guy $2, and he express mails half a dozen CD's.

  2. "Cut 'n Paste" stories by Roarkk · · Score: 2, Insightful

    What's with all of the "cut and paste" stories lately?

    One of the things I like about Slashdot is the different takes on existing news presented by user submissions. Lately, though, many stories seem to be just copied directly from the link's website.

  3. I can't see this being a go, any time soon. by saskboy · · Score: 4, Insightful

    As has been mentioned already, [no this is not redundant, because I am writing this myself] the potential for data being stolen is too great an issue to overlook. This is not a viable option because the potential for theft is too great, and no ammount of encryption will make a difference. Encryption will always be broken.

    --
    Saskboy's blog is good. 9 out of 10 dentists agree.
  4. Would this work in the current [US] legal climate? by Michalson · · Score: 3, Insightful

    What is to say that the FBI/RIAA won't come to your house, claiming you have terrorest information/stolen music stored on your harddrive? And assuming it was true, would you be legally/crimminally liable for it? This gives a whole new meaning to the excuse "well I was just holding it for a friend".

  5. Don't trust them to return your files by PepperedApple · · Score: 4, Insightful

    It's not so much that I wouldn't trust someone not to break the encryption, but what if the person who's holding your backup copies gets tired of giving up disk storage and just deletes the software from his/her computer. Or what if their computer happens to be off when you want to retrieve the backup?

    1. Re:Don't trust them to return your files by Salamander · · Score: 2, Insightful
      what if the person who's holding your backup copies gets tired of giving up disk storage and just deletes the software from his/her computer

      That's the same as a simple failure, which the software is designed to handle anyway. What's not clear from the documentation (and I'm too pressed for time to read the code right now) is whether it does The Right Thing when a peer comes back.

      --
      Slashdot - News for Herds. Stuff that Splatters.
  6. Re:Would this work in the current [US] legal clima by kryzx · · Score: 3, Insightful
    This is actually a good question. If I back up my music file on your computer, does that fall under "fair use"? Would whether you access them or not effect the legal position? Is it possible to build something like this so my files can only be accessed, or at least can only be decrypted, by me, and hence are not usable to the person providing the disk space? If so, would that change the legal implications?

    This raises all sorts of interesting questions. Unfortunately the answer to all of these questions is most likely "we won't know until it goes to court and there is a ruling to estabish precedent."

    --
    "I don't know half of you half as well as I should like, and I like less than half of you half as well as you deserve."
  7. This idea is not new by fudgefactor7 · · Score: 4, Insightful

    It's been discussed (and even tried) before, the problems were many, namely security speed, and availability. One cannot guarantee any of those three every important variables. As a result it (the idea) died a horrible death--let's hope it dies again.

  8. Re:Would this work in the current [US] legal clima by Michalson · · Score: 3, Insightful

    Unfortunately I think it would be bad *either* way. Now since "stolen music" is somewhat debateble here on /., and most people aren't too worried about being charged with terrorism, I'll try something more clear cut: Kiddie pron. Ruling 1: You are responsible for what is on your HD Result: Someone backs up their illegal pics to your harddrive (you don't know this because it's encrypted), you (innocent) get charged for it and sent to jail. Ruling 2: You are not responsible for encrypted content that appears to have been generated by this netbackup program. Result: Every pedophiles dream has come true. They simply encrypt their stuff and spoof it to look like someone elses backup file. They are now immune from procecution because "it's someone elses". Same applies to anyone else that wants to store something illegal on a computer system. Obviously there needs to be a way to positively indentify who "owns" what content on your harddrive before a system like this could become [legally] safe.

  9. The True value is internal company usage.. by Barastol · · Score: 2, Insightful

    I don't see companies using this to backup valuable/private information on the greater internet. But what about those hundreds of work stations with large hard drives that your peons are using? use the DIBS system to back up all your shared company data, it's still all on systems you own, behind your own firewalls, etc. but it gives you untold gigabytes of back up space that is at least as fast as decent tape backup system, but inherently cheaper.

    the IT department could distribute the daemon to all work stations, and the users of the systems aren't even required to be aware of it.

    Sounds great to me!

    --
    -- Obligatory Blog descramble to e-mail.
  10. Watch what you back up... by kaptin · · Score: 2, Insightful

    So what if your entire drive is backed up across a huge distributed network. And let's say Joe User had backed up cache files, etc that contained personal info (credit numbers, child pr0n, etc). Joe User is could become one screwed individual. It's a huge risk that the average user might be making unknowingly...

    --
    If water were beans, I'd be 70% beans.
  11. Re:Huh? by Anonymous Coward · · Score: 1, Insightful

    Not being too familiar with this and not delving too deep, is the data all on one computer, or split?

    Wouldn't it be more secure to put one third of each file on twelve different computers? Then when you need it, fold all the encrypted pieces back together again. That way-- even if the do crack your code, all they have is gibberish.

    Or is that how it's done?

  12. Re:Private Peer to Peer (PP2P) by Slayne · · Score: 2, Insightful

    Wouldn't your files be encrypted with your public key so that only you could decrypt it with your private key? This is normally the way things work with public/private key encryption.

  13. Re:Private Peer to Peer (PP2P) by Dan+Nordquist · · Score: 2, Insightful

    First, I think you're misunderstanding the point of DIBS... a public key is required to encode, but doesn't do any good for decoding, so giving someone your public key only allows them to give you things you could decode.

    I wouldn't read too much into the fact that they say you're "trading files"... because that is, after all, what you're doing, even if you can't read the files that you recieved in trade.

    On the P2P thing, I'm not sure public key cryptosystems would be advantageous at all. First off, the public keys would uniquely identify the participants. On the other hand, if a P2P client were to generate its own keys, then it would be trivial for authorities to join the network and see the traffic unencrypted.

    There might be interest in "private" P2P, but that kind of defeats the purpose of P2P, right? Getting files from unknown sources and searching millions of clients worldwide?

    Napster would have been boring if it were just me and my friends.

  14. Re:Security? by Guido69 · · Score: 3, Insightful

    I agree. This may be a perfectly fine way to back up your terrabyte ogg/mp3/pr0n archive, but no way will any major corps take it seriously. Has nothing to do with how secure it really is, but more on executive perception.

    --
    - If we aren't supposed to eat animals, then why are they made out of meat? - Steven Wright
  15. Re:Problem = bandwidth. (solution?) by racermd · · Score: 3, Insightful

    Ideally, you should be able to make your computer fail *COMPLETELY* and still be able to recover completely. The distributed backup plan seems to have different specific advantages for two specific groups of home users, but has the same overall beneficial results.

    For the average Joe with only one computer running that ancient copy of Windows98 on a P133, the massive ammount of data-cruft is bound to be the weakest point of upgrading or even backing up. I've found that most families only have that one computer, and only have the option of backing up onto floppies. Usually their data can fit on one or two CDR/CDRW discs, but their system is also usually too old to get a cd burner to work reliably. In addition, they're just too stingy with the purse-strings to shell out the $100 or so for a decent, middle-of-the-pack drive, anyway. Sending critical data over the internet might be a better option, if a bit more time-consuming (no broadband, only 56k modem). Frequent backups like this has the potential to be substantially more reliable, not to mention scores easier, than a pile of floppies as you're ideally only sending the new data. I can't tell you how often I wished for something like this when working on a friend's/family's system across town and away from my own network.

    And that brings me to my second group that can really take advantage of something like this: Power-users with a small network running at home. My network has a file-server that stores *EVERYTHING* on it for backup purposes. It's got ISO's of all my software and OS's, drivers, stand-alone programs, documents, and media files. Currently, there's about 80GB of data on there. Backing up that data is a Travan-5 drive (10GB/tape, native) and 9 cartridges. At about 3 hours per tape, backing up to 9 TR-5 tapes takes days, not hours. There's two additional tapes for backup of the server's OS and configuration and it easily fits on one tape. But if there are any significant changes to the system, I rotate the tape so that there's always a working copy in case things go terribly wrong. That's a total of 11 tapes. They're not exactly cheap, but it's probably the least expensive backup I can find right now without going to removable HDs (I'm avoiding that solution as HDs are, in my opinion, less reliable and durable than tapes). Using this distributed backup plan would allow me to recover my server's OS from the single tape and retrieve the data from the network when I have time.

    The 2 desktops and 2 laptops can be fully recovered with an OS or system recovery cd and the rest is available on the server. In fact, I usually have one of each type of computer down at any given time for something-or-other. Having the data on the server allows me to blow away any of the systems I run at any time and completely recover the system to a working state in just over an hour.

    Actually, I had been setting up a distributed backup plan for my own server with some of my friends so we'd all have each others' server's backup. More accurately, the plan was to merge the changes between all the servers' data and share it between all of us in a manner similar to CVS. There's only 3 of us, but we're located all over the state and we all have broadband. 80GB of data is a large ammount to initially transfer. Really, though, all we'd be transmitting is the changes we've made which would limit the total bandwidth used. We'd probably only set it up for once per week in automatic mode to further decrease the load with an option to manually update. In the event of a complete failure of one of the systems, there should be a copy from one of the other two servers that's no older than 1 week. As the storage requirements grow, each server can be updated with additional storage in sequence so that it recovers in a manner similar to how a RAID5 array rebuilds the data on a replaced drive.

    Unfortunately, neither of my two friends in question have the resources to afford the hardware and set up their own server to the reliability standards that I'm requiring, so it kind of fell through for now. I'm working with them on how to get everything running, and I may just maintain it for them from a remote console. They'll still host the server on their network and have access to it, of course. But the responsibility of maintaining the system may just have to lie with me.

    In short, it's not terribly difficult to implement a solution like this, but there are serious bandwidth concerns. If you're only doing this amongst your friends/peers, it's possible to mitigate the bandwidth issue by using a single removable hard disk to sneakernet the data to a fresh server. This allows for a much more reliable home network for power-users, and gives some peace-of-mind to the average user (and their power-user friends who fix their computer for them)

    --
    My sources are unreliable, but their information is fascinating. -- Ashleigh Brilliant