Slashdot Mirror


Distributed Internet Backup System

deadfx writes "Since disk drives are cheap, backup should be cheap too. Of course it does not help to mirror your data by adding more disks to your own computer because a fire, flood, power surge, etc. could still wipe out your local data center. Instead, you should give your files to peers (and in return store their files) so that if a catastrophe strikes your area, you can recover data from surviving peers. The Distributed Internet Backup System (DIBS) is designed to implement this vision."

18 of 303 comments (clear)

  1. Problem = bandwidth. by caluml · · Score: 5, Insightful

    The main problem with this approach (and for that matter Freenet) is that it is slow for all but the smallest files.

    Bandwidth is still the most precious commodity in computing. Once we get fibre to every house, then distributed storage will make sense.

    1. Re:Problem = bandwidth. by nano2nd · · Score: 5, Insightful

      You're right in that today's infrastructure isn't made for chuffing massive, hard-drive-sized hunks of data back and forth.

      But what about incremental backups?

      OK so you've got to get your base image uploaded -somehow- but after that, data changes very little on a daily basis and this level of data transfer to some secure backup repository won't be a problem at all with current bandwidth.

    2. Re:Problem = bandwidth. by gmuslera · · Score: 4, Insightful

      For internal networks where you have a lot of fast connected servers, sparing a bit of bandwidth and disk space to have a distributed backup across the LAN could be useful, specially when you can backup servers data in workstations and so on.

  2. Ok, start sending me your code, Blizzard by Quarters · · Score: 4, Funny

    I've got my terrabyte array setup. Your, "Worlds of Warcraft" data will be completely secure on my backup node.

    Go ahead, send it.

    I'm waiting....

  3. All my data and software... by ackthpt · · Score: 5, Funny
    All my data and software are backed up on crackers computers.

    I'm not worried. %-)

    --

    A feeling of having made the same mistake before: Deja Foobar
    1. Re:All my data and software... by BillFarber · · Score: 5, Funny

      Are you saying you only use white-peoples' computers for backup?

  4. do this with schools by octalgirl · · Score: 5, Interesting

    We do this with neighbor school districts. We also backup all buildings, over the WAN and at night, to a file on the hard drive of another building. We do this in two places, so backups criss-cross. Because of the size and time it takes, this can only happen at night and only one building per night, so there is a downside. But if a building goes down, I know I have a secondary (besides the tape in that building) to fall back on.

  5. Security? by vano2001 · · Score: 5, Interesting

    What if it is sensitive data? Do you think even with all that cryptography and secure computing blabla people will trust storing their important files on other people's computers? think not. There are companies who put their backups into safes ... ask *them* to put it online on a slashdot reader's PC. See what they answer. Freenet and similar networks are only good for general [public] domain data

  6. I can't see this being a go, any time soon. by saskboy · · Score: 4, Insightful

    As has been mentioned already, [no this is not redundant, because I am writing this myself] the potential for data being stolen is too great an issue to overlook. This is not a viable option because the potential for theft is too great, and no ammount of encryption will make a difference. Encryption will always be broken.

    --
    Saskboy's blog is good. 9 out of 10 dentists agree.
  7. Don't trust them to return your files by PepperedApple · · Score: 4, Insightful

    It's not so much that I wouldn't trust someone not to break the encryption, but what if the person who's holding your backup copies gets tired of giving up disk storage and just deletes the software from his/her computer. Or what if their computer happens to be off when you want to retrieve the backup?

  8. And what if by Apparition-X · · Score: 4, Interesting

    I grant that personal backup is time consuming and it is tough to find a good method without resorting to expensive tape or hundreds of CDs. But as intriguing as this approach is, there seems like a lot of problems with it.

    What if the reason you need to do a recovery is because your system with internet access is toast? How long does it take to restore several hundred thousand files? What about peers that drop off the network, or that are only on sporadically (no, that never happens in peer to peer filesharing networks!).

    Even aside from the issues of speed of restoration, I can't imagine too many circumstances in which you want to rely on a internet network connection as a prerequisite for a successful restore... Although perhaps as a way of complimenting existing backup methodologies (i.e. backup root and critical config information to tape or CD, and the rest of your schiznit to DIBS) this might have a place.

  9. Private Peer to Peer (PP2P) by 4/3PI*R^3 · · Score: 4, Informative

    This is just the next evolutionary change in P2P. Encrypting data and exchanging the encryption key so that only those "in the know" can exchange files and the *AA groups don't know what you are trading.

    In the "Pefect Example of Talking Out of Both Sides Of Your Mouth" Department:

    This is posted on the home page:
    Note that DIBS is a backup system not a file sharing system like Napster, Gnutella, Kazaa, etc. In fact, DIBS encrypts all data transmissions so that the peers you trade files with can not access your data.[emphasis mine]

    This is posted on the documentation page:
    Make sure you give your gpg public key to any peers you want to trade files with.[emphasis mine]

  10. Also compare rdiff-backup and duplicity by wfrp01 · · Score: 4, Informative

    Some nice folks at Stanford are also creating a different flavor of network backup called rdiff-backup. I'll just plagiarize the description from the homepage:

    rdiff-backup backs up one directory to another, possibly over a network. The target directory ends up a copy of the source directory, but extra reverse diffs are stored in a special subdirectory of that target directory, so you can still recover files lost some time ago. The idea is to combine the best features of a mirror and an incremental backup. rdiff-backup also preserves subdirectories, hard links, dev files, permissions, uid/gid ownership (if it is running as root), and modification times. Finally, rdiff-backup can operate in a bandwidth efficient manner over a pipe, like rsync. Thus you can use rdiff-backup and ssh to securely back a hard drive up to a remote location, and only the differences will be transmitted.

    The homepage also links to a project called duplicity, which operates on a similar principle, but uses GnuPG to encrypt data to prevent spying/modification.

    --

    --Lawrence Lessig for Congress!
  11. This idea is not new by fudgefactor7 · · Score: 4, Insightful

    It's been discussed (and even tried) before, the problems were many, namely security speed, and availability. One cannot guarantee any of those three every important variables. As a result it (the idea) died a horrible death--let's hope it dies again.

  12. Distributed RAID Like Backups by angry_beaver · · Score: 5, Interesting

    This should work a little differently.
    Why not stripe your data accross many hosts with parity data being stored on serveral. A central server would maintain a list of servers containing your data. In the event of a failure, you would simply fireup the client, that would contact this server for a list of your backup "devices" and it would start pulling in, reconstructing and decrypting the data.
    This would have a couple bonuses...

    1) You could stripe it accross 100 machines, and have another 100 with parity data so that any 50% of the machines can be unavaliable and you can still get your data back.

    2) Security - Rather than having a full copy of your data on their machine, each node only has a small subset of your data, and does not know where to find the rest of the data making reconstruction nearly impossible for the storage node. GPG would be used on top of this.

  13. Why not just use OpenAFS? by rindeee · · Score: 4, Informative

    It was designed for use in low-bandwidth envrionments. Not only do you get the benefit of a distributed backup system, but you get inherant (sp?) fault-tolerance, load-balancing, etc. Yes, over a low-bandwidth connection a file still takes a long time to copy, but OpenAFS is designed to accomodate this (not going into detail here, go to the OpenAFS site if you're curious). I am a fanatic OpenAFS user so I am somewhat biased. We have however implemented OpenAFS on a 1.4TB datastore at one of our customer sites (medical market) that has key data (a couple hundred Gig) distribted to 3 slave RO cells (again, read up on OpenAFS for answers). Rock solid reliability is an understatement.

  14. dibs vs rsync by bromoseltzer · · Score: 4, Interesting
    I peer with another system at another institution using rsync. They rsync their files to a folder on my disk, and I rsync to a folder on theirs. No encryption, but very good performance - 128 kbs DSL upload is fine, running overnight.

    This requires a lot of trust, which is OK because I'm the sysadmin at both places.

    Without trust, you need DIBS-like encryption, which (probably) means no rsync-like differential backups, and you need a "safe" way to find partners.

    How about "DIBS-raid" where your data is spread over many peers? If a peer blows up, you can still recover, and no one peer should have a recognizable piece of your data.

    -Martin

    This .sig donated to Poets Against the War.

    --
    Fiat Lux.
  15. Who would take Pete Townsend's files? by someguyintoronto · · Score: 4, Funny

    Seriously, what would be the legal ramifications if illegal data was stored on someone else computer?

    Would this back system, be an easy way to hide illegal content?

    What if the RIAA went after someone for keeping a bunch of legal MP3s?

    Too many cans... Too many worms...