Slashdot Mirror


Ask Slashdot: Temporary Backup Pouch?

An anonymous reader writes "It looks simple. I've got a laptop and a USB HDD for backups. With rsync, I only move changes to the USB HDD for subsequent backups. I'd like to move these changes to a more portable USB stick when I'm away, then sync again to the USB HDD when I get home. I figured with the normality of the pieces and the situation, there'd be an app for that, but no luck yet. I'm guessing one could make a hardlink parallel-backup on the laptop at the same time as the USB HDD backup. Then use find to detect changes between it and the actual filesystem when it's time to backup to the USB stick. But there would need to be a way to preserve paths, and a way communicate deletions. So how about it? I'm joe-user with Ubuntu. I even use grsync for rsync. After several evenings of trying to figure this out, all I've got is a much better understanding of what hardlinks are and are not. What do the smart kids do? Three common pieces of hardware, and a simple-looking task."

15 of 153 comments (clear)

  1. unison-gtk by niftydude · · Score: 3, Informative

    Since you are an ubuntu user, and it looks like you just need a nice rsync front-end to handle backup of the same data to two different drives, I'll suggest unison-gtk.

    Very nice, simple front-end, and will do what I think you need.

    --
    You can never know everything, and part of what you do know will always be wrong. Perhaps even the most important part.
    1. Re:unison-gtk by Anonymous Coward · · Score: 4, Informative

      I think people (including you) don't understand what he needs. He has a complete backup at home. When he's on the move, he wants to backup only modifications that are not already backed up at home, so that the backup fits on a USB stick. To know what has and hasn't changed, he can't access the backup at home, like rsync would need to do. His idea was to have a space-saving complete copy of the backup on his laptop via hard links. You might think that file modification times could be used, but both solutions leave the problem of communicating file deletion. Suppose he needs to recover. He would copy his home backup to the new drive and then he would have to integrate the incremental backup. How would the incremental backup keep the information about deleted files without access to the base backup? I suppose one could keep a recursive directory listing with the incremental backup, but that's the question: Is there a ready-made solution for this?

    2. Re:unison-gtk by Hatta · · Score: 3, Informative

      To know what has and hasn't changed, he can't access the backup at home, like rsync would need to do.

      If I understand correctly BackupPC caches the checksums rsync generates to enable exactly that. It would be nice if that was possible with vanilla rsync.

      --
      Give me Classic Slashdot or give me death!
  2. Unison? by Anonymous Coward · · Score: 3, Informative

    I hesitate to offer this, because I've not experimented with it in the precise scenario you describe. However, being another Joe User with ubuntu, I took a look at rsync as a way to implement backups between my home PC and an Apple Time capsule that I was using as a secondary backup device.

    After some tinkering I settled on Unison, which is available in the ubuntu repositories. It's essentially a sophisticated rsync front end, with a few bells and whistles. You get 2-way directory replication between your 'local' and 'remote' file systems [though they could both be local or both remote if you choose] and you can essentially script multiple different backups into the single interface. For example, I have "Office" for documents, spreadsheets and the like, "Photos", for camera images, "Music", and so on.

    Like most tools, Unison is imperfect, but it's simple to use once set up. The key point with it, as with any product you put in this space, will be knowing and keeping track of your definitive data source. If you have a document that exists on both your local and backup systems, and you edit that file separately at each location, then run Unison, only the most chronologically recent copy will be preserved. To go beyond this level of functionality and get to something that can intelligently merge changes, I think you're going to need something more like a CVS tool... There are hugely expensive proprietary solutions (like Livelink), but I've not come across anyone using a good FOSS alternative. HTH...

    1. Re:Unison? by DrVxD · · Score: 5, Informative

      rsync doesn't handle deletions

      rsync handles deletions just fine - that's why it has a --delete option...

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
  3. Duplicity, perhaps by Wizarth · · Score: 3, Informative

    Duplicity uses librsync to generate the changeset that rsync would use, then stores the change set. If you stored the change set to the USB drive, this could then be "restored" to the destination drive, perhaps? I don't know if there's any way to do this out of the box, or with a bit of scripting, or if this would need to be a whole new toolchain.

  4. Re:Hardlinks by Anonymous Coward · · Score: 4, Funny

    Nobody pretended this to be the case. Not even the article's author. Please read again - and use a ghostwriter, you are not at the point of rolling your own comments. Not even close.

  5. Re:SkyDrive by Zemran · · Score: 4, Insightful

    As you say, the internet is really crap at times when you are travelling so why make life difficult? It is also fair to say that you obviously think of travelling as a bit of wandering around in the US. Once you broaden your horizons you will find that the internet is often not even an option.

    Skydrive is not going to integrate with Ubuntu (have you read the summary yet?) so it is a stupid option whereas there is a dropbox client. It is still flakey and not going to be easy to use as required so he is still better off doing something that will work well and therefore get done regularly. If he is using some client for a service that sometimes works and sometimes doesn't you can guarantee that the time when he needs that backup will be one of the times that it did not work.

    --
    I love stacking my barbecues in the shed at the end of summer - you can't beat a bit of grill on grill action.
  6. Finally! An interesting question. by colonel · · Score: 5, Insightful

    First, ignore the people who encourage you not to try, and who point you in other directions. Sure, there are much better ways of doing this, but who cares? The whole point is that you should be able to do whatever you want -- and actually doing this is going to leave you _so_ much smarter, trust me.

    Some douche criticized you for not knowing beforehand why hard links wouldn't work. . . . because, you know, you should have been born knowing everything about filesystems. To hell with him, sally forth on your journey of discovery, this can be hella fun and you'll get an awesome feeling of accomplishment.

    First off, you're going to have trouble using rsync with the flash drive, because I assume your constraint is that you can't fit everything on the flash drive, it's only big enough to hold the differences.

    Next, come to terms with the fact that you'll need to do some shell scripting. Maybe more than just some, maybe a lot, but you can do it.

    I'd recommend cutting your hard drive in two -- through partitions or whatever -- to make sure that "system" is fully segmented from "data." No sense wasting all your time and effort getting backups of /proc/ and /dev/, or, hell, even /bin/ and /usr/. Those things aren't supposed to change all that much, so get your backups of /home/ and /var/ and /etc/ working first. Running system updates on the road is rarely worth it, and will be the least of your concerns if you end up needing to recover.

    Next, remind yourself how rsync was originally intended to work at a high level. It takes checksums of chunks of files to see which chunks have changed, and only transfers the changed chunks over the wire in order to minimize network use. Only over time did it evolve to take on more tasks -- but you're not using it for its intended purpose to begin with, since you're not using any network here. So rsync might not have to be your solution while travelling unless you start rsyncing to a personal cloud or something -- but its first principles are definitely a help as you come up with your own design.

    The premise is that, while travelling, you need to know exactly what files have changed since your last full backup, and you need to store those changes on the flash drive so that you can apply the changes to a system restored from the full backup you left at home. You won't be able to do a full restore while in the field, and you won't be able to roll back mistakes made without going home, but I don't think either of those constraints would surprise you too much, you likely came to terms with them already.

    So, when doing the full backup at home, also store a full path/file listing with file timestamps and MD5 or CRC or TLA checksums either on your laptop or on the flash disk, preferably both.

    Then, when running a "backup" in the field, have your shell script generate that same report again, and compare it against the report you made with the last full backup. If the script detects a new file, it should copy that file to the flash disk. If the script detects a changed timestamp, or a changed checksum, it should also copy over the file. When storing files on the flash disk, the script should create directories as necessary to preserve paths of changed/new files.

    For bonus points, if the script detects a deleted file, it should add it to a list of files to be deleted. For extra bonus points, it should store file permissions and ownerships in its logfiles as replayable commands.

    The script would do a terrible job at being "efficient" for renamed files, but same is true for rsync, so whatevs.

    I built a very similar set of scripts for managing VMWare master disk images and diff files about ten years ago, and it took me two 7hr days of scripting/testing/documenting -- this should be a similar effort for a 10-yr-younger me. I learned *so* much in doing that back then that I'm jealous of the fun that you'll have in doing this.

    Of course, document the hell out of your work. Post it on sourceforge or something, GPL it, put it on your resume.

  7. dar? by safetyinnumbers · · Score: 3, Informative

    If I understand your problem right, How about dar? It can make an empty archive of your main backup to act as a reference (just file info, no files). Then it makes archives relative to that, with just changed files. It can then apply the changes to the original dir, including deletions, if you need that.

  8. Re:Finally! An interesting question. by colonel · · Score: 4, Insightful

    Forgot to mention:

    To accomplish this, you'll need to read up on:
    - bash
    - find
    - grep
    - awk
    - sed
    - md5sum
    - chmod/chown
    - mkdir -p
    - diff/patch (for general reference, and also look up binary diffing tools)

    Extra extra extra bonus points if you compress the changed files when storing them on the flash drive.

  9. Re:SkyDrive by partofme · · Score: 5, Insightful

    Actually, I'm not even US citizen, and I travel in South East Asia. When talking about shitty internet, I know what shitty internet is. For example when I'm staying in Cambodia, internet can (and often does) go down for the whole day and night. It also happens often. The speed is also ridiculously slow. You can try to get around some of the downtimes by getting mobile internet for backup, but if there's a wider outage, there's nothing you can do.

    Yet, I've found Dropbox to be the best backup solution. Files will get there eventually, and I don't need to do anything. There's also revision history of files, so if you upload corrupted files or something like that you can reverse it. You can access them from other computers in case your laptop goes poof (happened to me). And the most important thing - if you get robbed or lose your luggage, you will still have access to your files (and of course, I keep my laptop encrypted).

    The good sides of online cloud backup far outweights the negative ones or worries about bandwidth. Especially since most of the time the files that need backup aren't large. No one in their right mind would try to sync their media files.

  10. Re:SkyDrive by Lord+Crc · · Score: 5, Informative

    Obviously Skydrive is of no use but there are several other alternatives that would be better suited to this purpose although if, as he says, it is for use while travelling an internet based system is useless.

    That's why I liked Crashplan when i first saw it. This may sound like a sales pitch but I'm just a happy customer.

    With Crashplan you can have multiple destinations for your backup set. I usually have three:
    - same HD in case I accidentally deleted some files.
    - USB HD for faster recovery in case my primary HD breaks.
    - Online "in the cloud", in case my house burns down etc.

    Crashplan detects when I plug in the USB HD and automatically starts running updating the backup on it. If there's no internet the first two destinations will still keep me pretty safe. Once the internet is back it catches up on the cloud destination.

    It works just fine on my Linux Mint laptop as well as my Windows desktop pc.

  11. Re:SkyDrive by partofme · · Score: 3, Insightful

    Try keeping current on the status of Dropbox and SkyDrive services so you can pull your data before they disappear.

    Email? Twitter? Facebook? All kind of "push notification" technologies where you don't really need to do anything if you use them.

    Besides, we are talking about Microsoft here. A company that has ridiculously long phase outs for their products as a standard practice so businesses feel safe using them (seriously, they announced that a version 4.0 of SilverLight will see end of support in two years from now). If there is any tech company in the world that you can trust not just going to end support suddenly, it's Microsoft.

  12. Re:SkyDrive by comp.sci · · Score: 3, Insightful

    For 99.9% of all users a backup is simply that, a failsafe in case their main HD gets lost / damaged. So what if dropbox or skydrive suddenly were to go out of business (as unlikely as that is, youd know in advance)? You suddenly lose access to that safety copy of your data and will know right away because the client cannot connect anymore. But you still have your primary copy of everything, nothing was lost, you can just switch providers or change your backup strategy. The chances that something would happen right then in the time-frame that the cloud provider fails and you make another copy with another provider are incredibly low. If you can't take that risk then you'd have a third backup anyways.