Slashdot Mirror


Ask Slashdot: Temporary Backup Pouch?

An anonymous reader writes "It looks simple. I've got a laptop and a USB HDD for backups. With rsync, I only move changes to the USB HDD for subsequent backups. I'd like to move these changes to a more portable USB stick when I'm away, then sync again to the USB HDD when I get home. I figured with the normality of the pieces and the situation, there'd be an app for that, but no luck yet. I'm guessing one could make a hardlink parallel-backup on the laptop at the same time as the USB HDD backup. Then use find to detect changes between it and the actual filesystem when it's time to backup to the USB stick. But there would need to be a way to preserve paths, and a way communicate deletions. So how about it? I'm joe-user with Ubuntu. I even use grsync for rsync. After several evenings of trying to figure this out, all I've got is a much better understanding of what hardlinks are and are not. What do the smart kids do? Three common pieces of hardware, and a simple-looking task."

8 of 153 comments (clear)

  1. Re:Hardlinks by Anonymous Coward · · Score: 4, Funny

    Nobody pretended this to be the case. Not even the article's author. Please read again - and use a ghostwriter, you are not at the point of rolling your own comments. Not even close.

  2. Re:SkyDrive by Zemran · · Score: 4, Insightful

    As you say, the internet is really crap at times when you are travelling so why make life difficult? It is also fair to say that you obviously think of travelling as a bit of wandering around in the US. Once you broaden your horizons you will find that the internet is often not even an option.

    Skydrive is not going to integrate with Ubuntu (have you read the summary yet?) so it is a stupid option whereas there is a dropbox client. It is still flakey and not going to be easy to use as required so he is still better off doing something that will work well and therefore get done regularly. If he is using some client for a service that sometimes works and sometimes doesn't you can guarantee that the time when he needs that backup will be one of the times that it did not work.

    --
    I love stacking my barbecues in the shed at the end of summer - you can't beat a bit of grill on grill action.
  3. Finally! An interesting question. by colonel · · Score: 5, Insightful

    First, ignore the people who encourage you not to try, and who point you in other directions. Sure, there are much better ways of doing this, but who cares? The whole point is that you should be able to do whatever you want -- and actually doing this is going to leave you _so_ much smarter, trust me.

    Some douche criticized you for not knowing beforehand why hard links wouldn't work. . . . because, you know, you should have been born knowing everything about filesystems. To hell with him, sally forth on your journey of discovery, this can be hella fun and you'll get an awesome feeling of accomplishment.

    First off, you're going to have trouble using rsync with the flash drive, because I assume your constraint is that you can't fit everything on the flash drive, it's only big enough to hold the differences.

    Next, come to terms with the fact that you'll need to do some shell scripting. Maybe more than just some, maybe a lot, but you can do it.

    I'd recommend cutting your hard drive in two -- through partitions or whatever -- to make sure that "system" is fully segmented from "data." No sense wasting all your time and effort getting backups of /proc/ and /dev/, or, hell, even /bin/ and /usr/. Those things aren't supposed to change all that much, so get your backups of /home/ and /var/ and /etc/ working first. Running system updates on the road is rarely worth it, and will be the least of your concerns if you end up needing to recover.

    Next, remind yourself how rsync was originally intended to work at a high level. It takes checksums of chunks of files to see which chunks have changed, and only transfers the changed chunks over the wire in order to minimize network use. Only over time did it evolve to take on more tasks -- but you're not using it for its intended purpose to begin with, since you're not using any network here. So rsync might not have to be your solution while travelling unless you start rsyncing to a personal cloud or something -- but its first principles are definitely a help as you come up with your own design.

    The premise is that, while travelling, you need to know exactly what files have changed since your last full backup, and you need to store those changes on the flash drive so that you can apply the changes to a system restored from the full backup you left at home. You won't be able to do a full restore while in the field, and you won't be able to roll back mistakes made without going home, but I don't think either of those constraints would surprise you too much, you likely came to terms with them already.

    So, when doing the full backup at home, also store a full path/file listing with file timestamps and MD5 or CRC or TLA checksums either on your laptop or on the flash disk, preferably both.

    Then, when running a "backup" in the field, have your shell script generate that same report again, and compare it against the report you made with the last full backup. If the script detects a new file, it should copy that file to the flash disk. If the script detects a changed timestamp, or a changed checksum, it should also copy over the file. When storing files on the flash disk, the script should create directories as necessary to preserve paths of changed/new files.

    For bonus points, if the script detects a deleted file, it should add it to a list of files to be deleted. For extra bonus points, it should store file permissions and ownerships in its logfiles as replayable commands.

    The script would do a terrible job at being "efficient" for renamed files, but same is true for rsync, so whatevs.

    I built a very similar set of scripts for managing VMWare master disk images and diff files about ten years ago, and it took me two 7hr days of scripting/testing/documenting -- this should be a similar effort for a 10-yr-younger me. I learned *so* much in doing that back then that I'm jealous of the fun that you'll have in doing this.

    Of course, document the hell out of your work. Post it on sourceforge or something, GPL it, put it on your resume.

  4. Re:Finally! An interesting question. by colonel · · Score: 4, Insightful

    Forgot to mention:

    To accomplish this, you'll need to read up on:
    - bash
    - find
    - grep
    - awk
    - sed
    - md5sum
    - chmod/chown
    - mkdir -p
    - diff/patch (for general reference, and also look up binary diffing tools)

    Extra extra extra bonus points if you compress the changed files when storing them on the flash drive.

  5. Re:SkyDrive by partofme · · Score: 5, Insightful

    Actually, I'm not even US citizen, and I travel in South East Asia. When talking about shitty internet, I know what shitty internet is. For example when I'm staying in Cambodia, internet can (and often does) go down for the whole day and night. It also happens often. The speed is also ridiculously slow. You can try to get around some of the downtimes by getting mobile internet for backup, but if there's a wider outage, there's nothing you can do.

    Yet, I've found Dropbox to be the best backup solution. Files will get there eventually, and I don't need to do anything. There's also revision history of files, so if you upload corrupted files or something like that you can reverse it. You can access them from other computers in case your laptop goes poof (happened to me). And the most important thing - if you get robbed or lose your luggage, you will still have access to your files (and of course, I keep my laptop encrypted).

    The good sides of online cloud backup far outweights the negative ones or worries about bandwidth. Especially since most of the time the files that need backup aren't large. No one in their right mind would try to sync their media files.

  6. Re:SkyDrive by Lord+Crc · · Score: 5, Informative

    Obviously Skydrive is of no use but there are several other alternatives that would be better suited to this purpose although if, as he says, it is for use while travelling an internet based system is useless.

    That's why I liked Crashplan when i first saw it. This may sound like a sales pitch but I'm just a happy customer.

    With Crashplan you can have multiple destinations for your backup set. I usually have three:
    - same HD in case I accidentally deleted some files.
    - USB HD for faster recovery in case my primary HD breaks.
    - Online "in the cloud", in case my house burns down etc.

    Crashplan detects when I plug in the USB HD and automatically starts running updating the backup on it. If there's no internet the first two destinations will still keep me pretty safe. Once the internet is back it catches up on the cloud destination.

    It works just fine on my Linux Mint laptop as well as my Windows desktop pc.

  7. Re:Unison? by DrVxD · · Score: 5, Informative

    rsync doesn't handle deletions

    rsync handles deletions just fine - that's why it has a --delete option...

    --
    Not everything that can be measured matters; Not everything that matters can be measured.
  8. Re:unison-gtk by Anonymous Coward · · Score: 4, Informative

    I think people (including you) don't understand what he needs. He has a complete backup at home. When he's on the move, he wants to backup only modifications that are not already backed up at home, so that the backup fits on a USB stick. To know what has and hasn't changed, he can't access the backup at home, like rsync would need to do. His idea was to have a space-saving complete copy of the backup on his laptop via hard links. You might think that file modification times could be used, but both solutions leave the problem of communicating file deletion. Suppose he needs to recover. He would copy his home backup to the new drive and then he would have to integrate the incremental backup. How would the incremental backup keep the information about deleted files without access to the base backup? I suppose one could keep a recursive directory listing with the incremental backup, but that's the question: Is there a ready-made solution for this?