Linux Backups Made Easy
mfago writes "A colleague of mine has written a great
tutorial on how to use rsync to create automatic "snapshot-style" backups. Nothing is required except for a simple script, although it is thus not necessarily suitable for data-center applications. Please try to be gentle on his server: it is the $80 computer that he mentions in the tutorial. Perhaps try the Google cache." An excellent article answering a frequently asked question.
I had the chance to be the first post, but decided to mirror the site first.
My mirror is here
"Live Free or Die." Don't like it? Then keep out of the USA
That's probably one good reason.
It's an expression, it's not particularly abusive.
rm -rf backup.3
mv backup.2 backup.3
mv backup.1 backup.2
cp -al backup.0 backup.1
rsync -a --delete source_directory/ backup.0/
There. That's the script basically. Add more snapshot levels as needed, stick it in cron at whatever interval you need.
dump only supports ext2/3. This supports any file system, and retreiving a file from backups is as simple as running "cd" to the directory of the snapshot you need and "cp" the file out.
I run backups from Linux to IRIX and other UNIXs using gnu rsync and openssh. This little trick is going to be very handy for me. I can't waste my time worrying about which filesystem type the files came from originally.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
The backup scheme described here uses hard links to avoid storing multiple copies of identical files, but when a large file changes even in a small way it stores a whole fresh copy of that file. rdiff-backup is more efficient because it stores one complete copy of your current tree with reverse diffs that allow you to step back to previous versions if you need to. If a large file changes in a small way, only the reverse diff is stored to encode that. This is very handy for cases where, for example, a multiple megabyte e-mail inbox has had just a few kilobytes of new messages appended to the end (although the rsync/rdiff-backup algorithm is also efficient with changes in the middle of a file). Being more efficient in this way translates directly to an increase in the number of past versions you can fit in the same space which can make all the difference if it takes you a while to realize that a given file has been accidentally deleted or damaged.
http://rdiff-backup.stanford.edu/
A few years ago I saw a neat (expensive!) disc array that could 'freeze' the disc image at a single point in time so that a backup could be taken from the frozen image. The backup software saw only the frozen image, while the rest of the OS saw the disc as normal including updates made after the freeze occurred. The disc array maintained the frozen image until the backup was complete, guaranteeing a true snapshot as at a specific instant in time.
I wonder whether such a thing would be possible in software. Possibly it can even be done through cunning application of the tools that we already have. I imagined that you might be able to do something like it by extending the loopback device interface. Does anyone out there have any cunning ideas?
The method Mike describes does not create snapshots, so you can't use it to create consistent backups: Files can be written while they are read by rsync, and lots of software (including databases) requires cross-file data consistency (some broken software even expects permanent inode numbers!). rsync can be used for backups (if you trust the algorithm), but in most cases, you have to do other things to get a proper backup.
At home, I store xfsdump output encrypted with GnuPG on an almost public (and thus untrusted) machine with lots of disk space (on multiple disks). At work, I do the same, but the untrusted machine is in turn backed up using TSM. In both cases, incremental backups work in the expected way. Of course, all this doesn't solve the snapshot problem (I'd probably need LVM for that), but with the encryption step, you can more easily separate the backup from your real box (without worrying too much about the implications).
Are you serious? Crush some guys server rather than using the publically available Google copy, because the Google page DOESN'T HAVE ADS?????? Who pays this guy for his server and bandwidth?? Do you make sure every page you view has ads on it? Are you a marketing exec or something??
This "ads pay for everything on the internet" mentality is INSANE!!
The site was never down; it's just that my roommate, a windows user, noticed the connection was slow and reset the cable modem. He's quite upset about being unable to play Warcraft III. :)
I've never had a slashdot nick before, so I just created this one today. I'll try to go through some of the comments and provide useful feedback.
Thanks for your interest everyone!
Mike