Backing up a Linux (or Other *nix) System
bigsmoke writes "My buddy Halfgaar finally got sick of all the helpful users on forums and mailing lists who keep suggesting backup methods and strategies to others which simply don't, won't and can't work. According to him, this indicates that most of the backups made by *nix users simply won't help you recover, while you'd think that disaster recovery is the whole point of doing backups. So, now he explains to the world once and for all what's involved in backing up *nix systems."
I'd say he hasn't seen the "dump" command on FreeBSD:o pos=0&sektion=0&manpath=FreeBSD+6.1-RELEASE&format =html
http://www.freebsd.org/cgi/man.cgi?query=dump&apr
I still use tar, but ideally I'd like to use dump. As it is now, each server makes its own backups, copies them to a central server, which then dumps them all to tape. The backup server also holds one previous copy in addition to what got dumped to tape. It has come in handy on many occasions.
It does take some planning, though.
The article seems like a good one, though I think it may be a little too cautious. I would need to hear some real world examples before I would give up on incremental backups. Being able to store months worth of data seems so much better than being only able to store weeks because you aren't doing incremental backups.
One thing not mentioned is encryption. The backups should be stored on a media or machine seperate from the source. In the case of the machine you will likely be backing up more than one system. If it is a centralized backup server then all someone has to do is break into that system and they have access to the data from all the systems. Hence encrypted are a must in my book. The servers should also push their data to the backup server, as a normal user on the backup server, instead of the backup server pulling it from the servers.
I used to use hdup2, but the developer abandoned it for rdup. The problem with rdup is it writes straight to the filesystem. Which brings up all kinds of problems, like the ones mentioned in the article. Lately I have been using duplicity. It does everything I want it to. I ran into a few bugs with it, but once I worked around them it has worked very well for me. I have been able to do restores on multiple occasions.
Havoc Penington, the bane of my Linux desktop.
http://www.amanda.org/
Does the trick for my organization.
Mondoarchive works pretty well for backing up a Linux system. It uses your existing kernel and other various OS parts to make a bootable set of backup disks (via Mindi Linux), which you can use to restore your partitions and files in the event of a crash.
Visual IRC: Fast. Powerful. Free.
The article has been up for over 20 minutes and still no RTFM followed by a cryptic dd command? For shame.
Cron based backup with compression/encryption, rewind, bitlevel verify, send email re: success/failure.
Add a scsi controller, and Drive Of Your Choice, and sleep well.
Technology -- No Place For Wimps! Grateful Dead and Jerry Garcia Chatroom -- http://www.wemissjerry.org
Amanda
Don't you have someone you'd die for?
I have come to the conclusion, that unless a tape backup solution is necessary it is often easier to backup to a remote machine. Sure, archive to tape once in a while, but for the primary requirement of a backup... rsync your data to a seperate machine with a large and cheap raid array.
I use a wonderful little tool/script called rsnapshot to backup our servers to a remote location. It's fast as it uses rsync and only transmits the portions of files that have changed. It's effortless to restore as the entire directory tree appears in each backup folder using symlinks, and it's rock solid.
Essentially the best part of this solution is it's low maintenance and the fact that restorations require absolutely no manual work. I even have an intermediate backup server that holds a snapshot of our users home directories... my users can connect to the server via a network share and restore any file that has existed in their home directory in the last week by simply copying and pasting it... changed files are backed up every hour.
Sure, the data is not as compressed as it could be in some backup solutions, and it's residing on a running server so it's subject to corruption or hack attempts. But my users absolutely love it. And it really doesn't waste much space unless a large percentage of your data changes frequently, which would consume a lot of tape space as well.
Sometimes the best solution is to stop wasting time looking for an easy solution.
A comment about sparse files:
/var/log/lastlog
/var/log/lastlog" and the system will work as normal.
99% of the time there is only one sparse file of any significance on your machine:
Unless you really care about the timestamp of each users' prior login, you can safely exclude this file from the backup. Following a restore, "touch
Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
Hi there,
/etc/fstab. But I have never actually seen a machine which had the dump command... It's possibly not very safe BTW. If it works like DOS's Archive bit, than it can't be trusted: it can be set manually. Some DOS apps even used them as copy protection mechanism...
:)
A quick reply from the author of the article before I go to sleep:
About dump. So, that's a freebsd command? I've always suspected it existed, doing the very thing the man page described, because of the dump field in
The suggestions (for software and the rest). The comments are very much appreciated. I'll investigate them and adjust the article accordingly. For now, I have to sleep
Oh, one more thing, encryption. I was in doubt whether to include it or not. I use different encryption schemes for my backups (LUKS for external HD and GPG for DVD burning), but I decided this can be left to the reader. I may include a chapter on it, after all.
dd if=/dev/ of=/root/backup && gzip -9 /root/backup
I would figure that would work if you're looking at 'disaster recovery' type situations for *restoring* the machine to it's previous state. Of course if you wind up with a drive with different geometry after the 'disaster' (and unless this is some kind of 'big' server with a pricetag and support contract to match, you will) then I think you're still boned...
Personally I just backup the 'critical' files I need to rebuild my system (talking purely workstation here) to alternate storage (external HD, network transfer to another system, USB thumbdrive, etc) because I can rebuild/reinstall the OS... the data in my home directory and the relevant configs for my system are the items I'd rather not lose.
So you need a carefully-written, carefully-reviewed, carefully-tested procedure, and you need lockfiles to guarantee that it's not being run twice at once, that nothing else starts the server you shut down while the backup is going, etc. A lot of sysadmins screw this up - they'll do things like saying "okay, I'll run the snapshot at 02:00 and the backup at 03:00. The snapshot will have finished in an hour." And then something bogs down the system and it takes two, and the backup is totally worthless, but they won't know until they need to restore from it.
These systems put a lot of effort into durability by fsync()ing at the proper time, etc. If you just copy all the files in no particular order with no locking, you don't get any of those benefits. Your blind copy operation doesn't pay any attention to that sort of write barrier or see an atomic view of multiple files, so it's quite possible that (to pick a simple example) it copied the destination of a move before the move was complete and the source of the move after it was complete. Oops, that file's gone.
We've been using Amazon's S3. It has a great API, pretty easy to use. I was concerned about storing sensitive data there, but we worked out a good encryption scheme (that I won't detail) and now I'm able to really restore everything from anywhere with no notice. My city could sink into the ocean and I could be in Topeka, and I could bring things back up as long as I had a credit card.
Nothing great was ever achieved without enthusiasm
OK, my slashdot noobness is revealed. Here's the post again...
:)
"I think his complaints are no longer relevant. rdiff-backup has a --compare-hash option, though I haven't checked the details. Maybe the author should give it another look.. "
The hash is stored in the meta information, and the compare option does only that, comparing the live system to your archive. It does not say anything about the change-detection behaviour used during a backup.
"Besides, if you have an accurate timeserver (you should! time is unbelievably important to software in general!), the timestamp check is pretty safe, barring maliciousness. And if your machine has been compromised, the data coming off it should not be trusted in general. This is just one more case of that."
No, it's not (safe, I mean). Do this:
touch a b
edit a and b to be the same length but different content
stat a b
mv b a
stat a
a will now have the mtime b had first. mtime+size is not changed, file is not backed up.
This is a danger in my opinion.
"Have you looked at brackup? It seems promising, anyway, but I haven't actually tried it. Maybe when it's a little more mature..."
No, I will have a look. But as I said a few posts below, I'll have to go to sleep now
Encryption, Compression, Bit-Level verification, Bootable disaster recovery, Commercial support...
http://www.microlite.com/
http://www.sweetnam.eu/mediawiki/index.php/Using_N etcat_for_Backup
backing up your system with bash, tar and netcat
Listen to my music.
True, but my assumption (which again, I haven't checked) is that they wouldn't have stored this hash if they weren't doing something with it. I don't think the sanity check uses any information that's not gathered for normal operation.
True. Your backup from before the move will be correct, so if you were to catch this before you got rid of the pre-move increment, you'd have a way to recover manually. I assume you're talking about after that. Yeah, there's a problem, but I'd say it's an incredibly minor failure when compared to not having incremental backups at all. I've sometimes gone for several backup cycles before realizing anything was wrong, so it would be difficult to convince me they're not worthwhile.
Do you have a real-world example where this might happen? The best I've got is moving messages in Cyrus IMAP - if they were both placed in the same second and have the same length, and you moved one away and the other into its folder before any other mail arrived there, I guess this would happen. I just consider that sequence pretty unlikely, and the consequences not too severe.
Ghost (as in Symantec/Norton/whoever owns it these days) works to backup a linux partition just fine. I know, I know, it's not free as in beer, speech, or ipod, but if you do already have it around, it does the job fine.
Section 4 brings up the issue of data files from running applications and agrees with your recommendation of pg_dump or shutting down to do the backup.
Section 7 recommends syncing and sleeping and warns "consider a tar backup routine which first makes the backup and then removes the old one. If the cache isn't synced and the power fails during removing of the old backup, you may end up with both the new and the old backup corrupted".
I deal with some aggregate 2 terabytes of storage on my home file servers. What works for me won't work for an enterprise corporate data center, but maybe some things are useful...
I think the article does a good job of explaining how to backup, but maybe just as important is "why?". There are some posts that say put everything on a RAID or use mirror or dd. What they fail to address is one important reason to backup: human error. You may wipe a file and then a week later need to recover it. If all you're doing is mirroring or RAID, no matter how reliable, your backups are worthless.
There's also different classes of data. I have gigabytes of videos. Some are transcoded DVDs, some are raw footage. If I lose all my transcoded DVDs it's not as critical as if I lost raw footage. Why? The DVDs can be re-ripped. It will take a long time but the data can be recreated. For the raw footage it's different, even if I keep the original Mini-DV tapes, because re-recording the video from tape won't guarantee that the file is identical. If the file is different then the edits will be different. Then there's also mail spools, CVS, personal files, etc..
What I've found is that I archive my DVD rips once every few months. Other stuff is backed up once a week to another file server.
I could care less about the OS. THe file server runs FedoraCore5. The only thing I keep is the Kickstart file so that I can rebuild it within a matter of minutes then restore the data from archives. This is just a matter of copying a samba configuration and restarting.
For the web server, all content is kept within CVS. If the web server fails, it's just a matter of rebuilding the image and pulling the latest copy from CVS. Fifteen minutes to re-image the OS. Five minutes to pull down the latest content.
For DNS, initial configuration for 8 domains is done by a perl script that auto-creates the named.conf and all zone files. Then I just append the host list to the primary domain. Ten minutes at most.
Home directories are centralized on a file server using OpenLDAP and automounts. One filesystem to backup makes it easy.. By being easy it means it gets done automatically.
Other "machines" are virtual and these are copied to DVD whenever something drastic changes (e.g., major upgrade).
What can I say? I just did two successful system restores today from my "tar cjlf /" created system backups. I did several more in the last few years. Never had problems. I think this guy is just trying to sound mysterious and knowledgeable....
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Works fine with my autoloaders, and it's open source.
Quo usque tandem abutere, Nimbus, patientia nostra?
The article addressed a question that has been nagging at the back of my mind but I haven't gotten around to figuring out the answer to. I like the way that the article is to the point, and very in depth. The author does a good job of explaining the various aspects of the files and the importance of preserving them, and then goes on to detail the steps necessary to preserve them.
He is complaining about people who suggest backing-up with "tar cvz /" but really, the only thing missing is the: "p". I use it extensively and it just works (not for databases, but that should go without saying).
In order to ensure I'm never in a tough spot, I made a custom bootable image using my distro's kernel and utilities. Then I made a bzip2 -9 compressed tar backup of my notebook hard drive, which is just small enough to fit on a single CD... (With DVD-Rs these days, the situation is even better).
After burning it all to CD, I restored from it, and it's still working perfectly to this day. Now I can be sure that no matter what unforseen events happen, I'll never be stranded with a non-working notebook due to software problems, and a CD is notably lighter than carrying a second, backup notebook.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Here's a real life case in point that I came across with a Fortune 500 company. This company had recently aquired a small startup, who's system administration skills were lacking. Before moving to the new facilities after being aquired, one of the V.P.'s told people not to bother copying their home directories from the server. They then proceeded to simply shut off the power to the server without doing a proper shutdown.
The server was using a Reiser FS, and this filesystem was spectacularly fried when it arrived that the big company. Tons of supposedly petabyte file sizes, and even more in the Terabyte range (clearly exceeding the size of the disk array). In the end, over 12,000 files and directories would end up under lost+found.
Anyway, it was decided to backup the filesystem before attempting to recover the files. Absolutely everything broke when trying to do this, as Linux doesn't handle petabyte (or even terabyte) files properly. There are subtle problems with all of the utilities (find, ls, cp, cpio, and tar, to name just a few). While this isn't surprising, when you're trying to make a backup, it presents a serious problem.
Were we dependent upon a closed-source solution, we would've been seriously stuck waiting for a fix.
In the end, I actually had to modify GNU tar to handle these problems. This was particularly amusing, as tar handn't been modified in years. But it was the only way out of the situation in a timely fashion.
The point here is that you have this option for really nasty disaster scenarios if you are familiar with Open Source tools.
The best way to predict the future is to create it. - Peter Drucker.
It all depends on the skill of the admin, personally, I get by with
/etc /var /home, and burn to DVD
/etc somewhere.
tar czf backup_$date.tar.gz
there is about 90% of the system backed up. I don't bother with apps, since those are generally archived. Most config stuff is properly placed in
For 'troublesome' apps, thats where rpm comes in, or better yet, apt-get (ie: anything built from source, with custom mods).
This works if the admin is skilled enough. For those who simply don't have the skills/time/energy, there is always Amanda/snapshots/Ghost, etc.
Depends on how much time, money, storage space one has after all.
Signed
The Helpful People on forums and mailing lists
Slashdot, where armchair scientists get shouted down and armchair theologians get modded up.
When you work in a large environment, you start to develop a different idea about backups. Strangely enough, most of these ideas work remarkably well on a small scale as well.
tar, gtar, dd, cp, etc. are not backup programs. These are file or filesystem copy programs. Backups are a different kettle of fish entirely.
Amanda is a pretty good option. There are many others. The tool really isn't that important other than that (a) it maintains a catalog, and (b) it provides comprehensive enough scheduling for your needs.
The schedule is key. Deciding what needs to get backed up, when it needs to get backed up, how big of a failure window you can tolerate, and such is the real trick. It can be insanely difficult when you have a hundred machines with different needs, but fundamentally, a few rules apply to backups:
For backups:
1) Back up the OS routinely.
2) Back up the data obsessively.
3) Document your systems carefully.
4) TEST your backups!!!
For restores:
1) Don't restore machines--rebuild.
2) Restore necessary config files.
3) Restore data.
4) TEST your restoration.
All machines should have their basic network and system config documented. If a machine is a web server, that fact should be added to the documentation but the actual web configuration should be restored from OS backups. Build the machine, create the basic configuration, restore the specific configuration, recover the data, verify everything. It's not backups, it's not a tool, it's not just spinning tape; it's the process and the documentation and the testing.
And THAT'S how you save 63 billion dollar companies.
"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
I like to use Righteous Backup: http://www.r1soft.com/
Its new, but it shows a lot of promise. It uses a kernel module to take consistent backups of partitions at the file system block level and store them on a remote server. The cool part is, it tracks changes. If you haven't rebooted your machine since the last backup, it takes a few seconds to send the changed blocks and almost no CPU usage. It can also interpret the file system in any incremental backup to restore individual files. Not to mention backup up the entire MBR and partition tables of disks. I don't think anything like it currently exists for linux servers.
Dirvish written in perl and using rsync it is a fast disc to disc backup. enjoy.
A computer once beat me at chess, but it was no match for me at kick boxing. Emo Philips
My personal favorite is to swap a pair of disks in and out of a super-redundant RAID
/proc/mdstat
/dev/sdg1 on /backup type reiserfs (ro)
every week. Simple, predictable, and works fine in the face of lots of small files.
# cat
Personalities : [raid1]
md1 : active raid1 sdf1[6] sdb1[4] sdd1[3] sdc1[2] sde1[1]
488383936 blocks [6/4] [_UUUU_]
[============>........] recovery = 61.6% (301244544/488383936) finish=231.7min speed=13455K/sec
# mount | grep backup
After the plethora of stinky windows based backup systems in the mid to late 90's . . .
. . . I used Legato as both client and server, on both windows and linux. None of the possible combinations ever gave me any real joy. It backed stuff up, and test restores went ok, if you consider how it had to work o.k. I lived with it for 4 or 5 years.
Found http://www.arkeia.com/. Never been happier. I run the server on linux. Using autoloader stacks. First set of *sane* backup configuration patterns I've had the pleasure to work with. I don't suggest the windows "JUI". Not my thing anyway, but still.. I look forward to them getting that part right, if they haven't already in the latest version.
If your looking for non-commercial.. I just use tar and netcat similar to as described in posts above. I've used a few of the backup tools available via the Debian repository, but most really do *more* than what I really want for my personal systems.
At work I use SGI IRIX boxes with the (default) XFS filesystem. xfsdump is fantastic for dumping whole filesystems. You can periodically do a level 0 (full) backup and then subsequently do incremental backups from that. I have used it to restore full systems a couple of times and they have worked flawlessly.
:)
I also use XFS and xfsdump on a Debian stable box which is a large media server and I regularly do test a restore and the md5 sums of all the files have been just right
XFS and xfsdump rock my world.
The hash information feature was included after I suggested a feature for hash-change-checking. The hash is already stored, because that was easy to do, but the change checking never got implemented.
It is very unlikely, yes. I know incremental backups can be very handy, so it's perfectly logical people find it's worth the "risk". I don't know of any concrete significant real world examples, I just don't like situations where the theory shows that the backup may miss files.
I know it's kind of an extreme example, but would you go up in an airliner of which you know the software would crash at certain speed, angle and fullconsumption combination?
So his complaint about GNU Tar is that it requires you to remember options... Just look at his Dar command! Seriously, I just do "tar -cjpSf foo.tar.bz2 bar/ baz/" and it just works. And since you should be automating this anyway, it doesn't matter at all.
There is also a separate utility which can split any file into multipile pieces. It's called "split". They can be joined together with cat.
As for mtimes, I ran his test. touch a; touch b; mv b a... Unless the mtimes are identical, backup software will notice that a has changed. This is actually pretty damned reliable, although I'd recommend doing a full backup every now and then just in case. Of course, we could also check inode (or the equivalent), but the real solution would be a hash check. Reiser4 could provide something like this -- a hash that is kept current on each file, without much of a performance hit. But this is only to prevent the case where one file is moved on top of another, and each has the exact same size and mtime -- how often is that going to happen in practice?
Backing up to a filesystem: Duh, so don't keep that filesystem mounted. You might just as easily touch the file metadata by messing with your local system anyway. Sorry, but I'm not buying this -- it's for people who 'alias rm="rm -i"' to make sure they don't accidentally delete something. Except in this case, it's much less likely that you'll accidentally do something, and his proposed solutions are worse -- a tar archive is much harder to access if you just need a single file, which happens more than you'd expect. We used BackupPC at my last job, but even that has a 1:1 relationship between files being backed up and files in the store, except for the few files it keeps to handle metadata.
No need to split up files. If you have to burn them to CD or DVD, you can split them up while you burn. But otherwise, just use a modern filesystem -- God help you if you're forced onto FAT, but other than that, you'll be fine. Yes, it's perfectly possible to put files larger than 2 gigs onto a DVD, and all three modern OSes will read them.
Syncing: I thought filesystems generally serialized this sort of thing? At least, some do. But by all means, sync between backup and clean, and after clean. But his syncs are overkill, and there's no need to sleep -- sync will block until it's done. No need to sync before umount -- umount will sync before detaching. And "sync as much as possible", taken to a literal extreme, would kill performance.
File system replication: You just described dump, in every way except that I don't know if dump can restrict to specific directories. But this doesn't really belong in the filesystem itself. The right way to do this is use dm-snapshot. Take a copy-on-write snapshot of the filesystem -- safest because additional changes go straight to the master disk, not to the snapshot device. Mount the snapshot somewhere else, read-only. Then do a filesystem backup.
"But the metadata!" I hear him scream. This is 2006. We know how to read metadata through the filesystem. If you know enough to implement ACLs, you know enough to back them up.
As for ReiserFS vs ext3, there actually is a solid reason to prefer ext3, but it's not the journalling. Journalling data is absolutely, completely, totally, utterly meaningless when you don't have a concept of a transaction. I believe Reiser4 attempts to use the write() call for that purpose, but there's no guarantee until they finish the transaction API. This is why databases call fsync on their own -- they cannot trust any journalling, whatsoever. In fact, they'd almost be better off without a filesystem in the first place.
The solid reason to prefer ext3 is that ReiserFS can run out of potential keys. This takes a lot longer than it takes ext3 to run out of inodes, but at least you can check how many inodes you have left. Still, I prefer XFS or Reiser4, depending on how solid I need the system to be. To think that it comes down to "ext3 vs reiserfs" means this person has obviously never looked at the sheer number of options available.
As for network backups, we used both BackupPC and DRBD. BackupPC to keep things sane -- only one backup per day. DRBD to replicate the backup server over the network to a remote copy.
Don't thank God, thank a doctor!
is not to backup at all. Backup and restore are time and resource consuming tasks and should be avoided. Redundance minimizes the risk of data loss in case of drive failure or power loss. Paranoid sysadmins mirror the entire system up to third redundancy level and perform regular backups as well, just in case the devil's horns appear...
Is anyone using backup2l? Is coming with Debian distron and I'm very happy with it. It's a little bit slower at recovery but it's not that the point.
Regards
Webmaster's Talks
On my personal Arch Linux system at home, I prefer to simply backup my home directory and the xorg.conf configuration file. Linux is fast and easy to reinstall (at least Arch and Slackware is), so I don't really worry about bare metal recovery. Windows, which I also like to run, takes forever to install and is far more likely to have problems. That's where I am interested in bare metal recovery.
I backup using rsync-backup to another hdd. I wrapped that in a simple perl script and use it to backup only the most essential of directories: /etc, /home, /root, etc. I've found that there's no real need to back up every file on your system -- they'll just get reinstalled anyway. For larger collections of files (eg. mp3s), I'd recommend another hard drive and just rsync it nightly... or RAID.
oh the joy of having archival snapshots of each day, instantly available.
most of all i miss singing along to yesterday.
Can't use it on a live filesystem. No guarantee.
Now if you use a volume manager you can create snapshots and back those up instead. Unfortunately most filesystems don't have a way of being told that a snapshot is being taken, and to checkpoint themselves. With the exception of XFS. I think there's a patch for ext3 to do this as well, but I don't know which distros include it by default.
I am of the opinion that the safest route is to do a backup at the mounted level of the filesystem from a snapshot from userspace (with any database services quiesced, or not backed up by this procedure). Pick the tool that preserves metadata best and has good support for incrementals (stores CRCs if possible).
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
Linux's device mapper can snapshot a block device, which you can then write to your backup medium of choice.
Why on earth would you look at the mtime? That's what ctime is for!Note the new ctime.
Basically, each workstation runs a cron job (or under Windows, task manager, IIRC) at a certain time at night (each WS is staggered to start at a different time to avoid overloading the small server on the network or hard drive throughput), which kicks off a batch file to mount a SMB share and copy the certain directories (mainly documents and development stuff, along with things like workstation email and such) off the workstation and over to the SMB server (which also functions as a web and database server). Then, at a different time (after all the workstations have copied), the server kicks off its own cron job to copy those directories, and others on the server (database images, config files for smb, apache, php, mysql and postgreSQL, mainly) and create an ISO9660 image of those files. This ISO image is then stored in another directory, along with the last 7 days of ISOs. Periodically I make a backup of the last ISO to a CD or DVD.
This works well for my purposes at home. At one time, I had things set up so that the server would automatically burn the ISO to a CD, but due to the location of my server (in a non-climate controlled, dusty attached workshop at my house), the CD burner didn't last long and died, refusing to burn CDs properly. Instead, I just burn the images from my workstation by mounting the directory as an SMB share.
This system has served me well for almost two years now, and like I said, it saved my butt. After I hosed my server (Debian Woody, remember), I ended up installing Mandrake 10.1 on it (all gui options turned off, mind you - just running CLI here), then took my last backup CD image and copied the data over from that (along with restoring my MySQL database from the dump on the ISO). A couple of evenings of work and I was done, and had the new system up and running perfectly as if nothing had happenned (and I got an upgrade to everything as well!). The scary thing was the fact that I had never done a "full test" of my backup strategy (in a business IT environment, this is a big no-no) - but it passed with flying colors. My backup system continues to run, with minor tweaks and additions here and there, but it has proved itself "under fire", and I am fine with it so far.
Reason is the Path to God - Anon
I use samba on a fileserver and all the users' computers use roaming profiles, so all the data is in one place. They are all unprivileged users and can't save to C:\. There's no need to aggregate the data at the end of the day. The server uses RAID 5 incase there's a disk failure.
Every night the fileserver rsyncs to an offsite computer (uses ssh). The backup computer makes hardlink copies of the old (yesterdays') data and then unlinks and stores any new (todays') data. Actually it stores the last 30 days worth of changes and only uses up about 120% of the space that a single backup would (this depends on the amount of changes). For the 100GB or so of data we store, the backup takes about two minutes to complete because rsync transfers the changed "blocks" only.
If a user erases his Word document, I can call it back with scp backup_server:/daily.x/path . where x is the number of days old the copy of the file is.
I'm thinking of adding another backup computer, this one on-site. It would be similar to the off-site daily backup, but this one would do hourly backups.
Rdiff-backup uses mtime+size as a unique identifier. The author tried using ctime, but there were problems with that. Unfortunately, he couldn't remember what they were... Ben Escoto (the author) told me so personally.
Well here's the current situation:
You can use it if you want. UFS1 is supported R/W. UFS2 exists as read-only.
However ext3 is capability identical to FFS with additional journaling options. In the beginning linux was using minix/xiafs. ext was introduced to help transition from that while bringing modern features to the table. Each evolution on the FS has been forward compatible to ease transition.
extX, reiser, jfs and xfs.
Each of them have a purpose:
extX: simple, low-overhead, modest size limits, online resize.
reiserfs: more efficient storage of directories, small files, etc. An attempt to bring new techniques to the table like plugins and support for extended metadata and indexing.
jfs: an import from AIX that had support for larger volumes than EXT3 did at the time. More efficient for large files.
xfs: an import from SGI that had support for large volumes, stripe sets, and online inode/block resizing. More efficient for large files.
There are reasons to have different file systems depending on the needs of your data. extX fulfills the same roles as FFS and does a good enough job that trying to get FFS to "work" under linux is not worth the effort except for cases of quorum disks and other chicanery.
ReiserFS, JFS, and XFS (and your other flavor-of-the-month types) all have legitimate uses in various circumstances (typically when dealing with really really big volumes, big files, databases, or lots of tiny files).
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
Anyhow, good luck with your article, and give dump and cpio a spin
Regards,
--
*Art
We use NetBackup from Symantec (formerly Veritas). Supports all our distros and even FreeBSD & Mac. Works like a charm.
I'm not a troll, but I play one on Slashdot.
I know it's kind of an extreme example, but would you go up in an airliner of which you know the software would crash at certain speed, angle and fullconsumption combination?
Yup. If that speed, angle, and fuel consumption aren't part of my flight, why should I worry? If there was a chance that they would be, they would have been tested and formally verified multiple times over before the plane was ever certified for flight.
Aircraft hardware will crash for certain speeds, angles, and fuel consumption combinations, too. If the software works for all the normal cases, I'm happy with it.
For example, a plane tends to crash when fuel consumption = 100% of fuel available, angle = straight down, and speed = 9.8m/s^2. If the software crashes then, I really don't care.