Backing up a Linux (or Other *nix) System

← Back to Stories (view on slashdot.org)

Backing up a Linux (or Other *nix) System

Posted by ryuzaki0 on Thursday October 12, 2006 @11:38AM from the nothing-can-go-wrong-nothing-can-go-wrong dept.

bigsmoke writes "My buddy Halfgaar finally got sick of all the helpful users on forums and mailing lists who keep suggesting backup methods and strategies to others which simply don't, won't and can't work. According to him, this indicates that most of the backups made by *nix users simply won't help you recover, while you'd think that disaster recovery is the whole point of doing backups. So, now he explains to the world once and for all what's involved in backing up *nix systems."

10 of 134 comments (clear)

Min score:

Reason:

Sort:

Dump by Fez · 2006-10-12 11:46 · Score: 3, Informative

I'd say he hasn't seen the "dump" command on FreeBSD:
http://www.freebsd.org/cgi/man.cgi?query=dump&apro pos=0&sektion=0&manpath=FreeBSD+6.1-RELEASE&format =html

I still use tar, but ideally I'd like to use dump. As it is now, each server makes its own backups, copies them to a central server, which then dumps them all to tape. The backup server also holds one previous copy in addition to what got dumped to tape. It has come in handy on many occasions.

It does take some planning, though.
1. Re:Dump by Retardican · 2006-10-12 12:56 · Score: 5, Informative
  
  If you are going to talk about dump, you can't leave out why dump is the best. From the FreeBSD Handbook:
  
  17.12.7 Which Backup Program Is Best?
  
  dump(8) Period. Elizabeth D. Zwicky torture tested all the backup programs discussed here. The clear choice for preserving all your data and all the peculiarities of UNIX file systems is dump. Elizabeth created file systems containing a large variety of unusual conditions (and some not so unusual ones) and tested each program by doing a backup and restore of those file systems. The peculiarities included: files with holes, files with holes and a block of nulls, files with funny characters in their names, unreadable and unwritable files, devices, files that change size during the backup, files that are created/deleted during the backup and more. She presented the results at LISA V in Oct. 1991. See torture-testing Backup and Archive Programs.
  
  I find dump to be the best backup tool for unix systems. One disadvantage is that it deals with whole file systems, which means things have to be partitioned intelligently before hand. I think that's actually a Good Thing (TM).
  
  --
  Will the War in Iraq get better or worse in 2007? Vote here
2. Re:Dump by arivanov · 2006-10-12 18:36 · Score: 4, Insightful
  I find dump to be the best backup tool for unix systems.
  First, looking at this statement it seems that you have never had to run backups in a sufficiently diverse environment. Dump "proper" has a well known problem - it supports only a limited list of filesystems. It originally supported UFS and was ported to support EXT?FS. It does not support JFS, XFS, ReiserFS, UDF and so on (last time I looked each used to have its own different dump-like utility). In the past I have also ran into some entertaining problems with it when dealing with posix ACLs (and other bells-n-whistles) on ext3fs. IMHO, it is also not very good at producing a viable back up of heavily used filesystems.
  Second, planning dumps is not a rocket science any more. Nowdays, dumps can be planned in advance in an intelligent manner without user intervention. This is trivial. Dump is one of the supported backup mechanisms in Amanda and it works reasonably well for cases where it fits the bill. Amanda will schedule dumps at the correct levels without user attendance (once configured). If you are backing to disk or tape library you can leave it completely unattended. If you are backing to other media you will need only to change cartridges once it is set-up. Personally, I prefer to use the tar mechanism in Amanda. While less effective it supports more filesystems and is better behaved in a large environment (my backup runs at work are in the many-TB range and they have been working fine for 5+ years now).
  Now back to the overall topic, the original ASK Slashdot is a classic example of "Ask Backup Question" on slashdot. Vague question with loads of answers which I would rather not qualify. As usually what is missing is what are you protecting against. When planning a backup strategy it is important to decide what are you protecting against: cockup, minor disaster, major disaster or compliance.
  
  Cockup - user deleted a file. It must be retrieved fast and there is no real problem if the backups go south once in a while. Backup to disk is possibly the best solution here. Backup to tape does not do the job. It may take up to 6 hours to get a set of files of a large tape. By the end you will have users taking matters in their own hands.
  
  Minor disaster - server has died taking fs-es with it. Taking a few hours to recover it will not get you killed in most SMBs and home offices. Backup to disk on another machine is possibly the best solution here. In most cases this can be combined with the "cockup" recovery backup.
  
  Major disaster - flood, fire, four horsemen and the like. For this you need offsite backup or a highly rated fire safe and backup to suitable removable media. Tape and high speed disk-like cartridges (Iomega REV) are possibly the best solution for putting in a safe. This cannot be combined with the "cockup/minor disaster" backups because the requirements contradict. You cannot optimise for speed and reliability/security of storage at the same time. Tapes are slow, network backup to remote sites is even slower.
  
  Compliance - that is definitely not an Ask Slashdot topic.
  
  As far as with what to backup on unix IMO the answer is amanda, amanda or amanda:
  
  It plugs into supported and well known OS utilities so if worst comes to worst you can extract the dump/tar from tape and use dump or tar to process it by hand. Also, if you change something on the underlying OS the backups no longer stop working. For example while ago, I had that problem with Veritas which kept going south on anything but old stock RedHat kernels (wihtout updates). So at one point I said enough is enough, moved all of the Unix systems to amanda and never looked back since (that was 5+ years ago)
  
  It is fairly reliable and network backup is well supported (including firewall support on linux).
  
  It is not easy to tune (unix is userfriendly...), but can be tuned to do backup jobs where many high end commercial backup programs fail.
  
  It supports tape backup (including libraries), disk backup and various weird media (like REV)
  
  It works (TM).
  --
  Baker's Law: Misery no longer loves company. Nowadays it insists on it
  http://www.sigsegv.cx/
Backups by StarHeart · 2006-10-12 11:51 · Score: 3, Informative

The article seems like a good one, though I think it may be a little too cautious. I would need to hear some real world examples before I would give up on incremental backups. Being able to store months worth of data seems so much better than being only able to store weeks because you aren't doing incremental backups.

One thing not mentioned is encryption. The backups should be stored on a media or machine seperate from the source. In the case of the machine you will likely be backing up more than one system. If it is a centralized backup server then all someone has to do is break into that system and they have access to the data from all the systems. Hence encrypted are a must in my book. The servers should also push their data to the backup server, as a normal user on the backup server, instead of the backup server pulling it from the servers.

I used to use hdup2, but the developer abandoned it for rdup. The problem with rdup is it writes straight to the filesystem. Which brings up all kinds of problems, like the ones mentioned in the article. Lately I have been using duplicity. It does everything I want it to. I ran into a few bugs with it, but once I worked around them it has worked very well for me. I have been able to do restores on multiple occasions.

--
Havoc Penington, the bane of my Linux desktop.
1. Re:Backups by WuphonsReach · 2006-10-12 12:25 · Score: 4, Informative
  
  The problem with suggesting backup solutions is that everyone's tolerance of risk differs. Plus, different backup solutions solve different problems.
  
  For bare metal restore, there's not much that beats a compressed dd copy of the boot sector, the boot partition and the root partition. Assuming that you have a logical partition scheme for the base OS, a bootable CD of some sort and a place to pull the compressed dd images from, you can get a server back up and running in a basic state pretty quickly. You can also get fancier by using a tar snapshot of the root partition instead of a low-level dd image.
  
  Then there are the fancier methods of bare metal restore that use programs like Bacula, Amanda, tar, tape drives.
  
  After that, you get into preservation of OS configuration. For which I prefer to use things like version control systems, incremental hard-link snapshots to another partition and incremental snapshots to a central backup server. I typically snapshot the entire OS, not just configuration files and the hardlinked backups using ssh/rsync keep things manageable.
  
  Finally we get into data, and there's two goals here. Disaster recovery and archival. Archive backups can be less frequent then disaster recovery backups since the goal is to be able to pull a file from 2 years ago. Disaster recovery backup frequency depends more on your tolerance for risk. How many days / hours are you willing to lose if the building burns down (or if someone deletes a file).
  
  You can even mitigate some data loss scenarios by putting versioning and snapshots into place to handle day-to-day accidential mistakes.
  
  Or there's simpler ideas, like having backup operating systems installed on the partition (a bootable root with an old, clean copy) that can be booted in an emergency, run no services other then SSH, but have the tools to let you repair the primary OS volumes. Or going virtual with Xen where your servers are just files on the hard drive of the hypervisor domain and you can dump them to tape.
  
  --
  Wolde you bothe eate your cake, and have your cake?
Amanda by Neil+Blender · 2006-10-12 11:52 · Score: 5, Informative

http://www.amanda.org/

Does the trick for my organization.
Mondoarchive by Mr2001 · 2006-10-12 11:52 · Score: 3, Informative

Mondoarchive works pretty well for backing up a Linux system. It uses your existing kernel and other various OS parts to make a bootable set of backup disks (via Mindi Linux), which you can use to restore your partitions and files in the event of a crash.

--
Visual IRC: Fast. Powerful. Free.
Re:/. is slipping by LearnToSpell · 2006-10-12 13:19 · Score: 3, Funny

RTFM n00bz!!

dd if=/dev/sda | rsh user@dest "gzip -9 >yizzow.gz"

And then just restore with
rsh user@dest "cat yizzow.gz | gunzip" | dd of=/dev/sda

Jeez. Was that so tough?

--
Haida Manga
Re:Consistent backups by Just+Some+Guy · 2006-10-12 13:58 · Score: 3, Interesting

The '-L' option to FreeBSD's dump command makes an atomic snapshot of the filesystem to be dumped, then runs against that snapshot instead of the filesystem itself. While that might not be good enough for your purposes, it's nice to know that the backup of database backend file foo was made at the same instant as file bar; that is, they're internally consistent with one another.

--
Dewey, what part of this looks like authorities should be involved?
Arguably worthless by swordgeek · 2006-10-12 16:22 · Score: 3, Insightful

When you work in a large environment, you start to develop a different idea about backups. Strangely enough, most of these ideas work remarkably well on a small scale as well.

tar, gtar, dd, cp, etc. are not backup programs. These are file or filesystem copy programs. Backups are a different kettle of fish entirely.

Amanda is a pretty good option. There are many others. The tool really isn't that important other than that (a) it maintains a catalog, and (b) it provides comprehensive enough scheduling for your needs.

The schedule is key. Deciding what needs to get backed up, when it needs to get backed up, how big of a failure window you can tolerate, and such is the real trick. It can be insanely difficult when you have a hundred machines with different needs, but fundamentally, a few rules apply to backups:

For backups:
1) Back up the OS routinely.
2) Back up the data obsessively.
3) Document your systems carefully.
4) TEST your backups!!!

For restores:
1) Don't restore machines--rebuild.
2) Restore necessary config files.
3) Restore data.
4) TEST your restoration.

All machines should have their basic network and system config documented. If a machine is a web server, that fact should be added to the documentation but the actual web configuration should be restored from OS backups. Build the machine, create the basic configuration, restore the specific configuration, recover the data, verify everything. It's not backups, it's not a tool, it's not just spinning tape; it's the process and the documentation and the testing.

And THAT'S how you save 63 billion dollar companies.

--

"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban