Slashdot Mirror


What Software Do You Use for Unix Backups?

jregel asks: "Linus has stated that dump should not be considered a reliable backup program, and both tar and cpio have their limitations. So what are Slashdot readers doing for backing up Linux servers and workstations? (you do backup, right?)" Given this bit of news, have you used anything other than the standard Unix staple to back up your Linux boxes? If you were forced off of tar, cpio and dump, what would you use as a replacement?

27 of 212 comments (clear)

  1. dump on solaris... by Polo · · Score: 4, Informative

    You know, I was thinking about the same thing since I had problems with a recent restore from a compressed dump archive. I was missing some files probably because I ran the dump from an active file system.

    I found out that solaris has a very interesting command: fssnap

    It creates a read-only snapshot of your filesystem intended for backup operations.

    You create a snapshot, dump the snapshot, then delete the snapshot and the dump is consistent.

    I wonder if there's something like this for linux...

    1. Re:dump on solaris... by AlexA · · Score: 3, Informative

      Yes there is. It's called LVM. I've used its snapshot capabilities before on my Linux server, it's very nice.

    2. Re:dump on solaris... by Anonymous Coward · · Score: 1, Informative

      But LVM does snapshots at the block device level, not file system like Solaris fssnap. The LVM HOWTO says the for XFS, you should "freeze" the file system (with xfs_freeze tool) before taking the snapshot, and unfreeze it afterwards (I guess this "freezing" is something like turning it into read-only mode for couple of seconds, but without unmounting it).

      Does anyone have more information about this? What about other filesystems than XFS?

    3. Re:dump on solaris... by Polo · · Score: 2, Informative

      Hey, you're right, and the wonderful LVM documentation even has a Recipe for performing the backup. I assume that since the snapshot is read-only, dump should work fine without the issues Linus mentioned.

      The snapshot partition just has to contain enough space to hold the changes made to the original volume while the snapshot exists.

  2. BackupEDGE vs. Taper by mindslip · · Score: 3, Informative

    I think the 2 above are both excellent, Taper for the less demanding environment, BUpEdge for a system with multiple drives.

    I'm actually doing a 100gb backup as we speak... so good timing on the Ask Slashdot.

    My only beef with Taper (and I'd use it otherwise, on my home system) is that when you do an "e"xclude or "i"nclude of a directory, it scans the entire subtree, which can take *forever*, (like when excluding /var/squid) instead of just simply skipping that directory.

    mindslip

  3. Re:rsync by Colitis · · Score: 3, Informative

    I use rsync over ssh too; I back it up to a machine at work (which I can reach from home). It basically does my whole home directory except for a few excludes for stuff that's a bit sensitive (ssh keys, keychain, ICQ history) which I manually backup to CD now and then. The machine at work is then backed up with TSM.

    The rsync over ssh style of backup is so easy it's addictive!

  4. rsync by heikkile · · Score: 2, Informative
    We have a dedicated backup machine, into which we rsync all the important stuff. We are a smallish shop, so it only has a couple of 120G disks.

    This backup machine keeps seven generations of daily backups on one disk (cp -al, so no duplicating of static data), and a few weekly ones on the other disk. Every night it rsyncs things off-site (to my home). That rsync has turned out to be unreliable (probably my adsl), so I have a script that does it in small bits and pieces. Takes a few hours in the early morning.

    --

    In Murphy We Turst

  5. cdrtools by Masa · · Score: 2, Informative

    I use "mkisofs /etc /root /home -R -T -o backup.iso && cdrecord dev=0,0,0 speed=4 blank=fast -data backup.iso" to create an ISO image, which will be burned to the CDRW disk. That's all I need to backup my workstation. And restoring the data doesn't require any special tools.

  6. Amanda! by nathanh · · Score: 5, Informative

    I have been extremely happy with Amanda. Single centralised backup server running amanda-server. Multiple workstations running the amanda-client. Amanda automagically schedules backups based on sensible heuristics. I just tell Amanda how many tapes I have, how many workstations I have, and Amanda does all the hard work of working out how much tape capacity is required and how often it should schedule incrementals/fulls.

    The server/client protocol has been designed to avoid reliance on dangerous security holes like rsh. The server sends the client a "send me your dump" message. The client then connects back to the server and delivers it the output from dump or tar. You can configure exclusion lists on the client if you're worried about sending certain files or filesystems. You can also encrypt the data stream and/or use Kerberos for authentication.

    If I forget to load a blank tape then Amanda plays it safe. It doesn't overwrite last night's backup: instead it stores incrementals into the "holding disk". Amanda will then flush the held backups to the next blank tape.

    Amanda emails me reports after every backup with a neat summary of what went right/wrong. It also gives you several hours advance warning if you forget to load a blank tape or if any of the workstations are offline.

    The only downside of Amanda is that it is fiddly to setup. The documentation is poor and the configuration files are cryptic. But if you're willing to invest some time and effort then you can't do much better (for free) than Amanda.

    1. Re:Amanda! by riffraff · · Score: 2, Informative

      Yes, and with amanda I was able to open up the command line client, navigate to the file and set the restore path. With that done, it worked out which tapes the file was on and restored.

      Amanda does the same thing, it's no problem. Yes, spanning tapes is a problem, but people might be working on it now. You can get around it by just backing up files, or directories, under the filesystem, in increments that are less than the tape size. I use it at a couple of different work locations, and it has worked really well.

  7. afbackup by Vairon · · Score: 4, Informative
    Website URL: http://sourceforge.net/projects/afbackup/
    Features:
    • Server & Client programs
    • Supports multiple clients streaming backups at the same time
    • Webmin module for easy configuration
    • Support for many tape drives and autoloaders
    • SSL and DES encryption support
    • Remote or local start of backups
    • Compatible with most *NIX systems (personally used it with Linux, Solaris & FreeBSD)
    • Non-root users can restore their own files
    • Unlike AMANDA:afbackup can actually append to tapes

    For those who don't know: AMANDA cannot append to tapes.
    Every time you backup with AMANDA it must start from the beginning of the tape.
    So, if you want backups every day, you must have a tape for every day.
    (http://amanda.sourceforge.net/fom-serve/cache/29. html
    1. Re:afbackup by martin · · Score: 4, Informative

      amanda doesn't append to tapes so there is not possibility of blowing away that tape. This is a problem I've experienced with other commercial software that appends to a tape each run - tape write error and it marks the entire tape bad. which means you have to scrap the entire entire tape and start again.

      Also tisk of appending is loss of tape or drive due to environmental factors - fire/flood (plane being driven into data centre).

    2. Re:afbackup by Vairon · · Score: 2, Informative

      It would seem like this itself would cause more wear on the tape. It's my understanding that the hardest thing on tapes is rewinding them. Everytime it runs into the beginning or the end of the tape it "pulls" at the tape. Which is why smart tape backup units slow down the speed of the drive as they near the beginning or the end, during a rewind. If your backup program causes a rewind every-single-day, that would seem (IMO) to cause more ware.

      In addition, unless you own a autoloader/robot unit, using a backup program that makes you change tapes every day would cost you more money due to having a person there to change tapes 7 days out of the week. The only alternative is to allow your backup program to overwrite the previous day's backup, which sort of defeats the purpose of having a backup on the weekend.

  8. Re:rsync by GigsVT · · Score: 2, Informative

    I hear rdiff-backup is good, but I still mainly use rsync with the incremental rsync type scripts that use hardlinks for versioning. We use it here to backup over 2TB of data over a 512kbit link. Since you never need to do a "full" backup, the bandwidth is plenty.

    --
    I've had enough abrasive sigs. Kittens are cute and fuzzy.
  9. Backup2L by JLester · · Score: 2, Informative
    We use the backup2l script from Sourceforge to backup about a dozen servers each night to a remote NAS server. It keeps multiple generations (not sure how many, but we can restore files from several months or even years later) and has worked great for us. It is tar based, but that hasn't caused any problems and we're backing up about 150 gigs with it.


    Jason

    --
    "FORMAT C:" - Kills bugs dead!
  10. Re:Roll Your Own by chrestomanci · · Score: 2, Informative
    I wrote my own (Perl) script, that copies all my "important" files (basically stuff in my home directory that can't be reconstructed by other means and all the system config files) to a new directory tree (using cpio) it then burns the copied tree to CD-RW and verifies the CD against the copied tree.

    That's what I used to do, (wrappering tar) but the matanence of the script became a pain and I needed to add support for incremental backups, and exclusion lists.

    After some web searching, I on google, freshmeat etc, I came across dobackup.pl, which is very similar in functionality to what I would have written myself. It wrappers AFIO, it supports full, incremental and differential backups to fixed and removable media.

    One of its best features is support for exclusion lists. Users can put in any directory they like a .nobackup file, which contains a list of expressions for filenames that should not be backed up, making it easy to exclude all the mp3 of mpeg files from user home directories.

    A downside is that the perl source code is a mess. It looks like it was written by someone who is used to programming in shell, but had very little experience in perl. Just reading through the code, I saw a number of potential bugs, where global variables where being trampled. In short it is a good program, but it needs a re-write.

  11. Try star by J�rg Schilling by Corporate+Gadfly · · Score: 4, Informative

    Some people have already mentioned Amanda.

    In addition to amanda, I have good luck with star coded by Jörg Schilling. star is very feature-rich, fast, standards compliant and has been around since 1985. Give it a try!

    The star-users mailing list is here . You can also look at the man page and finally download it

    --
    Corporate Gadfly
    Jonathan Archer: the most beaten up Enterprise captain in Star Trek history
  12. BackupPC by dissy · · Score: 3, Informative

    http://backuppc.sourceforge.net/

    Automated backups to an online disk server, open source, and a really nice web interface as well as command line interface.

    It uses samba and ssh to backup and restore to windows and unix machines.
    You can have it restore any files/folders in a backup you select, using the same methods (samba or ssh) as well as it can send the restore files to your browser in a tar or zip file.

    I recently replaced a machine using amanda and a DLT drive with a fileserver using a raid 5 array and backuppc. Best switch ever.

  13. Re:The dangers of backing up live systems by mph · · Score: 2, Informative
    Unfortunately we are talking a minimum of $40k for this type of solution.
    In FreeBSD 5.0, you can dump(8) a snapshot. I'm not sure if we're using snapshot in exactly the same way, but the point is that you're backing up a static "picture" of the filesystem, while the real filesystem can still be used read/write.

    The best part is the FreeBSD costs considerably less than $40k.

  14. Re:tar does not do incremental backups by dissy · · Score: 3, Informative

    > The problem is tar always archives the entire space which makes it difficult to
    > backup, say gigabytes of data, daily.
    >
    > A decent backup tool (as opposed to an archival tool) must absolutely have
    > incremental backup support.

    Er?

    tar --help
    [snip]
    Operation modifiers:
    -G, --incremental handle old GNU-format incremental backup
    -g, --listed-incremental handle new GNU-format incremental backup
    [snip]
    Local file selection:
    -N, --newer=DATE only store files newer than DATE
    --newer-mtime compare date and time when data changed only
    [snip]

    This is in tar (GNU tar) 1.12
    (Which is really really old actually.. slackware 3.2 dist)

    There are also tons of options to exclude directorys and files, to force it to span disks, and pretty much match in any way you need.
    I've been making incremental backups (and even restored a few) for awhile now.

  15. Re:rsync by dubl-u · · Score: 2, Informative

    Here's a howto for rsync snapshot backups. I keep daily backups for two weeks, weekly backups for two months, and monthly backups forever. I rolled my own wrappers for this stuff in a few hours.

    It is about eight zillion times better than tapes. I have hot, random access to all versions of all my files. Thanks to the hard linking, space used is moderate. Since it backs up to a remote computer, backups are instantly off site. And if I want to verify my backups, I don't have to feed in eight million tapes; I just write a little perl script.

    I recommend it highly!

  16. Incremental backups with rsync by Bishop · · Score: 2, Informative

    Rsync can also be used to make some very nice incremental "snapshot" backups.

  17. Wrong: EVMS by Vairon · · Score: 2, Informative

    I believe you are wrong. EVMS (which was built by IBM) and is distributed under the GPL license for free, provides software raid (0,1,5), filesystem snapshots, has both GUI and CLI tools for linux.
    It's a simple patch you can add to any 2.4 kernel.

  18. Re:Use dump and lose data by coyote-san · · Score: 3, Informative

    Have you even read Linus's comments?

    Dump works by reading the raw data partition. That works great with an unmounted partition, or if you have a very limited OS that does not perform any caching.

    But Linux is different - it's now using the cached pages as the primary content, usually flushing them to disk only as the pages are dropped. This is the approach used by most mature OSes, but Linux doesn't yet have an interface for "dump" programs to query the OS for updated but unwritten sectors.

    So dump is the worst of all possible things now. Not only will you get incomplete live files, you can get incomplete files even if the users have all terminated but the pages haven't been flushed to disk yet. That's non-deterministic, and there's simply no way for you to perform reliable dumps.

    On the practical side, dump is specific to the filesystem. When everyone ran ext2, that wasn't a problem. But now people may have a mixture of ext2, ext3, reiserfs, xfs, jfs, and probably even other formats. Each requires their own dump and restore, and that requires a lot more effort.

    --
    For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
  19. Re:CD's and harddisks by coyote-san · · Score: 2, Informative

    The ISO9660 FS has some pretty strict limits on number of files in a directory (~1024) and length of filenames under Rock Ridge extensions (~30s, I think). If you exceed this, you'll be unable to retrieve those "extra" files - I know after being burned by it in the past.

    (Obviously I don't like working in directories with thousands of entries, but some tools will produce them, it's easy to accidently hit numbers like that with mail or news spools, etc.)

    As for the RW media, you do realize that they have a limited lifetime, right? Are you validating the discs you write, or going on blind faith?

    --
    For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
  20. Re:Linux is about the only... by nathanh · · Score: 2, Informative
    Linux is about the only... major OS that doesn't have some kind of filesystem-snapshot support.

    You do realise that dump doesn't give you a filesystem snapshot? Even on Solaris - the most venerable of modern UNIX - the manpage for ufsdump clearly states:

    When running ufsdump, the file system must be inactive; otherwise, the output of ufsdump may be inconsistent and restoring files correctly may be impossible.

    There's a good reason why nobody seriously uses dump anymore.

    And Linux does support filesystem snapshots. The Linux LVM explicitly lists it as a feature.

    Moderators, this person was not informative, they were simply wrong.

  21. Re:Arkeia by Anonymous Coward · · Score: 1, Informative

    Arkeia is alright if your backing up a small amount of data/servers (10?).

    However, I would highly recommend against using Arkeia in an "enterprise" enviroment even though they claim it is "enterprise network backup" software.

    Where to start:

    - Since it isn't free, and of course requires client licenses, (different types, for different OS's too), if you exceed your client license limit, the backup won't run at all. This is expected, but the kicker is anyone on your network can install Arkeia, point it to backup.domain.com and use up one of your licenses without you even knowing!!

    - You can't give arbitrary names to clients, and Arkeia defaults to not using a FQDN when naming the clients itself. So you often end up with two or more "www" clients. Renaming these requires changing Arkeia's plain text (more on this later) database, and often results in losing the data for that host. It also defaults to whichever IP address the client decides to send, which is often ends up being 127.0.0.1.

    - Plain text database! This is absolutely pathetic. If you need to backup any amount of files, you have to use ReiserFS for the database partition or else you run out of inodes 30mins in to a backup. Arkeia creates approx. 2 files per directory it backs up. Our Arkeia database has about 10 million files in it (find . | wc takes about two hours to run on a Dual 1GZ SCSI160 RAID1), totalling about 20gb. Most of the CPU time used during a backup is in the fopen/fclose calls.
    It also calls fsync after each of the changes to these files, which, until we disabled this in the kernel, a full backup job took about 48hrs. Which makes daily backup impossible to run.

    - Instability. This database decides to corrupt itself about twice a month, even on rock solid hardware.

    - Support. Expensive and useless. I tried contacting them about their fsync issue, and how to get our backup to run within a 24hr window. They couldn't help me. So after we created a .so file to disable fsync, I emailed it to them, and explained it. The next day I recieved an email from their support along the lines of:

    "Another customer sent us this patch to speed up backups, you might want to give it a try"

    I sent you the patch, Idiot.

    The list goes on, but in short, stay away from Arkeia if you have more than about 10 servers to backup. We attempted 75 servers and 1 terrabyte of data per full backup, but it just can't handle it.