Slashdot Mirror


Backing up a Linux (or Other *nix) System

bigsmoke writes "My buddy Halfgaar finally got sick of all the helpful users on forums and mailing lists who keep suggesting backup methods and strategies to others which simply don't, won't and can't work. According to him, this indicates that most of the backups made by *nix users simply won't help you recover, while you'd think that disaster recovery is the whole point of doing backups. So, now he explains to the world once and for all what's involved in backing up *nix systems."

134 comments

  1. Dump by Fez · · Score: 3, Informative

    I'd say he hasn't seen the "dump" command on FreeBSD:
    http://www.freebsd.org/cgi/man.cgi?query=dump&apro pos=0&sektion=0&manpath=FreeBSD+6.1-RELEASE&format =html

    I still use tar, but ideally I'd like to use dump. As it is now, each server makes its own backups, copies them to a central server, which then dumps them all to tape. The backup server also holds one previous copy in addition to what got dumped to tape. It has come in handy on many occasions.

    It does take some planning, though.

    1. Re:Dump by Retardican · · Score: 5, Informative
      If you are going to talk about dump, you can't leave out why dump is the best. From the FreeBSD Handbook:

      17.12.7 Which Backup Program Is Best?

      dump(8) Period. Elizabeth D. Zwicky torture tested all the backup programs discussed here. The clear choice for preserving all your data and all the peculiarities of UNIX file systems is dump. Elizabeth created file systems containing a large variety of unusual conditions (and some not so unusual ones) and tested each program by doing a backup and restore of those file systems. The peculiarities included: files with holes, files with holes and a block of nulls, files with funny characters in their names, unreadable and unwritable files, devices, files that change size during the backup, files that are created/deleted during the backup and more. She presented the results at LISA V in Oct. 1991. See torture-testing Backup and Archive Programs.

      I find dump to be the best backup tool for unix systems. One disadvantage is that it deals with whole file systems, which means things have to be partitioned intelligently before hand. I think that's actually a Good Thing (TM).
      --
      Will the War in Iraq get better or worse in 2007? Vote here
    2. Re:Dump by kv9 · · Score: 1

      One disadvantage is that it deals with whole file systems

      NetBSD's dump supports files too, not just filesystems.

    3. Re:Dump by arivanov · · Score: 4, Insightful
      I find dump to be the best backup tool for unix systems.

      First, looking at this statement it seems that you have never had to run backups in a sufficiently diverse environment. Dump "proper" has a well known problem - it supports only a limited list of filesystems. It originally supported UFS and was ported to support EXT?FS. It does not support JFS, XFS, ReiserFS, UDF and so on (last time I looked each used to have its own different dump-like utility). In the past I have also ran into some entertaining problems with it when dealing with posix ACLs (and other bells-n-whistles) on ext3fs. IMHO, it is also not very good at producing a viable back up of heavily used filesystems.

      Second, planning dumps is not a rocket science any more. Nowdays, dumps can be planned in advance in an intelligent manner without user intervention. This is trivial. Dump is one of the supported backup mechanisms in Amanda and it works reasonably well for cases where it fits the bill. Amanda will schedule dumps at the correct levels without user attendance (once configured). If you are backing to disk or tape library you can leave it completely unattended. If you are backing to other media you will need only to change cartridges once it is set-up. Personally, I prefer to use the tar mechanism in Amanda. While less effective it supports more filesystems and is better behaved in a large environment (my backup runs at work are in the many-TB range and they have been working fine for 5+ years now).

      Now back to the overall topic, the original ASK Slashdot is a classic example of "Ask Backup Question" on slashdot. Vague question with loads of answers which I would rather not qualify. As usually what is missing is what are you protecting against. When planning a backup strategy it is important to decide what are you protecting against: cockup, minor disaster, major disaster or compliance.

      • Cockup - user deleted a file. It must be retrieved fast and there is no real problem if the backups go south once in a while. Backup to disk is possibly the best solution here. Backup to tape does not do the job. It may take up to 6 hours to get a set of files of a large tape. By the end you will have users taking matters in their own hands.
      • Minor disaster - server has died taking fs-es with it. Taking a few hours to recover it will not get you killed in most SMBs and home offices. Backup to disk on another machine is possibly the best solution here. In most cases this can be combined with the "cockup" recovery backup.
      • Major disaster - flood, fire, four horsemen and the like. For this you need offsite backup or a highly rated fire safe and backup to suitable removable media. Tape and high speed disk-like cartridges (Iomega REV) are possibly the best solution for putting in a safe. This cannot be combined with the "cockup/minor disaster" backups because the requirements contradict. You cannot optimise for speed and reliability/security of storage at the same time. Tapes are slow, network backup to remote sites is even slower.
      • Compliance - that is definitely not an Ask Slashdot topic.
      As far as with what to backup on unix IMO the answer is amanda, amanda or amanda:
      • It plugs into supported and well known OS utilities so if worst comes to worst you can extract the dump/tar from tape and use dump or tar to process it by hand. Also, if you change something on the underlying OS the backups no longer stop working. For example while ago, I had that problem with Veritas which kept going south on anything but old stock RedHat kernels (wihtout updates). So at one point I said enough is enough, moved all of the Unix systems to amanda and never looked back since (that was 5+ years ago)
      • It is fairly reliable and network backup is well supported (including firewall support on linux).
      • It is not easy to tune (unix is userfriendly...), but can be tuned to do backup jobs where many high end commercial backup programs fail.
      • It supports tape backup (including libraries), disk backup and various weird media (like REV)
      • It works (TM).
      --
      Baker's Law: Misery no longer loves company. Nowadays it insists on it
      http://www.sigsegv.cx/
    4. Re:Dump by geschild · · Score: 1

      [...]She presented the results at LISA V in Oct. 1991. See torture-testing Backup and Archive Programs.
      (emphasis mine)

      If you're going to go and quote something, please make sure that it is still relevant? I'm not entirely sure that more current versions, say 15 years younger, might not still have the same problems, but I think a re-match is in order to get some real information, here.
      --
      Karma? What's that again?
    5. Re:Dump by arth1 · · Score: 2, Informative
      Dump "proper" has a well known problem - it supports only a limited list of filesystems. It originally supported UFS and was ported to support EXT?FS. It does not support JFS, XFS, ReiserFS, UDF and so on (last time I looked each used to have its own different dump-like utility). In the past I have also ran into some entertaining problems with it when dealing with posix ACLs (and other bells-n-whistles) on ext3fs. IMHO, it is also not very good at producing a viable back up of heavily used filesystems.

      Yes, different file systems need their own versions of "dump" and "restore", because the operations happen at file system level, and need to be able to back up and recover any special features of the file system.
      As for producing a viable backup of a heavily used file system, dump is certainly superior to a tar or otherwise trying to work at a file level. With dump, you will be able to get a snapshot copy of files that are locked. But true, a consistent backup of active file system can only be done by either re-mounting the volumes read only, using a shadow copy, or techniques like breaking a mirror.
      As for ACLs, alternate streams and other fs-specific features, native dumps are one of the very few ways you can back up files and retain this data.

      I back up three machines here faithfully every night with xfsdump, and yes, I've had to restore due to hardware failure and upgrades, so I know they're viable. Since xfsdump supports differential backups (not to be confused with incremental backups), I use a staggered Tower of Hanoi approach. From the crontab of one of the machines:
      1 3 1-31/16 * * /usr/local/sbin/xfsbackup -l 0
      1 3 9-31/16 * * /usr/local/sbin/xfsbackup -l 2
      1 3 5-31/8 * * /usr/local/sbin/xfsbackup -l 4
      1 3 3-31/4 * * /usr/local/sbin/xfsbackup -l 6
      1 3 2-31/2 * * /usr/local/sbin/xfsbackup -l 8
      ... where xfsbackup is a script that performs a dump of all file systems in fstab with the dump flag set, at the level specified, and mounting/unmounting if necessary, and only after completion without errors, removing older backups of the same or lower level in this set. (Directly overwriting one backup with its replacement is a typical newbie mistake -- if the machine crashes during backup, you then have no backup at all.)

      The use of differential backups instead of incremental allows for a much smaller number of required volumes, and diminishes the risk of a deleted file being restored -- in my case, I need at most 5 volumes per set, and usually 3 or less, with each set holding up to 16 days. This makes restores much quicker too.
      The down side is that you will back up the same data more than once; whenever you stay at or go up in backup levels, the same files will be backed up again, even if there's no new changes. In practice, this is minor problem, with a predictable pattern, so resources can be allocated accordingly.

      Regards,
      --
      *Art
    6. Re:Dump by arivanov · · Score: 1
      The down side is that you will back up the same data more than once

      This is not a downside, this is an advantage. One of the ways to increase the probability of recovery is to do this. Unfortunately the human brain (without probability theory training) is not very well suited to this. It is even less suited to follow the changes in the filesystems over time and change these estimates on every backup run so the best is for the backup system to does this for you. This is possibly the best feature in amanda - it does this for you.

      By the way this scares the shit out of traditional "full + incremental" backup tradition sysadmins as they fail to understand what it does.

      As far as dump (and xfsdump) in particular I agree with you that it is possibly one of the best ways of dumping a specific fully supported filesystem (all versions and spec matching fully) by hand. If it starts getting into the realm which you are describing and if you have 20+ servers to backup (with TB size volumes) I cannot be arsed to do this math. I would rather have a backup system do it for me and here the fact that xfsdump is not supported as an Amanda frontend becomes really painfull.

      --
      Baker's Law: Misery no longer loves company. Nowadays it insists on it
      http://www.sigsegv.cx/
    7. Re:Dump by joib · · Score: 1

      Good post.

      Have you tried bacula? I've heard stories of people migrating from amanda to it, although probably less so these days now that
      amanda supports spanning many tapes.

      And my pet peeve, neither amanda, bacula or any commercial program I know of supports extended attributes (ACL:s, SELinux labels). #"@%&

    8. Re:Dump by ryanov · · Score: 1

      Bacula definitely supports ACL's. I'm not sure about SELinux labels, but it seems to be there are ways to back them up independently and then backup the file that is exported. I could be wrong, but I bet you that it's doable.

    9. Re:Dump by ryanov · · Score: 1

      FYI, from an SELinux FAQ:

      What about backup and recovery ?

      When backing up and recovering files with a SELinux system, care must be taken to preserve SELinux context information. Use the star command to backup SE Linux contexts on Fedora, Red Hat Enterprise Linux (and probably most systems with a recent version of star).

      For example,
      star -xattr -H=exustar -c -f output.tar [files]

      Also the dump and restore utilities for Ext2/3 have been updated to work with XATTRs (and therefore SE Linux contexts). They should work on all distributions now.

    10. Re:Dump by arivanov · · Score: 1
      I have heard of bacula and I have looked at the list of supported features on a few occasions, but I have never seen any need to migrate. I also know a few admins who have migrated to it from amanda. It has always been for one of the following reasons:
      • Multitape support - most people simply do not know that amanda can support multiple tapes and tape libraries. Many of the ones who know do not know how to circumvent the file-does-not-span-a-tape limitation. For this I use automounter+nis to move things around across several servers with several TBs of total storage. This adds some extra work to do when defining backups (you cannot just tell it do the entire volume) and forces the use of tar instead of dump, but keeps each backup item down to a reasonable size. It is trivial, works, scales to any storage set size and any filesystem and allows your entire network to be easily maintainable. You can move items between servers and volumes at will with an ease which nothing short of a very high end SAN can achieve and do all of this with commodity OS on commodity hardware.
      • Unpredictability - Amanda schedules multiple full or same level backups in a tape cycle if the space allows. This is done on purpose and improves recovery probability in the event of a tape loss. This drives people who are accustomed to the full + differential paradigm nuts. I know many otherwise good sysadmins who have moved from Amanda to other backup products for this sole reason.
      As far as the ACLs I usually dump them into a separate file before (and in some cases after) the backup run for the entire fs. This is not very efficient and has the major problem of being non-atomic. There is always the probability of losing some ACL information somewhere. It has the advantage of being portable across different ACL systems and not limited by the backup system. It also allows moving things around in a much easier manner on a large installation with multiple servers and systems. Labels (including SE), application info, etc can all be treated in the same manner. While this is not applicable to some systems where the backup/recovery must be strictly atomic it will be enough for 99% of the installs out there.
      --
      Baker's Law: Misery no longer loves company. Nowadays it insists on it
      http://www.sigsegv.cx/
  2. Backups by StarHeart · · Score: 3, Informative

    The article seems like a good one, though I think it may be a little too cautious. I would need to hear some real world examples before I would give up on incremental backups. Being able to store months worth of data seems so much better than being only able to store weeks because you aren't doing incremental backups.

        One thing not mentioned is encryption. The backups should be stored on a media or machine seperate from the source. In the case of the machine you will likely be backing up more than one system. If it is a centralized backup server then all someone has to do is break into that system and they have access to the data from all the systems. Hence encrypted are a must in my book. The servers should also push their data to the backup server, as a normal user on the backup server, instead of the backup server pulling it from the servers.

        I used to use hdup2, but the developer abandoned it for rdup. The problem with rdup is it writes straight to the filesystem. Which brings up all kinds of problems, like the ones mentioned in the article. Lately I have been using duplicity. It does everything I want it to. I ran into a few bugs with it, but once I worked around them it has worked very well for me. I have been able to do restores on multiple occasions.

    --
    Havoc Penington, the bane of my Linux desktop.
    1. Re:Backups by WuphonsReach · · Score: 4, Informative

      The problem with suggesting backup solutions is that everyone's tolerance of risk differs. Plus, different backup solutions solve different problems.

      For bare metal restore, there's not much that beats a compressed dd copy of the boot sector, the boot partition and the root partition. Assuming that you have a logical partition scheme for the base OS, a bootable CD of some sort and a place to pull the compressed dd images from, you can get a server back up and running in a basic state pretty quickly. You can also get fancier by using a tar snapshot of the root partition instead of a low-level dd image.

      Then there are the fancier methods of bare metal restore that use programs like Bacula, Amanda, tar, tape drives.

      After that, you get into preservation of OS configuration. For which I prefer to use things like version control systems, incremental hard-link snapshots to another partition and incremental snapshots to a central backup server. I typically snapshot the entire OS, not just configuration files and the hardlinked backups using ssh/rsync keep things manageable.

      Finally we get into data, and there's two goals here. Disaster recovery and archival. Archive backups can be less frequent then disaster recovery backups since the goal is to be able to pull a file from 2 years ago. Disaster recovery backup frequency depends more on your tolerance for risk. How many days / hours are you willing to lose if the building burns down (or if someone deletes a file).

      You can even mitigate some data loss scenarios by putting versioning and snapshots into place to handle day-to-day accidential mistakes.

      Or there's simpler ideas, like having backup operating systems installed on the partition (a bootable root with an old, clean copy) that can be booted in an emergency, run no services other then SSH, but have the tools to let you repair the primary OS volumes. Or going virtual with Xen where your servers are just files on the hard drive of the hypervisor domain and you can dump them to tape.

      --
      Wolde you bothe eate your cake, and have your cake?
    2. Re:Backups by slamb · · Score: 1
      The article seems like a good one, though I think it may be a little too cautious. I would need to hear some real world examples before I would give up on incremental backups. Being able to store months worth of data seems so much better than being only able to store weeks because you aren't doing incremental backups.

      I think his complaints are no longer relevant. rdiff-backup has a --compare-hash option, though I haven't checked the details. Maybe the author should give it another look...

      Besides, if you have an accurate timeserver (you should! time is unbelievably important to software in general!), the timestamp check is pretty safe, barring maliciousness. And if your machine has been compromised, the data coming off it should not be trusted in general. This is just one more case of that.

      One thing not mentioned is encryption. [...] Lately I have been using duplicity.

      It seems like a great idea, but my impression was that it was missing a lot of the same love, care, testing, documentation, etc. that has been put into rdiff-backup. They're by the same guy, but he obviously has been concentrating largely on the one, and I don't believe they share any code.

      Have you looked at brackup? It seems promising, anyway, but I haven't actually tried it. Maybe when it's a little more mature...

    3. Re:Backups by halfgaar · · Score: 1

      [quote]I think his complaints are no longer relevant. rdiff-backup has a --compare-hash option, though I haven't checked the details. Maybe the author should give it another look...[/quote] The hash is stored in the meta information, and the compare option does only that, comparing the live system to your archive. It does not say anything about the change-detection behaviour used during a backup. [quote]Besides, if you have an accurate timeserver (you should! time is unbelievably important to software in general!), the timestamp check is pretty safe, barring maliciousness. And if your machine has been compromised, the data coming off it should not be trusted in general. This is just one more case of that.[/quote] No, it's not (safe, I mean). Do this: touch a b edit a and b to be the same length but different content stat a b mv b a stat a a will now have the mtime b had first. mtime+size is not changed, file is not backed up. This is a danger in my opinion. [quote]Have you looked at brackup? It seems promising, anyway, but I haven't actually tried it. Maybe when it's a little more mature...[/quote] No, I will have a look. But as I said a few posts below, I'll have to go to sleep now :)

  3. Amanda by Neil+Blender · · Score: 5, Informative

    http://www.amanda.org/

    Does the trick for my organization.

    1. Re:Amanda by fjf33 · · Score: 1

      Does the trick for me at home. :)

    2. Re:Amanda by Dadoo · · Score: 1

      Yeah, Amanda has all the capabilities you need to do enterprise backups, except possibly the most important one: the ability to span tapes.

      --
      Sit, Ubuntu, sit. Good dog.
    3. Re:Amanda by Noksagt · · Score: 1

      Except that AMANDA now has tape spanning.

    4. Re:Amanda by Neil+Blender · · Score: 1

      Yes, sorry, this article is clearly intended to teach one how to backup in a large scale environment. I reread the article. It's funny, the first time around I missed the part about the author's prefered backup file size is 650MB (he likes to burn them to CDs). I italicized the part about CDs because I didn't want anyone to get scared. It's a very enterprisey technology.

    5. Re:Amanda by halfgaar · · Score: 1

      Actually, I burn them to DVDs. But, I don't really have one specific target audience in mind. Large enterprise setups require more work and more specific apps, of course. I didn't know of Amanda, but I have it on my TODO.

    6. Re:Amanda by WuphonsReach · · Score: 1

      No recommendations for bacula? Or are they not even comparable?

      --
      Wolde you bothe eate your cake, and have your cake?
    7. Re:Amanda by Anonymous Coward · · Score: 0

      Except that amanda does that. Don't believe the anti-amanda propaganda on the bacula page. It's rather out-of-date at best, amanda does tape spanning, user-defined backup methods, backups over ssh, and a load of other weird and wonderful stuff. And it works on all unix family members we have including very weird ones (fucks and aches, nuff said), it's not a "linux" backup solution.

      As to amanda vs. bacula, bacula has a very traditional tape rotation, okay if you like to micromanage, but it doesn't scale well. We use amanda and an LTO tape library to back up 100s of hosts, having tried bacula first because (er) we believed the propaganda on the bacula page.

    8. Re:Amanda by ryanov · · Score: 1

      Report it to the Bacula project then! It's not anti-Amanda propoganda, it was likely true at the time it was written.

    9. Re:Amanda by Dadoo · · Score: 1

      Except that amanda does that. Don't believe the anti-amanda propaganda on the bacula page.

      Actually, I remember reading about it on the Amanda page. How long has it had this capability?

      --
      Sit, Ubuntu, sit. Good dog.
  4. Mondoarchive by Mr2001 · · Score: 3, Informative

    Mondoarchive works pretty well for backing up a Linux system. It uses your existing kernel and other various OS parts to make a bootable set of backup disks (via Mindi Linux), which you can use to restore your partitions and files in the event of a crash.

    --
    Visual IRC: Fast. Powerful. Free.
    1. Re:Mondoarchive by shellbeach · · Score: 1

      Yes, I couldn't believe someone had written an article about backing up a linux system and didn't refer even once to Mondo! (Or to any other backup software, either! I mean, OK, it's cool to know how to back things up yourself, but data recovery isn't a game ... I'd stick with something straightforward and reliable, personally, rather than rolling your own!)

      Mondo is absolutely vital in this regard - it allows you to restore from bare metal, and backs up and restores systems flawlessly. I've had to use it to recover my system when I had a warranty claim on a notebook, and it worked perfectly.

      My only piece of advice, if creating optical backups, is to backup to your harddisk, then burn the images and verify the burns against the images, rather than burning the discs on the fly.

    2. Re:Mondoarchive by Mr2001 · · Score: 1
      My only piece of advice, if creating optical backups, is to backup to your harddisk, then burn the images and verify the burns against the images, rather than burning the discs on the fly.

      Is it any faster that way? My only real complaint about Mondo is that it takes several hours to back up my 26 GB system to DVD+R, even with compression turned off... and for most of that time, I'm watching a progress bar stuck at 100% ("Now backing up large files") even as it burns disc after disc after disc.
      --
      Visual IRC: Fast. Powerful. Free.
    3. Re:Mondoarchive by halfgaar · · Score: 1

      I put it on my TODO list to check out, Mondoarchive I mean.

    4. Re:Mondoarchive by shellbeach · · Score: 1
      It's faster in that you don't have to change discs during the backup, so you can schedule it at 1am in the morning and wake-up to some fresh backup images to burn (thus, no need to spend hours looking at the progress bar! :) But the really nice thing is being able to verify the burnt discs against the iso images: there's nothing worse that finding that your vital backup had an error in burning and your disc can't be read!!!

      NB: You need to do a bit of fiddling with isoinfo and dd to get the md5sum of the disc to ignore the 0's at the end, but it's easy when you've worked that out. If you're interested, I've copied the little perl script I use to check the md5sums beneath:

       
      ----- isocompare.pl -----
      #!/usr/bin/perl -w
      use strict;
       
      my($image, $device)=@ARGV;
       
      $_ = `isoinfo -d -i $image`;
      my ($blocksize, $volsize) = m/Logical block size is: (\d+).*Volume size is: (\d+)/is;
       
      my $dev_md5 = `dd if=/dev/hdc bs=$blocksize count=$volsize | md5sum`;
      print "disc md5 is \n$dev_md5\n\n";
       
      my $iso_md5 = `md5sum $image`;
      print "image md5 is \n$iso_md5\n\n";
  5. /. is slipping by MalleusEBHC · · Score: 1

    The article has been up for over 20 minutes and still no RTFM followed by a cryptic dd command? For shame.

    1. Re:/. is slipping by LearnToSpell · · Score: 3, Funny

      RTFM n00bz!!

      dd if=/dev/sda | rsh user@dest "gzip -9 >yizzow.gz"

      And then just restore with
      rsh user@dest "cat yizzow.gz | gunzip" | dd of=/dev/sda

      Jeez. Was that so tough?

    2. Re:/. is slipping by rwa2 · · Score: 1

      .... you forgot to

      cat /dev/zero > /frickenlargefillerfile; rm /frickenlargefillerfile

      so the unused fs blocks compress well. /noob/ ;-]

      Well, I'd really use a filesystem backup tool (that way you can restore to an upgraded filesystem / partitioning scheme, as well as not bothering to backup unused inodes). The only thing I ever use dd for is backing up the partition table & MBR:

      dd if=/dev/sda of=/mnt/net/backup/asdf.img bs=512k count=1

      Just remember, after you restore, re-run fdisk to adjust you partition tables if necessary, but more importantly just to tell the linux kernel to re-read the partition table. And if you're using LILO instead of GRUB, you'd probably have to rerun LILO before rebooting from your main disk drive.

    3. Re:/. is slipping by reed · · Score: 1

      Why would you gzip it *after* copying it accross the network??

    4. Re:/. is slipping by Anonymous Coward · · Score: 0

      And why use rsh? Who uses this still? Try SSH.

  6. Lone-Tar. by mikelieman · · Score: 2, Insightful

    Cron based backup with compression/encryption, rewind, bitlevel verify, send email re: success/failure.

    Add a scsi controller, and Drive Of Your Choice, and sleep well.

    --
    Technology -- No Place For Wimps! Grateful Dead and Jerry Garcia Chatroom -- http://www.wemissjerry.org
  7. Simple by talksinmaths · · Score: 0, Redundant
    --
    Don't you have someone you'd die for?
  8. Alternative to backup by jhfry · · Score: 2, Informative

    I have come to the conclusion, that unless a tape backup solution is necessary it is often easier to backup to a remote machine. Sure, archive to tape once in a while, but for the primary requirement of a backup... rsync your data to a seperate machine with a large and cheap raid array.

    I use a wonderful little tool/script called rsnapshot to backup our servers to a remote location. It's fast as it uses rsync and only transmits the portions of files that have changed. It's effortless to restore as the entire directory tree appears in each backup folder using symlinks, and it's rock solid.

    Essentially the best part of this solution is it's low maintenance and the fact that restorations require absolutely no manual work. I even have an intermediate backup server that holds a snapshot of our users home directories... my users can connect to the server via a network share and restore any file that has existed in their home directory in the last week by simply copying and pasting it... changed files are backed up every hour.

    Sure, the data is not as compressed as it could be in some backup solutions, and it's residing on a running server so it's subject to corruption or hack attempts. But my users absolutely love it. And it really doesn't waste much space unless a large percentage of your data changes frequently, which would consume a lot of tape space as well.

    --
    Sometimes the best solution is to stop wasting time looking for an easy solution.
    1. Re:Alternative to backup by Anonymous Coward · · Score: 0
    2. Re:Alternative to backup by Retardican · · Score: 1

      I think your solution is pretty good, but Backup should protect you from all of the following:

      1. Hardware failure: Oops, I just spilled juice all over the motherboard, and shorted the HDD.
      2. Accidents: Oops, I just deleted a file.
      3. Accident discovery: Oops, I deleted a file a week ago, that I didn't mean to.
      4. Accident discovery2: Damn, I need the file I deleted 6 months ago.
      5. Once restored, the file should have all the exact time stamps it did when it was backed up.

      A REAL backup should let you recover from all of the above.

      --
      Will the War in Iraq get better or worse in 2007? Vote here
    3. Re:Alternative to backup by Anonymous Coward · · Score: 0
      1. Hardware failure: Oops, I just spilled juice all over the motherboard, and shorted the HDD.


      rsnapshot backups reside on a different hard drive, on a different computer. No problem.


      2. Accidents: Oops, I just deleted a file.


      Copy it back. No problem.


      3. Accident discovery: Oops, I deleted a file a week ago, that I didn't mean to.


      Assuming you keep your backups at least a week, that's no problem. Just copy the relevant version out.


      4. Accident discovery2: Damn, I need the file I deleted 6 months ago.


      If your backups go back six months, you're in luck. If not, get a bigger hard drive.


      5. Once restored, the file should have all the exact time stamps it did when it was backed up.


      Rsync with the -a flag. Again, no problem.


      A REAL backup should let you recover from all of the above.


      Ergo, rsnapshot is a real backup solution.


      Disclaimer: I authored an article related to it.

    4. Re:Alternative to backup by Retardican · · Score: 1

      Disclaimer: I authored an article related to it.

      Then please link to the article. I would like to know about it.

      --
      Will the War in Iraq get better or worse in 2007? Vote here
    5. Re:Alternative to backup by Alchemist253 · · Score: 1

      My rsnapshot scheme fulfills all those requirements... hourly backups (24 archived), daily (seven archived), weekly (four archived), monthly (six archived). Since the backups are on a remote machine in a different facility it deals with hardware failures very well.

      And grandparent wasn't quite right... the backup uses HARDlinks, not SYMlinks, so restoration is truly effortless (and yes, time/date/gid/uid/mode are all preserved).

    6. Re:Alternative to backup by arth1 · · Score: 1

      rsync can't handle files that are locked or modified during sync, nor can it handle alternate streams and security labels. ACLs and extended attributes only work if the remote system has the exact same users/groups as the source machine.

      Especially the inability to properly handle files that are in use makes it a poor choice for backing up a running system. Unless you can kick off all users and stop all services and scheduled jobs on a machine while the rsync runs, I wouldn't recommend it at all. You may be in for a nasty surprise.

      Regards,
      --
      *Art

    7. Re:Alternative to backup by Anonymous Coward · · Score: 0
  9. Sparse files by Spazmania · · Score: 1

    A comment about sparse files:

    99% of the time there is only one sparse file of any significance on your machine: /var/log/lastlog

    Unless you really care about the timestamp of each users' prior login, you can safely exclude this file from the backup. Following a restore, "touch /var/log/lastlog" and the system will work as normal.

    --
    Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
    1. Re:Sparse files by arth1 · · Score: 1
      99% of the time there is only one sparse file of any significance on your machine: /var/log/lastlog

      Obviously, you don't have database files, nor use p2p, then. (Start the download of a few ISO's with a p2p program, and you'll have gigabytes of sparse file non-data.)
    2. Re:Sparse files by Spazmania · · Score: 1

      Lots of database files. Which database are you using that has huge amounst of empty space in the files?

      As for p2p, no. Then again, I'm not clear why you would even -try- to back up p2p downloads in progress. Seems like prime candidates for exclusion from the backup process.

      --
      Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
    3. Re:Sparse files by arth1 · · Score: 1

      Then again, I'm not clear why you would even -try- to back up p2p downloads in progress. Seems like prime candidates for exclusion from the backup process.
      You exclude files from a backup on a system level, not a user level. You can't go into each and every user's home directory and scan for what can be backed up and what can be excluded. You back that all up. Period.
      If a user can cripple or trash your backup by creating a 2 TB sparse file, then you don't have a viable backup system.

      Regards,
      --
      *Art

    4. Re:Sparse files by Spazmania · · Score: 1

      You must be working in a very different environment than I am. My users have yet to create a large sparse file.

      --
      Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
  10. A quick reply from the author of the article by halfgaar · · Score: 2, Interesting

    Hi there,

    A quick reply from the author of the article before I go to sleep:

    About dump. So, that's a freebsd command? I've always suspected it existed, doing the very thing the man page described, because of the dump field in /etc/fstab. But I have never actually seen a machine which had the dump command... It's possibly not very safe BTW. If it works like DOS's Archive bit, than it can't be trusted: it can be set manually. Some DOS apps even used them as copy protection mechanism...

    The suggestions (for software and the rest). The comments are very much appreciated. I'll investigate them and adjust the article accordingly. For now, I have to sleep :)

    1. Re:A quick reply from the author of the article by adam872 · · Score: 1

      DUMP has existed in various incarnations on various O/S's for eons. I used ufsdump/ufsrestore on Solaris just the other day to recover from a failed root disk on one of our old Sun servers. Worked an absolute treat. Boot from the CD or network (if you have Jumpstart): format, newfs, ufsrestore, installboot, reboot, done....

    2. Re:A quick reply from the author of the article by Jonathan+C.+Patschke · · Score: 2, Informative
      About dump. So, that's a freebsd command? I've always suspected it existed, doing the very thing the man page described, because of the dump field in /etc/fstab. But I have never actually seen a machine which had the dump command... It's possibly not very safe BTW. If it works like DOS's Archive bit, than it can't be trusted: it can be set manually. Some DOS apps even used them as copy protection mechanism...

      Please don't take this the wrong way, but how in the world could you do any sort of proper research for a technical article on backing up Unix systems without having run across the dump command (and its OS-specific variants: ufsdump, xfsdump, efsdump, and AIX backup)? It's not a FreeBSD-specific command. It or a similarly-named variant exists just about everywhere except on Linux. Linux used to have a proper ext2dump, but Linus decided that dump was deprecated because it was too difficult to make it work in the grand new VM/disk-cache subsystems of recent Linux kernels.

      It works nothing like MS-DOS backup programs that used the FAT archive bit. It uses date comparisons and dumps low-level filesystem structures to a storage medium. That means:

      • Backups are not portable between operating systems, and sometimes not between filesystems of different types on the same OS (though this is rarely a concern on a commercial Unix).
      • dump really wants to be run on a quiesced or unmounted filesystem or on a filesystem snapshot.
      • Backups do not contain files from more than one filesystem.
      • Extended attributes (ACLs, etc.) tend to be preserved best with dump because dump has to understand the innards of the filesystem to work, anyway.

      To operate dump, you have dump "levels". Level 0 is a full filesystem dump. Level 1 contains files that changed since the last level 0 dump. Level 2 contains files that changed since the last level 1 dump, and so on. A file /etc/dumpdates contains a log of backup activity and is used for date comparisons when doing dumps at levels other than 0. In a classic tape rotation, you'd do a level 0 dump once a week, a level 1 the rest of the week (to separate tapes), a level 0 dump to a different tape the next week, and rotate through the level 1 backup tapes again.

      Dump and restore are particularly useful for doing system images on systems like Solaris, where the native tar command doesn't always know about extended filesystem attributes.

      --
      Pining for the days when The Glorious MEEPT!!! graced SlapDash with his wisdom.
    3. Re:A quick reply from the author of the article by halfgaar · · Score: 1

      Stupid me, there simply a linux package available... I'll check it out.

    4. Re:A quick reply from the author of the article by halfgaar · · Score: 1

      As I stated in the intro, my experience is with Linux (perhaps I should remove the word "mostly" ...). The lack of dump on Linux can be blaimed for my ignorance. But, I will investigate it, of course.

    5. Re:A quick reply from the author of the article by arth1 · · Score: 1
      The author of the article wrote:
      But I have never actually seen a machine which had the dump command...

      That's ... interesting. I've never seen a system without a dump command. All commercial Unix varieties I've used (SunOS,IRIX,HPUX,AIX,DecOS) have them, and so do GNU/Linux distributions like SuSE and Redhat. The above tidbit of information makes me wonder about the credentials of the author.
    6. Re:A quick reply from the author of the article by halfgaar · · Score: 1

      I don't have any official "credentials"... When I talk about machines, I mean Linux machines. I should clarify the intro about that (the word "mostly" should be removed). Debian, Slackware, Gentoo didn't have it, per default.

      But I don't need crediantials to let this article be useful to anybody, I would say. People can agree or disagree with the points I make of their own accord.

  11. One more thing by halfgaar · · Score: 2, Interesting

    Oh, one more thing, encryption. I was in doubt whether to include it or not. I use different encryption schemes for my backups (LUKS for external HD and GPG for DVD burning), but I decided this can be left to the reader. I may include a chapter on it, after all.

  12. Backup? by Anonymous Coward · · Score: 0

    dd if=/dev/ of=/root/backup && gzip -9 /root/backup

    I would figure that would work if you're looking at 'disaster recovery' type situations for *restoring* the machine to it's previous state. Of course if you wind up with a drive with different geometry after the 'disaster' (and unless this is some kind of 'big' server with a pricetag and support contract to match, you will) then I think you're still boned...

    Personally I just backup the 'critical' files I need to rebuild my system (talking purely workstation here) to alternate storage (external HD, network transfer to another system, USB thumbdrive, etc) because I can rebuild/reinstall the OS... the data in my home directory and the relevant configs for my system are the items I'd rather not lose.

  13. Consistent backups by slamb · · Score: 2, Informative
    This article totally neglects consistency. Recently I've put a lot of effort into getting consistent backups of things:
    • PostgreSQL by doing pg_dump to a file (easiest, diffs well if you turn off compression), pg_dump over a socket (better if disk space is tight, but you send the whole thing every time), or an elaborate procedure based on archive logs. (It's in the manual, but essentially you ensure logfiles aren't overwritten during the backup and that you copy files in the proper order.)
    • Other ACID databases with a write-ahead log in a similar way.
    • Subversion fsfs is really easy - it only changes files through atomic rename(), so you copy all the files away
    • Subversion bdb is a write-ahead log-based system, easiest way is "svnadmin hotcopy".
    • Perforce by a simple checkpoint (which unfortunately locks the database for an hour if it's big enough) or a fancy procedure involving replaying journals on a second metadata directory...and a restore procedure that involves carefully throwing away anything newer than your checkpoint.
    • Cyrus imapd...I still haven't figured out how to do this. The best I've got is to use LVM to get a snapshot of the entire filesystem, but I don't really trust LVM.
    • ...
    • If you're really desperate, anything can be safely backed up by shutting it down. A lot of people aren't willing to accept the downtime, though.

    So you need a carefully-written, carefully-reviewed, carefully-tested procedure, and you need lockfiles to guarantee that it's not being run twice at once, that nothing else starts the server you shut down while the backup is going, etc. A lot of sysadmins screw this up - they'll do things like saying "okay, I'll run the snapshot at 02:00 and the backup at 03:00. The snapshot will have finished in an hour." And then something bogs down the system and it takes two, and the backup is totally worthless, but they won't know until they need to restore from it.

    These systems put a lot of effort into durability by fsync()ing at the proper time, etc. If you just copy all the files in no particular order with no locking, you don't get any of those benefits. Your blind copy operation doesn't pay any attention to that sort of write barrier or see an atomic view of multiple files, so it's quite possible that (to pick a simple example) it copied the destination of a move before the move was complete and the source of the move after it was complete. Oops, that file's gone.

    1. Re:Consistent backups by slamb · · Score: 1
      or see an atomic view of multiple files

      Oops, I meant "consistent" here. "Atomic view" is nonsense.

    2. Re:Consistent backups by Mostly+a+lurker · · Score: 1

      In fairness to the author, while he does not go into the details, TFA does stress the importance of alternative methods for transactional systems such as the ones you are referring to.

    3. Re:Consistent backups by Just+Some+Guy · · Score: 3, Interesting

      The '-L' option to FreeBSD's dump command makes an atomic snapshot of the filesystem to be dumped, then runs against that snapshot instead of the filesystem itself. While that might not be good enough for your purposes, it's nice to know that the backup of database backend file foo was made at the same instant as file bar; that is, they're internally consistent with one another.

      --
      Dewey, what part of this looks like authorities should be involved?
    4. Re:Consistent backups by WuphonsReach · · Score: 1

      Subversion fsfs is really easy - it only changes files through atomic rename(), so you copy all the files away

      I was under the impression that even with FSFS you still needed to use the hotcopy.py script in order to get a guaranteed consistent backup.

      --
      Wolde you bothe eate your cake, and have your cake?
    5. Re:Consistent backups by slamb · · Score: 1
      I was under the impression that even with FSFS you still needed to use the hotcopy.py script in order to get a guaranteed consistent backup.
      I originally thought so, too, but check out this thread. Old revision files are never modified, old revprop files are modified only when you do "svn propset --revision", and new files are created with a unique tempfile name then svn_fs_fs__move_into_place. My backup script does some additional sanity checking (ensures the dir is an fsfs repository of version 1 or 2, etc.) but you can really get away with just copying the files.
    6. Re:Consistent backups by Anonymous Coward · · Score: 0

      I use several backups scheme for different reasons, but the most
      consistent backups I get for bare metal recovery is by using mirror disks.

      1) say hda hdc and hdd.
      2) hda and hdc are active
      3) quick reboot leaving only hda as active raid partition
      4) after reboot raidhotadd hdd
      5) Snapshot is now in hdc (mount partitions directly, not raid)
      6) Means to store that snapshot have already been discussed ;-))

  14. It costs a little money by fishdan · · Score: 1

    We've been using Amazon's S3. It has a great API, pretty easy to use. I was concerned about storing sensitive data there, but we worked out a good encryption scheme (that I won't detail) and now I'm able to really restore everything from anywhere with no notice. My city could sink into the ocean and I could be in Topeka, and I could bring things back up as long as I had a credit card.

    --
    Nothing great was ever achieved without enthusiasm
  15. Re:Backups (with right formatting) by halfgaar · · Score: 1

    OK, my slashdot noobness is revealed. Here's the post again...

    "I think his complaints are no longer relevant. rdiff-backup has a --compare-hash option, though I haven't checked the details. Maybe the author should give it another look.. "

    The hash is stored in the meta information, and the compare option does only that, comparing the live system to your archive. It does not say anything about the change-detection behaviour used during a backup.

    "Besides, if you have an accurate timeserver (you should! time is unbelievably important to software in general!), the timestamp check is pretty safe, barring maliciousness. And if your machine has been compromised, the data coming off it should not be trusted in general. This is just one more case of that."

    No, it's not (safe, I mean). Do this:

    touch a b
    edit a and b to be the same length but different content
    stat a b
    mv b a
    stat a
    a will now have the mtime b had first. mtime+size is not changed, file is not backed up.

    This is a danger in my opinion.

    "Have you looked at brackup? It seems promising, anyway, but I haven't actually tried it. Maybe when it's a little more mature..."

    No, I will have a look. But as I said a few posts below, I'll have to go to sleep now :)

  16. Backup Edge by ArkiMage · · Score: 1

    Encryption, Compression, Bit-Level verification, Bootable disaster recovery, Commercial support...

    http://www.microlite.com/

  17. bash, tar and netcat by kruhft · · Score: 1
    I wrote an article a while back about how to do backups over the network using command line tools. I did it to bounce my system to a bigger hard drive, but I'm sure it could be automated and put to some good use if you wanted. Disaster recovery is as easy as booting with a livecd and untarring.

    backing up your system with bash, tar and netcat

    1. Re:bash, tar and netcat by Anonymous Coward · · Score: 0

      You know, I read the first word of your blog and had already decided it wasn't going to be worth my time read. You never capitalized any of sentences. It makes you look unprofessional, ignorant, and most of all someone who doesn't know how to write. So if you expect to be taken seriously, go back and fix your work.

    2. Re:bash, tar and netcat by kruhft · · Score: 1

      That's unfortunate that you feel that way. I don't have time to fix it, and I guess, I prefer that you not read it. I find content to be more important, and as you can see, I can use capitols when I feel like it. Considering this article made it to the front page of digg, I assumed that it's lack of capitols was undershot by it's actual useful and interesting content.

      It's good to see Anonymous Cowards bitching about other people's work. Maybe you could post some of yours so we can pick it apart.

    3. Re:bash, tar and netcat by jesboat · · Score: 1
      The GP said:

      You know, I read the first word of your blog and had already decided it wasn't going to be worth my time read. You never capitalized any of sentences.


      I haven't read your blog, but I'm assuming it is as described. Both of those sentences are almost certainly true and aren't things you can argue about (fact and opinion presented as such.)

      It makes you look unprofessional, ignorant, and most of all someone who doesn't know how to write. So if you expect to be taken seriously, go back and fix your work.


      These statements are on shakier ground, but they're still true. You may not like it, but there is a stereotype of professionalism, and that stereotype does include grammatically correct writing. Some people will most certainly not take you seriously because of your presentation. (Whether or not that's what they should be doing is another question.)

      In short, the GPs points, while they could have been presented a bit more nicely, were valid.

      -------

      That's unfortunate that you feel that way. I don't have time to fix it, and I guess, I prefer that you not read it. I find content to be more important, and as you can see, I can use capitols when I feel like it. Considering this article made it to the front page of digg, I assumed that it's lack of capitols was undershot by it's actual useful and interesting content.


      That's all fine too; if you care about the part of your audience that will disregard grammar, others can think that's a bad decision, but that's all they can do, for it's certainly your decision to make.

      It's good to see Anonymous Cowards bitching about other people's work. Maybe you could post some of yours so we can pick it apart.


      This, however, is just plain immature and is the reason I reply. (Really? Wow.)

      In order to make valid points, perhaps-constructive criticism (not bitching), the GP should post non-anonymously? Should post some of his own work? Do you honestly think his identity or credentials should have an impact on facts he notes about your work?
    4. Re:bash, tar and netcat by kruhft · · Score: 1

      I'm not a professional; I don't get paid to to write articles on how to back up your system, nor do I get paid for the music I write and artwork I make. If I was, I'd make very sure that what I did was perfect, since I was doing it and being a professional. This article was not, but contained very useful info that I happened to write on a day when I didn't feel like holding down the shift key.

      As for my audience, I write for those that are interested in information and not bitching about grammar. It's funny how people just completely disregard things right off the bat for little reasons. Maybe I post my articles to filter out such people; you can never be quite sure. They're the ones that lose out and I couldn't care less.

      And yes, he should post non-anon, because if he did, he probably wouldn't have bitched like he did, since hiding behind AC makes it a lot easier to say stupid things.

  18. Re:Backups (with right formatting) by slamb · · Score: 1
    The hash is stored in the meta information, and the compare option does only that, comparing the live system to your archive. It does not say anything about the change-detection behaviour used during a backup.

    True, but my assumption (which again, I haven't checked) is that they wouldn't have stored this hash if they weren't doing something with it. I don't think the sanity check uses any information that's not gathered for normal operation.

    [Time-based checking is not safe...touch example]

    True. Your backup from before the move will be correct, so if you were to catch this before you got rid of the pre-move increment, you'd have a way to recover manually. I assume you're talking about after that. Yeah, there's a problem, but I'd say it's an incredibly minor failure when compared to not having incremental backups at all. I've sometimes gone for several backup cycles before realizing anything was wrong, so it would be difficult to convince me they're not worthwhile.

    Do you have a real-world example where this might happen? The best I've got is moving messages in Cyrus IMAP - if they were both placed in the same second and have the same length, and you moved one away and the other into its folder before any other mail arrived there, I guess this would happen. I just consider that sequence pretty unlikely, and the consequences not too severe.

  19. I know it's not GPL kosher but... by Anonymous Coward · · Score: 0

    Ghost (as in Symantec/Norton/whoever owns it these days) works to backup a linux partition just fine. I know, I know, it's not free as in beer, speech, or ipod, but if you do already have it around, it does the job fine.

    1. Re:I know it's not GPL kosher but... by notanatheist · · Score: 1

      g4u = ghost for unix. Amanda, Mondo, Bacula, rsync, plenty of options. Seems a rather silly question to me. There's this place associated with Slashdot called SOURCEFORGE which is just great for looking up stuff. Of course searching for 'backup' only give 567 options.

    2. Re:I know it's not GPL kosher but... by AWhistler · · Score: 1

      So use PartImage. It is available on RIPLinuX, SystemRescueCD, Knoppix Live CD and many others. Use dd to save the MBR, sfdisk to save the partition, and partimage on all the partitions. This is pretty much what g4u does. I haven't figured out how to get LVM's to work with this, but it will work with Windows and Linux partitions. Like Ghost, the image is intended to be restored to the original hardware, but can be moved to other hardware if you're willing to put up with the headaches of reinstalling some stuff afterward (device drivers, etc) after reboot.

      I use this on all my PC's at home, and it works great. I use a bootable CD, but if you set up PXE booting, you can schedule your PC to backup at specific times. Individual file restoration is not possible nor are incremental backups, however.

    3. Re:I know it's not GPL kosher but... by DarcZide · · Score: 1

      Now 568 results :)

      --
      That was either the start of something bad or the end of something stupid. -Bun Bun
    4. Re:I know it's not GPL kosher but... by Slashdot+Parent · · Score: 1

      Ahh, but how do you know which of those 567 options to trust your precious data with? Or do you just keep 567 backups?

      --
      They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
  20. It's mentioned, just buried by Beryllium+Sphere(tm) · · Score: 1

    Section 4 brings up the issue of data files from running applications and agrees with your recommendation of pg_dump or shutting down to do the backup.

    Section 7 recommends syncing and sleeping and warns "consider a tar backup routine which first makes the backup and then removes the old one. If the cache isn't synced and the power fails during removing of the old backup, you may end up with both the new and the old backup corrupted".

  21. Random thoughts by digitalhermit · · Score: 1

    I deal with some aggregate 2 terabytes of storage on my home file servers. What works for me won't work for an enterprise corporate data center, but maybe some things are useful...

    I think the article does a good job of explaining how to backup, but maybe just as important is "why?". There are some posts that say put everything on a RAID or use mirror or dd. What they fail to address is one important reason to backup: human error. You may wipe a file and then a week later need to recover it. If all you're doing is mirroring or RAID, no matter how reliable, your backups are worthless.

    There's also different classes of data. I have gigabytes of videos. Some are transcoded DVDs, some are raw footage. If I lose all my transcoded DVDs it's not as critical as if I lost raw footage. Why? The DVDs can be re-ripped. It will take a long time but the data can be recreated. For the raw footage it's different, even if I keep the original Mini-DV tapes, because re-recording the video from tape won't guarantee that the file is identical. If the file is different then the edits will be different. Then there's also mail spools, CVS, personal files, etc..

    What I've found is that I archive my DVD rips once every few months. Other stuff is backed up once a week to another file server.

    I could care less about the OS. THe file server runs FedoraCore5. The only thing I keep is the Kickstart file so that I can rebuild it within a matter of minutes then restore the data from archives. This is just a matter of copying a samba configuration and restarting.

    For the web server, all content is kept within CVS. If the web server fails, it's just a matter of rebuilding the image and pulling the latest copy from CVS. Fifteen minutes to re-image the OS. Five minutes to pull down the latest content.

    For DNS, initial configuration for 8 domains is done by a perl script that auto-creates the named.conf and all zone files. Then I just append the host list to the primary domain. Ten minutes at most.

    Home directories are centralized on a file server using OpenLDAP and automounts. One filesystem to backup makes it easy.. By being easy it means it gets done automatically.

    Other "machines" are virtual and these are copied to DVD whenever something drastic changes (e.g., major upgrade).

    1. Re:Random thoughts by Anonymous Coward · · Score: 0
      ... I could care less about the OS. THe file server runs FedoraCore5. The only thing I keep is...

      You could care less? Then why don't you care less? Are you saying you do actually care what OS the file server runs? Does our sentences make senses?

  22. Nonsense by gweihir · · Score: 1

    What can I say? I just did two successful system restores today from my "tar cjlf /" created system backups. I did several more in the last few years. Never had problems. I think this guy is just trying to sound mysterious and knowledgeable....

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    1. Re:Nonsense by gweihir · · Score: 1

      ... of course that would be "tar cjlf target_file.tar.bz2 /". ...

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    2. Re:Nonsense by sl3xd · · Score: 1

      I agree completely. I work for a company whose method of Linux installation is frequently... boot CD, unpack tarball. It takes a bit of care to make sure you don't mangle the permissions & other metadata, but it's not that mystical.

      The article also has outright falsehoods in it: For instance, ReiserFS can be configured to do data journaling (it just doesn't call it that), and has had this ability for quite some time now. And IIRC, ReiserFS4 can't be configured to disable data journaling.

      It's odd how passionate people can get about their filesystem of choice. It's almost like the author has a bone to pick. When you have the opprotunity to power cycle 1,000 identical systems at the same time (just yank the power -- none of that graceful writing out buffers to disk crap), it's fairly easy to see patterns emerge. To be honest, I haven't seen ReiserFS to be more or less reliable than ext2/3.

      --
      -- Sometimes you have to turn the lights off in order to see.
  23. www.bacula.org by Penguinisto · · Score: 2, Insightful
    Bacula, baby!

    Works fine with my autoloaders, and it's open source.

    /P

    --
    Quo usque tandem abutere, Nimbus, patientia nostra?
    1. Re:www.bacula.org by WuphonsReach · · Score: 1

      And moderately difficult to install... Don't get me wrong, it's our platform of choice and I'm working on setting up a central backup server using it. But I reckon that I still have a few hours of reading before I'll have it up and running and making backups.

      (OTOH, I prefer it that way in the long run, because it forces me to learn the ins/outs of the system. Which is better then click-click-click-done and then not knowing how to fix it when things go pear-shaped.)

      --
      Wolde you bothe eate your cake, and have your cake?
    2. Re:www.bacula.org by Penguinisto · · Score: 1
      Actually, it's not as bad as it first appears. When I first eyeballed it (and was looking at alternatives that weren't so OSS), I realized even then that it was worth the time I spent learning it - the tech support call savings alone would be well beyond valuable, let alone the price tag (free!) :) ).

      The docs onsite are pretty valuable, and they walk you through setup nicely. Installation isn't too bad; even a default MySQL or PostgreSQL installation on the box can be prepped and ready to go with the provided scripts, or you can just use the built-in DB.

      My only real gripes so far are that:

      • I had to learn more about postgres (my fault). Even though Bacula comes with an internal DB engine if you choose it, I figured that I already somewhat knew how to support pgsql, so I chose the devil I knew as opposed to the one I didn't.
      • Needs a few more pre-built tools for certain jobs. I have scripts (based heavily on mt and mtx) that do quite a few jobs, but I'd lvoe to see a binary that can automatically make bootstrap CD's, and perhaps a standalone brestore-based set of tools - that would be nice.
      • I have a couple 'doze servers buried in a huge Linux-centric environment. It's not tough to setup or run on any of them really (on Linux and FBSD it's a frickin' breeze, client-side), but I can see most MCSE's scratching their heads going "WTF!?" a lot if they tried it. Prolly has to do w/ the *nix-like roots of the thing.

      OTOH, if my company had went out and bought Legato, Bakbone, or one of the big boys, it would've required learning it fairly well and paying a ginormous pricetag. Also, if things ever go too far south, I know Bacula well enough now that I can handle it, pretty much no matter what "it" is.

      The best part is, it runs just fine on a Dell PV-124T (16-tape LTO-2), a Dell PV-132 (~28-tape LTO-3), as well as some ungodly old DLT IV kit I have stashed off to one side but can't seem to quite get rid of (yet).

      YMMV as usual, though :)

      /P

      --
      Quo usque tandem abutere, Nimbus, patientia nostra?
    3. Re:www.bacula.org by WuphonsReach · · Score: 1

      We used to use NovaStor... but that has never worked well on the Windows boxes. So now I'm setting up a 1.3TB 4-disk RAID10 server (expandable to 2.6TB) and we're going to use Bacula for the Unix/Linux boxes and to backup the data on the Windows servers as well. There's also a set of 500GB IDE drives that we take offsite weekly that are on a WinXP box that I have to work into the equation. The amount of data that we have to backup daily is about 200GB but only a few percent changes daily.

      All this just happens be occuring at the same time that I'm setting up Xen, central log server, a subversion server, etc. So I'm a bit overwhelmed this week.

      Tape drives have proven to be too problematic for us (4mm DAT then Sony's 50GB tape drive). So instead we're going with a central backup "vault" server with hard drives as the offsite component (6+ drives in rotation). And we're slowly moving more and more files into a version control system (SVN) which eliminates a lot of the "oops" that would require us to pull a backup tape for a quick restore.

      --
      Wolde you bothe eate your cake, and have your cake?
  24. Informative, well written by dave562 · · Score: 1

    The article addressed a question that has been nagging at the back of my mind but I haven't gotten around to figuring out the answer to. I like the way that the article is to the point, and very in depth. The author does a good job of explaining the various aspects of the files and the importance of preserving them, and then goes on to detail the steps necessary to preserve them.

    1. Re:Informative, well written by SawanGupta · · Score: 1

      Disaster Recovery Plans depends on user to user. Depends on what all you want to backup. For e.g. I wont like to backup permissions on some files (maybe MP3s) but may need for some of them.

      This Article explains many things which I hope will be very useful to many of us.

      Good Job Dude.

  25. Little to say... by evilviper · · Score: 1

    He is complaining about people who suggest backing-up with "tar cvz /" but really, the only thing missing is the: "p". I use it extensively and it just works (not for databases, but that should go without saying).

    In order to ensure I'm never in a tough spot, I made a custom bootable image using my distro's kernel and utilities. Then I made a bzip2 -9 compressed tar backup of my notebook hard drive, which is just small enough to fit on a single CD... (With DVD-Rs these days, the situation is even better).

    After burning it all to CD, I restored from it, and it's still working perfectly to this day. Now I can be sure that no matter what unforseen events happen, I'll never be stranded with a non-working notebook due to software problems, and a CD is notably lighter than carrying a second, backup notebook.

    --
    Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    1. Re:Little to say... by halfgaar · · Score: 1

      You also have to specify --numeric-owner...

    2. Re:Little to say... by evilviper · · Score: 1

      Helpful in certain situations, but not necessary.

      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    3. Re:Little to say... by halfgaar · · Score: 1

      Depends on the situation. If you're restoring from another system, such as a live CD, --numeric-owner is vital, or users can be messed up. All the GIDs en UIDs of the system you're using to restore are not identical to the ones of the system your restore. Therefore, name matching by string is not reliable.

    4. Re:Little to say... by evilviper · · Score: 1
      If you're restoring from another system, such as a live CD,
      ...which I doubt many people do. And even if they do, chances are that passwd file will be empty (except for root) anyhow.

      Still, it's really not worth arguing. It doesn't change the point of my post at all.
      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
  26. Another very key point which was missed by btarval · · Score: 1
    That was a good article. However, there was one very key point which was missed. Specifically, the importance of using Open Source tools. Which this might be implied by the references, the author (like most people) have never faced a disaster situation where Open Source was the only way to do the backup and recovery.

    Here's a real life case in point that I came across with a Fortune 500 company. This company had recently aquired a small startup, who's system administration skills were lacking. Before moving to the new facilities after being aquired, one of the V.P.'s told people not to bother copying their home directories from the server. They then proceeded to simply shut off the power to the server without doing a proper shutdown.

    The server was using a Reiser FS, and this filesystem was spectacularly fried when it arrived that the big company. Tons of supposedly petabyte file sizes, and even more in the Terabyte range (clearly exceeding the size of the disk array). In the end, over 12,000 files and directories would end up under lost+found.

    Anyway, it was decided to backup the filesystem before attempting to recover the files. Absolutely everything broke when trying to do this, as Linux doesn't handle petabyte (or even terabyte) files properly. There are subtle problems with all of the utilities (find, ls, cp, cpio, and tar, to name just a few). While this isn't surprising, when you're trying to make a backup, it presents a serious problem.

    Were we dependent upon a closed-source solution, we would've been seriously stuck waiting for a fix.

    In the end, I actually had to modify GNU tar to handle these problems. This was particularly amusing, as tar handn't been modified in years. But it was the only way out of the situation in a timely fashion.

    The point here is that you have this option for really nasty disaster scenarios if you are familiar with Open Source tools.

    --
    The best way to predict the future is to create it. - Peter Drucker.
    1. Re:Another very key point which was missed by Barnoid · · Score: 1
      Anyway, it was decided to backup the filesystem before attempting to recover the files. Absolutely everything broke when trying to do this, as Linux doesn't handle petabyte (or even terabyte) files properly. There are subtle problems with all of the utilities (find, ls, cp, cpio, and tar, to name just a few). While this isn't surprising, when you're trying to make a backup, it presents a serious problem.

      In the end, I actually had to modify GNU tar to handle these problems. This was particularly amusing, as tar handn't been modified in years. But it was the only way out of the situation in a timely fashion.


      Dude, next time use 'dd' to backup a broken filesystem you might want to restore 1:1.
    2. Re:Another very key point which was missed by btarval · · Score: 1
      dd wasn't an option as we didn't have enough free disk space for making such an image. We'd have had to have either set up another large RAID array, or have bought a new NAS server. Both would take time just to get the approval; and in the case of the NAS server, it would be a significant amount of time.

      And time was of the essense, because having a bunch of engineers sitting around waiting for their files adds up to a significant amount of money.

      IT departments in large companies are a little funny, in that they'd prefer to go with the NAS server. Mostly because of job security in case things go wrong. And it was the home-grown RAID array which triggered this mess in the first place.

      --
      The best way to predict the future is to create it. - Peter Drucker.
    3. Re:Another very key point which was missed by cr0sh · · Score: 1

      I won't pretend to know the situation, because you were there and I wasn't, but from your description it doesn't sound like the problem was a "home-grown RAID array" which triggered the mess. What triggered the mess was a failure to follow a good process for the move. The fact that they didn't allow the copying of the home folders to each user's desktop was the first mistake, the second mistake was just shutting off the power instead of performing a proper shutdown. While a real RAID array sub-system would likely have handled that much more gracefully (even so, in such a situation you still wouldn't just cut the power), and a proper backup solution for the data instead of copying the data to the desktops would be preferred, the improper steps (for shutdown and data retention) the company took ultimately caused the end result, and not the way the system was implemented. With all that said, though, it does sound like the system ultimately was a fragile one, and it is amazing you came up with such a unique method of restoring the fubared data. Good points for Open Source!

      --
      Reason is the Path to God - Anon
    4. Re:Another very key point which was missed by btarval · · Score: 1
      I agree completely; you are quite correct. Thank you for pointing that out.

      It was clearly a failure of process here. Having built my own home-grown RAID systems from scratch, I find them quite useful. Like any system, incorrect usage will lead to problems. Such was the case here.

      Indeed, one of the options was to build one simply for the storage here. Had I been guaranteed reimbursement for this, one could've been put together in a day or so.

      Unfortunately, the reimbusement was an issue.

      --
      The best way to predict the future is to create it. - Peter Drucker.
  27. Depends on the Skill of the Admin by Anonymous Coward · · Score: 0

    It all depends on the skill of the admin, personally, I get by with

    tar czf backup_$date.tar.gz /etc /var /home, and burn to DVD

    there is about 90% of the system backed up. I don't bother with apps, since those are generally archived. Most config stuff is properly placed in /etc somewhere.

    For 'troublesome' apps, thats where rpm comes in, or better yet, apt-get (ie: anything built from source, with custom mods).

    This works if the admin is skilled enough. For those who simply don't have the skills/time/energy, there is always Amanda/snapshots/Ghost, etc.

    Depends on how much time, money, storage space one has after all.

  28. Hey thanks... by Seraphim_72 · · Score: 2, Funny
    My buddy Halfgaar finally got sick of all the helpful users on forums and mailing lists...
    Hey thanks, Fuck You too.

    Signed
    The Helpful People on forums and mailing lists
    --
    Slashdot, where armchair scientists get shouted down and armchair theologians get modded up.
  29. Arguably worthless by swordgeek · · Score: 3, Insightful

    When you work in a large environment, you start to develop a different idea about backups. Strangely enough, most of these ideas work remarkably well on a small scale as well.

    tar, gtar, dd, cp, etc. are not backup programs. These are file or filesystem copy programs. Backups are a different kettle of fish entirely.

    Amanda is a pretty good option. There are many others. The tool really isn't that important other than that (a) it maintains a catalog, and (b) it provides comprehensive enough scheduling for your needs.

    The schedule is key. Deciding what needs to get backed up, when it needs to get backed up, how big of a failure window you can tolerate, and such is the real trick. It can be insanely difficult when you have a hundred machines with different needs, but fundamentally, a few rules apply to backups:

    For backups:
    1) Back up the OS routinely.
    2) Back up the data obsessively.
    3) Document your systems carefully.
    4) TEST your backups!!!

    For restores:
    1) Don't restore machines--rebuild.
    2) Restore necessary config files.
    3) Restore data.
    4) TEST your restoration.

    All machines should have their basic network and system config documented. If a machine is a web server, that fact should be added to the documentation but the actual web configuration should be restored from OS backups. Build the machine, create the basic configuration, restore the specific configuration, recover the data, verify everything. It's not backups, it's not a tool, it's not just spinning tape; it's the process and the documentation and the testing.

    And THAT'S how you save 63 billion dollar companies.

    --

    "People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
  30. Righteous Backup by krappie · · Score: 1

    I like to use Righteous Backup: http://www.r1soft.com/

    Its new, but it shows a lot of promise. It uses a kernel module to take consistent backups of partitions at the file system block level and store them on a remote server. The cool part is, it tracks changes. If you haven't rebooted your machine since the last backup, it takes a few seconds to send the changed blocks and almost no CPU usage. It can also interpret the file system in any incremental backup to restore individual files. Not to mention backup up the entire MBR and partition tables of disks. I don't think anything like it currently exists for linux servers.

  31. dirvish by Sfing_ter · · Score: 1

    Dirvish written in perl and using rsync it is a fast disc to disc backup. enjoy.

    --
    A computer once beat me at chess, but it was no match for me at kick boxing. Emo Philips
  32. mdadm by jab · · Score: 1

    My personal favorite is to swap a pair of disks in and out of a super-redundant RAID
    every week. Simple, predictable, and works fine in the face of lots of small files.


    # cat /proc/mdstat
    Personalities : [raid1]
    md1 : active raid1 sdf1[6] sdb1[4] sdd1[3] sdc1[2] sde1[1]
                488383936 blocks [6/4] [_UUUU_]
                [============>........] recovery = 61.6% (301244544/488383936) finish=231.7min speed=13455K/sec

    # mount | grep backup
    /dev/sdg1 on /backup type reiserfs (ro)

    1. Re:mdadm by Logreybaby · · Score: 1

      WTF is "super-redundant" RAID?
      Is that like level 1111?

  33. My short opinion - Two commercial systems by Anonymous Coward · · Score: 0

    After the plethora of stinky windows based backup systems in the mid to late 90's . . .

    . . . I used Legato as both client and server, on both windows and linux. None of the possible combinations ever gave me any real joy. It backed stuff up, and test restores went ok, if you consider how it had to work o.k. I lived with it for 4 or 5 years.

    Found http://www.arkeia.com/. Never been happier. I run the server on linux. Using autoloader stacks. First set of *sane* backup configuration patterns I've had the pleasure to work with. I don't suggest the windows "JUI". Not my thing anyway, but still.. I look forward to them getting that part right, if they haven't already in the latest version.

    If your looking for non-commercial.. I just use tar and netcat similar to as described in posts above. I've used a few of the backup tools available via the Debian repository, but most really do *more* than what I really want for my personal systems.

  34. xfsdump by Anonymous Coward · · Score: 0

    At work I use SGI IRIX boxes with the (default) XFS filesystem. xfsdump is fantastic for dumping whole filesystems. You can periodically do a level 0 (full) backup and then subsequently do incremental backups from that. I have used it to restore full systems a couple of times and they have worked flawlessly.

    I also use XFS and xfsdump on a Debian stable box which is a large media server and I regularly do test a restore and the md5 sums of all the files have been just right :)

    XFS and xfsdump rock my world.

  35. Re:Backups (with right formatting) by halfgaar · · Score: 1
    True, but my assumption (which again, I haven't checked) is that they wouldn't have stored this hash if they weren't doing something with it. I don't think the sanity check uses any information that's not gathered for normal operation.

    The hash information feature was included after I suggested a feature for hash-change-checking. The hash is already stored, because that was easy to do, but the change checking never got implemented.

    True. Your backup from before the move will be correct, so if you were to catch this before you got rid of the pre-move increment, you'd have a way to recover manually. I assume you're talking about after that. Yeah, there's a problem, but I'd say it's an incredibly minor failure when compared to not having incremental backups at all. I've sometimes gone for several backup cycles before realizing anything was wrong, so it would be difficult to convince me they're not worthwhile. Do you have a real-world example where this might happen? The best I've got is moving messages in Cyrus IMAP - if they were both placed in the same second and have the same length, and you moved one away and the other into its folder before any other mail arrived there, I guess this would happen. I just consider that sequence pretty unlikely, and the consequences not too severe.

    It is very unlikely, yes. I know incremental backups can be very handy, so it's perfectly logical people find it's worth the "risk". I don't know of any concrete significant real world examples, I just don't like situations where the theory shows that the backup may miss files.

    I know it's kind of an extreme example, but would you go up in an airliner of which you know the software would crash at certain speed, angle and fullconsumption combination?

  36. Oh, so many problems... by SanityInAnarchy · · Score: 2, Informative

    So his complaint about GNU Tar is that it requires you to remember options... Just look at his Dar command! Seriously, I just do "tar -cjpSf foo.tar.bz2 bar/ baz/" and it just works. And since you should be automating this anyway, it doesn't matter at all.

    There is also a separate utility which can split any file into multipile pieces. It's called "split". They can be joined together with cat.

    As for mtimes, I ran his test. touch a; touch b; mv b a... Unless the mtimes are identical, backup software will notice that a has changed. This is actually pretty damned reliable, although I'd recommend doing a full backup every now and then just in case. Of course, we could also check inode (or the equivalent), but the real solution would be a hash check. Reiser4 could provide something like this -- a hash that is kept current on each file, without much of a performance hit. But this is only to prevent the case where one file is moved on top of another, and each has the exact same size and mtime -- how often is that going to happen in practice?

    Backing up to a filesystem: Duh, so don't keep that filesystem mounted. You might just as easily touch the file metadata by messing with your local system anyway. Sorry, but I'm not buying this -- it's for people who 'alias rm="rm -i"' to make sure they don't accidentally delete something. Except in this case, it's much less likely that you'll accidentally do something, and his proposed solutions are worse -- a tar archive is much harder to access if you just need a single file, which happens more than you'd expect. We used BackupPC at my last job, but even that has a 1:1 relationship between files being backed up and files in the store, except for the few files it keeps to handle metadata.

    No need to split up files. If you have to burn them to CD or DVD, you can split them up while you burn. But otherwise, just use a modern filesystem -- God help you if you're forced onto FAT, but other than that, you'll be fine. Yes, it's perfectly possible to put files larger than 2 gigs onto a DVD, and all three modern OSes will read them.

    Syncing: I thought filesystems generally serialized this sort of thing? At least, some do. But by all means, sync between backup and clean, and after clean. But his syncs are overkill, and there's no need to sleep -- sync will block until it's done. No need to sync before umount -- umount will sync before detaching. And "sync as much as possible", taken to a literal extreme, would kill performance.

    File system replication: You just described dump, in every way except that I don't know if dump can restrict to specific directories. But this doesn't really belong in the filesystem itself. The right way to do this is use dm-snapshot. Take a copy-on-write snapshot of the filesystem -- safest because additional changes go straight to the master disk, not to the snapshot device. Mount the snapshot somewhere else, read-only. Then do a filesystem backup.

    "But the metadata!" I hear him scream. This is 2006. We know how to read metadata through the filesystem. If you know enough to implement ACLs, you know enough to back them up.

    As for ReiserFS vs ext3, there actually is a solid reason to prefer ext3, but it's not the journalling. Journalling data is absolutely, completely, totally, utterly meaningless when you don't have a concept of a transaction. I believe Reiser4 attempts to use the write() call for that purpose, but there's no guarantee until they finish the transaction API. This is why databases call fsync on their own -- they cannot trust any journalling, whatsoever. In fact, they'd almost be better off without a filesystem in the first place.

    The solid reason to prefer ext3 is that ReiserFS can run out of potential keys. This takes a lot longer than it takes ext3 to run out of inodes, but at least you can check how many inodes you have left. Still, I prefer XFS or Reiser4, depending on how solid I need the system to be. To think that it comes down to "ext3 vs reiserfs" means this person has obviously never looked at the sheer number of options available.

    As for network backups, we used both BackupPC and DRBD. BackupPC to keep things sane -- only one backup per day. DRBD to replicate the backup server over the network to a remote copy.

    --
    Don't thank God, thank a doctor!
    1. Re:Oh, so many problems... by Anonymous Coward · · Score: 0

      On certain old systems, sync doesn't block ...

  37. The best solution... by Anonymous Coward · · Score: 0

    is not to backup at all. Backup and restore are time and resource consuming tasks and should be avoided. Redundance minimizes the risk of data loss in case of drive failure or power loss. Paranoid sysadmins mirror the entire system up to third redundancy level and perform regular backups as well, just in case the devil's horns appear...

  38. Backup2l by sacx13 · · Score: 0

    Is anyone using backup2l? Is coming with Debian distron and I'm very happy with it. It's a little bit slower at recovery but it's not that the point.
    Regards

  39. Just The Files (For Linux) by Q7U · · Score: 1

    On my personal Arch Linux system at home, I prefer to simply backup my home directory and the xorg.conf configuration file. Linux is fast and easy to reinstall (at least Arch and Slackware is), so I don't really worry about bare metal recovery. Windows, which I also like to run, takes forever to install and is far more likely to have problems. That's where I am interested in bare metal recovery.

  40. rsync-backup by zero-g · · Score: 1

    I backup using rsync-backup to another hdd. I wrapped that in a simple perl script and use it to backup only the most essential of directories: /etc, /home, /root, etc. I've found that there's no real need to back up every file on your system -- they'll just get reinstalled anyway. For larger collections of files (eg. mp3s), I'd recommend another hard drive and just rsync it nightly... or RAID.

    1. Re:rsync-backup by zero-g · · Score: 1

      Sorry, misspoke: rdiff-backup.

  41. deep sigh by rpeppe · · Score: 1
    i'm now using linux after having been on a plan 9 system for years, and i really, really, really miss venti and fossil.

    oh the joy of having archival snapshots of each day, instantly available.

    most of all i miss singing along to yesterday.

  42. The problem with dump. by Ayanami+Rei · · Score: 1

    Can't use it on a live filesystem. No guarantee.
    Now if you use a volume manager you can create snapshots and back those up instead. Unfortunately most filesystems don't have a way of being told that a snapshot is being taken, and to checkpoint themselves. With the exception of XFS. I think there's a patch for ext3 to do this as well, but I don't know which distros include it by default.

    I am of the opinion that the safest route is to do a backup at the mounted level of the filesystem from a snapshot from userspace (with any database services quiesced, or not backed up by this procedure). Pick the tool that preserves metadata best and has good support for incrementals (stores CRCs if possible).

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
    1. Re:The problem with dump. by kamochan · · Score: 1

      Unfortunately most filesystems don't have a way of being told that a snapshot is being taken, and to checkpoint themselves.

      FFS in *BSD has had snapshot support for some time. Although I never had problems restoring FFS's dump'd in the pre-snapshot days (and yes, I had to recover from a few complete RAID array blow-outs in very actual practice... flaky MegaRAID controllers, and whatnot... ugh).

      dump(8) rules.

      Also I have never quite understood the Linux XYZFS of the day. Sure, FFS ain't the latest and greatest, but it's quite darn robust and snappy enough for most practical uses, and basic stuff like restores work, and work reliably. So why no FFS in *nux?

  43. http://linuxgazette.net/114/kapil.html by Ivan+Matveich · · Score: 1

    Linux's device mapper can snapshot a block device, which you can then write to your backup medium of choice.

  44. Re:Backups (with right formatting) by arth1 · · Score: 1
    No, it's not (safe, I mean). Do this:

    touch a b
    edit a and b to be the same length but different content
    stat a b
    mv b a
    stat a
    a will now have the mtime b had first. mtime+size is not changed, file is not backed up.

    This is a danger in my opinion.


    Why on earth would you look at the mtime? That's what ctime is for!
    % echo foo >a
    % echo bar >b
    % stat a b | grep -v Uid
      File: `a'
      Size: 4 Blocks: 8 IO Block: 4096 regular file
    Device: 343h/835d Inode: 18437034 Links: 1
    Access: 2006-10-13 13:26:51.360667383 -0400
    Modify: 2006-10-13 13:26:51.360667383 -0400
    Change: 2006-10-13 13:26:51.360667383 -0400
      File: `b'
      Size: 4 Blocks: 8 IO Block: 4096 regular file
    Device: 343h/835d Inode: 18437045 Links: 1
    Access: 2006-10-13 13:26:54.640489932 -0400
    Modify: 2006-10-13 13:26:54.640489932 -0400
    Change: 2006-10-13 13:26:54.640489932 -0400
    % mv b a
    % stat a | grep -v Uid
      File: `a'
      Size: 4 Blocks: 8 IO Block: 4096 regular file
    Device: 343h/835d Inode: 18437045 Links: 1
    Access: 2006-10-13 13:26:54.640489932 -0400
    Modify: 2006-10-13 13:26:54.640489932 -0400
    Change: 2006-10-13 13:27:52.041384599 -0400
    Note the new ctime.
  45. My personal backup solution... by cr0sh · · Score: 1
    I have a backup solution I use at home which saved my butt once after I fubar'ed my server with a bad Debian update (was trying to do an update to Woody, but they had already switched things over to Sarge, and things got really messed up). While it isn't something that would be scalable for business (ah, who am I kidding - do not use this in a real IT department, please!), it has worked pretty well for me at home on my small network.


    Basically, each workstation runs a cron job (or under Windows, task manager, IIRC) at a certain time at night (each WS is staggered to start at a different time to avoid overloading the small server on the network or hard drive throughput), which kicks off a batch file to mount a SMB share and copy the certain directories (mainly documents and development stuff, along with things like workstation email and such) off the workstation and over to the SMB server (which also functions as a web and database server). Then, at a different time (after all the workstations have copied), the server kicks off its own cron job to copy those directories, and others on the server (database images, config files for smb, apache, php, mysql and postgreSQL, mainly) and create an ISO9660 image of those files. This ISO image is then stored in another directory, along with the last 7 days of ISOs. Periodically I make a backup of the last ISO to a CD or DVD.


    This works well for my purposes at home. At one time, I had things set up so that the server would automatically burn the ISO to a CD, but due to the location of my server (in a non-climate controlled, dusty attached workshop at my house), the CD burner didn't last long and died, refusing to burn CDs properly. Instead, I just burn the images from my workstation by mounting the directory as an SMB share.


    This system has served me well for almost two years now, and like I said, it saved my butt. After I hosed my server (Debian Woody, remember), I ended up installing Mandrake 10.1 on it (all gui options turned off, mind you - just running CLI here), then took my last backup CD image and copied the data over from that (along with restoring my MySQL database from the dump on the ISO). A couple of evenings of work and I was done, and had the new system up and running perfectly as if nothing had happenned (and I got an upgrade to everything as well!). The scary thing was the fact that I had never done a "full test" of my backup strategy (in a business IT environment, this is a big no-no) - but it passed with flying colors. My backup system continues to run, with minor tweaks and additions here and there, but it has proved itself "under fire", and I am fine with it so far.

    --
    Reason is the Path to God - Anon
  46. samba and rsync by Anonymous Coward · · Score: 0

    I use samba on a fileserver and all the users' computers use roaming profiles, so all the data is in one place. They are all unprivileged users and can't save to C:\. There's no need to aggregate the data at the end of the day. The server uses RAID 5 incase there's a disk failure.

    Every night the fileserver rsyncs to an offsite computer (uses ssh). The backup computer makes hardlink copies of the old (yesterdays') data and then unlinks and stores any new (todays') data. Actually it stores the last 30 days worth of changes and only uses up about 120% of the space that a single backup would (this depends on the amount of changes). For the 100GB or so of data we store, the backup takes about two minutes to complete because rsync transfers the changed "blocks" only.

    If a user erases his Word document, I can call it back with scp backup_server:/daily.x/path . where x is the number of days old the copy of the file is.

    I'm thinking of adding another backup computer, this one on-site. It would be similar to the off-site daily backup, but this one would do hourly backups.

  47. Re:Backups (with right formatting) by halfgaar · · Score: 1

    Rdiff-backup uses mtime+size as a unique identifier. The author tried using ctime, but there were problems with that. Unfortunately, he couldn't remember what they were... Ben Escoto (the author) told me so personally.

  48. Why not FFS? by Ayanami+Rei · · Score: 1

    Well here's the current situation:
    You can use it if you want. UFS1 is supported R/W. UFS2 exists as read-only.

    However ext3 is capability identical to FFS with additional journaling options. In the beginning linux was using minix/xiafs. ext was introduced to help transition from that while bringing modern features to the table. Each evolution on the FS has been forward compatible to ease transition.

    extX, reiser, jfs and xfs.
    Each of them have a purpose:

    extX: simple, low-overhead, modest size limits, online resize.
    reiserfs: more efficient storage of directories, small files, etc. An attempt to bring new techniques to the table like plugins and support for extended metadata and indexing.
    jfs: an import from AIX that had support for larger volumes than EXT3 did at the time. More efficient for large files.
    xfs: an import from SGI that had support for large volumes, stripe sets, and online inode/block resizing. More efficient for large files.

    There are reasons to have different file systems depending on the needs of your data. extX fulfills the same roles as FFS and does a good enough job that trying to get FFS to "work" under linux is not worth the effort except for cases of quorum disks and other chicanery.
    ReiserFS, JFS, and XFS (and your other flavor-of-the-month types) all have legitimate uses in various circumstances (typically when dealing with really really big volumes, big files, databases, or lots of tiny files).

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
  49. Re:Backups (with right formatting) by arth1 · · Score: 1
    I'd love to know what the problem he had with ctime was too. To quote Randal L. Schwartz (better known as Merlyn):

    Think of "mtime" as "what make looks at"
    Think of "ctime" as "what backup looks at, for incrementals"

    So, anything that would make a file "backupworthy" updates ctime.
    This has been the principle since the earliest days of Unix. If Linux
    didn't maintain this historical meaning, then Linux really *isn't*
    Unix. :)


    Anyhow, good luck with your article, and give dump and cpio a spin :-)

    Regards,
    --
    *Art
  50. NetBackup for us... by PenguinBoyDave · · Score: 1

    We use NetBackup from Symantec (formerly Veritas). Supports all our distros and even FreeBSD & Mac. Works like a charm.

    --
    I'm not a troll, but I play one on Slashdot.
  51. Re:Backups (with right formatting) by Anonymous Coward · · Score: 0

    I know it's kind of an extreme example, but would you go up in an airliner of which you know the software would crash at certain speed, angle and fullconsumption combination?

    Yup. If that speed, angle, and fuel consumption aren't part of my flight, why should I worry? If there was a chance that they would be, they would have been tested and formally verified multiple times over before the plane was ever certified for flight.

    Aircraft hardware will crash for certain speeds, angles, and fuel consumption combinations, too. If the software works for all the normal cases, I'm happy with it.

    For example, a plane tends to crash when fuel consumption = 100% of fuel available, angle = straight down, and speed = 9.8m/s^2. If the software crashes then, I really don't care.