What is Your Backup Policy?
higuita asks: "A few days ago, I was asked to check our backups policy, how they are being applied and to try to make it safer and more useful. Being new to the company, I started to check what is being done right now and found several problems. Since I don't have much experience with enterprise backups, what are the most used backup policies, software and global ideas about this issue? We have less than 1000 workstations (Windows and Macs), about 20 Oracle and Exchange servers (split between Windows, Solaris, and Linux), and it all needs to be backed up. Right now, we use the HP data protector with several tapes, where most things have a weekly full backup and daily incremental backups, and that most full backups are archived permanently in a safe we have for this purpose. We also have off-site storage for backups, as well. What practices and policies do Slashdot users implement for backups they perform at their office (home backups practices I am not interested in)?"
"I've investigated Veritas NetBackup and other solutions, and I'm also curious if Amanda could be better or at approximate the features offered by HP Data Protector. What backup software have you used that you found enjoyable with the least bit of hassle?
I've thought about using Dirvish to backup the user's homes to a cheap server with several HDs, and only backup to tapes once every 15 days or even once a month. They will lose their Windows permissions, but I don't think that matters much, since this is just for safekeeping the users' work. I thought about making full backups of the servers every 15 days with daily incremental backups. This way I will free up tape drives' time and gain more flexibility with the backup schedule.
I would love it if users worked off of file servers, but right now this just isn't possible. It's a planned addition that we still don't have the time to make."
I've thought about using Dirvish to backup the user's homes to a cheap server with several HDs, and only backup to tapes once every 15 days or even once a month. They will lose their Windows permissions, but I don't think that matters much, since this is just for safekeeping the users' work. I thought about making full backups of the servers every 15 days with daily incremental backups. This way I will free up tape drives' time and gain more flexibility with the backup schedule.
I would love it if users worked off of file servers, but right now this just isn't possible. It's a planned addition that we still don't have the time to make."
For that many systems, use a professional, enterprise grade, commercial solution. The open source stuff doesn't supply the same manageability.
AND FOR GOD'S SAKE, REGULARLY VERIFY THAT YOU CAN READ THE TAPES BACK... More sites have been screwed by backup tapes that weren't readable than any other failure mode. Verifying every tape is best. Second best is every weekly. Random samples, but covering every single drive's tape output at least once a month, are poor third place.
The two obvious software suggestions are Veritas/Symantec NetBackup and Legato Networker.
Weekly fulls and daily incrementals are good. Your offsite schedule should be checked to ensure that you have a relatively recent restore point both onsite (in case of data loss) and offsite (in case of building loss).
In terms of offsites, having a prepared plan for where and how to restore (Disaster Recovery and Business Continuity) is also important. But those all start with "Go get the tapes...".
don't make the mistake that one guy did
the office was in the North Tower --- The "offsite backup" was in the South Tower
oops
i would suggest minimum different zip codes different time zones would be best
other than that Grand father > Father >Son GF gets sent offsite
Any person using FTFY or editing my postings agrees to a US$50.00 charge
This will take a LOT of research on your part.
You'll need to identify each application that is being used, where its data is being stored and what type of "backup" is needed for it.
Don't forget to include "backups" of the system software. There's nothing more annoying than having to rebuild a system, and you have a backup of the data, but you cannot find the install CD.
Older *nix systems were far easier than the "modern" PC-based servers. I could backup my old Sequent box to a bootable tape. If anything went wrong, I could boot the tape and re-write the system. This is somewhat supported now on some of the PC-based servers.
Anyway, back to the "backups". Once you have the systems identified, then you'll need to look at what scenarios you'll need to plan for.
#1. Server crash.
The data on the disk is destroyed. The OS is destroyed. But the hardware is okay.
#2. The building burns down.
All of your servers are now smoking heaps of plastic. So's your desk. And all the CD's you had.
#3. 5 years from now someone wants a critical policy that was deleted 3 years ago.
I spend most of my time kicking co-workers to get them to NOT just dump data any where that has free space and to NOT just throw up a new web server without telling me.
I can't think of any good reason to do that. All the important data should be on the server. If the user wants to save a picture on the local disk to use as a background or something that's one thing (although I wouldn't allow that myself) but nothing important should be on those disks.
Past that, I don't have the experience to help you. All I can do is reiterate what another poster has already put up. Check the backups. I can't tell you how many stories I've heard about backups that "went fine" until someone needed data. Stories where the tapes were so old they almost shredded themselves in the drives. Stories of "backing up" for at least 6 months onto a cleaning tape (I bet the drive was in good condition though!). Stories of the backup data being garbage because of a faulty cable or something. The backup is worthless if you can't get the data back off it successfully.
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
My backup strategy consists of hoping that my hard drive doesn't fail before I get a new computer/hard drive. It's worked so far, even with a laptop.
Stupidity is like nuclear power, it can be used for good or evil. And you don't want to get any on you.
Fire, Flood, Mud, and Earthquake
In which case, the best case for off site backup is out of state, like Las Vegas or something. This also gives you an excellent excuse for monthly road trips to "check out the quality of the backups"
That said, for simple off site backups, solutions like MOZY.com do just fine for a small small business. Otherwise, something like LiveVault.com is recommended. There are plenty of vendors out there.
Another thing is the insurance for replacements for each of your software media. Things like MS can be done in bulk via several MSDN subscriptions, a bargain even if you never develop anything. (300 bucks get you copies of everything MS is currently shipping, along with extra CDKeys for many items). In fact insurance for the media and other details is a very good idea.
It's very nice if the backup facility is also located at the bottom of a retired ICBM missile silo, or something similar.
"It is a greater offense to steal men's labor, than their clothes"
At work we do the same, only to a larger extent. We've got an on-site and off-site storage, and each piece of information is printed in two copies to be stored at each. All that in addition to your usual Veritas tape and CD-RW backups, which we do for convenience of restoring lost data, but which we don't trust enough to eliminate paper copies.
I think you're jumping the gun a little here.
The first question you need to ask is:
What is the time frame for your servers to be restored in should servers and such completely fail?
If you don't know that answer to that question then how does your company know how much money to budget? Are you bound by HIPAA or Sarbanes-Oxley? You should know how much is your company's data worth prior to assigning a bidget.
Are some of your database servers supposed to be up 24x7? Maybe you should look at distributed transactions across databases located at different sites so if one server fails you still have everything live? Have you timed how long it takes to rebuild your servers to confirm your allotted time in your disaster recovery plan? Has your company considered imaging servers/ Is it possible to?
Have you consulted your disaster recovery plan? Have you checked with suppliers to see how long replacement parts will take to order? I can't tell you how many administrators get caught out by buying an expensive tape drive only to have it fail along woith the server and nothing can be restored until a new one can be sourced.
Without requirements, a disaster recovery time frame you will never be in control in the event of a disaster.
Your companies board of directors/owners will need this information. It's called operating under conditions of "due care and diligence".
If something goes wrong and you can't tell your boss exactly what is required and how long it will take to recover then you're working in the wrong job - a big part of being a network administrator is planning for ANY event.
Oh, most of the time my customers are happy with Robocopy. I hate paying for expensive hardware and backup software solutions when I can write something much simpler and document it properly rather than depending on someone else's buggy software. Of course this depends on the industry and their requirements.
Make sure that your boss completely understands these questions and issues. Ask him to see the current Business Continuity plan and Disaster Recovery documentation before you touch anything on those servers - can't stress that enough.
Hope that helps, sorry it's brief but if you're in charge of backups it's your job to be ANAL and PEDANTIC.
Please God... please say someone took the project home on CD, or we're fucked!
I scream. You scream. I assume that means we're both acquainted with the problem. We proceed.
I don't give two hoots for a backup policy. What you need is a data recovery policy. When will I need to recover data, and how will it that be attained.
/. message if you want more info.
I've been working with Symantec (formerly Veritas) Netbackup in my workplace for the past 6 years. About 6 months ago I became one of the backup admins, and the biggest barrier I have to break with our clients is the backup mentality - I must backup everything all the time...
Generally your data recovery will happen from two triggers:
1. A user broke his own stuff and needs a file restored.
2. Disaster Recovery.
Each has different requirements, user wants the backup copy to be onsite, DR wants it to be offsite.
User PC's typically can be rebuilt/imaged in a disaster, you're not going to have a hot-site contract for PC's. If your DR plan is to install an OS, install a backup/restore client software then recover databases/applications, then why fret about backing up the OS?
Our policy is as follows
NT/Unix OS and flat files
Monthly full backups retained for 13 months
Weekly full backups retained for 6 weeks
Daily cumulative incremental (everything changed since the last full) retained for 15 days
Oracle Datafiles
Weekly cold for 13 months
Daily hot for 6 weeks
1-6 hour archive logs for 15 days
Exchange Datastores
Daily full for 6 weeks
Weekly full for 13 months
Every day any full backups that are more than 10 days old (not copy 0) are sent offsite.
Any customer that has a DRP contract (banks etc) with a 4 hour recovery policy (we have 3 days to get the system back to how it was 4 hours before the disaster) we either run inline tape copies, one for onsite and one for offsite, or else we backup overnight and duplicate during the day.
Your most important backup (for Netbackup) is your catalog. If you go to DR and all you have is a box of tapes, good luck. You need to know what data is on what tape, and the only thing that knows that is the Netbackup catalog.
I don't know much about other backup products (HP OpenView and BackupExec are the only others I've touched), but I'm sure they'll have something similar.
I've got lots more to say, but I don't have time to put it all down now. Send me a
Think of the children!
http://alternatives.rzero.com/
Servers - how long can they be down? Do you have replacement plans in case your data center gets hit by the next earthquake/hurricane/fill_in_the_disaster. Having tapes off site means nothing if you don't have hardware for restore. Can you get Hardware X if everyone else is looking for X, maybe Y is the new standard and you're application needs X version 1.2.
Desktops, are files on a server or local? Do you have a standard desktop that can be rolled out and copies of the server. Can the desktops go 2 weeks, but you need the servers back in 12 hours. You need a plan before things get ugly.
Speaking of tapes, as mentioned you need to periodically check your restore. Backups don't matter, it's whether or not you can restore your data that counts. How often, incremental or full. Be careful shipping tapes. Since 9/11 I've noticed tapes shipped with certain carriers have read issues at the remote site. Is this X-rays on cargo or just a bad run of tapes?
I use plenty of stuff for which I have the source code. Going back to the 4.2mumble BSDs, through SunOS, Linux, Solaris, the various x86 BSDs, and plenty of applications (this is Mozilla I'm /.ing with, and before that a long line of other open source browsers). I have no problem with installing large Linux farms, using Apache for an enterprise web deployment, using MySQL for moderate sized databases (or PostgreSQL, though I haven't deployed it personally).
Tape backup... NBU wins. Legato's a close second. Sorry, charlie. Open source as a category does not suck. The open source backup stuff doesn't suck, for small to medium sized sites. It's not enterprise class, though, and most of the trick to succeeding in IT is knowing when the tools you use aren't applicable anymore and how to figure out what are.
NBU can't RAIT, but it can stream across multiple tapes, and can write duplicate tapes if you want redundancy. And you can extract the files off tape with tar if you have to.
Amanda certainly doesn't suck, but it's not NBU.
I've used Amanda, Bakula, Netbackup, Networker and by far the best of the bunch for enterprise size networks is TSM. Easily. Netbackup is something I still have cold sweats and nightmares about, ok, not quite nightmares, just the occasional cold sweat. It's really a small network system which has been kludged to "enterprise" class. TSM was designed for managing large network backups from the start.
Deleted
i would suggest minimum different zip codes different time zones would be best
Sounds funny but very true. Backups across town aren't terriby useful if across town is flat too. Sound farfetched? Ask a sysadmin in Miami how far off he ships his backups. If he was there when Andrew visited, I'll bet they're in New Mexico.
This may seem a tad offtopic, but it is relevant:
You have to think through both distance from and access to your backups as a part of disaster recovery planning. Backup isn't just recovering the CEO's email, though that is a (hopefully) far more frequent occurance than recovering from a hurricane/fire/mudslide/blizzard. Easy access to the backup media is important for daily operations. Recovery from disaster is quite a bit more complex. Your backup solution needs to be able to cover the full spectrum - from yestarday's lost spreadsheet to the area flattened by mother nature.
Personally, I keep two backups - one here locally, one 1000 miles away in another state. Backup to CD here, online rsync in NC.
"Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway." - Variously attributed, frequently to Andrew Tanenbaum
-- "Never underestimate the power of human stupidity." - R.A.H.