Advice on Remote Backup Services?
a-freeman asks: "Faced with the prospect of doing automated weekly backups for several servers with some 200 GB of files each, I have been looking for a remote backup solution. A couple of recent articles consider backup to hard drives, although I feel this still fails the 'separate snapshot in time' aspect of good backup policy, since with many of the solutions that I have seen, you will likely lose all your backups if your array gets corrupted. However, CD-Rs and DVDs are just too damn small. Can anyone recommend a remote backup service or interesting combination of hosting service + FTP/RSync/etc., or am I stuck buying a tape drive?"
You can probably keep using tape as an easier way to get data onto a more portable backup medium that can be locked away, then use snapshot backups to a set of mirrored or RAID-5'd array, which then send the data over a decent pipe to a remote server via rsync or scp (or NFS even) from the array, but probably not as frequently due to the time it would take to transfer the snapshot data (which may compress well, but I'm not sure).
Just a thought.
Are you looking for remote as in "in the next rack over" or "somewhere across the internet" or somewhere in between? In short, define "remote."
/. reader has a better idea.
Tape is probably the best bet so far. As far as getting a good 'image' of it, tar it and stick it on a tape. Since you don't want hard drive array, and optical is out, tape is going to be the best way, I think, unless another
Nobodies Prefect
Tidbits for Techs Technology Blog
"...or am I stuck buying a tape drive?"
Whats wrong with a tape drive? It is a medium that was designed for backups. If you are going to be backing up large amounts of data you need a tape-library and remote backup software. If you want the convienence of harddrives then attach the tape-library to a machine with a whole lot of disk. You can backup to the disks first and then archive whats on the backup-server to tape. Most backup software programs allow you to do this.
-----BEGIN GEEK CODE BLOCK----- Version: 3.12 GIT d? s: a-- C++++ UL++++ P++ L+++ E- W++ N o-- K- w--- O- M+ V PS+ P
Does Anyone know also if there's any good open source backup solutions for tape libraries? Don't say Amanda, it doesn't span tapes.
"Talent does what it can; genius does what it must."
Storix makes a product that allows for backups over IP to either tape or disk. This way you could do disk backups offsite. You can also perform disaster recovery over the the Network. You can't use bootp, but insert the install floppy/CD and install from the remote image.
Flexible bare-metal recovery for Linux/UNIX
I'm dealing with pretty much the same issue right now as you. I've set aside a dedicated backup server (cheap K6/2 400) with a lot of disk space which uses rsync to backup the other servers in the office. Then I'm using rsync on an offsite server to keep a backup of the backup. Seems to work well. Having some sort of a raid setup on this box would be even more insurance I suppose.
I'm not using tape because the office I'm doing this for doesn't have a dedicated IT staff, and I'm not going there nightly/weekly to rotate tapes.
I have an Rsync backup walk thru. Once everything is on a large array, you can run backups to a localy mounted tape drive.
Rsyncing keeps the network traffice to a minimum, and the local tape speeds up the backup to tape.
Rdiff backup, does incremental snapshot.
http://rdiff-backup.stanford.edu/
I've had enough abrasive sigs. Kittens are cute and fuzzy.
rdiff-backup is based on rsync but allows you to keep incrementals as well as full backups. great for disk based backups while maintaining lots of history.
for redundancy and recoverability just use it to multiple backup disks at whatever level of redundancy you need. each one will have its own full set of incrementals so if you lose one, no big deal.
"It's hard to beat the bandwidth of a stationwagon filled with DAT tapes"
If you're talking: 1) several servers and 2) 200ish gigs per, welcome to needing a real backup solution.
One thing to keep in mind is the three 'kinds' of backups. You will need to cover (or choose not to) all three.
1) DR. Disaster recovery. A full image of ALL data, usually duplicated so you have a in house copy and a remote copy. Full system images, and a software package that can blast a full system image to a box, or full data and config backups that require an OS install before your restore. Usually this is somewhat light on tapes, since you'll only keep 2-3 weeks of them.
2) File Recovery. Someone deleted something that they shouldn't have and need it restored. Or the Database equivelent: "We dropped this table 5 weeks ago, and discovered just now this random important process that hits it every 2 months. Can we restore the DBF file so we can get that table, data and schema back?" Sometimes DR feeds into File Recovery. You just keep the tapes longer. More expensive in tapes though, and you have data you'll not use (like OS images) wasting tape space. It's easier though.
3) Archival. EG: The IRS mandates that we keep this data for N years (where N is usually greater than 7). Thankfully, this is a thin ammount of data, but it's important none the less. CD/DVD rock for this, but tapes are good too (so long as you're under 10 years. Media and reader issues will kill you after that).
Good luck. Backups are a huge pain. Be sure to test the DR portion of it at least once a year. You'll be thankful you do.
Zapman
Dear Slashdot,
Please do my job for me. Plz k thx bye.
HP 8 * 200Gb Ultrium Autoloader
Veritas Backup software.
Just make sure you buy the library options and remote backup agent licences too.
I currently have about 10G of "live" data on my office network, each night I back up using rsync, to a server at my house. This happens over a 768k DSL. On an average day I am pushing about 120MB over the wire, and it takes 10-20 minutes. Also in the office I have a 120GB drive where I have made 10 copies of the data, and each hour during the work day from 10AM to 8PM it makes an rsync copy, over 100mbit link it takes about 6 minutes to replicate the changes. I also have a nightly tape backup.
This system mainly gets used for the "oh shit, I deleted my file" kind of users. I can't say enough about rsync, works great and has saved my ass a numbe of times.
Mike Rubel has an excellent page on snapshot backups using rsync.
As far as I can tell tapes are awful, but they are still unfortunatly the best choice.
It looks like the Ultrium 2's might be the curent best of the bunch capacity/cost wise - with 200GB (uncompressed) per tape. I wish they were firewire/USB2 rather than having to prat with SCSI cards.
The Maxtor MaxLine 2 hard drives do look tempting though; it will be interesting to see if the 300GB versions ever become available. (They were originally listed as 320GB!)
I recommend NetVault's software Bakbone. It's cheap and I've used it alot. DLT tapes and drives are expensive - nearly 100 USD for a tape and over 1000 USD for a station - but very fast and stable. You could also do backup to IDE-disks with NetVault. By far NetVault is the most flexible and less expensive (1000 USD approx. for a 5 server license )product I have ever seen or worked with. See http://www.bakbone.com for details.
What kind of dog barks "BOFH! BOFH!"? A rootweiler of course...
There is an old joke. "How do I move 3TB's from NYC to LA?" Answer, FedEx or UPS.
If it's multiple remote sites as in WAN sites all over the country and you have 200GB's approx. per site to backup. Then you need to have something like a Compaq DLT Tape Library on each server and someone to rotate the tapes for offsite storage.
We have many field locations so we backup to these DLT Tape libraries and either have an outsourced company like UNISYS or Siemans go to these offices and rotate the backup tapes. (A DLT hold many tapes so it's about once a month or twice a month). The tapes are shipped to an off-site archival storage company. Reports are sent to the Lan Center to track the tapes locations. If we need one and it's off-site then it's couried to the LAN Center for restoration.
Yeah, it's expensive, yeah we are a big company. But to do it for the amount of data we need backed over up over the WAN would cost more in bandwidth then it does to have someone pickup the tapes and store them for us.
Alternatively, you could train a primary and secondary onsite person to rotate the tapes and ship them to you. But you would have to trust them and that can be tough depending on the data. Also short turn-around time may mean you need to constantly train new people. Attempt to automate it as much as possible. Also ensure the person is not an idiot that will jam a tape and get it stuck in the DLT auto-loader, etc.
rsync is fine when you have the bandwidth available, and when the time it takes to run the backup remains tolerable. However, unless you can write-lock the filesystem during the process, one interesting problem with a backup that takes a long time to run is that the backup copy represents a state that never really existed: the filesystem might have changed significantly while the backup was running.
My favorite solution to this is breaking 3-way mirrors; see my earlier comment about this (along with the informative replies). It gives you an instant, "checkpoint" backup of your drive.
Let's not forget of course that you can use a versionning filesystem (some Veritas products come to mind) to achieve a "checkpoint" state for your backups. These systems also usually include plenty of built-in backup strategies to answer your question... but we're talking about some bigger dollar signs here.
"Words have meaning, and names have power." -- Lorien
Mod the parent of this up. The Mike Rubel article is excellent.
Am I missing something, or does remote (as in off-site) backup for large amounts of data imply vast bandwidth costs?
Transferring 200Gb a day accross a 2Mb/s leased line (point to point) would be fairly fast, but then it would be idle most of the time while you're paying a monthly fee.
The only remote backup solutions I've ever heard of are remote as in fiber to the next room (or building if you're lucky). Then it goes to tape.
JJ
"And the meaning of words; when they cease to function; when will it start worrying you?"