Slashdot Mirror


Advice on Remote Backup Services?

a-freeman asks: "Faced with the prospect of doing automated weekly backups for several servers with some 200 GB of files each, I have been looking for a remote backup solution. A couple of recent articles consider backup to hard drives, although I feel this still fails the 'separate snapshot in time' aspect of good backup policy, since with many of the solutions that I have seen, you will likely lose all your backups if your array gets corrupted. However, CD-Rs and DVDs are just too damn small. Can anyone recommend a remote backup service or interesting combination of hosting service + FTP/RSync/etc., or am I stuck buying a tape drive?"

30 comments

  1. Maybe a combination? by questionlp · · Score: 1

    You can probably keep using tape as an easier way to get data onto a more portable backup medium that can be locked away, then use snapshot backups to a set of mirrored or RAID-5'd array, which then send the data over a decent pipe to a remote server via rsync or scp (or NFS even) from the array, but probably not as frequently due to the time it would take to transfer the snapshot data (which may compress well, but I'm not sure).

    Just a thought.

  2. Remote in what way? by toygeek · · Score: 2, Insightful

    Are you looking for remote as in "in the next rack over" or "somewhere across the internet" or somewhere in between? In short, define "remote."

    Tape is probably the best bet so far. As far as getting a good 'image' of it, tar it and stick it on a tape. Since you don't want hard drive array, and optical is out, tape is going to be the best way, I think, unless another /. reader has a better idea.

  3. Why wouldn't you want a tape drive? by SpaFF · · Score: 4, Informative

    "...or am I stuck buying a tape drive?"

    Whats wrong with a tape drive? It is a medium that was designed for backups. If you are going to be backing up large amounts of data you need a tape-library and remote backup software. If you want the convienence of harddrives then attach the tape-library to a machine with a whole lot of disk. You can backup to the disks first and then archive whats on the backup-server to tape. Most backup software programs allow you to do this.

    --
    -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GIT d? s: a-- C++++ UL++++ P++ L+++ E- W++ N o-- K- w--- O- M+ V PS+ P
    1. Re:Why wouldn't you want a tape drive? by override11 · · Score: 2, Insightful

      I agree. We have a NAS maxtor MaxAttach, which has a SCSI port on the back just for a tape drive. I think that the largest Maxtor drive right now is right around 600 gigs (ours is 300 gigs), but it would be simple enough to replace the drives with larger ones. Then just slap a LTO drive on the back and setup your backup to tape to run every night, and overwrite your backup to disk to keep the maxtor clean, and you will have a fast recent backup and slower snapshots on tapes. :)

      --
      No I didnt spell check this post...
    2. Re:Why wouldn't you want a tape drive? by Anonymous Coward · · Score: 1, Funny

      Never Underestimate the Bandwidth of an Eighteen Wheeler full of 600 gig tapes....But the Latency blows.

    3. Re:Why wouldn't you want a tape drive? by OpenYourEyes · · Score: 1

      While it was designed for backups, its not exactly designed for restores. I've known a number of people over the years who never realized their backups were failing, and found out the hard way when they needed something in a pinch.

      Other media have the advantage that you can access them directly, validate they're actually writing the data correctly, and have more random access to them. Yes, you can do these things with tapes, but its more difficult.

    4. Re:Why wouldn't you want a tape drive? by Anonymous Coward · · Score: 0

      The breakeven point is getting higher all the time--unless you actually need to store a couple dozen tapes' worth of data simultaneously (or absolutely have to cram your backups into a safe deposit box), a RAID of fixed disks has overwhelming advantages in reliability, price, and throughput. As for a tape library, what else can you say about a labor-saving device that actually costs more than any labor you could ever expect it to save you?

  4. OpenSource Backup Solution?? by klupo · · Score: 1

    Does Anyone know also if there's any good open source backup solutions for tape libraries? Don't say Amanda, it doesn't span tapes.

    --
    "Talent does what it can; genius does what it must."
    1. Re:OpenSource Backup Solution?? by SpaFF · · Score: 2, Informative

      Arkeia has a nice backup suite that while not open source does have a free (as in beer) edition. When I was evaluating it, it seemed to work great with my ADIC library.

      I would be using that now except for that our company already had a license for the Veritas backupexec software for windows so I was able to just download the linux client software for free.

      --
      -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GIT d? s: a-- C++++ UL++++ P++ L+++ E- W++ N o-- K- w--- O- M+ V PS+ P
    2. Re:OpenSource Backup Solution?? by Anonymous Coward · · Score: 0

      Don't know what docs you are reading, but Amanda does support spaning tapes, changers, etc.

    3. Re:OpenSource Backup Solution?? by klupo · · Score: 2, Informative

      No it doesn't. It even says it doesn't in the online FAQ.

      --
      "Talent does what it can; genius does what it must."
  5. Storix System Backup Administrator by bigredradio · · Score: 1

    Storix makes a product that allows for backups over IP to either tape or disk. This way you could do disk backups offsite. You can also perform disaster recovery over the the Network. You can't use bootp, but insert the install floppy/CD and install from the remote image.

  6. rsync + remote server by Lerxst · · Score: 2, Informative

    I'm dealing with pretty much the same issue right now as you. I've set aside a dedicated backup server (cheap K6/2 400) with a lot of disk space which uses rsync to backup the other servers in the office. Then I'm using rsync on an offsite server to keep a backup of the backup. Seems to work well. Having some sort of a raid setup on this box would be even more insurance I suppose.

    I'm not using tape because the office I'm doing this for doesn't have a dedicated IT staff, and I'm not going there nightly/weekly to rotate tapes.

  7. Rsync to large local disk array, then to tape by buttahead · · Score: 3, Informative

    I have an Rsync backup walk thru. Once everything is on a large array, you can run backups to a localy mounted tape drive.

    Rsyncing keeps the network traffice to a minimum, and the local tape speeds up the backup to tape.

  8. rdiff by GigsVT · · Score: 2, Informative

    Rdiff backup, does incremental snapshot.

    http://rdiff-backup.stanford.edu/

    --
    I've had enough abrasive sigs. Kittens are cute and fuzzy.
  9. look at rdiff-backup if you use disks by Splork · · Score: 2, Informative

    rdiff-backup is based on rsync but allows you to keep incrementals as well as full backups. great for disk based backups while maintaining lots of history.

    for redundancy and recoverability just use it to multiple backup disks at whatever level of redundancy you need. each one will have its own full set of incrementals so if you lose one, no big deal.

  10. Tape drives. by Zapman · · Score: 4, Informative

    "It's hard to beat the bandwidth of a stationwagon filled with DAT tapes"

    If you're talking: 1) several servers and 2) 200ish gigs per, welcome to needing a real backup solution.

    One thing to keep in mind is the three 'kinds' of backups. You will need to cover (or choose not to) all three.

    1) DR. Disaster recovery. A full image of ALL data, usually duplicated so you have a in house copy and a remote copy. Full system images, and a software package that can blast a full system image to a box, or full data and config backups that require an OS install before your restore. Usually this is somewhat light on tapes, since you'll only keep 2-3 weeks of them.

    2) File Recovery. Someone deleted something that they shouldn't have and need it restored. Or the Database equivelent: "We dropped this table 5 weeks ago, and discovered just now this random important process that hits it every 2 months. Can we restore the DBF file so we can get that table, data and schema back?" Sometimes DR feeds into File Recovery. You just keep the tapes longer. More expensive in tapes though, and you have data you'll not use (like OS images) wasting tape space. It's easier though.

    3) Archival. EG: The IRS mandates that we keep this data for N years (where N is usually greater than 7). Thankfully, this is a thin ammount of data, but it's important none the less. CD/DVD rock for this, but tapes are good too (so long as you're under 10 years. Media and reader issues will kill you after that).

    Good luck. Backups are a huge pain. Be sure to test the DR portion of it at least once a year. You'll be thankful you do.

    --
    Zapman
    1. Re:Tape drives. by GigsVT · · Score: 2, Interesting

      rdiff-backup and rsync with rotating incrementals are both able to do the first two very well, with the advantage of never needing a "full" backup after you do the first one, something tape will never be able to do. This makes things like offsite backups over slow and cheap links possible that would not be otherwise.

      We back up about 1TB of total data to a offsite backup over a 512kbit fractional T1, with daily rsync incremental snapshots that we keep for 30 days. Our data velocity is about 3-6GB per day of data that changes or is added. The backup easily finishes between 5pm and 8am.

      For #3, long term archival of small amount of data, hard disks probably aren't a good choice.

      I resent it when people say "the only real backup solution is tape".

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
    2. Re:Tape drives. by Stinking+Pig · · Score: 1

      Your solution brings up the other two components to consider when looking at backup though:

      1) Data loss window: How much data can you lose if your system fails right before the backup starts? Sounds like you're initiating nightly, so you can lose twenty-four hours of data. Rsync would support reducing that (in fact, I've seen rsync solutions with three minute windows; just gotta make sure your script detects presence of active rsync's and alerts someone that the window's gotten too narrow). However, reduce it too much and you've created a new dependency on that T1; if it is down when your backup wants to fire, you've just doubled the time that the next rsync will take. I'm sure you can guess where that's going if the network is problematic for some time...

      2) Time to recovery: It's fast to rsync a single file back to the original copy; it's not fast to restore the file if it was lost. It's extremely slow to rebuild an entire server over a 512K line. Mitigation of these factors is going to cost; spool on this side of the circuit? get a bigger circuit?

      --
      "Nothing was broken, and it's been fixed." -- Jon Carroll
    3. Re:Tape drives. by GigsVT · · Score: 1

      . Rsync would support reducing that (

      Correct, locally we run rsync with 30 minute cycles over 100bt. The script simply does a killall rsync at the beginning in case of a sudden data influx that causes an older script to run too long. Your sugesstions are good, the killall rsync was a quick and dirty way to buy a little insurance.

      Sounds like you're initiating nightly, so you can lose twenty-four hours of data.

      Offsites are a third level backup, we have local rsync incrementals, and also mirroring. In total, we always have from 2 to 4 copies of any given file in the archive. The archive is interacted with only through wrapper shell scripts from a user's point of view at least, which allow me to present a level of abstraction.

      This system is pretty tailored to the type of thing we do, sticking large files into an archive. The shell scripts also allow logging and do basic locking when a file is out for changes. SSH plays a big role in all this, rsync is over SSH, the abstraction shell scripts use scp and ssh.

      However, reduce it too much and you've created a new dependency on that T1

      Yep. There is a possibility of a snowball effect, especially with slow lines like the 512K. Since the 512K is also used for many other things during business hours, I had to throttle the rsync down to 30 kilobytes a second. We decided to just go to nightly only and run it unthrottled, that has worked a lot better for us.

      Time to recovery: It's fast to rsync a single file back to the original copy; ... Mitigation of these factors is going to cost; spool on this side of the circuit? get a bigger circuit?

      Well this is an easy one for us. Since we also have redundant local backups, the offsite backup is generally for severe disaster recovery only. In that case, we would do one of two things:

      a) Just use the offsite server as the active archive. It would slow us down some, but it would work. The nature of our archive is that there aren't small incremental updates to files, files get checked in and out for changes, or copied for read-only access. We could operate at reduced capacity from it remotely. We also have a full T1 to the Internet (from a different ISP) in addition to the 512K WAN link, it could be used to transfer files in an emergency.

      b) This is the primary plan in case of severe storage disaster. We just get them to ship us the server or RAIDset. Insert witty comment about 747s full of disks. :)

      Our system isn't perfect, but for the shoestring budget we built it with, I think it's pretty robust.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
    4. Re:Tape drives. by Stinking+Pig · · Score: 1

      That's pretty much how I'd do it too :-)

      I've gotten to the point where I absolutely hate talking about DR though; every one I meet wants synchronous transfers and hot GSLB failover to a 100% capacity site on the other side of the country. It takes weeks to design, the price tag comes in at 150% of what they're paying now, they choke and disappear. It's been like this for years.

      Expensive things are expensive. Don't know why people keep thinking that an MSP/VAR/consultant is able to buy them for less money than an end-user.

      --
      "Nothing was broken, and it's been fixed." -- Jon Carroll
  11. Another pointless "Ask Slashdot" by Anonymous Coward · · Score: 0

    Dear Slashdot,

    Please do my job for me. Plz k thx bye.

    HP 8 * 200Gb Ultrium Autoloader

    Veritas Backup software.

    Just make sure you buy the library options and remote backup agent licences too.

  12. Rsync local and remote by bzant · · Score: 2, Interesting

    I currently have about 10G of "live" data on my office network, each night I back up using rsync, to a server at my house. This happens over a 768k DSL. On an average day I am pushing about 120MB over the wire, and it takes 10-20 minutes. Also in the office I have a 120GB drive where I have made 10 copies of the data, and each hour during the work day from 10AM to 8PM it makes an rsync copy, over 100mbit link it takes about 6 minutes to replicate the changes. I also have a nightly tape backup.

    This system mainly gets used for the "oh shit, I deleted my file" kind of users. I can't say enough about rsync, works great and has saved my ass a numbe of times.

  13. Easy Automated Snapshot-Style Backups with Rsync by Anonymous Coward · · Score: 0

    Mike Rubel has an excellent page on snapshot backups using rsync.

  14. Tapes are horrible - but still the best possible by prestwich · · Score: 1

    As far as I can tell tapes are awful, but they are still unfortunatly the best choice.
    It looks like the Ultrium 2's might be the curent best of the bunch capacity/cost wise - with 200GB (uncompressed) per tape. I wish they were firewire/USB2 rather than having to prat with SCSI cards.

    The Maxtor MaxLine 2 hard drives do look tempting though; it will be interesting to see if the 300GB versions ever become available. (They were originally listed as 320GB!)

  15. Try DLT tapes and Bakbone's NetVault by WinterSilence · · Score: 1

    I recommend NetVault's software Bakbone. It's cheap and I've used it alot. DLT tapes and drives are expensive - nearly 100 USD for a tape and over 1000 USD for a station - but very fast and stable. You could also do backup to IDE-disks with NetVault. By far NetVault is the most flexible and less expensive (1000 USD approx. for a 5 server license )product I have ever seen or worked with. See http://www.bakbone.com for details.

    --
    What kind of dog barks "BOFH! BOFH!"? A rootweiler of course...
  16. Remote Backup Solutions by Anonymous Coward · · Score: 2, Informative

    There is an old joke. "How do I move 3TB's from NYC to LA?" Answer, FedEx or UPS.

    If it's multiple remote sites as in WAN sites all over the country and you have 200GB's approx. per site to backup. Then you need to have something like a Compaq DLT Tape Library on each server and someone to rotate the tapes for offsite storage.

    We have many field locations so we backup to these DLT Tape libraries and either have an outsourced company like UNISYS or Siemans go to these offices and rotate the backup tapes. (A DLT hold many tapes so it's about once a month or twice a month). The tapes are shipped to an off-site archival storage company. Reports are sent to the Lan Center to track the tapes locations. If we need one and it's off-site then it's couried to the LAN Center for restoration.

    Yeah, it's expensive, yeah we are a big company. But to do it for the amount of data we need backed over up over the WAN would cost more in bandwidth then it does to have someone pickup the tapes and store them for us.

    Alternatively, you could train a primary and secondary onsite person to rotate the tapes and ship them to you. But you would have to trust them and that can be tough depending on the data. Also short turn-around time may mean you need to constantly train new people. Attempt to automate it as much as possible. Also ensure the person is not an idiot that will jam a tape and get it stuck in the DLT auto-loader, etc.

  17. Break 3-way mirrors by patbernier · · Score: 1

    rsync is fine when you have the bandwidth available, and when the time it takes to run the backup remains tolerable. However, unless you can write-lock the filesystem during the process, one interesting problem with a backup that takes a long time to run is that the backup copy represents a state that never really existed: the filesystem might have changed significantly while the backup was running.

    My favorite solution to this is breaking 3-way mirrors; see my earlier comment about this (along with the informative replies). It gives you an instant, "checkpoint" backup of your drive.

    Let's not forget of course that you can use a versionning filesystem (some Veritas products come to mind) to achieve a "checkpoint" state for your backups. These systems also usually include plenty of built-in backup strategies to answer your question... but we're talking about some bigger dollar signs here.

    --
    "Words have meaning, and names have power." -- Lorien
  18. Re:Easy Automated Snapshot-Style Backups with Rsyn by Anonymous Coward · · Score: 0

    Mod the parent of this up. The Mike Rubel article is excellent.

  19. Remote backup == massive cost? by gilgongo · · Score: 1

    Am I missing something, or does remote (as in off-site) backup for large amounts of data imply vast bandwidth costs?

    Transferring 200Gb a day accross a 2Mb/s leased line (point to point) would be fairly fast, but then it would be idle most of the time while you're paying a monthly fee.

    The only remote backup solutions I've ever heard of are remote as in fiber to the next room (or building if you're lucky). Then it goes to tape.

    JJ

    --
    "And the meaning of words; when they cease to function; when will it start worrying you?"