Slashdot Mirror


Backing Up 100 Gigs in an Hour?

cybrthng asks: "I am faced with finding a backup solution capable of archiving to tape about 200 gigs of a financials data in a 2 hour window. I originally looked into DLT8000 Jukeboxes with 2-4 drives but have recently discovered the new LTO drives. I am interested in knowing real world experiences with these drives as there has to be a catch. I mean there is a 3 fold performance increase in data transfers, two fold increase in tape capacity and a minimal price increase overall. With these drastic differences is there something I'm giving up with LTO over DLT or vice versa? Which backup applications are more geared to handling volume and integrate with Oracle RDBMS? Restoring speed is even more critical then backup speed so i'm curious about how these two drives compare and which applications are best geared for this much data on a nightly bases. Mind you there will also be about 500 gigs of data in an end-of-week backup as well."

11 of 79 comments (clear)

  1. Why... by Stone+Rhino · · Score: 5, Insightful

    Does the backup medium have to be tape? Hard drives are in fact more reliable than some tape, and would have a faster data transfer rate. A pair of hard drives hooked into a RAID array could backup over 200 GB of data and then be taken offsite just as easily as tape. Considering the fact that the drives would likely cost $400, tops, and could be reused many more times than tapes, I don't understand why people bother with tape anymore.

    --


    Remember, there were no nuclear weapons before women were allowed to vote.
    1. Re:Why... by sharkey · · Score: 3, Insightful

      Wouldn't your tape drives be toast, too?

      --

      --
      "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
    2. Re:Why... by Perdo · · Score: 3, Insightful

      You can move a hardrive offsite just as easy as a tape. the difference being tapes are more expencive. 200 gig tape starts at $3000. that same $3000 dollars spent on hard drives will get you TWO TERABYTES of more reliable, faster storage.

      --

      If voting were effective, it would be illegal by now.

  2. is removable necessary? by stilwebm · · Score: 4, Insightful

    You never mentioned it, so I thought I'd ask. If you don't need something easily removable, you can still have the data backed up to the other side of the data center, or even possibly they other side of the campus. With a storge array on a fibre loop you can back data up hourly, all 100GB in a full backup. Even a Gbit ethernet link could do this in under an hour, provided the link is not shared too much. Then you could run daily or twice daily tape backups off of that archive to send to you offsite safe archive location.

  3. Quite a feat by FreeLinux · · Score: 2, Insightful

    You certainly have quite a challenge here. Some of the newer LTO drives can, theoretically, achieve 55GB per hour transfer rates but, realistically you will get far less. Using an array of drives will allow you to increase this performance but definitely not as much as one would like to think. With a two drive array, most people expect to double their performance but, a 30% increase is likely as much as you will see. That said, it would require a 4 or 5 drive array in order to achieve the theoretical throughput that you desire. This all assumes that your server hardware and OS and backup software can in fact feed the array fast enough.

    I'm afraid that your only realistic options are to either get a larger window, which is probably unlikely, or perform live backups and bear the performance degradation during that time. The only other alternative that I can think of would be to mirror your data to secondary disk based storage and then backup the secondary storage off line. Any which way, I'd be amazed if you got 100 Gig an hour.

  4. Mirrors and Backups by acaird · · Score: 2, Insightful
    Although this doesn't address your need for fast restores, one method of doing this that you likely already know, is to mirror all of the data, break the mirror, backup the static half of the mirror for as long as you like, bring the mirror back together, and let the RAID software worry about the sync'ing of the data (Veritas can do this, I've done it using Veritas for disk mgmt and Legato for backups.) This means your backup window, from the perspective of the application, is 0, and from the perspective of the backsups nearly as long as you need.

    However, you aren't the first person to have this problem, and I'm sure Oracle as solved this problem. If it's as important as you say, I would pay them for this knowledge.

    --
    Power corrupts. PowerPoint corrupts absolutely. E. Tufte
  5. Re:IDE? You ignorant shit! by foobar104 · · Score: 3, Insightful

    I run a major financial bank institution on a bunch of overclocked Athlon XP's. we use IDE RAID and linux 2.4.10.

    That sound you hear is the rustling of ten million "withdrawl" slips being hurriedly filled out.

  6. Not such a big deal... by j.e.hahn · · Score: 4, Insightful

    It's by no means an easy feat, but the following should probably get you there.

    First, you need to consider how the data is getting to your backup server. This looks like a job for gig-e. (since you don't really want to run you DBs on the same machine as your backup server.) You should use multiple streams. (either break it into multiple smaller jobs or enable the multiple streams option in your backup software if it has one. Many do.) It's hard to flood even a 10base network with a single TCP/IP connection. (your bandwidth utilization decreases in inverse proportion with your latency. I forget the exact formula though.)

    Next there's how you're getting it to tape. I recommend running the backups to disk first if you can. This means you won't stall a network connection if you change tapes, or the like. But it does mean you need a lot of storage on the bkup server. Also, if that's a 2h from DB to tape window, this might not be useful. However, barring using a SAN and snapshots (or the like) your only other option is to go straight to tape.

    To go straight to tape you'll need at least 6 DLT drives, assuming you can keep the tape streaming and get 6Mbytes/s, and you balance them across a wide enough SCSI and PCI bus(or whatever system bus you choose) This will give you N+1 redundancy and meet your bandwidth requirement of 28Mbytes/s.

    As for the LTO/DLT trade off. We're moving to an LTO solution where I work, and it generally seems to be the way to go. It's worth evaluating, but I don't think your choice of tape either way should be your restricting factor. And there is something to be said for the reliability of DLT.

  7. A different take on the HD idea by sigemund · · Score: 5, Insightful

    By no means do I know a whole lot about backup technologies or any of that, but I do have a suggestion that kindof takes a different angle on the hard drive idea.

    I understand that you would want and need to keep the data off-site on tape (requirement). However, getting that transfer rate is going to be difficult. Perhaps you could do something like this:

    Use the hard drive backup (SCSI RAID perhaps?) idea to backup the data quickly and reliably. THEN, you've got it backed up in your time limit. Now, you can back up that back up with a tape, but you don't have the incredible time requirement. Get it?

    Concept:

    Original Data on Hard Drive
    --> Back it up onto a separate Hard Drive within the time limit
    --> Now, back up that hard drive that has just backed up the original. You have a backup done already, so you've met the time needed. Now, you can back it up with tape or whatever without having to do it within such a short amount of time. You can use the technology you desire to back up the hard drive copy while the original data drive keeps working.

    Then to restore, you can do it from whatever the removable media is.

    Again, I don't know a ton about this, but it's just a thought of another way to accomplish this.

  8. Off Site Backup by Bryan+Andersen · · Score: 3, Insightful
    Hard drives may be more reliable than tapes, but when the server room has water spewing from the AC and your controllers short out, guess what?

    Your "backups" are toast.

    Floods, tornadoes, fires, etc happen. Sometimes people fly planes into buildings. When that happens, tapes are the only thing that keeps your business in business.

    No, actually it is off site backups that save your ass. All the tapes in the world won't save your ass unless you have carried backup sets off site.

    Off Site Backups can be done with harddisks too. The main advantage of tapes it they are usually less fragile than hard disks, but the costs of the tapes for some large capacity tape backup systems are higher per MB than the multi GB consumer IDE disks and they don't provide random access.

    An idea I had for backups was to have a system be a mirror for the main disks. As the day went on it would mirror all the changes to the main file server. At 6PM or so (end of busisness day plus an hour) the current DB after image file would be copied to the mirror and mirror would be broken, the disks pulled, and a set from the week before installed. The mirror would then be restarted bringing the old backups up to date. The removed set would then go home as the current offsite backup. Tape and DB backups would happen as normal. The DB backup would be written to a partition on the disk set. I would think this become infeasable if one has to backup more than 4 to 8 disks worth. At this point that could be more than one TB. There would be a set of disks for each day of the week. Weekly tape backups would be the long term archive, while the disk sets would be the offsite backup.

  9. Re:Easy - History. by silicon_synapse · · Score: 2, Insightful

    Good point. Also remember though that hard drives have many more moving parts than tapes. HDDs may be more reliable than tapes sitting on a rack, but not while being carried/driven off site and back frequently.