Hard Drives Instead of Tapes?
An anonymous reader writes "Tom's Hardware News weekly news letter has a very interesting article about Dr. Koch of Computertechnik AG who won the contract to build a RAID backup system for the University of Tübingen. Dr. Koch took several standard entry-level servers, such as the dual-Athlon MP, and add modern components and three large-caliber IDE-RAID controllers per computer, and a total of 576 x 160GB Drives."
This is a much better solution than tape, really. It's predictable that the industry will probably move in this direction, now that the hardware is cheap enough and of high enough capacity to serve this function.
Imagine: instant recovery. Your backup could be a usable image of your live server.
So a BIG RAID is somehow safer than many small RAIDS? Backups aren't just for the heck of it...some of them are required for compliance, i.e. the financial industry.
With the huge size of some databases, it would make more sense to connect to your offsite storage via fiber and store it there. There is no reason the backup disks need to be in the same room or building or state as the primary disks. Then you also solve the problem of reliably getting the data offsite in the first place. This is of course more expensive than renting a storage locker and driving a dat tape over to it every night, but I don't think Citibank is driving too many tapes around town. (just a guess)
Right now, Sony is shipping Super-AIT tapes. The cartridges are about 3/8 of an inch thick, and each holds 500GB, before compression (which is integrated in the drive hardware). The drive can read or write at 30MB/s, before compression. With typical IT compression of 2:1, you get just under 60MB/s. The cartridge goes for about $150. Just try and get a terabyte of disk for that much. No, the drives aren't cheap, but they get paid off quickly.
Yes, disk is good if you need instant access to your backup, and for small installations of under a couple of TB, using disk backups make sense, but for larger data pools, tape is far more economical.
Also, as mentioned in the article, disk is terrible if you need off-site backups. In addition, a tape library consumes far less power, takes up less space, and produces less heat than a drive array of the same capacity.
Basically, the death of tape has been predicted for years, but it hasn't happened yet.
Haven't you ever put a CDR in a microwave? Pretty lights! (I take no responsibility for any damage to your microwave...)
Dupe posts are
Well, they kind of flitted over it with one sentence:
There's one aspect in which Dr. Koch's backup system can't keep up with tape solutions: storing the backup medium in another location after the backup has been completed.
The article didn't address what to do in this case. Instead, they continued:
As long as this isn't necessary, Dr. Koch's backup system offers some rather unique advantages.
Given that it's hardware-focussed, maybe one can understand this omission, but here in the real world it's still important. So, yes, what does one do if one does need offsite storage? Realistically, I think your suggestion of a big pipe is about the only way. It's hardly feasible to hotswap loads of drives for your offsite storage every morning. (Yes, I know they're using IDE, but think Promise controllers.)
The question then becomes a comparison of the cost of providing for offsite storage in this manner versus the saved cost of replacing your tape library with associated robots, etc.
However, the article also discusses (very briefly) associated costs for specialized backup administrators, delays inherent in recovering from tape backups, etc., so they're not totally unaware of the real-world issue. I suspect they may have chosen to ignore this specific issue because (i) it wasn't an issue in this case study, and (ii) examining it would've been a touch difficult.
High end mag tape cartridges store 50GB. One hard drive can replace three tape cartriges. When sending the drive off site for storage, just use the same box you used for the tapes and fill the extra space with shock absorbant padding.
But wait there's more. Those mag tape cartriges have a transfer rate of about 10 MB/sec. With hard drives, your backups will take a fraction of the time they took under the old system. That leaves plenty of extra time to pack the drives up extra securely. You may even be tempted to do extra backups to send copies to multipls off-site locations!
Double plus good!
Good point, he does talk about 100 nodes, why not have them on seperate ends of campus or even across town. Using longhaul fibre adapters they could go up 16 miles I believe without a repeater. So just devide the nodes into two groups and mirror the data to both sites, still be cheaper than tape. Sure it wouldn't work for a multinational corporation (for instance the telephone and transmitters in NY were often mirrored by being in each of the twin towers, this is now seen as being "not a good idea") but anything that takes out both ends of campus or two ends of town is probably so big that the universities last concerns will be the backup data.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
Low end dell server: about 500 bucks
:)
Escelade 4 port IDE RAID card: around 200 bucks
200 Gig Drives X 5: say 249 each, = 1245
Total Cost: 1945
And thats 1000 GB of un-raided space, so will end up being more than 600 GB raid 5.
No I didnt spell check this post...
4.7 GB vs. 300 GB AIT3 (or even 100 GB AIT2). That's a whole hell of a lot of DVDs. I've got clients in the graphic arts industry who archive to DVD-RAM (2.6 GB/side, in this case), and one job spans several of these. Imagine backing up the whole office (and all of the jobs) every night. Their backup (and truly rather insufficient at this time) is a 4 tape AIT 25/50 autoloader for each day. And I agree with the above poster about hd vs. tape. Remeber, MTBF mean Mean Time Between Failure. Not Mean Time Between It Might Fail. It means FAIL. As in, they will always fail. And what happens to the RAID when multiple drives fail?
Just a disclaimer to start things off - I am in the tape library business, so take what I say with a grain of salt. OTOH, I am a technical person, so it isn't going to be a polished marketing twist either.
.53 failure rate is good (I'm not sure what the published rates for new tape drive technology is) but the rate 5 years down the line is going to be much higher in my opinion.
The article mentions one major drawback, the inability to do offsite storage. You could work something out with offsite mirroring, but bandwidth costs at 70TB would get excessive. Not to mention needing the same hardware setup on the other end.
The other major advantage that tape has over disk is the archive ability. Once you write a tape, that data is static. I can have it sit in a slot in the library for a long time. Since this system is only designed for 5 years, archive is not a big deal, but a lot of industries it is huge. The ability to alter data on a disk drive seamlessly is a lot easier than to do on a tape.
The person who mentioned the shock/vibe values for a disk drive VS a tape cartridge: #1 I have dropped PLENTY of cartridges, and have only has one chip a corner. That chip did not affect my ability to use the tape further. Additionally, if the housing is destroyed, the process to spool off the tape, and splice it onto a different tape is not that difficult. I would not loose the data permanently. If there is a major mechanical failure inside a disk drive, getting the data off the platters is a lot harder.
I would be interested in seeing numbers for throughput of the system, power consumption, backup window lengths, average restore time. Some of these might stack up favorably to tape, others might not.
The comment on moving to optical as a backup medium - maybe someday, but for now the space needed/time to backup to optical does nto compare well with tape. A DVD of 4.5 GB VS a tape of 100GB (Currently available, yes I know blue lasers will improve that)
As for a robot failure, worst-case scenario, you put the tape in the drive manually. Realistically, at least at our company, we have solved this problem for our customers by providing the ability to easily replace components. This can happen either with a field engineer, or even the customer themselves. Generally all you need is a Phillips screwdriver, 20 minutes max, and the ability to follow instructions.
Again, I'm not in the sales department, so I can't quote costs, but a 435K total cost for 70TB is not that cheap. With tape systems, a lot of the cost depends on how fast the backups need to occur in. I could build out a 70 TB system with 1 drive, a SCSI connection and a huge wall of tapes relatively cheaply. As you add more drives, use fibre or gigabit Ethernet interfaces, etc costs go up, but access times go down. Cost can also be brought down by not going with the 500 lb gorilla of the field - StorageTek.
Yes disk is growing, but generally it does not replace tape, it only pushes it back a layer. This won't change for a while.
The drives just keep getting more expansive. It's hard to budget for these kind of things without making people laugh. And my drives are always less reliable than the disk drives. Only 1/2 of my DLT7000 drives still work.
Tangent: FreeBSD 5.0 has filesystem snapshots. Anyone interested in a more home-grown setup should take a look at that... Is there anything similar for ext3 or reiserfs?
First, I should note that I do consider sanity checks and cost/benefit analysis when making backup/recovery plans. So I agree with many of these comments. BUT...
(1) Disasters happen more often than people expect. And they can happen to you, not just the other guy. Wildpackets almost went out of business as a result of underestimating that.
(2) Being out of business anyway - well - that's a discussion I had with the owner of one small company. I pointed out to him that one of his core values was loyalty to his employees. In the event of a big disaster, he and his family would collect the insurance check and sell the site, but his (former) employees' mortgage payments would continue. He got the point and agreed to improve disaster recovery plans.
(3) "Both of our sites will never get hit at the same time". I had a friend in charge of DR for a large company who analyzed 10 years of data center disasters and came to the same conclusion. He put the backup right down the street from the primary. Ever hear of the Great Chicago Flood? Luckily for my friend he was working at another company when both his primary and backup were taken out by that event!
That's my 0.02 anyway.
sPh
Devil's advocate for a second: if you're going to do this, you put loads of hot spares into that array. Maybe 20% of your drives are hot spares. Then, of course, your RAID system will automatically fail over to hot spares in the event of a drive's failure. (If your RAID system doesn't do that, there's no sense in worrying about the safety of your data, because you're doomed.) Naturally, you'd also maintain a big stack of, say, 50 replacement drives somewhere.
So what you do is check the RAID array once every day or so. If, say, 3/4 of your hot spares are in use, then only 5% of your online drives are available as hot spares. At that point, which hopefully only happens every few months, you make a day of it and start replacing hard drive after hard drive. You take your big heap of 15 or 20 failed hard drives and mail them back for warranty repair all at once, in one big box. Hopefully the paperwork isn't too hard because probably you're on a first-name basis with the warranty department. :-)
This seems like a good idea until I think about what the RAID system is going to do when you try to replace 15 failed hard drives at once. It will have quite a time migrating all the data over. The power drain and heat generated from this activity may be more than the hardware can bear. It is also a scenario that the manufacturer's testing department is unlikely to have considered!