Hard Drives as Backup Media?
rootus-rootus asks: "I funny thought struck me as I was going over the life expectancy for tape media for backups... Since the size of 3.5" hard disks is surpassing 100GB in a reasonably inexpensive package, has anyone thought of using them as backup media, as in a jukebox or autoloader? The access times and data transfer rate for data stored on them would make backing up databases, etc. MUCH more palatable (200+GB takes a LONG time to dump to tape for a full backup) Any thoughts on the matter?" Bet you've thought about this question before, haven't you? Has anyone done anything like this? If so, how well did it work?
I've done a reasonable quantity of backup-solution deployments, from the simple "tape drive in a server" to multi-element DLT libraries. I've had customers "invent" a version of this idea on many occasions. Typically, the customer's "invention" takes the form of one of several similar ideas.
What it comes down to, though, is that the idea behind having multiple medias, stored _away_ from the production copy of the data, is a good thing. Until recently, this has only been really convenient with tape media. With the advent of very convenient hot-swappable hard drive carriages and support for hot swapping of hard disk media in nearly every commonly used operating system, I don't see why hard drives could not be used-- but they would need to be treated with a little more physical care than tapes.
The "problem" seems to come when the (typically small-business) customer "invents" this idea, buys one of those cruddy "centronics connector on the back" sub-consumer-grade plastic "drive bays", slaps a hard drive in it, and starts doing backups to one hard drive from another. The cycle is something like: (1) insert 2nd hard drive, (2) wipe 2nd hard drive, (3) copy contents of production hard drive(s) to 2nd hard drive, (4) remove 2nd hard drive. They don't think about what would happen if, say, between steps 2 and 3 the production hard drive(s) failed.
If you're going to use hard disks as "tapes", I don't think there's anything fundamentally wrong-- but buy the same number of hard disks as you'd buy tapes-- and rotate them in the same manner. Treat them as large, mechanical tapes. Keep them away from the production data except when in use.
The Attitude Adjuster, I hate me, you can too.
I'm reading some of the replies and thinking to myself that the /. readers don't understand what a backup system is.
A backup system is not simply redundancy (i.e. RAID). A backup system for files typically can recreate any version of a file requested by the user (as backed up according to the backup regimen). Thus, if you have nightly backups, you might keep every night for the past month, every month end, and every year end for a given document. RAID won't give you this.
I'm familiar with some expensive IBM products that do this. However, they're expensive. Basically, ADSM (ADSTAR Data Storage Manager, or something) is a product that allows regular backups of products, and access to every incremental version of the documents. On the backend, it can be hooked up to a huge disk cache and a robotic tape library. The end result is terabytes of near-online access data, with automatic versioning. Pretty nice. And if your disk cache was large enough, it would never hit the tapes. It seems to me that this could be modified to remove the tapes and present what the user requires.
I'm not aware of anything open source or free (as in beer) that does this. It would be really nice, though.
Hell, I've always dreamed about an automatic versioning filesystem. Documents would be automatically versioned. You could use CVS to handle this. Perhaps you could do something as simple as have some code executed upon every file close for files that are opened with write access. When these files are closed, they are added as new versions of the document within CVS.
When the disk reaches some capacity watermark, a disk cleanup agent would be invoked. Its goal would be to remove redundant versions of old binary files from CVS. Rules could be attached to the agent to perform tasks such as retaining specifc versions of binary files (i.e. retaining the first version, the latest version, and all versions from the last named version).
Users could tag specific versions of files. These versions would always be retained.
I know this would incur a significant performance hit for disk access. Perhaps I could limit such disk access to specific directories or mount points. In this manner, I could have a mount point for documents, all of which would be automatically versioned.
Plugins for Explorer could be built to allow users to tag versions of documents and retrieve specific old versions of files. I'm thinking something like TortoiseCVS, a beautiful piece of software. In fact, for prototyping, TortoiseCVS would be enough.
Now, is anything like that available? No? Perhaps I should do something about that.
Cheers.
--Be human.