How Often Do You Replace Your Hard Drives?
Telemachas asks: "I recently purchased a Dell P4 2.8 GHz swap meet computer with a 200 gig hard disk for a good price and all is working fine. It does not seem prudent, however, to trust my data on a swap meet item. For another @ $ 75.00 each I can purchase new 200 gig HDDs. I would also like to do my first RAID system. I am now wondering how often, if at all, do Slashdot readers replace their HDDs?"
10 Gig HDD that came with an IBM PII (I think) is still spinning Windows for my Athlon64. Geek cred +1
- Frans.
I replace my hard drive when the S.M.A.R.T. info starts to signify problems, such as too many relocated sectors.
As I've already seen a couple of people say, don't preemptively replace your hard drives.
Allow me to add: Here's why.
Hardware failure rates follow a curve on average. They fail a lot after initial purchase, then slope down to their minimum after a couple of [relevant time periods] (probably "weeks" or "months" for hard drives, varies by what kind of thing it is), then slowly slopes upwards again.
(Please do not miss the phrase "on average". Certain specific flaws can cause a certain product line to have unusual characteristics, like a sudden spike at six months or something. However, unless you somehow figure out a way to guess which hard drives are going to have such failures in six months when it's pretty amazing for the exact same hard drive to even be on the market for six months, the fact that these things can theoretically happen can't have much impact on your decisions. After all, if you knew that was going to happen, you'd just plain not buy the drive, period, regardless of the argument in this post.)
Therefore, if you've got a "burned in" drive, you will be replacing a known-high-reliablility component with a component with a lower expected reliability. (I use "expected" in the probability/statistics sense here.) Unless you've discovered that you do have one of those funky products that all die in ten months, this is a bad move on average.
I replace hard drives when they fail. I try to act as if they could die at any minute, although I fail.
(But I try to get better. I'm in an all-laptop house, so it's difficult to have the convenience of an integrated backup solution and an automated, unforgettable script. However, with the recent Linux kernels finally supporting my SD card reader, I've gotten a high-capacity, slow, cheap SD card to stick in the previously-useless slot and I have an rsync now backing up the files I'd cry if I lost every hour. Sure, 1GB can't backup my entire system but most people's "cry if I lost it" datasets would fit into that. (Yes, there are exceptions... but if you're one of them, you've already got another back up solution in place, right? Right?))
There's really no replacement for backing up your files.
RAID 5 (or mirrored RAID, if that's your favorite flavor) protects against a single hard drive dying. But if the RAID card dies, you lose everything, especially if it's a proprietary card that's hard to find (more likely on a personal server); I've tried interchanging 3ware controllers and Highpoint controllers, and they couldn't read each other. Additionally, if more than one drive dies, you lose everything. Or, if there's some other problem (you know, the one you didn't think about before you setup the RAID) and the array gets corrupted somehow... well, you lose everything.
RAID can be a good supplement in addition to regular backups, but it's not a complete replacement.
Just a tip on the power supply sutuation. Spend a bit of extra cash and get a name brand one. The fans are quieter and the lifetime is a great deal longer plus they are generally a lot more efficient.
I'd always stinged out on the power supply but ever since I took the plunge and got a good one I'll never go back.
What's more, there are a lot of other data-loss scenarios for which RAID won't help you at all: namely, anything that either destroys the pc as a unit or anything that causes your machine to actively destroy data.
To name a few:
* disasters, natural or otherwise, that fry, crush or soak the pc as a whole. (Lightening, earthquake, broken water pipe.)
* Theft or confiscation of your computer. (Sure, you can argue with the DEA that your drug dealing roommate never used your computer, and you might win and get your hardware back. On the other hand, if your roommate manages to pawn it first, you're out of luck.)
* Any trojan, virus, hacker, or dumb friend who deletes your files or screws up your file systems or partitions tables. Sure, in the case of a dumb friend (or a dumb you), you may be able to recover if you discover it soon enough. . . but in that case hardware RAID is likely to make it far MORE of a pain in the ass than it otherwise would have been.
Sure, they're probably all less likely to happen to most home pcs than the failure of a single hard drive. But they're not so unlikely as to be worth ignoring, if you care about your data.
In choosing between RAID, and buying a couple spare drives in portable enclosures and keeping a weekly backup in your desk at work, the later seems quite a lot more attractive to me. Of course both is an even better solution. (Both, with an identical spare RAID card in your desk at work is best of all...)
Then there's mdadm for Linux. I've found it to work wonderfully. And it's free.
Except that raid 0+1 can't be implemented with 2 drives, it requires a minimum of 4.
Rule number one: always keep an extra drive around. Drives are cheap, and they die regularly. Also, the cost of buying that _one_ extra drive is constant. You always have an extra drive around. It's not like you have to buy two each time you go to the store. You drives will die at 8pm on a Sunday night, just before you go on that 3-week business trip, otherwise. I promise.
Rule number two: never spend more than $100. The best $/GB always seems to me to be in the $100 range these days. I usually make sure to pick up drives at Fry's whenever I see something substantially larger than what I have now for less than $100.
Rule number three: Stay ahead of drive failures. If you have important data on those crappy, cheap $100 IDE drives, replace them every two years at least. In those two years, you can double your capacity for less cost. Use the old drives for backups of important stuff, just in case a newer drive bites the dust. Or, leave it as-is, and use it like a snapshot of your working data.
No, that's why you have maintenance packs on your servers.
When your controller fails, it gets replaced OnSite by service technician, no matter how old it is. We use IBM xSeries, and still have some older machines operating. We bought Out-Of-Warranty ServicePacks for them, they're now 5 years old.
A controller in one of them failed, 3 hours later an IBM technician was OnSite with a new, same controller, replaced the card, and the machine was up and running again. That was a 5 years old IBM xSeries, with dual PIII at 1.1Ghz, mind you.
Of course, you don't want to buy service packs that cost more than the machine is worth now (but less than the money involved to migrate the existing setup..) in a private environment. Thats why you do only RAID1 there. I've been able to recover RAID1s from any sort of raid controller with a bit of fiddling. Most involve no fiddling at all, because they have the Metadata at the "end" of the drive, and just appear as a plain disk on a normal scsi controller.
Every drive failure I've experienced has had two things in common.
1. There were obvious warning signs(strange noises, etc) while the drive was still functioning properly.
2. The failure isn't sudden death of the drive, the drive has a fairly long period of almost working right.
Replacing the drive as soon as it starts to exhibit problems is much more important than worrying about the age of the drive.
Everything that's critical (and not so secret) goes as soon as possible on a backup CD/DVD (the more the merrier), on other home/office computers, even on memory sticks or whatever other removable media you might have at hand... and if possible, also some remote (and remotely accessible) location.
Um, if you have a 20-40 GB drive and don't fill it up and only have a CD burner that might be a solution. The best affordable solution for most people is to buy an external USB drive enclosure and a couple of HDs. Last Christamas, my mom gave me that 250GB drive and enclosure was only about $150 from tigerdirect. I used to trust CDs/DVDs for backup purposes, but I've been burned by bad copies of the CD/DVD not working on other machines. It may be slightly more expensive for the HD solution, but you just don't have to worry about it working unless all your backup drives fail, which is unlikely.
You need to add more to that.
buy decent Hard drives. I used to have to replace IBM drives like mad, they would not last more than 12 months if you were lucky. Switched to seagate higher quality class drives (means more expensive per gigabyte but it's well worth it) and the problems stopped.
I find that the server class hard drives, even the IDE ones are so much better than the consumer class garbage that it's a no brainer. I happily spend $140 for a 200 gig drive, because I know that my 200 gig of illegally ripped DVD's is safer on that drive than a $68.00 200 gig consumer drive, just from my experience.
Granted, I rather have them on a raid 50 array of 15K scsi U320 drives... but not everyone can afford $40,000 for movie storage.
>If you replace them on a schedule, you're still not guaranteed 100% reliability because a drive can fail way before MTBF...
It is a common misperception that MTBF ratings mean anything about how long an individual device is supposed to last. It's only a measure across a large number of units in total power-on hours, and only within the expected "useful life."
For example, consider a hard drive that has an MTBF of 100,000 hours (11 years), and a 5-year intended useful life. If you have 1,000 of these drives, you can expect, on average, one to fail every 100 hours within the first five years. After that, all bets are off.
So not only does a 100,000 hour MTBF not mean you'll get 11 years, you're lucky (or, more precisely, not unlucky) if you get 5 years.
As many others have said, if you intend to keep it, back it up. Every drive is only guaranteed to work until it fails.
IBM once described it this way:
http://web.archive.org/web/20001202154100/http://
FIXME: Add a sig here