How Often Do You Replace Your Hard Drives?
Telemachas asks: "I recently purchased a Dell P4 2.8 GHz swap meet computer with a 200 gig hard disk for a good price and all is working fine. It does not seem prudent, however, to trust my data on a swap meet item. For another @ $ 75.00 each I can purchase new 200 gig HDDs. I would also like to do my first RAID system. I am now wondering how often, if at all, do Slashdot readers replace their HDDs?"
I started building computers twelve years ago.
:)
The only drive I've had die before I retired it myself from sheer obsolescence was an IBM 20GB "DeskStar" model; this happened about five years ago, IIRC. The drive made noise and froze the system when I would read particular files; to my frustration, it occurred when I read some of the files that were important to me (documents, programming projects, one folder of MP3s, etc.)
My solution was to put the drive in the freezer for a few hours; UNBELIEVABLY, it worked - I would have about ten minutes to copy as much as I could off the drive before it would start making noise again. I got most of what I needed off of it.
Incidentally, IBM was very good about the whole thing; they sent me a new drive the day I called them. Too bad they sold their HD division to Hitachi...
Anyway, I've had FAR worse luck with power supplies; I usually go through one of those every other year. Recently, ALL of the drives in my RAID 5 array (4x 120GB Seagate drives) as well as a fifth one (an identical Seagate 120GB that's standalone) started making noise at around the same time; of course I assumed there was some defect with this particular drive model.
But thankfully, it turned out only to be my power supply (the +5V line would deliver +4.4V ~ +4.6V, while the +12V line would fluctuate between +11V and +13V). I can only conclude that Seagate drives are less tolerant than IBM/Hitachi's of power supply fluctuations, since I also have an old 80GB IBM/Hitachi Deskstar and a much newer 250GB SATA IBM/Hitachi drive, and neither batted an eye.
Likewise, the system showed no other symptoms that pointed at the power supply; so a week or so ago, this post would have looked very different, with a few "F-You Seagate"'s thrown in there.
I don't upgrade single drives at a time. I have dedicated file servers to put the majority of my data on. The first was 8x20GB drives, then 8x120GB drives and my current is 8x250GB drives. I rebuild when I run out of space and can afford the upgrade. When I do, I take down the old system and have several drives to throw around in spare systems and friends computers. This happens every few years I guess. The file servers are all RAID 5 and I upgraded to a gigabit network with the last one so it's pretty speedy and redundant. It's also handy when you have data to share between several computers and several users. Though, I believe my next system will simply be a MacPro with 3x750GB drives. I'm getting to the point where I wish the majority of my data was on my computer locally so I don't have to worry about permissions and resource forks. I'm also getting tired of the whole second-computer-for-data thing. I'm ready to consolidate. I guess I'll finally have to do decent backups though in case a drive goes down.
Seriously. The older a drive is, in my experience the less likely it is to die. The first six months are the worst.
But then I'm running a pair of drives as raid 0 for speed, and figure if you loose important files due to disk crash, you needed to learn your lesson about backups the hard way.
Next time I'll do raid 1 as I'm told that some controllers manage to combine reads from both drives to get the same speed as raid 0. Size is so cheap these days there isn't much point not to do raid 1. Twice the speed of a normal drive and a vastly reduced chance of having to reinstall everything.
-- http://thegirlorthecar.com funny dating game for guys
"When they break" is the correct answer.
I replaced a drive because the new drive was getting rave reviews. One year later, the Deathstar died. The drive that had been replaced is still running in a friend's computer.
Remember, RAID with mirroring or parity is just for fault tolerance. RAID is not a backup. In a normal desktop, I would buy a faster drive than spend the money on a RAID.
A typical configuration for the smartmon-tools package for Linux will run a full SMART self-test every day. That test has caught three hard drive failures in the last three years for me (two Maxtors, one Seagate), all of which started screaming before any data was lost. In one of the Maxtor cases, the drive went down in flames so fast after the initial warning that I lost some data, the other two gave me enough time to make (another!) backup before tossing or RMA'ing the drive.
I have considerably less faith in any of the Windows based SMART monitoring tools, as I haven't found any that seem to run an equally rigorous test on the drive every day. As you suggest, unless you run a good test, the drive is unlikely to generate useful SMART errors until it's too late. You can go crazy staring at the low-level statistics trying to figure out whether changes in the rate of the error rates there mean anything, but when the self-test reports an error that drive is done. For me, that's been early enough to be helpful while not causing me to toss the drive before it's truly worn out.
I'm sold on SMART. It's saved my bacon in a major way at least twice. I use it on my SuSE boxes and my WinXP machines. I have the schedule set up to run self-tests everynight and a long test every weekend, which causes almost no impact on the drive while the test is running. The testing algorythm is built into the drive, it runs on the drive, and doesn't consume memory or CPU on the host machine. Watch the logs carefully for relocated sectors and other tell-tales, like lengthening seek times. http://smartmontools.sourceforge.net/ It works.