Disk Failure Rates More Myth Than Metric
Lucas123 writes "Using mean time between failure rates suggest that disks can last from 1 million to 1.5 million hours, or 114 to 170 years, but study after study shows that those metrics are inaccurate for determining hard drive life. One study found that some disk drive replacement rates were greater than one in 10. This is nearly 15 times what vendors claim, and all of these studies show failure rates grow steadily with the age of the hardware. One former EMC employee turned consultant said, 'I don't think [disk array manufacturers are] going to be forthright with giving people that data because it would reduce the opportunity for them to add value by 'interpreting' the numbers.'"
I don't understand how people are always complaining about their hard drives failing. In 30 years it hasn't happened to me yet.
I'm about to lug a huge Wang hard drive out to the trash pickup on Monday - weighs over 100 pounds... still runs. Actually it uses removable platters but still...
This space available.
...those that make backups and those that never had a hard drive fail.
If everyone knows how much a disk drive costs, and nobody can find out how long a disk drive really will last, there is no way the marketplace can reward the vendors of durable and reliable products.
The inevitable result is a race to the bottom. Buyers will reason they might was well buy cheap, because they at least know they're saving money, rather then paying for quality and likely not getting it.
"How to Do Nothing," kids activities, back in print!
Maybe they mean the MTBF for drives that are just on, but not being used. I've never put any stock into those numbers, because I've had too many drives fail to believe that they're supposed to be lasting 100 years. I've had 3 die in the last 3 years alone (all in my server, so probably getting more than average use, but still...)
My anecdotal converse is I have never had a hard drive not fail. I am a bit on the cheap side of the spectrum, I'll admit, but having lost my last 40GB drives this winter I now claim a pair of 120s as my smallest.
I always seem to have a use for a drive, so I run them until failure.
I dub thee... Sir Phobos, Knight of Mars, Beater of Ass.
As drive sizes have been going up, overall the warranty periods have been going down. With few exceptions (Seagate does three years) drives have a one year expected life time.
"Have you ever thought about just turning off the TV, sitting down with your kids, and hitting them?"
The best metric is probably going to be the length of warranty the manufacturer offers. They have financial incentive to find out the REAL mean time until failure in calculating the warranty.
'Every story, if continued long enough, ends in death.' --Ernest Hemingway
put the 500GB drive into your bottom drawer ... the unused disk will break when thrown out by your great great grand kids - who will simultaneously wonder if you really did use storage of such tiny capacity.
I remember back in the mid 1980s when I received a service management manual from DEC, it had some information that really opened my eyes about what MTBF was really intended for. It had a calculation (I have long since forgotten the details) that allowed you to estimate how many service spares you would need to keep in stock to service any installed base of hardware, based on MTBF. This was intended for internal use in calculating spares inventory level for DEC service agents. High MTBF products needed fewer replacement parts in inventory, low MTBF parts needed lots of parts in stock. Presumably internal MTBF ratings were more accurate than those released to end users.
So anyway.. MTBF is not intended as an indicator of a specific unit's reliability. It is a statistical measurement to calculate how many spares are needed to keep a large population of machines working. It cannot be applied to a single unit in the way it can be applied to a large population of units.
Perhaps the classical example is about the old tube-based computers like ENIAC, if a single tube has an MTBF of 1 year, but the computer has 10,000 tubes, you'd be changing tubes (on average) more than once an hour, you'd rarely even get an hour of uptime. (I hope I got that calculation vaguely correct)
I think that a lot of people are mis-understanding MTBF. A HD might have a MTBF of 100 years. This doesn't mean that the company expects the vast majority of consumers to have that HD running for 100 years without problems.
MTBF numbers are generated by running say thousands of hard-drives of the same model and batch/lot, and seeing how long it takes before 1 fails. This may be a day or so. You then figure out how many total HD running hours it took before failure. If you have 1,000 HD's running, and it takes 40 hours before one fails, that's a 40,000 hr MTBF. But this number isn't generated by running say 10 hard-drives, waiting for all of them to fail, and averaging that number.
Thus, because of the way MTBF numbers are generated, they may or may not reflect hard-drive reliability beyond a few weeks. It depends on our assumptions about hard-drive stress and usage beyond the length of time before the 1st HD of the 1,000 or so they were testing failed. Most likely, it says less and less about hard-drive reliability beyond that initial point of failure (which is on the order of tens or hundreds of hours, not hundreds of thousands of hours or millions of hours!).
To be sure, all-else equal, a higher MTBF is better than a lower one. But as far as I'm concerned, those numbers are more useful for predicting DOA, duds, or quick-failure; and are more useful to professionals who might be employing large arrays of HD's. They are not particularly useful for getting a good idea of how long your HD will actually last.
HD manufacturers also publish an expected life-cycle of their HD. But I usually put the most stock in the length of the warranty. That's what they're willing to put their money behind. Albeit, it's possible their strategy is just to warranty less than how long they expect 90% of HD's to last, so they can then sell them cheaper. But if you've had a HD and you've had it for longer than what the manufacturer publishes as the expected-life, what they're saying by that is you've basically got a good value, and will probably want to have something else on hand, and be backed up.
social sciences can never use experience to verify their statemen
Disk MTBF is quoted for 20C.
Here is an example of my server. At 18C ambient in a well cooled and well designed case with dedicated hard drive fans he Maxtors I use for RAID1 run at 29ÂC. My Media server which is in the loft with sub-16C ambient runs them at 24-34 depending on the position in the case (once again, proper high end case with dedicated hard drive fans).
Very few hard disk enclosures can bring the temperature down to 24-25C.
SANs or high density servers usually end up running disks at 30C+ while at 18C ambient. In fact I have seen disks run at 40C or more in "enterprise hardware".
From there on it is not amazing that they fail at a rate different from the quoted one. In fact I would have been very surprised if they did.
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/
Everyone who's ever had a hard drive already knows that.
From a wonderful satire site Married to the Sea, comes this little gem.
Drive makers have always relied on questionable statistics and outright misrepresentation to make sales, and as we all know, statistics are worse than even damned lies.
I am not a supporter of industry regulation or class action lawsuits, I think that both are use far too much these days, but it would be nice if these companies were given a hard kick in the pants. They've gotten away with this for far too long.
Love sees no species.
He should look at the escalating price of gold too. Older the computer component the more gold in the connectors and the thicker the gold on the traces, etc.. Not to mention other precious metals involved in some of the components such as platinum, paladium, etc.. Perhaps the greatest consideration should be given to the fact that it would increase the heavy metal pollution at the dump it goes to.
:P
Probably some nice magnets inside to play with too.
My drives work great ... until someone comes along and puts stickers on other drives that say they are more "ready" than my drives.
NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
However, I strongly suspect that the problem lies in the fact that one manufacturer would have to be the first to change the documented lifespan of their products, and the danger is that unless their competitors follow, their products could be interpreted as inferior and they could lose a lot of business.
Amnesty International
Didn't Google present data on their disk failure rates? How about other large purchasers? Who cares if the manufacturers don't report them. If you have some very large purchasers report them, it may be more useful information, anyway.
Help! I'm a slashdot refugee.
I would put the quotation marks around "add value" instead of adding them around "interpreting".
They are obviously interpreting the numbers.
How the hell can they be adding value is way beyond me.
Adding price, may be, but VALUE ????
We are Turing O-Machines. The Oracle is out there.
While we are on the topic of failing drives, I think it would be appropriate to include a warning about USB drives and warranties.
I purchased a 500GB Western Digital My Book about a year and a half ago. I figured that a pre-fab USB enclosed drive would somehow be more reliable than building one myself with a regular 3.5" internal drive and my own separately purchased USB enclosure (you may dock me points for irrational thinking there). Of course, I started getting the click-of-death about a month ago, and I was unpleasantly surprised to discover that the warranty on the drive was only for 1 year, rather than the 3 year warranty that I would have gotten for a regular 3.5" 500GB Western Digital drive at the time. Meanwhile, my 750GB Seagate drive in a AMS VENUS enclosure has been chugging along just fine, and if it fails sometime in the next four years, I will still be able to exchange it under warranty.
The moral of the story is that, when there is a difference in the warranty periods (i.e., 1 year vs. 5 years), it makes a lot more sense to build your own USB enclosed drive rather than order a pre-fab USB enclosed drive.
An unjust law is no law at all. - St. Augustine
To make this sort of test work, it must be run over a much longer period of time. But in the process of designing, building, testing and refining disk drive hardware and firmware (software), there isn't that much extra time to test drive failure rates. Want to wait an extra 9 months before releasing that new drive, to get accurate MTBF numbers? Didn't think so. How many different disk controllers do they use in the MTBF tests, to approximate different real-world behaviors? Probably not that many.
Could they run longer tests, and revise MTBF numbers after the initial release of a drive? Sure, and many of them do, but that revised MTBF would almost always be lower, making it harder to sell the drives. On the other hand, newer drives are certainly available every quarter, so it may not be a bad idea to lower the apparent value of older drive models.
So, it's better to assume a drive will fail before you're done using it. They're mechanical devices with high-speed moving parts, very narrow tolerable ranges of operation (that drive head has to be far enough away from the platters not to hit them, but close enough to read smaller and smaller areas of data). Anyone who's worked in a data center, or even a small server room, knows that drives fail. When I've had around two hundred drives, of varying ages, sizes and manufacturers, in a data center, I observed a failure rate of five to ten drives per year. This is well below the MTBF for enterprise disk array drives (SCSI, FC, SAS, whatever), but drives fail. That's why we have RAID. Storage Review has a good overview of how to interpret MTBF values from drive manufactures.
But since 1981 I have had exactly zero catastrophic PC drive crashes. That's not to say I haven't seen some bad/relocated sectors, but hard failures? None. Granted that's only 20 drives. But in fact in my experience in PC's, midranges and mainframes in almost 30 years I have seen zero hard drive crashes.
I'm at work right now...copying terabytes of data from an array that has failing drives and cannot rebuild itself due to the amount of simultaneous drive failures. I have been here for 32 hours. So, please don't give me this "hard drives never fail" crap!
From my first 200MB Seagate (bought in 1993) to a 20GB Maxtor that failed last year. Fortunately they fail when they're no longer my primary drive. I would say they last something about 5-6 years...
Anecdotal reports of failures also need to consider the operating environment. If I have a server rack, and most servers in the rack have a drive failure in the first year, is it the drive design or the server design? Given the relative effort that usually goes into HDD design and box design, it's more likely to be due to poor thermal management in the drive enclosure. Back in the day when Apple made computers (yes, they did once, before they outsourced it) their thermal management was notoriously better than that of many of the vanilla PC boxes, and properly designed PC-format servers like the HP Kayaks were just as expensive as Macs. The same, of course, went for Sun, and that was one reason why elderly Mac and Sparc boxes would often keep chugging along as mail servers until there were just too many people sending big attachments.
One possibly related oddity that does interest me is laptop prices. The very cheap laptops are often advertised with optional 3 year warranties that cost as much as the laptop. Upmarket ones may have three year warranties for very little. I find myself wondering if the difference in price really does reflect better standards of manufacture so that the chance of a claim is much less, whether cheap laptops get abused and are so much more likely to fail, or whether the warranty cost is just built into the price of the more expensive models because most failures in fact occur in the first year.
From scarped cliff or quarried stone she cries "A thousand types are gone, I care for nothing, no not one."
Hard drives have been becoming less and less reliable as densities increase. Seagate, WD, Hitachi, Maxtor, Toshiba, heck, they all die, often sooner than their warranties are up. They're mechanical devices, for crying out loud. So here's a bit of good advice: If you really care about your data, use a RAID array with redundancy (RAID 1 or 5). It will cost a bit more, but you'll sleep better at night. Thank you all for your kind attention. That is all.
Panama Hat: SO DO YOU!
Disk reliability metrics are much more science than myth. Like all science, this means you actually need to put some minimal effort into understanding them. Unlike myths :-)
Disks have two separate reliability metrics. The first is their expected life time. In general disks failure follows a "bathtub distribution". They are much more likely to fail at the first few weeks of operation. If they make it past this phase, they become very reliable - for a while anyway. Once their expected lifetime is reached, their failure rate starts steeply climbing.
The often quoted MTBF numbers express the disk reliability during the "safe" part of this probability distribution. Therefore, a disk with an expected lifetime of, say, 4 years, can have an MTBF of 100 years. This sounds theoretical until you consider that if you have 200 of such disks, you can expect that on average one of them will fail each year.
People running large data warehouses are painfully aware of these two separate numbers. They need to replace all "expired" disks, and also have enough redundancy to survive disk failures in the duration.
The article goes so far as to state this:
"When the vendor specs a 300,000-hour MTBF -- which is common for consumer-level SATA drives -- they're saying that for a large population of drives, half will fail in the first 300,000 hours of operation," he says on his blog. "MTBF, therefore, says nothing about how long any particular drive will last."
However, this obviously flew over the head of the author:
The study also found that replacement rates grew constantly with age, which counters the usual common understanding that drive degradation sets in after a nominal lifetime of five years, Schroeder says.
Common understanding is that 5 years is a bloody long life expectancy for a hard disk! It would take divine intervention to stop failures from rising after such a long time!
MTBF is only valid during the "lifetime" of a drive. (For example, "lifetime" might mean the five years during which a drive is under warranty.) Thus, the MTBF is the mean time before failure if you replace the drive every five years with other drives with identical MTBF. Thus the 100-some year MTBF doesn't mean that an individual drive will last 100+ years, it means that your scheme of replacing every 5 years will work for an average time of 100+ years.
Of course, I think this is another deceptive definition from the hard drive industry... To me, the drive's lifetime ends when it fails, not "5 years".
Source: http://www.rpi.edu/~sofkam/fileserverdisks.html
Was this even a question? I mean, did anybody actually believe the claims from the hard drive manufacturers?
Oh, you're not stuck, you're just unable to let go of the onion rings.
People say, 'Tape is kind of boring.' Well, I say go in and tell your customer that you have lost their back-up tapes and you'll see excitement pretty quickly.
Have you tried to get a drive replaced from Sun recently?
It'll take you an hour just to reach someone and then they arn't even in the right department.
Why?
Within the past six months the amount of hard drive failures on there equipment has skyrocketed.
They can't hire people fast enough to answer the phones/web requests for new ones.
disclaimer: I work for Samsung 3.5" HDD Lab
One difference would be that the voice coil motor that pushes the head back and forth on seeks on Samsung drives runs slower, but quieter and lower power. Samsung drives generally have a reputation for being lower power. That has been one differentiating factor between Samsung versus Seagate, Fujitsu and Western Digital. However, an even bigger difference is the number of disks in the drive. The more disks, the harder all of the motors have to work.
There are differences from model to model within vendors as well. For each new model of hard drive you have a custom designed motor, enclosure, ICs, media, etc. The technology is moving so fast it is hard to follow. The current generation is the 1TB disks.
One funny example is that right now Western Digital is pushing their so-called "Green" 5400 rpm drives. Running at 5400 rpm does indeed use less power -- but they didn't set out to make a low power drive. Engineering was simply unable to get their 1TB drive to work at the higher performance 7200 rpm. So, they marketed it as a "green" drive, and had a huge success!
A little off-topic, but maybe someone should start complaining about stupid crappy power supplies failing for external hard drives.
I have had TWO failed power supplies happen on me, one a couple days ago because i accidentally kicked it when i tripped (on the floor...)
And it wasn't even a full-force kick either, its like one of those nudges you'd use to check if the guy you ran over was dead.
Also, both drives were from Seagate, it could be that they just fail at making anything good (i'd have to stick with that one)
I'm almost considering ripping the drive out, breaking the plastic into pieces then mailing it back to Seagate with a little message, something like "LOOK AT ME! I'M A FUCKING WRECK! THIS IS YOUR FAULT DAMN IT! WHY DID YOU USE THOSE STUPID SECURE TORX SCREWS? YOU GOD DAMN SHITPENGUIN!"
Last I checked.
-Clio
Karma: Bad (mostly from not giving a fuck)
Blog: http://clintjcl.wordpress.com
I would expect that MTBF for new and old kit would differ significantly, and found http://www.tech-faq.com/mtbf.shtml which defines the 'MTBF Curve' - the variation of MTBF across a a product's lifecycle.
However, by the vary nature of the beast, this cannot be grasped by a single number, which is what marketeers prefer.
This would be a nice application for Sparklines [http://en.wikipedia.org/wiki/Sparkline]
change is inevitable
on this workstation (pIII500 1GB RAM XP/98) my 20GB western digital has been on for more or less 24/7 since 1999 and is still going strong, sure its a bit noisy (till it spins down) but no errors or bad blocks and the transfer rate is as good as ever
my servers drives on the other hand has had 2 drives in 4 years and in my day job i have seen many a maxtor/wd fail after 6 - 18months after the customer purchased it
they just dont make 'em like they used to
Things which have moving parts have shorter lifespans than things which are solid state.
Of the pieces of hardware in a computer, these have moving parts:
Cooling fans
Hard drives
Removeable media drives (CD, floppy, tape, etc)
Switches (power button and the like)
Ports (if you count things like the pins inside network and modem ports)
All of those things except hard drives could fail simultaneously, and you'd be pretty likely to be able to have the server running again in short order - by pulling the hard drive(s) and transplanting them into identical hardware. If any one of those non-hard drive things fails alone, the server is likely to continue running long enough to effect replacement before catastrophe (excepting maybe cooling fans, but you still wouldn't have a recovery scenario).
If the hard drives all simultaneously fail, you need to restore from backup, which is a lot of downtime.
Because failed drives result in so much downtime, fudging the reliability statistics on hard drives creates am exponentially higher risk to the consumer, one they are not generally aware of.
Note: I know about RAID, and everyone should use redundant disks. I'm talking a complete failure of all hard drives in a system. Unlikely, yes, but not out of the question. I had two of three drives in a RAID5 in predictive failure on a brand new HP server that I had just finished a customer project with, and had to wait until after the weekend to receive new drives. I was damned lucky.
Web 2.0 == Giant Blogspam Circle Jerk
Sorry. Mod me -1 Bad Pun.
Have gnu, will travel.
...that by the time the drive fails beyond that warranty, the vendor is more likely than not not going to have any drives that small in stock. So they'll replace it with whatever's on the shelf, which is usually an order of magnitude larger, at the very least.
I would not haul MY wang out to the trash...especially if it weighed over 100 pounds! *ducks and runs*
Down With Slashdot BETA!!! I've been around the corner and seen the oliphant; you can only abuse me from your perspecti
Given enough time, the failure rate for any kind of device is 100%.
Circumcision is child abuse.
Which would you rather own? A model that lasts 5 years to the day no matter how many you own, or a model that on average lasts 5 years but is as likely to fail on its 10th day as it is on its 10th anniversary of service?
Bell curves can be steep or shallow. Steeper curves, having values tightly clustered around the mean, lend themselves to far more predictable results. And the worst part about drive failures is that they're so freaking unpredictable.
I'm surprised so many have commented on the MTBF without so much as mentioning the relevance of the variation.
To the guys who claim they've never lost a drive, you've had what? Maybe 3 or 4? I deal with several large raids, encompassing a few hundred drives and running 24/7. The power and cooling are very tightly controlled. Looking at our statistics, we have about a 5% failure rate for drives within the first year. About 10% over four years. SCSI drives seem to last longer than SATA drives, but they are also much more expensive. The MTBF numbers from the manufacturers are total BS. The best number to go by is the warranty, because that's what matters to the manufacturer. Depending on the expected failure rate of a particular model and the profit margin, they set the warranty period to minimize the number of replacements and still be able to make a profit. Some models that might be a 5% or even 10% warranty replacement rate.
All of this is WRONG.
All this is just confusion, admittedly happily encouraged by the hardware manufactures.
MTBF is NOT and has NOTHING to do with the expected time before a drive fails.
MTBF is the expected time between failures in a SYSTEM which is REGULARLY MAINTAINED.
What does regularly maintained mean? It means that when a component reaches the end of its SERVICE LIFE that component is REPLACED.
TO WHIT: If at the end of the warranty period of your drive you replace said drive with a new burned-in hard drive, copying the data from old drive to new drive, and you keep doing this over and over again, on average it will take the MTBF before you encounter a failure.
Also, MTBF figures are notoriously inaccurate as they are arrived at using a formula which takes into account the MTBF of each component that goes into a system -- components which often have incorrect MTBF times.
Example: An electrolytic capacitor might have an MTBF of 100,000 years, assuming you replace it with a new tested electrolytic capacitor of the same type every year before all the electrolyte evaporates!
Knowing the MTBF without knowing the service life of the component or the burn-in procedures for a component is meaningless.
For more info see: http://www.apcmedia.com/salestools/VAVR-5WGTSB_R0_EN.pdf&revid=607475614&sa=X&oi=revisions_inline&resnum=0&ct=result&cd=3&usg=AFQjCNFpbPO04_wdZ8-aD-sN5yDKUViCsQ
This has been hashed and rehashed over and over.
7 Conclusion
Many have pointed out the need for a better understanding of what disk failures look like in the field. Yet hardly any published work exists that provides a large-scale study of disk failures in production systems. As a first step towards closing this gap, we have analyzed disk replacement data from a number of large production systems, spanning more than 100,000 drives from at least four different vendors, including drives with SCSI, FC and SATA interfaces. Below is a summary of a few of our results.
There are far more than two kinds of peeps.
"A great democracy must be progressive or it will soon cease to be a great democracy." --Theodore Roosevelt
The manufacturer tests a population of drives, and waits for a significant fraction of the drives under test to fail, recoding the failure time for each. In this way, it is possible to separate "infant mortality" failures from "random event" failures. Typically, the failure times are fitted to a Weibull distribution http://en.wikipedia.org/wiki/Weibull_distribution. This process also provides a value for the post-manufacture burn-in time which will kill most units which are prone to "infant mortality" type failure. The MTBF is estimated based on the "random event" failure rate, giving absurdly large MTBF values. Unfortunately, the test rarely lasts long enough to identify the third type of failure: "wear out", which determines the end of life, and which is often less than the MTBF.
Think of estimating human life expectancy in the U.S. as MTBF, using data from http://www.data360.org/dsg.aspx?Data_Set_Group_Id=587. A small fraction (683 out of 100000) die in the first year. Death rates are low for the next 60 years, then climb to a very steep peak. The 85+ category is not subdivided, because very few live long enough to make separate 85-94 and 95+ categories worthwhile. However, if life expectancy were calculated for humans in the same way as MTBF for disk drives, then only deaths between ages 1 and 24 would be used. Since from the 99317 who survived infancy, only 126 die in that time, the MTBF for humans would be estimated at several centuries. If only we didn't wear out...
Those who can make you believe absurdities can make you commit atrocities. - Voltaire
However, I generally run out of HD space long before the drive has had time to wear out and I buy drives in pairs, one main, one mobile rack with rsync backup, and I occasionally rotate the main and backup drives to equalize wear.
Tech Public Policy stuff
I do straight disk image backups to a mobile rack. I do this 3x a week using a modified Knoppix disk with backup scripts (to find out how, google on alizard Knoppix) taking about 15 minutes to rsync the drives together. Upside? If my main HD fails, I'm back up and running in 15 minutes. (I'm running LVM, so I have to change the volume ID to boot normally from a Linux initrd) For you, a bare-metal restore is going to take a lot longer.
Tech Public Policy stuff
I back up to a mirror drive in a mobile rack, which is unplugged from the computer when not actively backing up. I back up 3x/week (IOW, when KAlarm tells me to) using a Knoppix disk modified with rsync and dar backup scripts.
... secondary backup storage, whether to another site, NAS, a backup server, or pile of DVD-Rs.
The problem with RAID is the obvious one. If a disk drive in a RAID environment fails due to factors extrinsic to the drive (i.e. lightning bolt blowing up the UPS and surge protector and dumping into the PSU), every redundant drive probably goes with it. The way to avoid this is
Tech Public Policy stuff
I use a drive mirror in a mobile rack that is unplugged when not backing up (schedule 3x a week and I'm pretty religious about it), and back up to an offline DVD-R pile monthly. My rack cost me about $20, it's a very nice aluminum case to an SATA plug. And the last time I had to replace the drive, I was up and running in 10 minutes and at the Maxtor site working on my RMA a few minutes later. (props to Maxtor, the warranty replacement was hassle-free)
And yes, I do sleep better at night. If your RAID array doesn't have some sort of separate offline/nearline backup, you shouldn't.
Tech Public Policy stuff
LOL, great post & analogy. Yea, if only we didn't wear out.
Thanks for the more detailed analysis of how MTBF is calculated (and how burn-in failures are ignored -- shouldn't that also be something that they report?). So it seems like this calc is just enough to get beyond burn-in duds or DOA, and into maybe the "mid-life" of the HD, but not into the burn-out phase. Although with 3 or 5 years being burn-out, that would be impractical to calculate. Albeit, they could provide an estimate based on reported burnouts (StorageReview.com) of their similar HDs manufactured with similar processes.
social sciences can never use experience to verify their statemen
I see this a lot in my line of work. While I have had the occasional hard drive die (I look after a lot of machines, so my odds are up over the normal person - I also have backups so failures aren't a problem), most people who say "my hard drive died" are usually more like "windows refused to boot so I got sold a new hard drive".
I'm really suprised at the rush to sell people new hard drives in the asshole stores. Just this weekend I got given a drive by someone to 'recover' all their info from it after it had 'died'. I stick it in a case... it's fine. The only thing wrong with it was a fucked up windows registry. So I happily copy all their files off onto some dvd's. Too bad I don't get paid extra for that. I also wonder what would have normally happened to this 'dead' hard drive after it had been replaced. I bet it would have just been reformatted and gone into the next machine.
Google released a report about a year ago with the surprising finding that heat had no apparent effect on the rate of hard disk failure. This was based on Google's set of several tens of thousands of always-on hard disks.
...is the StorageReview.com Hard Drive Reliability Survey: http://www.storagereview.com/map/lm.cgi/survey_login
You basically input all the hard drives you possess into their database and then they let you see the statistics collected so far.
When one of your drives fail, you ought to update its status (at what age has it failed).
The database still contains a bit sparse information, but it's still the best I could locate on the Internet.
I bought some hard drives from a company I found through one of those on line cheap price location sites (not mentioning the name because I don't want to encourage them), but it was about 10% cheaper than the next less expensive vendor. My companies policy at the time was to record the serial numbers of all the drives. I noticed that I could not find a serial number printed, but there was a barcode where the serial number field should be. I scanned the barcodes of the drives, noticed that they were all the same, and figured I was just looking in the wrong place. I called Maxtor (the drives were labeled with Maxtor labels), and they had me run some more tests, and they came to the conclusion, "We never made those drives". They were all counterfeit. Needless to say, the drives all failed after only a few days / weeks of use.
Did you mount a military-grade, variable-focus MASER on an unlicensed artificial intelligence?
I just wrote W.D. and asked about *actual* expected lifespan of their hard drives, and received this response:
===============
We no longer measure the reliability of our drives using Mean Time Between Failure (MTBF). Our current drive reliability is measured using Component Design Life (CDL) and Annualized Failure Rate (AFR). The Component Design Life of the drive is 5 years and the Annualized Failure Rate is less than 0.8%.
================
~REZ~ #43301. Who'd fake being me anyway?