Slashdot Mirror


For First Three Years, Consumer Hard Drives As Reliable As Enterprise Drives

nk497 writes "Consumer hard drives don't fail any more often than enterprise-grade hardware — despite the price difference. That's according to online storage firm Backblaze, which uses a mix of both types of drive. It studied its own hardware, finding consumer hard-drives had a failure rate of 4.2%, while enterprise-grade drives failed at a rate of 4.6%. CEO Gleb Budman noted: 'It turns out that the consumer drive failure rate does go up after three years, but all three of the first three years are pretty good,' he notes. 'We have no data on enterprise drives older than two years, so we don't know if they will also have an increase in failure rate. It could be that the vaunted reliability of enterprise drives kicks in after two years, but because we haven't seen any of that reliability in the first two years, I'm skeptical.'"

23 of 270 comments (clear)

  1. You're buying an extended warranty by John3 · · Score: 4, Insightful

    "Enterprise" drives may have longer warranty coverage, so you are essentially just buying an extended warranty that is built into the selling price. This is how water heaters are priced...a 5 year warranty water heater is often identical to a 10 year warranty unit, but the manufacturer has crunched the failure rate numbers and will just wind up replacing a percentage of 10 year models when they start to leak in 8 years.

    --
    "We make our world significant by the courage of our questions and by the depth of our answers." Carl Sagan
    1. Re:You're buying an extended warranty by SJHillman · · Score: 3, Insightful

      Yeah, because no business ever adds computers to a domain, has users log in via Remote Desktop, uses group policies or roaming profiles.

    2. Re:You're buying an extended warranty by Guspaz · · Score: 3, Informative

      Let's presume that consumer drives don't fail for 3 years, and enterprise drives don't fail for whatever their warranty period is (or at least neither suffers significant failure figures during those time periods). Let me then compare the price of a comparable consumer and enterprise drive on NewEgg:

      Consumer drive: WD3001FAEX (3TB, 7200RPM, 64MB cache, 6gbit/s): $220, 2y warranty
      Enterprise drive: WD3000FYYZ (3TB, 7200RPM, 64MB cache, 6gbit/s): $340, 5y warranty

      Now, we know the data shows consumer drives are highly reliable for 3 years, after which they get reliable, so let's presume you replace at your own cost every 3 years. Enterprise drives are probably no more reliable, but replacements are free between years 3 and 5, so let's say you replace at your own cost every 5 years. You get:

      Consumer drive, average cost per year: ~$73
      Enterprise drive, average cost per year: ~$68

      Not a huge difference there, and if both drives are really equally reliable, it doesn't really make much of a difference which you pick.

    3. Re:You're buying an extended warranty by gstrickler · · Score: 3, Informative

      I got the warranty info directly from WD's site and spec sheets. RPM is NOT the primary factor in determining seek time, that only affects rotational latency, which is one of at least 4 components of access time, the other three being track seek time, head settling time, and head select time. Seek time is generally the largest of those, rotational latency second largest, and the others are minor by comparison.

      Amount of ECC is not only dependent upon 512/4k (AF) drive, that's one factor, but most "enterprise" drives from most manufacturers have greater ECC and most use lower track densities to allow faster positioning (faster seek). For instance, compare the data sheets for the 7200RPM desktop and Enterprise (Constellation ES) drives from Seagate. Note the "enhanced error correction" and better "non-recoverable read error" rates (which are directly related to ECC recoverablity) on the ES (enterprise) drive, and that's comparing a 512b sector ES drive to a 4K/AF desktop drive.

      As I said, you analysis was generally good, you just missed a the 3 items I noted.

      --
      make imaginary.friends COUNT=100 VISIBLE=false
  2. Re:Common knowledge by cmseagle · · Score: 3, Informative

    What? There's absolutely difference between 87 octane and 92+ octane. While many high end cars are able to compensate for this difference by sacrificing efficiency, it's certainly not wise to put the lower grade gasoline in a high performance vehicle. Not a good analogy at all.

  3. Re:But but but by g0bshiTe · · Score: 4, Funny

    Sadly no, they are just Intrepid's SSDs

    --
    I am Bennett Haselton! I am Bennett Haselton!
  4. Re:Common knowledge by ShanghaiBill · · Score: 4, Interesting

    What? There's absolutely difference between 87 octane and 92+ octane.

    For 99% of cars, there is no difference. Unless a car is specifically designed to use a higher compression ratio, there is no benefit whatsoever to a higher octane rating. Besides, you are assuming that the premium gas actually has a higher octane rating. Years ago, it actually cost more to make high octane gas. Today the octane rating can be tweaked with cheap additives. So it is common to just make it all 92, then just use one tanker truck to make the delivery and just fill all the tanks with identical gas.

  5. Google proved this years ago by Anonymous Coward · · Score: 4, Interesting

    Google already published many detailed reports on various issues surrounding the HDD business, proving that the money saved by buying cheaper hard-drives, and using them in data 'defending' situations (replicating data on multiple drives) made far more sense then using so-called 'enterprise' class equipment in complex, expensive configurations. Once again, to the surprise of no alpha, the KISS (keep it simple, stupid) principle wins out in engineering.

    The buzz wordy, mock intellectual, synthetically complex world of 'enterprise' solutions is designed to appeal to the mind of the 'beta', a class of technocrat for whom rote-learning is everything. IT people are mostly of this class, so the 'paraphernalia' and 'jargon' make such people feel 'special'. The fundamentals of Computer Science fly right over the heads of most people involved in computer decision making.

    It shames people to not even understand why the capitalist society works best with mass manufactured items, and that limited run items will always have significant compromises. Make more of an item, and it gets cheaper AND more reliable through necessity of efficiency.

    But only a few days back, in some forum, people were dribbling in ecstasy because some fake enterprise HDD (RED series from Seagate?) was being 'discounted' to only 40% above the cost of the cheapest quality 3TB HDD. Many people gave EXPENSE as the primary reason for buying the vastly inferior Xbox One over the PS4 (in other words they were 'big' individuals because they could afford the more expensive console).

     

  6. Re:Common knowledge by brianwski · · Score: 5, Insightful

    Disclaimer: Backblaze engineer here. I don't think all "commercial storage systems" get exactly the same "hammering". Some commercial systems are used to store data quietly for a long time (let's say online backup or shutterfly storage of photos), some commercial systems are hammered constantly (google's homepage search). I reject the concept that "enterprise" or "commercial" is a thing. You MUST look at the specific application. Some consumers use their hard drives quite a bit, some don't. Some corporations are hammering away at their drives, some are not.

  7. Re:Common knowledge by SJHillman · · Score: 5, Funny

    "Realistically the first part to fail on a PC will be the hard drive."

    Only because the user isn't technically part of the PC.

  8. Re:Common knowledge by SpaceManFlip · · Score: 4, Informative
    Source on the tanker claim?

    Also FYI the octane requirement can be related to timing advance, where a lower-compression turbocharged engine with more advanced timing would need higher octane gas to make longer burns from each spark (higher octane gas burns longer than lower octane gas). The earlier spark sets off a longer-burn time of gas timed to the timing, needing the longer-burn ability of the 92+ octane. An old simple truck with 0 BDC timing would be happy with 87 octane, where a newer engine with 15 BDC timing advance would be better with 92+ octane.

    Fuck this is way off topic from hard drives, sorry. Just needed to fill in some missing info.

    As for hard drives, the more, the better. RAID is for safety now, and SSD's are for speed where we used to have RAID-0. ETC

  9. Warranty isn't the only factor by vhfer · · Score: 5, Informative
    We have hundreds of drives in Coraid SAN shelves. In our first batch of maybe four or five 15-drive shelves, we bought our own drives-- Seagate with 5 year warranties. We had a high initial rate of failure in the first 6 months, followed by a low but steady rate from then until the warranties were up. We had spares, Seagate was good about getting us replacements relatively quickly, were weren't happy, but it was workable.

    All the newer shelves came preloaded with Coraid-approved drives. As I said, there's hundreds of drives involved here, a lot of SATA 1TB and 2TB and some SAS 600GB. I think out of the later drives, we've had two fail. Maybe three.

    Asked about it, Coraid said, yes, the warranty is better on "Enterprise-class" or "RAID-class" drives, but also, the firmware is different. They claim that drives intended for the consumer / SOHO market spend a lot of time retrying marginal reads before declaring an unreadable sector and sparing it. They say that SAN-class drives limit the retry time, because the array controller handles it more efficiently, since it has the big-picture view.

    The also say that the drives are optimized for close-quarters operation, all jammed together in an array, handling vibration and heat build-up slightly differently, and that they have minor differences to keep lubrication from migrating out of the spindle bearing under continuous operation. I don't know but I imagine loss of spindle bearing lube would add vibration and make any but the best reads more marginal.

    I don't know for sure, but we've spent a great deal of US dollars on their products and our experience has borne out the fact that there's a definite difference in arrays.

    As for corporate desktop and/or server use, well, I don't really know. Our servers that have one to four drives were mostly shipped with those drives, so we didn't choose them. I can't tell you if they are enterprise class drives, but I imagine they are, based on the replacement costs. And I know about what some of those costs are, or anyhow I know they were way more than I personally pay for drives for home desktop and server use. I know that because occasionally they fail, and I have to buy new ones.

  10. Re:Not only that, by gstoddart · · Score: 4, Informative

    But consumer hard drives are so much cheaper that it's not really cost effective anymore to buy Enterprise drives.

    Do you actually do Enterprise Storage? Because I know people who do.

    At the really high end, the machines automatically call home and report a fault to the vendor. The vendor then dispatches someone to replace the faulty bit within the SLA.

    In my experience, and from what I've been told by people who do this for a living, the Enterprise class drives come with the benefit of a warranty in which the manufacturer is contractually obligated to get you a replacement within a fixed amount of time.

    Anyone doing real enterprise class storage for real mission critical things -- using commercial SATA drives is just not done unless it's cheap/bulk storage. Sure, you pay through the nose to the vendor for that kind of support, but you also have guaranteed service time and availability.

    I just don't see evidence of people who do this at an enterprise scale cheaping out on disks for the important stuff.

    --
    Lost at C:>. Found at C.
  11. Used to design HDD's by loose+electron · · Score: 4, Informative

    No difference between enterprise and home HDD's that I know of.

    As for what "hammering and heavy use " of a drive is?

    The biggest killer of HDD's is something called the CSS test cycle.

    CSS = Contact Start Stop where the drive is booted up, spun up, and then shut down repetitively.

    Generally, a HDD sitting there spinning away is not what kill them off,
    however turning them on-off-on-off a lot is the most abusive thing that you can do.

    I still think WD makes the best quality out there, but that's just my opinion.

    just my 0.02 worth...

    --
    www.effectiveelectrons.com "chips that work" Analog, RF, Mixed Signal
  12. Re:Common knowledge by brianwski · · Score: 3, Informative

    Our Dell shelves (billing servers and store customer account info) have hot spares already spinning inside the shelves. NetApp Filers do this also. If a drive fails, the storage system begins IMMEDIATELY transitioning to the spare. So I agree with you wholeheartedly there. Backblaze uses RAID6 for the customer backup storage where we group 15 drives into a RAID group with 2 parity drives. So we can lose any 2 drives out of 15 and the data is still 100% intact. I really, REALLY cannot recommend RAID5 to anybody. Having a lone hard drive is fine for some applications (my laptop), and having RAID6 with 2 parity drives is fine for some applications. I cannot imagine why you would have RAID because you care about your uptime, but not care enough to use more than RAID5.

  13. Re:Common knowledge by ewibble · · Score: 3, Insightful

    Consumer drives have this thing called being half the price, keep one spare, what the heck if it breaks go out and buy a new one, in 1 a hour, still faster than 4 hours. What kind of enterprise organization wouldn't have a few hard drives spare just in case a few failed. Send the old one back to replaced, in their own good time.

    I don't see why you would have to pay 100% markup for what is basically insurance, for the manufactures defects.

    Sort of like airline tickets that you can reschedule, more than 2x the price and still subject to availability (last time my company bought one), just buy the non refundable ticket, if your plans change then buy another one, the average cost is going to be less, unless you change your plans a lot, perhaps you need better planning? You also have travel insurance for such things which is not the cost of the plane ticket, and covers other things too.

  14. Re:Common knowledge by UnknowingFool · · Score: 3, Insightful

    Are you saying that the enterprise drives last longer?

    I didn't say that.

    Or just that they are replaced for free when they die at the same or higher rates? If you want to save money, I think the answer is *NOT* buy the warranty (so buy consumer drives) because the warranty costs more than just replacing the failed drives?

    If your company wants to do that, then do it. But I would think that is a hard sell to the IT directors who want service and replacement parts quickly. Here's the scenario:

    1. HD fails
    2. Log ticket with HD company and get replacement drive with little cost
    or
    2. Put in a purchase order for a new drive.

    At some companies, buying a new drive outright is more troublesome/bureaucratic than getting a replacement drive.

    --
    Well, there's spam egg sausage and spam, that's not got much spam in it.
  15. Re:Not like the 90's by houstonbofh · · Score: 3, Interesting

    You may want to check your environment for heat or dust, or get better power supplies. I can not remember the last drive I have had fail in the warranty period.

  16. Re:Common knowledge by brianwski · · Score: 4, Interesting

    The only major company I know that uses consumer grade HDs in volume is probably Google

    What qualifies as "major"? :-) This article is about Backblaze, we have 25,000 consumer hard drives, are we "major"?

  17. Re:Common knowledge by lgw · · Score: 3, Insightful

    You can't use a consumer drive in a RAID array if that drive will spend 90 seconds trying to recover a normal read error before sparing the sector out. TLER means "give up almost immediately" on media errors.

    Yes, it's a bit of a scam that you have to buy a high-end drive to get TLER, since it's just a flag in the firmware, but it's still critical ro RAID.

    --
    Socialism: a lie told by totalitarians and believed by fools.
  18. Re:Not only that, by roc97007 · · Score: 3, Interesting

    > Do you actually do Enterprise Storage? Because I know people who do.

    > At the really high end, the machines automatically call home and report a fault to the vendor. The vendor then dispatches someone to replace the faulty bit within the SLA.

    Yes, I deal directly with that, with Big Company and Really Big Company, and I have to say the process doesn't work very well, for many reasons that I won't enumerate here for keep-my-job reasons. In all honesty, we had better uptime and much faster response when we stocked our own spares and hired someone to walk through the machine room daily looking for yellow lights. Sorry, but that has been my experience. After outsourcing storage, the lag from warning light to replacement is significant, with many hilarious hijinks along the way. (My favorite being when they remotely updated the firmware during the same service call as disk replacement and bricked the device.) It's a great example of not getting what you pay for, except the ability to check off managerial line items.

    --
    Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
  19. Re: Common knowledge by nabsltd · · Score: 4, Informative

    Here in Australia, 92 is the standard fuel and 97 is the premium. I can't imagine putting 87 in my car...

    Australia displays the "Research Octane Number" on the pumps, while the US diplays the "Anti-Knock Index", which is:
    ((Research Octane Number) + (Motor Octane Number)) / 2

    Since MON is often 8-10 points lower for the same fuel, this results in 4-5 points lower on the pump display in the US.

  20. Re:Common knowledge by fnj · · Score: 3, Insightful

    What the heck? The error retry and sector sparing are within the drive itself. ZFS doesn't even see this. What ZFS can see is a drive not responding for 90 seconds after a write command, and ZFS or the driver below the ZFS level does not like this. There is real danger of multiple drives being kicked out of the storage pool quickly and the whole pool failing, when proper drive behavior lets the pool continue undegraded even in the face of bad sectoirs on multiple disks.

    There are plenty of consumer drives that can be set to the same TLER (time limited error control) behavior as enterprise drives, though.

    Right on the money about using ZFS, though. I will never understand losers using old fashioned expensive caching RAID controllers when ZFS on dumb SATA/SAS ports is far superior in every way. Many or most of them are Windows losers, of course.