Slashdot Mirror


SSD Annual Failure Rates Around 1.5%, HDDs About 5%

Lucas123 writes "On the news that Linus Torvalds's SSD went belly up while he was coding the 3.12 kernel, Computerworld took a closer look at SSDs and their failure rates. While Torvalds didn't specify the SSD manufacturer in his blog, he did write in a 2008 blog that he'd purchased an 80GB Intel SSD — likely the X25, which has become something of an industry standard for SSD reliability. While they may have no mechanical parts, making them preferable for mobile use, there are many factors that go into an SSD being reliable. For example, a NAND die, the SSD controller, capacitors, or other passive components can — and do — slowly wear out or fail entirely. As an investigation into SSD reliability performed by Tom's Hardware noted: 'We know that SSDs still fail.... All it takes is 10 minutes of flipping through customer reviews on Newegg's listings.' Yet, according to IHS, client SSD annual failure rates under warranty tend to be around 1.5%, while HDDs are near 5%. So SSDs not only outperform, but on average outlast spinning disks."

18 of 512 comments (clear)

  1. Poor statistics by Anonymous Coward · · Score: 5, Insightful

    "client SSD annual failure rates under warranty tend to be around 1.5%, while HDDs are near 5%"
    So they are less likely to fail early in their life.

    NOT:
    "So an SSDs not only outperforms, but on average outlast spinning disk."

    This is completely unsubstantiated by the evidence provided.

    1. Re:Poor statistics by BancBoy · · Score: 5, Funny

      One of the few benefits of a spinning platter is that they can briefly generate their own juice when the power goes out.

      As many of us do, when the power goes out...

      --
      [UID-HeinzIntel]
    2. Re:Poor statistics by hairyfeet · · Score: 4, Interesting

      I would just add the whole thing ignores that big old rotting elephant in the room which is HDDs? I have found that in damned near every case, not all but most, will give you PLENTY of warning before it goes completely tits up whereas the SSD? One day its working and the next....nothing. No warning, no noise, no indication at all that there was a problem just...poof, buh bye data. This is also ignoring the fact that if the circuit board fails in a HDD you can swap one out for the same model and get it back in most cases, at least long enough to get the data, the SSD? Hope you are good with a soldering iron and a chip reader and I have heard even then its unlikely.

      I may be just a little country shop guy but when my gamer customers have all experienced multiple failures when it comes to SSDs, and these guys don't go cheap, sorry but ATM I still don't trust it. I tell folks if they want an SSD don't have anything on it they would feel bad if they lost, now does that mean there aren't still uses for SSDs? Of course not, for one thing if you have a laptop where most if not all of your data is in the cloud? Knock yourself out, just make a weekly disk image so you can re-image when it goes tits up and you are golden. I also have several customers that have bought either hybrid drives or that Sandisk caching drive for Win 7 and in both of those cases they have seen pretty big speed boosts while not having to worry because if it dies all you do is go back to HDD speeds as it is just a cache.

      Oh and one final thing....its gonna get worse. its common knowledge that with each shrink the number of writes goes down and the number of failures go up and with all of the major chip companies seeming to only care about how many bits they can stuff per nano-meter? The failure rate WILL get worse, you can count on it. Its too bad that SLC is so insanely high as those seem to have lower failure rates than MLC but as long as all the companies care about is getting that GB number up at all costs its really not gonna be getting better, its gonna be getting worse.

      Ironic that they talk about how supposedly high HDD failure rates are when I cleaned out a how drawer of them before moving into the new place, we are talking drives going back to Quantum Fireballs in the 200Mb size, yes Mb not Gb, and they all fired up. granted some of them were noisy as hell but I could still get files off of them while not a single one of my gamer customers have their first SSD, they are all dead. yes i know its an anecdote but I'm not the only one that has seen this, coding horror calls SSDs the hot crazy scale as you trade red hot performance for crazy failure rates. Call me old fashioned but I think I'll just pick upa caching SSD and keep the 5Tb in spinning rust, thanks ever so Intel.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    3. Re:Poor statistics by girlintraining · · Score: 5, Insightful

      I have found that in damned near every case, not all but most, will give you PLENTY of warning before it goes completely tits up whereas the SSD?

      Yeah, sure, okay. If you're sitting next to your computer, then yeah, maybe you notice. How about the hundreds of millions of drives that are sitting in a rack somewhere, and will only see a human being twice: Once when it gets installed in the rack, and then only when it stops working for whatever reason and a tech is sent out to replace it.

      The "it made a funny noise first" line item is a joke either way. This is like saying "Well, I prefer diesel engines because they make more noise when they die." Hookay. Yeah.

      I may be just a little country shop guy but when my gamer customers have all experienced multiple failures when it comes to SSDs, and these guys don't go cheap, sorry but ATM I still don't trust it.

      I may just be a Ferrari repair shop owner, but when my car owners have all experienced multiple failures when it comes to ceramic brakes and high end engine components, and these guys don't go cheap, sorry but ATM I still don't trust it.

      Now do you see how utterly ridiculous that sounds? High performance almost always means less robust. That graphics card you just plunked over $200 on? It's operating temperature is so high from the current being pumped through it that it's literally cooking itself at the molecular level from the moment you plug it in -- it's called electromigration, and in three to five depending on how often you use it, it's going to shit itself. But that's okay... because in two years, you'll be spending even more on a new one.

      Ironic that they talk about how supposedly high HDD failure rates are when I cleaned out a how drawer of them before moving into the new place, we are talking drives going back to Quantum Fireballs in the 200Mb size, yes Mb not Gb, and they all fired up. granted some of them were noisy as hell but I could still get files off of them while not a single one of my gamer customers have their first SSD, they are all dead.

      Yeah, and? How many gamers are still using their 200Mb Quantum Fireballs in an actual computer? I know it's a common geek past time to see what kind of antiquidated hardware you can pull out with your friends... that old parallel port Zip drive, or floppies the size of your head... and yeah, it's fun to talk about to show you had IT chops before the person you're talking to was even a glint in daddy's eye... but that's the only value they have.

      Nobody's coming up to me and asking for an AT command initialization string for their modem -- AT&F&C1&D2S95=55 in case you were wondering -- because it's not a technology very many use anymore. Yeah, I can dig out an old 2400 baud modem and get it working... but that doesn't mean 2400 baud modems are superior to cable modems that "have a higher failure rate".. and so, you know... I don't know if I trust such 'new' technology.

      Now, get off my lawn.

      --
      #fuckbeta #iamslashdot #dicemustdie
    4. Re:Poor statistics by icebike · · Score: 5, Informative

      Yeah, sure, okay. If you're sitting next to your computer, then yeah, maybe you notice. How about the hundreds of millions of drives that are sitting in a rack somewhere, and will only see a human being twice: Once when it gets installed in the rack, and then only when it stops working for whatever reason and a tech is sent out to replace it.

      Hmm, my drives send me emails when they start having problems. (And having gotten one of these emails a few years after setting up the drive initially, I was shocked to find it the email arrived in plenty of time. I pleasantly surprised to find the drive and all data still intact, and had time to swap a replacement into the raid).

      Why don't you find out how this is handled by people who actually have hundreds of drives to deal with.
      If you let them fail before servicing them you are doing it wrong.

      Look into: man 8 smartd

      --
      Sig Battery depleted. Reverting to safe mode.
    5. Re:Poor statistics by nedlohs · · Score: 4, Interesting

      Because Linus, who apparently uses SSDs, would never regularly compile a kernel or anything like that.

    6. Re:Poor statistics by Vanderhoth · · Score: 4, Informative

      I think he's referring to the power supply in the machine, not just a right out all the lights in your house go off. I've had power supplies that just before they die altogether will flicker on and off repeatedly, that would cause the SSD to flicker as well, causing the issue outlined above.

  2. Re:Do the math by Anonymous Coward · · Score: 4, Informative

    Alright, I'll do the math....

    9ms average access times on a 7200RPM spinning drive == ~100 IOPS.
    High-end SSD: 100K IOPS.

    Yes, a thousand times the number of disk accesses. If you're really a developer, you'll see your compile times cut by a factor of 5-10 (depending on how much CPU power you have to spare). Things load from disk like magic.

    You don't buy SSDs for the raw capacity, you buy them for the *fast* access times. Period.

  3. Hard drives warranty by danbob999 · · Score: 4, Insightful

    5 years should be mandatory by law. If you can't support your drive for 5 years, you shouldn't be allowed to manufacture hard drives at all.
    I don't understand this new trend in making new hard drives with only 1-2 years warranty. The same goes for SSD.

  4. Re:SSD failure rates by gander666 · · Score: 5, Informative

    Bullcrap. They can be replaced. Look up http://macsales.com/ they sell several sizes for the airs and the pro retinas.

    --
    Suppose you were an idiot and suppose you were a member of Congress ... but I repeat myself. - Mark T
  5. Yawn. by Anonymous Coward · · Score: 4, Insightful

    Anyone who isnt using a SSD by now for at least their boot drive is stuck in the past.
    It's the single best upgrade you can make anymore.

    Either way stop the fucking articles about it.
    Leave them with their warm feelings for spinning rust full of multi gigs of stuff they never touch.

    They'll wise up eventually. Or not.
    Either way it won't hurt you any. Enjoy your speedy pc and laugh at the rusties if you must.

  6. Re:Do the math by camperdave · · Score: 5, Insightful

    > as a developer, I have no use for SSD in my desktop system.

    Do you compile code?

    SSDs are for booting. RAM disks are for compiling, and hdd is for long term storage.

    --
    When our name is on the back of your car, we're behind you all the way!
  7. Re:Do the math by gman003 · · Score: 5, Informative

    Actually, you'd be surprised. The Samsung 840 EVO, a low-cost consumer drive (the high-end is the 840 Pro) that gets down to $0.70/GB, can hit 90K IOPS read on every model, and 90K IOPS write on 500GB models and up.

    Sure, older or ultra-cheap drives won't hit that (my new Chronos doesn't get there), but rounding to the nearest order of magnitude will get you 100K IOPS even on medium-end consumer drives.

  8. Stay away from OCZ and SandForce by JDG1980 · · Score: 5, Interesting

    OCZ's failure rates are higher than the rest of the industry's by an order of magnitude. Also, earlier SandForce drives have reliability problems because the firmware was written by paranoid loons who were deathly afraid of reverse-engineering and the drive goes into irrecoverable 'panic mode' when any abnormality of any kind is sensed. I think that newer SandForces (post-LSI acquisition), especially Intel's, are less likely to do this, but the original failures still taint the brand with the stigma of flakiness.

    If you stick with Samsung, Intel, and SanDisk, you should be fine. Stay away from OCZ at all costs, and be skeptical of any SandForce drive not made by Intel.

  9. Re:Do the math by Dahamma · · Score: 4, Informative

    Wait, you are basing the improvements in compile times on one guy's anecdotal results? Well, here's another: when I switched to an SSD at work my compile times were cut by more than half. It was an huge difference in compile time ie. productivity.

    It all depends on your codebase and tools, really. He was probably compiling a relatively small codebase, and for all we know his methodology sucked so a lot of it was in the RAM cache. I can tell you for a fact that a clean build on a large code base was drastically improved.

  10. Re:Do the math by MikeBabcock · · Score: 4, Informative

    No he's doing the math right -- At an annual failure rate of 1%, you need to replace 1% of your total capacity every year. With an annualized failure rate of 5%, you need to replace 5% of that capacity overall. The averaging is done because over time, it works out, just like insurance. Sure, on any given year *if* a drive fails, you have to pay for the whole thing, but that's not how one accounts for such failures.

    --
    - Michael T. Babcock (Yes, I blog)
  11. Re:But the disc can store much more by Rockoon · · Score: 4, Insightful

    Thats a silly thing to do. Lets examine this, shall we?

    A 5% chance to lose 2TB vs a 1.5% chance to lose 250GB.

    You argue that since it requires 8 of these 250GB SSD's to equal the capacity of the 2TB HDD that we should multiply 1.5% by 8, so a 12% chance... a 12% chance of what, tho? In actuality, there isnt a 12% chance of anything...

    The chance of losing at least 1 of those 8 SSD's (that is specifically 1 or more) over the period is (1 - (1 - 0.015)) = 0.114, but the chance of losing all of those 8 drives over the period is 0.015^8 = 0.0000000000000025628906. In other words, losing all 2TB in the SSD scenario is effectively never going to happen while it remains 5% for the HDD scenario.

    The actual breakdown of all possibilities of drive failings (0 drives, 1 drive, 2 drives, etc..) rounded to thousands of a percent is:

    0 drives: 88.611%
    1 drives: 10.795%
    2 drives: 0.575%
    3 drives: 0.000%
    4 drives: 0.000%
    5 drives: 0.000%
    6 drives: 0.000%
    7 drives: 0.000%
    8 drives: 0.000%

    So we see that you would be twice as likely to lose some data than in the HDD scenario, but invariably it will only be 250GB of data instead of 2TB of data (only 1 in 173 of these 8 drive experiments will witness more than 1 drive fail, and the majority of those will be exactly 2 drives failed)

    So no, you do not need to multiply the failure rate of the SSD's by the number of SSD's that you would need to equal the HDD. What you need to do is define the problem better because as it stands SSD's look a hell of a lot better when you suppose that you need a pile of them.

    --
    "His name was James Damore."
  12. Ancient data. by Reeses · · Score: 5, Informative

    All this discussion on this and no one has commented that TFA is from 2011??

    This article isn't reliable information. It's from when SSDs were relatively new and definitely doesn't apply to the in-the-field results people are seeing in 2013.

    --
    Reeses