Slashdot Mirror


Taking a Hard Look At SSD Write Endurance

New submitter jyujin writes "Ever wonder how long your SSD will last? It's funny how bad people are at estimating just how long '100,000 writes' are going to take when spread over a device that spans several thousand of those blocks over several gigabytes of memory. It obviously gets far worse with newer flash memory that is able to withstand a whopping million writes per cell. So yeah, let's crunch some numbers and fix that misconception. Spoiler: even at the maximum SATA 3.0 link speeds, you'd still find yourself waiting several months or even years for that SSD to start dying on you."

32 of 267 comments (clear)

  1. Holy idiocy batman by Anonymous Coward · · Score: 4, Insightful

    100000 writes? 1M writes?

    What the fuck is this submitter smoking?

    Newer NAND flash can sustain maybe 3000 writes per cell, and if it's TLC NAND, maybe 500 to 1000 writes.

    1. Re:Holy idiocy batman by Anonymous Coward · · Score: 5, Informative
      • SLC NAND flash is typically rated at about 100k cycles (Samsung OneNAND KFW4G16Q2M)
      • MLC NAND flash used to be rated at about 5k – 10k cycles (Samsung K9G8G08U0M) but is now typically 1k – 3k cycles
      • TLC NAND flash is typically rated at about 1k cycles (Samsung 840)
    2. Re:Holy idiocy batman by craznar · · Score: 4, Funny

      Obviously the TLC NAND is named for the Tender Loving Care you need to give it during use.

      I think the Slack Lazy Careless stuff is more robust.

      --
      EMail: 0110001101100010010000000110001101110010 0110000101111010011011100110000101110010 0010111001100011011011110110
    3. Re:Holy idiocy batman by craznar · · Score: 4, Informative
      --
      EMail: 0110001101100010010000000110001101110010 0110000101111010011011100110000101110010 0010111001100011011011110110
    4. Re:Holy idiocy batman by CajunArson · · Score: 5, Insightful

      The AC is dead-on right. At 25nm the endurance for high-quality MLC cells is about 3,000 writes. That's a relatively conservative estimate so you are pretty much guaranteed to get the 3K writes and likely somewhat more, but it's a far far cry from the 100K writes you can get from the highly expensive SLC chips. Intel & Micron claimed that one of the big "improvements" in the 20nm process was hi-K gates that are claimed to maintain the 3K write endurance at 20nm, which otherwise would have dropped even more from the 25nm node.

      The author of the article went to all the time & trouble to do his mathematical analysis without spending 10 minutes to find out the publicly available information about how real NAND in the real world actually performs....

      --
      AntiFA: An abbreviation for Anti First Amendment.
    5. Re:Holy idiocy batman by jyujin · · Score: 5, Informative

      I specifically had SLCs in mind when I ran the numbers. As for the 100k writes I used my original calculations, I took those from this PDF here: http://www.datasheetcatalog.org/datasheets2/16/1697648_1.pdf - see section 1.5, it lists "Endurance : 100K Program/Erase Cycles" As for the 1M write cycles: http://investors.micron.com/releasedetail.cfm?ReleaseID=440650 - that one came out in 2008, so using it as a baseline for "newer" SLCs didn't seem that far off. I'll have to revise the article to include those links methinks...

    6. Re:Holy idiocy batman by ioconnor · · Score: 5, Informative

      Citation needed? The manufacturers typically tell you. For instance here http://www.newegg.com/Product/Product.aspx?Item=N82E16820239045 it states "Budget-minded gamers and enthusiasts will benefit from the lower price of Kingston’s new HyperX 3K SSD. This solid-state drive combines premium 3000 program-erase cycle Toggle NAND with the second-generation SandForce controller" So it gets only 3% of the authors most optimistic graph! Kind of funny article actually. Like the mad scientist doing lots of good math but overlooking the most obvious information the ding bat brought along for comedy plot complications sees in a flash. I wrote a tutorial yesterday on how to make a ram drive on linux so as to avoid using your fancy fast flash drive. It can be found here: https://ioconnor.wordpress.com/2013/02/18/tutorial-on-automatically-moving-home-to-ram-drive-and-back-on-startup-and-shutdown/

    7. Re:Holy idiocy batman by Anonymous Coward · · Score: 3, Insightful

      He referenced specific models. A hyperlink is not the only way to refer to a source. You were given enough information to find the source easily.

    8. Re:Holy idiocy batman by ebh · · Score: 5, Interesting

      RAM disks are cool and all, but except on live CDs they're usually unnecessary. The kernel's buffer cache and directory-name-lookup cache (in RAM) can often outperform RAM disks on second reads and writes.

      (Claimer: I worked on file systems for HP-UX, and we measured this when we considered adding our internal experimental RAM FS to the production OS.)

    9. Re:Holy idiocy batman by CajunArson · · Score: 5, Funny

      17 December 2008.

      5 years? Might as well write a white paper on the benefits of drum memory over mercury delay lines.

      --
      AntiFA: An abbreviation for Anti First Amendment.
    10. Re:Holy idiocy batman by tlhIngan · · Score: 5, Interesting

      100000 writes? 1M writes?

      What the fuck is this submitter smoking?

      Newer NAND flash can sustain maybe 3000 writes per cell, and if it's TLC NAND, maybe 500 to 1000 writes.

      Actually, NAND flash doesn't "die" when you try to do the N+1 erase-write cycle (it's cycles, not writes. A cycle consists of flipping bits from 1 to 0 (aka write), and then from 0 to 1 (aka erase)). In practically all controllers, you do partial writes. With SLC NAND, it's fairly easy - you can write a page at a time, or even half pages. MLC lets you do page at a time as well - given typical MLC "big block" NAND of 32 4k pages, a block can be written 32 times before it's erased (once per page - you cannot do less than a page at a time).

      And... other dirty little secret - the quoted cycle life is guaranteed. It means your part will be able to be written and erased 3000 times. Most typically, they're an order of magnitude more conservative - so a 3000 cycle flash can really get you 30,000 with proper care and tolerance.

      Of course, a really big problem with cheap SSDs is lame firmware because what you need is a good flash translation later (FTL) which does wear levelling, sector translations, etc. These things are VERY proprietary and HEAVILY patented. A dirt cheap crappy controller you might find on low end thumbdrives and memory cards may not even DO translation or wear levelling. The other problem is the flash translation table must be stored somewhere so the device can find your data (because of wear levelling, where your data is actually stored versus where your PC thinks it is different - again, the FTL handles this). For some things, it's possible to just scan the entire array and generate the table live, but generally it's impractical at the large scale because it requires time to perform the scan. So usually the table is stored in flash as well, which of course is not protected by the FTL. Depending on how things go, this part could corrupt itself easily leading to an unmountable device or basically, a dead SSD.

      For some REAL analysis, some brave souls have been stressing cheap SSDs to their limits until failure - http://www.xtremesystems.org/forums/showthread.php?271063-SSD-Write-Endurance-25nm-Vs-34nm

      Some of those SSDs are actually still going strong.

      The best bet is to buy from people who know what they're doing - the likes of Samsung (VERY popular with the OEM crowd - Dell, Lenovo, Apple, etc.), Toshiba, and Intel - who all make NAND memory and thus actually do have experience on how to best balance speed and reliability. Everyone else is just using the datasheet and just assembling them together like they would any other PC part.

  2. 100,000? by rgbrenner · · Score: 5, Informative

    100,000 is only for SLC NAND. MLC, what is currently in most SSDs, is only 3,000, and TLC (found in usb drives, samsung 840, and probably more SSDs soon because it's cheaper) is only 1,000.

    Is 1,000 fine for most people, yes.. but you should be aware of it. I have a fileserver that writes 200gb per day.. which would kill a Samsung 840 in about 6-7 months.
    http://www.anandtech.com/show/6459/samsung-ssd-840-testing-the-endurance-of-tlc-nand

    1. Re:100,000? by rgbrenner · · Score: 5, Informative

      I own 2 840s... they are fine. If you're really concerned, samsung has a tool that will let you adjust the spare space.. so you can take a 256gb drive, set aside 20gb to use for spares as cells wear out, and use 236gb for your data.

      If you read the article I linked to, an 840 128gb drive will last for about 272TB in writes... or about 11.7 years at 10gb/day.

      It's much more likely that another part will wear out before the cells do.

    2. Re:100,000? by beelsebob · · Score: 5, Interesting

      Luckily, while he's about 30 times out for the write endurance on the bad side, he's about 100-1000 times out on the speed at which you're likely to ever write to the things, on the good side, so in reality, SSDs will last about 3-30 times longer than he's indicating in the article. The fact that he's discussing continuous writes at max sata 3 speed suggests that he's really concerned with big ass databases that are writing continuously, and use SLC NAND. The consumer case is in fact much better than that, even despite MLC/TLC.

  3. Re:Tried It - Disappointed by CajunArson · · Score: 4, Informative

    Obviious Troll is Obvious but... while SSDs can & do fail (just like old hard drives can & do fail), the reason for SSD failure in the real world is very rarely due to flash memory wear. Hint: If your flash drive suddenly stops working one day, that ain't due to flash wear, which would manifest as gradual failure over time.

    --
    AntiFA: An abbreviation for Anti First Amendment.
  4. Our first age-related failure was a 2008 drive. by urbanriot · · Score: 5, Interesting

    Our company experienced what we believe was its first age-related failure in October of 2012, an office PC with an Intel SSD drive in the value oriented line of 2008 (which was still high at the time). Basically the drive behaved as a mechanical drive would behave with an occasional bad sector and we were able to successfully image the data to a new one. Out of 200 Intel drives, that's pretty good. (We did have one failure in 2010 but that was an outright dead drive and we were able to RMA it). Not sure if this contributes anything to the conversation but I figured I'd throw this out there.

    The Intel X25's in my PC, from 2009, are still humming along nicely and my last benchmark produced the same results in 2012 as they did in 2010. But I've gone so far as to set environment variables for user temp files to a mechanical drive, internet temp files to a RAM drive and system temp files to a RAM drive, offsetting the wear leveling.

  5. Re:Tried It - Disappointed by neokushan · · Score: 4, Informative

    Had an SSD in my laptop for just over a year and a half now, no issues what so ever. Daily use as well.

    --
    +1 IDisagreeSoHeMustBeATrollOrAnAstroturferOrAShill
  6. Re:If SSd is nearly full? by Colonel+Korn · · Score: 4, Interesting

    But if your SSD is nearly full with data that you never change, wouldn't all the writing happen in the small area that is left? This would significantly reduce lifetime.

    I believe all the major brands actually move your data around periodically, which costs write cycles but is worth it to keep wear balanced.

    --
    "I zero-index my hamsters" - Willtor (147206)
  7. Re:If SSd is nearly full? by Anonymous Coward · · Score: 3, Informative

    actually they thought about that never SSD drives have special wear leveling algorithm that if it notices you write some parts a lot and remainder of disk is static they just move static part to used-up space and use underused (ex-static part of disk for writing stuff that changes a lot, more or less you can expect that every cell will be used equal number of times even if you write to just 1 file big 1MB and rest is static

  8. Life is tricky for flash by Anonymous Coward · · Score: 5, Interesting

    meaningful life specs are tough to come by for flash. Yes, as noted above, SLC NAND has a rated life of 100k erases/page on the datasheet, but that's really a guaranteed spec under all rated conditions, so in reality, it lasts quite a bit longer. If you were to write the same page once a second, you'd use it up in a bit more than a day.

    However, in real life, the "failure" criteria is when a page written with a test pattern doesn't read back as "erased" in a single readback. Simple enough, except that flash has transient read errors: that is, you can read a page, get an error, read the exact same page again and not get the error. Eventually, it does return the same thing every time, but that's longer than the "first error".

    There's also a very strong non-linear temperature dependence on life. Both in terms of cycles and just in terms of remembering the contents. Get the package above 85C and it tends to lose its contents (I realize that the typical SSD won't be hot enough that the package gets to 85C, although, consider the SSD in a ToughBook in Iraq at 45C air temp..)

    In actual life, with actual flash devices on a breadboard in the lab at "room temperature", I've cycled SLC NAND for well over a million cycles (hit it 10-20 times a second for days) without failure. This sort of behavior makes it difficult to design meaningful wear leveling (for all I know, different pages age differently) and life specs, without going to a conservative 100k/page uniform standard, which, in practice, grossly understates the actual life.

    What you really need to do is buy a couple drives and beat the heck out of them with *realistic* usage patterns.

  9. Re:100,000? (AWS?) by rgbrenner · · Score: 3, Informative

    Almost certainly MLC. SLC is really only found in industrial SSDs these days. Enterprise and consumer SSDs are all MLC, with the exception of Samsung 840, the first SSD to use TLC.

  10. Re: Tried It - Disappointed by h4rr4r · · Score: 4, Insightful

    So then you only use magnetic tape for storage?
    How long does it take to boot from that?

    I have backups, so I can always restore.

  11. Re:If SSd is nearly full? by higuita · · Score: 4, Interesting

    SSD should work at maximum of 75% of their capacity... 50% or less is recommended

    some chips try to move blocks to rotate the writes, have a lot of spare zones, so it can remap/use other sectors on write... but that is a problem, working in a full SSD will shorten its live

    --
    Higuita
  12. Re:What about swap? by h4rr4r · · Score: 3, Informative

    I don't expect most servers to swap at all. If your server is swapping, buy more ram. Cell phones are still ram starved enough to need to do that.

  13. Re:Tried It - Disappointed by Luckyo · · Score: 3, Insightful

    I have a very old (I think I bought it circa 2004 or so, it has turion cpu). Display hinges failed in it as well as cooling so I can't play games on it anymore (discreet GPU).

    Hard drive is trucking on fine.

    Some hard drives obviously last less. However if you have systemic problem with hard drives lasting less then two years, it's time to take a look at the factor that remains the same between these hard drives: user.

  14. Re:Number crunching != empirical evidence by bobbied · · Score: 3, Interesting

    Which is why most SSD drives implement some kind of wear leveling. They will move the often written sectors around the physical storage space in an effort to keep the wear even.

    Rotating media drives do similar things and can physically move "bad" sectors too, but this usually means you loose data. Many drives actually come from the factory with remapped sectors. You don't notice it because these sectors are already remapped on the drive onto the extra space the manufacturers build into the drive, but don't let you see.

    Reminds me of when I interviewed with Maxtor, years ago. They where telling me that the only difference between their current top of the line storage (which was like 250G at the time) and their 40 Gig OEM drive was the controller firmware configuration and the stickers. Both drives came off the same assembly line and only the final drive power up configuration and test step was different, and then only in the values configured in the controller and what stickers got put on the drive. If you had the correct software, you could easily convert the OEM drive to the bigger capacity, by writing the correct contents to the right physical location on the drive. The reason they did this was it was cheaper than having to stop and retool the production line every time an OEM wanted 10,000 cheap drives.

    I'm sure drive builders still do that sort of thing today. Set up a 3Tb drive line, then just down size the drives which are to be sold as 1Tb drives.

    --
    "File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
  15. Re:Curve or Cliff? by Luckyo · · Score: 3, Informative

    Sudden failures are controller failures. Especially budget controllers tend to fail before flash does.

    Flash failure is "usually" about not being able to write to the disk, but being able to read from the disk. Problem is that when you're getting it, that means you've gone through all the reserve flash and controller no longer has any flash to assign to use from reserve. I.e. drive has been failing for a while.

    Modern wear leveling also means that failure would likely cascade very quickly.

  16. Re: Tried It - Disappointed by gman003 · · Score: 4, Funny

    No, magnetic tape is too vulnerable to EMP. He boots from punch card.

  17. Re:Tried It - Disappointed by blueg3 · · Score: 3, Interesting

    Actually, better SSD controllers sense that a page has reached its rewrite limit. The end effect of this is that the size of the overprovisioned space gets reduced by one page. (The controller stops ever writing to the used-up page.) The write performance of the SSD degrades until it goes below a certain amount of overprovisioned space, at which point it refuses to write any more. The disk is still entirely readable, so it's a binary failure mechanism, but a pretty safe one.

    Gradual failure over time means either you have a crap controller or that your electronics are failing in ways other than running out of write cycles.

  18. Re: Tried It - Disappointed by Anonymous Coward · · Score: 4, Funny

    Fire susceptible.

    I've implemented a filesystem on top of OpenCV that uses a laser to read bits carved into granite slabs.
    If the laser fails, various sun alignments will allow the passive CdS sensor to take over, at a performance penalty of several years (about one IOP per year).

  19. Re:Tried It - Disappointed by AmiMoJo · · Score: 3, Informative

    My desktop Intel X25 died after 8 months due to running out of spare blocks and an ADATA drive I had in my occasional use laptop lasted about a year and a half. My two anecdotes cancel out your anecdotes.

    --
    const int one = 65536; (Silvermoon, Texture.cs)
    SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
  20. How about some real numbers by m.dillon · · Score: 3

    So far I see a lot of complaints from people who don't appear to even know how to run SMART tools to get write cycle and wear statistics from their SSDs... you know, so real actual numbers can be posted.

    So far none of my SSDs have failed, and I have almost 20 installed in various places. The one with the most wear is one of the first SSDs I purchased, an Intel 40G device:

    da0: Fixed Direct Access SCSI-4 device
    da0: Serial Number CVGB951600AC040GGN
    da0: supports TRIM

    Power on hours - 19127
    Power cycle count - 48
    Unsafe shutdown count - 32
    Host writes x 32MiB - 375697
    Workld media wear - 5120
    Available reserved - 99/99/10
    Media wearout - 91%

    Basically 12TB worth of writes on this 40G drive over the last 2.18 years. No failures. Media wearout indicator 99 -> 91. Estimated durability based on the wear indicator is around 132TB. Roughly comes to ~3300 cycles/cell. This vintage of SSD uses MLC flash whos cells are roughly spec'd at ~10000 cycles.

    While firmware issues are well documented for various SSD vendors over the last few years, and cell erase cycle life has gone down as the chips have gotten more dense, I would still expect the vast majority of failures to be due to wear-out.

    Lots of things can cause premature wear-out but probably the most common would be using the SSD for something really stupid, like to host a database doing a lot of random writes or with a high frequency of fsync()s, using the SSD for swap on a system which is paging heavily 24x7, using the SSD for WWW log files on a busy web server, formatting an unaligned filesystem on the SSD or a filesystem which uses too-small a block size, and any number of other things.

    Venerable but still mostly correct:

    http://leaf.dragonflybsd.org/cgi/web-man?command=swapcache

    The only adjustment I would make is that as the Intel 40G continues running, the wear I'm getting on it is pointing closer to ~130TB of durability and not 400TB (400TB is the theoretical max at 10,000 cycles/cell). Still reasonable. Generally speaking, that's the older 34nm technology. The newer 24nm technology will get fewer cycles but devices tend to have more storage so, as I say in the manual, you could expect similar total wear out of a newer 120GB 310 series SSD whos flash cells have 1/3 the cycle life.

    -Matt