Consumer-Grade SSDs Survive Two Petabytes of Writes
crookedvulture writes The SSD Endurance Experiment previously covered on Slashdot has reached another big milestone: two freaking petabytes of writes. That's an astounding total for consumer-grade drives rated to survive no more than a few hundred terabytes. Only two of the initial six subjects made it to 2PB. The Kingston HyperX 3K, Intel 335 Series, and Samsung 840 Series expired on the road to 1PB, while the Corsair Neutron GTX faltered at 1.2PB. The Samsung 840 Pro continues despite logging thousands of reallocated sectors. It has remained completely error-free throughout the experiment, unlike a second HyperX, which has suffered a couple of uncorrectable errors. The second HyperX is mostly intact otherwise, though its built-in compression tech has reduced the 2PB of host writes to just 1.4PB of flash writes. Even accounting for compression, the flash in the second HyperX has proven to be far more robust than in the first. That difference highlights the impact normal manufacturing variances can have on flash wear. It also illustrates why the experiment's sample size is too small to draw definitive conclusions about the durability of specific models. However, the fact that all the drives far exceeded their endurance specifications bodes well for the endurance of consumer-grade SSDs in general.
Just out of curiosity, how well do traditional HDD fare in comparison?
No, I think it means that the first ones were over-engineered, and the next generation will meet their stated MTBF number to within 1 standard deviation.
Mission: To provide products that consume time and energy as entertainingly as permitted by the laws of thermodynamics.
Most hard drive I see in consumer and business use write far less than that over their lifetimes. I have a customers hard drive I am copying data from currently. Has 15,147 power on hours, it has only written 1.3TB of data. It's very uncommon to see drives with over 6TB of data written (in the 500GB to 1TB drive range).
The other client SSD in my computer is a Samsung 830 256GB SSD that I just migrated to a 1TB SSD for a customer. Was used for about a year and a half before they needed a bigger drive. They used Outlook, a number of Autocad applications, lots of project files, a good sized collection of work related photos. The drive has 995GB of writes and is showing no SMART issues.
Average computer users have nothing to worry about when it comes to wearing a SSD out. Power users might have a problem depending on the nature of their work, but they also get the most benefit from high write speeds and IOPS. Servers, depending on their usage patters could have a problem, I certainly recommend the enterprise style drives that reserve a much larger amount of space.
Great, so now we just need to fix the sudden random failures where the drive completely fails but it is 6 months old and showed no signs of degradation. A coworker of mine just had that happen with a Crucial SSD.
Unfortunately these tests don't say much about the drives you can buy NOW, and write endurance in consumer drives is probably getting worse as geometry shrinks and relentless price pressure causes corners to be cut. It's good that the Samsung 840 Pro is holding up so well (its predecessor the 830 was also ridiculously durable) but it's now replaced by the 850 Pro which uses radical new technology (stacked chips). The Intel 320 was also very durable so the failure of the 335 doesn't bode very well for the idea that newer models should hold up better than older ones.
Write wear isn't everything anyway. Another thing to test is whether the drive can brick if the power fails while the drive is writing. Better drives have capacitors to deal with this event. Consumer drives lack them and can lose data or fail unrecoverably.
Because the brand is Crucial
I write a lot more to my SSDs than most do because of lost of application installs, playing with audio, etc, etc. 6TB to date, drive was purchased about 20 months ago. Ok well assuming I maintain that rate of writing (3.6TB/year) it would be 13 years before I'd hit 50 TB of writes, on a 512GB drive which can probably take 1PB or more.
Even if you hit it harder than the norm, you still don't hit it that hard. It really has to be used for something like database access or a file server or the like before endurance becomes an issue.
I think this has been a fantastic experiment, but do you still have any criticism regarding their test methodologies? Can we trust the results? For example, would we get different results if we leave the same data sitting on the drives for a longer time? Anything else that they are possibly not taking into account?
I think the idea is neat, but nothing meaningful can be said by sampling _one_ of each drive.
Moreover, from what I understand about flash, the more writes you make to a cell, the more quickly those bits tend to rot when left alone.
So being able to overwrite again and again and again isn't particularly important if those worn cells would just forget their contents over a few hours, days, weeks, etc.
I'd much rather have a drive that can take a moderate write load and hold on to my data than an Alzheimer's disk that can take endless information in for short periods of time, but would forget it an hour later.
From my experiences, most of SSD failures come from dead controllers and not wear. Or bad firmware, I'm looking at you Crucial and your 5000 hour bug. Also your weird incompatibles on your MX100 series.
Yes but there is an inverse relationship between number of writes and data retention. Try reading the data in a few months ... then you'll see the errors!
Where do they get Intel 335 Series? It's no longer manufactured. Couldn't they find something more up to date?
Test is meaningless unless it reads back the data and verifies with checksum *after* the internal buffer no longer has the written data. eg. write whole drive once, read back once and see if checksum matches for all the bytes the drive can fit.
Unfortunately, their test are pretty much pointless. To quote their original article, "Anvil's endurance test writes files sequentially".
The problem with most consumer flash drives (at least, cMLC drives) is not primarily one of flash endurance, but one of write amplification that causes the endurance of the drive to be significantly less than the underlying flash.
Write amplification occurs when data is overwritten, especially when the drive is nearer to full. Sequential writes will never cause this to occur, so the endurance levels that they are seeing in these tests is basically pointless for all but a very, very limited use case.
How many units of each type did they test? because if they only tested one of each, you cannot make any assumptions about and can't even call it a good test.. I've seen so many different results with the same HDD's.. Or even with SDD's, where my SDD died after a couple of months, but the one of my collegue is still plowing away after 2 years (SDD's from the same batch)..
A few years back I ran my own test. I had an unused 16 MB Canon SD card that came with one of my digital cameras (I bought a much larger one with the camera). Since it was unused I decided to see how long it would last. I wrote a script that repeatedly overwrote the entire card with one of several files of random data then checked it against the original. Each time overwriting, reading, and verifying the card took about 17 seconds. I had my first error after 120K writes. After that I got errors every 20K to 60K writes. Someone suggested I reformat the card and afterword it came out 114K smaller so I guess it marked some cells as bad. After this it went the longest stretch without an error, from write 1.9M to 2.5M without a single error. From this test one might conclude that there are a small number of frail cells that fail early on and the rest more robust that just keep going.
Of course this is the case: the entire purpose and design of the SSD controller that surrounds the actual Flash memory is to avoid writing/erasing the Flash chips precisely because Flash has limited endurance cycles. So instead the SSD controller operates as a cache and optimizer to achieve this.
Claiming "peta-writes" doesn't mean endurance - it means successful avoidance of endurance!!!