Slashdot Mirror


Endurance Experiment Writes One Petabyte To Six Consumer SSDs

crookedvulture (1866146) writes "Last year, we kicked off an SSD endurance experiment to see how much data could be written to six consumer drives. One petabyte later, half of them are still going. Their performance hasn't really suffered, either. The casualties slowed down a little toward the very end, and they died in different ways. The Intel 335 Series and Kingston HyperX 3K provided plenty of warning of their imminent demise, though both still ended up completely unresponsive at the very end. The Samsung 840 Series, which uses more fragile TLC NAND, perished unexpectedly. It also suffered a rash of cell failures and multiple bouts of uncorrectable errors during its life. While the sample size is far too small to draw any definitive conclusions, all six SSDs exceeded their rated lifespans by hundreds of terabytes. The fact that all of them wrote over 700TB is a testament to the endurance of modern SSDs."

164 comments

  1. Re:Sigh. by MasterOfGoingFaster · · Score: 2

    Yes, they are sooo reliable, every single SDD I've bought has been dead within 3 months.

    Odd - I've got 5 and all are well. 1 Intel, 2 Samsung and 1 Critical. I guess I'm lucky and you are not.

    --
    Place nail here >+
  2. Re:Sigh. by Anonymous Coward · · Score: 0

    I've had 150 of them, and all of them are half dead.

  3. context by pezpunk · · Score: 1

    has anyone tried this with platter drives? would it simply take too long?

    it's hard for me to judge whether this is more or less data than a platter drive will typically write in its lifespan. I feel like it's probably a lot more than the average drive processing in its lifetime. and anyway, platter drive failure might be more a function of total time spent spinning or seeking or simply time spent existing for all I know.

    --
    i could live a little longer in this prison
    1. Re:context by travisco_nabisco · · Score: 1

      I am sure someone has done it with platter drives, however it would take substantially longer to reach the same transfer quantities as the SDD's have much higher transfer rates than the spinny drives.

    2. Re:context by thesupraman · · Score: 2

      Why? The failure modes are completely different (and yes there are quite a few reports around on this subject..)

      SSDs have a write capacity limitation due to write/erase cycle limitations (they also have serious long term data retention issues).
      Mechanical drives tend to be more limited by seek actuations, head reloads, etc. The surfaces dont really have a problem write erase/write cycles.

      Nether are particularly good for long term storage at todays densities. Tape is MUCH better.

    3. Re:context by ShanghaiBill · · Score: 3, Informative

      has anyone tried this with platter drives?

      A few years ago, Google published a study of hard disk failures. Failures were not correlated with how much data was written or read. Failures were correlated with the amount of time the disk was spun up, so you should idle a drive not in active use. Failures were negatively correlated with temperature: drives kept cooler were MORE likely to fail.

    4. Re:context by pezpunk · · Score: 2

      the problem with tape is by the time you can retrieve the data you're interested in, it no longer matters.

      --
      i could live a little longer in this prison
    5. Re:context by afidel · · Score: 3, Interesting

      Not that much higher for streaming reads and writes, the new Seagate 6TB can do 220MB/s @128KB streaming reads or writes. That works out to ~19TB/day so it would only take around 2 months to hit 1PB.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    6. Re:context by fnj · · Score: 0

      Failures were correlated with the amount of time the disk was spun up, so you should idle a drive not in active use.

      That makes no logical sense unless the statement is missing a "not" somewhere, or unless you WANT failures.

    7. Re:context by fuzzyfuzzyfungus · · Score: 1

      I suspect that direct comparisons are tricky: magnetic platter surfaces should, at least in theory, have virtually infinite read and erase capability; but every mechanical part dies a little when called on to move(and if the lubricants are a problem, when not called on to move for too long).

      With SSDs, we know that the NAND dies a bit every time it is erased and rewritten; sometimes after surprisingly few cycles with contemporary high density MLC NAND; but the supporting solid state stuff should last longer that the person who owns the drive, barring firmware bugs or severe shoddiness.

    8. Re:context by viperidaenz · · Score: 3, Informative

      While ShanghaiBill apparently struggles with the English language, the phase "you should idle a drive not in active use" means the drive will spin up fewer times. You should disable spin down and leave the drive idling, not on standby.
      You'll reduce the number of head load/unloads.
      You'll reduce peak current consumption of the spindle motor.
      The drive will stay at a more stable temperature.

    9. Re:context by compro01 · · Score: 2

      Failures were correlated with the amount of time the disk was spun up, so you should idle a drive not in active use.

      That makes no logical sense unless the statement is missing a "not" somewhere, or unless you WANT failures.

      You're reading the sentence wrong. You're reading it as "Times the disk was spun up".

      What they mean is the total amount of time the disk has spent spinning over its lifetime.

      --
      upon the advice of my lawyer, i have no sig at this time
    10. Re:context by ShanghaiBill · · Score: 1

      What they mean is the total amount of time the disk has spent spinning over its lifetime.

      Yes, this is correct. It is the total amount of time spent spinning that you want to minimize, not the number of "spin-up/spin-down" cycles. The longer the disk spins, the more wear on the bearings.

    11. Re:context by LordLimecat · · Score: 3, Informative

      Tape actually has pretty high transfer rates. Its seek times are what sucks, but if you're doing a dump of tape you arent doing any seeking at all.

    12. Re:context by dshk · · Score: 1

      I regularly do restores from an LTO-3 drive, and the whole process takes no more than 5 minutes. If your data is useless after 5 minutes, then it is indeed unecessary to backup it, not to mention archiving it.

    13. Re:context by dgatwood · · Score: 4, Interesting

      That's curious. Almost all of the drive failures I've seen can be attributed to head damage from repeated parking prior to spin-down, whereas all the drives that I've kept spinning continuously have kept working essentially forever. And drives left spun down too long had a tendency to refuse to spin up.

      I've had exactly one drive that had problems from spinning too much, and that was just an acoustic failure (I had the drive replaced because it was too darn noisy). With that said, that was an older, pre-fluid-bearing drive. I've never experienced even a partial bearing failure with newer drives.

      It seems odd that their conclusions recommended precisely the opposite of what I've seen work in practice. I realize that the plural of anecdote is not data, and that my sample size is much smaller than Google's sample size, so it is possible that the failures I've seen are a fluke, but the differences are so striking that it leads me to suspect other differences. For example, Google might be using enterprise-class drives that lack a park ramp....

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    14. Re:context by timeOday · · Score: 4, Informative
      But contiguous writes is the absolute (and unrealistic) best case in terms of MB transferred before failure for an HDD, because it minimizes the number of revolutions and seeks per megabyte written. For whatever it's worth, it used to be said that "enterprise grade" drives were designed to withstand constant seeking associated with accesses from multiple processes, instead of fewer seeks associated with sporadic, single-user access.

      If seeking does wear a drive, then using an SSD for files that generates lots of seeks will not only greatly speed up the computer, but also extend the life of HDDs relegated to storing big files.

    15. Re:context by Anonymous Coward · · Score: 0

      For the cost of a single tape drive I could buy many, at least ten or twenty, 4TB hard drives for backups.

    16. Re:context by larryjoe · · Score: 2

      A few years ago, Google published a study of hard disk failures. Failures were not correlated with how much data was written or read. Failures were correlated with the amount of time the disk was spun up, so you should idle a drive not in active use. Failures were negatively correlated with temperature: drives kept cooler were MORE likely to fail.

      Actually the paper says that the Google guys approximated power-on hours with a notion of age, which I assume was approximated by a knowledge of either the manufacture date of the delivery date. From the paper, annualized failure rate (AFR) is somewhat correlated with age, but not necessarily strongly enough to predict probability of failure. Even with their large drive population, the paper points out that the drive model mix is not consistent over time and therefore, not much can be made of the apparently weak correlation between AFR and age, which could be perhaps be more greatly influenced by drive model.

      The negative correlation with very cold temperatures is interesting but hard to understand without further analysis. Perhaps some drive models didn't handle fly height adjustments well at low temperatures. It's hard to figure out without more data. It should also be pointed out the temperatures were obtained via SMART, and the SMART standard doesn't mandate how temperature is reported. So, different manufacturers could report temperatures in different ways, i.e., different locations (which can easily vary by up to 30 degrees C), different aggregation methods (time windows, sampling frequency), etc. So, the aggregate data is probably not as useful as the data per drive model.

    17. Re:context by _Shad0w_ · · Score: 1

      The main difference is LTO tapes (and similar) are actually designed so they can be used for archival storage (in the region of 30 years). Hard drives just aren't. If you can get a drive that's been sat in storage - no matter how good - for 20 years to spin up then you're very lucky.

      --

      Yeah, I had a sig once; I got bored of it.

    18. Re:context by Anonymous Coward · · Score: 0

      Why? The failure modes are completely different (and yes there are quite a few reports around on this subject..)

      A link would be nice. From what I recall back in the 100MB disk days when the spinning platter disks didn't have wear leveling it was very easy to wear out often used blocks of the platter. While the mechanics behind the wear is completely different the wear was still very noticeable and I went through a few disks even without using swap.
      As it is anecdotal and human memory is unreliable I'd like to see some reports that actually covers the subject in detail.
      Well, it's not that important. I don't consider going back to spinning platter but it could be interesting nonetheless.

    19. Re:context by Ed+Avis · · Score: 1

      Right, but while you may well need to archive data for 30 years, that doesn't mean you need to store a particular physical tape or disk for that long. It would make more sense to store the volume for five years and then transfer the data to a new one, to take advantage of capacity increases. Your warehouse full of tapes from 30 years ago might fit in a desk drawer now. So if I wanted to back up to hard disks, I'd keep a pool of them and replace the oldest disk every year or so. Admittedly software support for this is not great - RAID implementations don't always support cobbling together a random mixture of disk sizes which change over time.

      --
      -- Ed Avis ed@membled.com
    20. Re:context by Anonymous Coward · · Score: 0

      Modern harddrives get the opposite. They experience virtually zero wear while running, except the seek arm. Writing to a sector increases the longevity of that sector, not decrease.

    21. Re:context by tlhIngan · · Score: 2

      That's curious. Almost all of the drive failures I've seen can be attributed to head damage from repeated parking prior to spin-down, whereas all the drives that I've kept spinning continuously have kept working essentially forever. And drives left spun down too long had a tendency to refuse to spin up.

      The problem is that there are two ways for the drive to park the heads. (FYI - ALL spinning rust drives these days park the heads on power down). One of them is more violent than the other.

      There is the normal ATA spin down command which the OS issues to stop the drive, which causes a nice orderly movement of the heads to the parking area (or unloads the heads). On a drive datasheet, I saw they were rated for about 50,000 cycles of this.

      Then there's the emergency poweroff park, which uses the rotational momentum of the platters to provide power to the voice coils that slam the heads against the parking area (basically the power is funnelled straight into the voice coils). It needs to move ASAP as the platters are slowing down and the air cushion that keeps the heads from hitting the platters is dissipating the slower the platters move. So by dumping the back EMF into the voice coil, the heads are forced into the parking area while there's still enough movement to keep the heads away from the platter. This is so much more violent on the heads that a drive can easily be rated for 10,000 cycles of this or so (under "emergency park").

      Modern OSes typically send a spindown command prior to shutdown because it allows an orderly flushing of the caches into non-volatile storage, the heads can seek to the parking area in a controlled fashion and then the platters can spin down without worrying that the heads may contact the platters.

      You can easily tell which is the case - a normal spindown is very quiet (in a quiet room you can hear it), while an emergency park is heard by a loud clunk from the drive followed by a dying whine.

      Of course, there are bugs, and some OSes excessively unload/load the heads that could easily exceed the 50,000 number in the span of months (I think one distribution of Linux suffered from this due to a BIOS-Linux interaction).

      As for those complaining about the unusual nature of the test - well, it's stressing the weakest aspect of each storage medium. Reading/writing massive amounts of data doesn't really impact the drive longevity, but mechanical motion does - keeping a drive spinning wears the bearings, while spinning up and down wears the motor, and emergency parks puts extreme stress on the entire mechanical arm. On an SSD, none of those really do anything - but writing massive amounts of data DOES wear it out.

      So the tests aren't directly comparable, but they address the weakest part of each storage medium - spinning rust wears mechanically, while SSDs wear electronically.

      The other thing SSDs can suffer from is the emergency power off - because to achieve the speed they cache the entire FTL tables in RAM and then lazy-sync it to the storage medium. On power down, they need to flush the dirty changes to media. Some SSDs use capacitor banks to do this, others use journalled writes to allow safe updates.

      Either way, FTL table corruption is the #1 reason why SSDs die today - rarely do they actually die from the media actually wearing out. Luckily, a ATA SECURITY_ERASE command fixes that (in most cases) since it reinitializes the tables but generally keeps all the wear indicators as they were. And this happens almost always on powerdown.

    22. Re:context by Anonymous Coward · · Score: 0

      For the cost of a single tape drive I could buy many, at least ten or twenty, 4TB hard drives for backups.

      Have fun taking those twenty drives out of a safe every two months and connecting them up just to give them a spin.

    23. Re:context by BronsCon · · Score: 1

      Grab your soldering gun, we're gonna have some fun! There's no reason you can't connect them to a power supply that spins each drive up for a minute or so every week (one at a time, so the PSU doesn't even need to be beefy). Hell, it could monitor starting/running current and light up an LED next to any drives that show a sudden increase in power draw, so you know that drive might not spin up next week (e.g. so you can move the data to a new disk).

      What would be involved in continually verifying the viability of a warehouse full of tapes and mark failing tapes for replacement before they become unreadable?

      --
      APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
    24. Re:context by lsatenstein · · Score: 1

      But contiguous writes is the absolute (and unrealistic) best case in terms of MB transferred before failure for an HDD, because it minimizes the number of revolutions and seeks per megabyte written. For whatever it's worth, it used to be said that "enterprise grade" drives were designed to withstand constant seeking associated with accesses from multiple processes, instead of fewer seeks associated with sporadic, single-user access.

       

      If seeking does wear a drive, then using an SSD for files that generates lots of seeks will not only greatly speed up the computer, but also extend the life of HDDs relegated to storing big files.

      In regard to mixing SDDs and HDDs, there are some great caching programs that allow a single SSD to act as a front end to several HDDs. The driver looks at the traffic coming through, and if it is "mainly sequential writes", bypasses the SSD to write direct to the disk. For random stuff, across the x HDD drives, the SSD acts as a cache. The percentage of Sequential to Random is selectable.

        Very recently I purchased my first SDD at about $0.48 per gigabyte (128gig for $59.00) I expect that next year I should be seeing $0.35 per/gig to where I expect by next year, we will have excellent terabyte SDDs for $50.00. Can someone confirm that in SSD size measurement, we are back to 1k=1024 and 1meg=1024x1000, etc.?

       

      --
      Leslie Satenstein Montreal Quebec Canada
    25. Re:context by Anonymous Coward · · Score: 0

      There is plenty of evidence from what I've seen that leaving disks spinning 24x7(with adequate cooling) has a significant effect on drive life. If you read that Google White Paper and keep your drives in the 25-40C range and leave them spinning all the time you'll see exceptionally low failure rates. No clue why it matters, I just know that it does. Some people have claimed less than 5% of drives failing over 3+ years of continuous use.

      There's lots of prevailing theories as to why this is the case, and lots of discussion regarding it being the thermal cycles, the motor's design, firmware that sucks, etc.

      I don't know what the reason is, but I leave my system on 24x7 unless I plan to be going out of town for more than a week or two and I know I won't be accessing my server while gone(for example, if on a cruise).

      Posting anonymously to ensure I don't endorse a storage product shamefully(my username would link to a particular project).

    26. Re:context by Anonymous Coward · · Score: 0

      The failure mechanisms between SSD and HDD are very different (my company makes test systems to predict the use lifetimes of the raw devices/elements of both SSD and HDD - our customers are all the "big names" you know these these industries).

      SSDs (technically the Flash memory they are based on) fail as a function of write/erase cycles. Secondary are conventional CMOS failure mechanisms like BTI which are "power duty cycle" dependent (simply having power applied causes aging). SSD circuity can be summarized as "circuits designed to avoid Flash memory write/erase cycles at any cost".

      HDDs fail as a function of time operating (strictly a dependent probability of operating time and probability of mechanical shock) and power cycling rate (transitioning from park to active to park) - actual head or media degradation otherwise is not significant in terms of quantity of data written/erased in contrast to SSD/Flash. HDD circuity can be summarized as "circuits designed to avoid mechanical impact between head and media".

      This means usable lifetime very much depends on how you operate each technology. Similarly the "ideal usage for maximum life" follows from this as well: HDD are best for static use (e.g. in data centers) and for ultrahigh capacity (again data centers) because that's matches the best usage model AND aligns with the economics of price/performance which HDD will always have an advantage in because of the economies of scale natural to data centers. SSD has the sweet-spot for portable usage both in terms of usage for lifetime but also in terms of price/performance when you have shock hazards. SSD is closer to the "end of the line" for scaling than HDDs but there are still some games in the bag but not quite as many. HDDs has more scaling future than SSDs.

    27. Re:context by Anonymous Coward · · Score: 0

      Why would I buy twenty drives at once? You have some pretty stupid ideas.

    28. Re:context by Vastad · · Score: 1

      Is there any way to tell a WD Caviar Black drive to behave this way? Mine automatically spins down after 30 minutes of inactivity I believe.

    29. Re:context by viperidaenz · · Score: 1

      It should be under OS control, not the drive.

    30. Re:context by Vastad · · Score: 1

      Thanks for the reply. I'll go do a little research on that.

    31. Re:context by Anonymous Coward · · Score: 0

      Seagate suggests that their enterprise platter drives can take 550TB of writes (assumingly) per year, while their consumer drives only take about 50.

    32. Re:context by Anonymous Coward · · Score: 0

      I'd be very interested to know what this "great caching program" is. Also is it free?

    33. Re:context by dgatwood · · Score: 1

      I've never had a drive that did emergency parking until my HD-based MacBook Pro. All my dead drives were too dumb to have the needed sensors, as were the machines that they were in.

      With that said, I'm terrified at the aggressiveness with which that MacBook Pro parks its heads. I literally can't pick the thing up and place it gently on my bed without the heads doing an emergency park. I don't have a lot of faith in that drive lasting very long. Non-emergency parking is hard enough on the heads. Emergency parking is downright bad.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

  4. Rated Lifespan by Anonymous Coward · · Score: 0

    "all six SSDs exceeded their rated lifespans by hundreds of terabytes" - Interesting and probably relevant data, but doesn't the "rated lifespan" include retaining the data for at least one year after the last write is performed?

    1. Re:Rated Lifespan by AK+Marc · · Score: 1

      Can you link to that claim, or did you make it up?

    2. Re:Rated Lifespan by dc_gap · · Score: 1

      Did not make it up, or at least not on purpose. Link: http://www.jedec.org/sites/def... Go to page 24 "Endurance Rating" and you see the last requirement is: "4) the SSD retains data with power off for the required time for its application class." Then go down to page 25 and you will see that the above "required time for its application class" for a "client class" device is 1yr. This is consistent with many NAND device datasheets that I've been dealing with. It is common to spec 10 years min power off data retention when new and 1 year when they've reached their max write rating.

    3. Re:Rated Lifespan by AK+Marc · · Score: 1

      Though, I'd note that as far as we know, all would pass that. None were tested 1 year after max write rating, but they were all run past max write rating, in an attempt to make them fail. Did anyone stop writing at the max write rating and wait a year?

  5. Intel - weird failure mode. by Anonymous Coward · · Score: 0
    TFA:

    After a reboot, the SSD disappeared completely from the Intel software. It was still detected by the storage driver, but only as an inaccessible, 0GB SATA device.

    According to Intel, this end-of-life behavior generally matches what's supposed to happen. The write errors suggest the 335 Series had entered read-only mode. When the power is cycled in this state, a sort of self-destruct mechanism is triggered, rendering the drive unresponsive. Intel really doesn't want its client SSDs to be used after the flash has exceeded its lifetime spec.

    *blink*

    Nice that the MWI provided advanced warning, but the actual behavior when it ran out seems to be the opposite of what's supposed to happen: the drive should be readable but not writable.

    I had an X-25M that failed in similar fashion; although it had an MWI of 100% when it died and had barely seen its first couple of terabytes of writing, it was in a situation where there would have been heavy write amplification on whatever space it had left. When it died, applications fell over, and it showed up as an 8MB drive on powerup. 100% data loss. I should probably pull the chips off it and dump them - it was one of the pre-encryption drives - just to see if I can get anything back.

    1. Re:Intel - weird failure mode. by marcomarrero · · Score: 2

      The 8MB problem is an Intel firmware bug (older, non-Sandforce controllers). If you don't care about your data, ATA "security erase" can make it usable again. I think I used the DOS-based hdderase, and after a few problems it went through. Intel's DOS-based flash idiotically ignores the SSD because it identifies itself as "BAD_CTX"...

    2. Re:Intel - weird failure mode. by viperidaenz · · Score: 1

      but the actual behavior when it ran out seems to be exactly what's supposed to happen

      FTFY
      When a flash cell fails, it can no longer hold the charge that stores the bit.
      It will always be read as if it had no charge, therefore read checksums will fail and the drive is unreadable.

    3. Re:Intel - weird failure mode. by Anonymous Coward · · Score: 0

      The 8MB problem is an Intel firmware bug (older, non-Sandforce controllers). If you don't care about your data, ATA "security erase" can make it usable again. I think I used the DOS-based hdderase, and after a few problems it went through. Intel's DOS-based flash idiotically ignores the SSD because it identifies itself as "BAD_CTX"...

      Was there ever a published recovery procedure?

      You made me look around for the first time in a couple of years, and I see some commercial data recovery service claims (ugh, what a hideous industry, with a de facto policy of hiding anything useful for fear of profit margins, and pretending that they do so in order to "protect" users from the real risk of data loss "trying" to do it themselves), but othing of interest. If anyone can direct me to a solution I'm not averse, having rescued a few Barracudas from that firmware bug a few years ago, to a soldering iron. I don't actually need the data from the drive, I'm just curious as to how to rebuild or work around a failed FTL (flash translation layer).

    4. Re: Intel - weird failure mode. by Anonymous Coward · · Score: 0

      For that block, sure. Doesn't mean every other block becomes unreadable simultaneously.

    5. Re: Intel - weird failure mode. by Anonymous Coward · · Score: 0

      Unless the unreadable sector is in a critical metadata block. At which point, when the checksum fails, the drive is bricked.

      This has been the failure mode of every dead SSD I've come across. You get some sort of write error or uncorectable read error and the OS/driver/SATA controller hangs. Reboot. Drive fails to initialize.

      I suspect that it is a stochastic process, once the flash starts developing uncorrectable errors. If they're in user data, OK. Once one hits core metadata, boom!

  6. Re:Sigh. by ArcadeMan · · Score: 3, Funny

    Rejoice then, you still have 75 SSDs!

  7. Re:Sigh. by pezpunk · · Score: 2

    hey thanks for sharing your anecdotal experience as if it carries any weight whatsoever compared to actual controlled experiments and statistics.

    for comparison, I've owned 8 and no failures yet. I have a raid0 array of SSDs upstairs that has been working flawlessly since 2008. an aberration maybe. anecdotal evidence works like that.

    --
    i could live a little longer in this prison
  8. Re:Sigh. by pezpunk · · Score: 5, Insightful

    that reminds me ... I should do a backup ....

    --
    i could live a little longer in this prison
  9. Re: And the winners are... by Anonymous Coward · · Score: 1

    100TB a day? Roughly 1.2GB per second? No. No you won't.

  10. Re:And the winners are... by travisco_nabisco · · Score: 4, Insightful

    Good luck with that. This experiment has been running since Aug 20, 2013 and running almost continuously at that. Even the heaviest consumer/prosumer work load would have trouble reaching the amount of data written in this experiment.

  11. Re:Sigh. by AK+Marc · · Score: 1

    Stop storing them in the oven...

  12. Re:And the winners are... by ShanghaiBill · · Score: 1

    By the way, 700TB isn't all that much these days. Betcha I could do it in a week's worth of video editing.

    I'll take that bet. Most SSDs have physical bandwidths of less than 1GB/sec. So even if you were writing continuously, without sleep or bathroom breaks, and reading nothing back, you would still need more than a week to write that much data.

  13. Re:And the winners are... by jcochran · · Score: 4, Informative

    You might want to do a bit of math before making such a statement. 700TB is a very large amount of data. And in order to do that in a week, would require quite a bit of data transfer bandwidth. To wit:

    700,000,000,000,000 / 7 days = 100,000,000,000,000 / 24 hours = 4,166,666,666,666 / 3600 seconds = 1,157,407,407 bytes per second.

    Do you really write 1.157GB/second every second for a week? And if so, what data interface are you using? I'd really like to know since SATA 3.0 can only handle 600MB/second. Perhaps you're using SATA 3.2 which does have the required speed?

    Now in an environment using multiple drives, you can get to the 700TB mark much more rapidly with much lower per drive bandwidth. But then again, that's not the test criteria. They are testing how much endurance individual SSDs have.

  14. Re:Sigh. by msauve · · Score: 5, Funny

    "I've got 5 and all are well. 1 Intel, 2 Samsung and 1 Critical. "

    That apparently doesn't prevent you from dropping bits, though. 1+2+1=4.

    --
    "National Security is the chief cause of national insecurity." - Celine's First Law
  15. All still going by m.dillon · · Score: 1

    I have around 30 ranging from 40G to 512G, all of them are still intact including the original Intel 40G SSDs I bought way at the beginning of the SSD era. Nominal linux/bsd use cases, workstation-level paging, some modest-but-well-managed SSD-as-a-HDD-cache use cases. So far wearout rate is far lower than originally anticipated.

    I'm not surprised that some people complain about wear-out problems, it depends heavily on the environment and use cases and people who are heavy users who are not cognizant of how they are using their SSDs could easily get into trouble.

    For the typical consumer however, the SSD will easily outlast the machine. Even for a pro-sumer doing heavy video editing. Which, strangely enough, means that fewer PCs get sold because many consumers use failed or failing HDDs as an excuse to buy a new machine, and that is no longer the case if a SSD has been stuffed into it.

    A more pertinent question is what the unpowered shelf-life for typical SSDs is. I don't know anyone who's done good tests (storing a SSD in a hot area unpowered to simulate a longer shelf time). Flash has historically been rated for 10-years data retention but as the technology gets better it should presumably be possible to retrieve the data after a long period on a freshly written (only a few erase cycles) SSD. HDDs which have been operational for a time have horrible unpowered shelf lives... a bit unclear why, but any HDD I've ever put on the shelf (for 6-12 months) that I try to put back into a machine will typically spin-up, but then fail within a few months after that.

    -Matt

    1. Re:All still going by Anonymous Coward · · Score: 0

      >but as the technology gets better it should presumably be possible to retrieve the data after a long period on a freshly written

      Nope. It is not like you can update the physical structure of a silicon chip by firmware upgrade.

      As technology gets "better", the margins gets smaller and smaller. Instead of trapping a certain amount of electrons to represent a '0', you now have a smaller amount of electrons (smaller geometry) and trying to use the number to represent a a few bits instead of 1.

      As a matter of fact, read the datasheet of the MLC chips and see for your self. it is all there under data retention.

    2. Re:All still going by BitZtream · · Score: 1

      a bit unclear why, but any HDD I've ever put on the shelf (for 6-12 months) that I try to put back into a machine will typically spin-up, but then fail within a few months after that.

      The lubrication in the bearings of the platters and head arms gets thicker over time after being heated a few times. It needs to stay warm to keep a lower/workable viscosity. The drag becomes too great fairly rapidly after even a few months initial use when then stored on the shelf.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  16. Good news for me by Snotnose · · Score: 1

    Considering 90% of my storage is write once, read many (email, mp3, dvds, programs, etc), this is good for me as long as the drive has a good, errr, brain fart, scheme so when I write a byte it chooses one I haven't written to in a while. My SSD should last forever, or until the electron holes break free of their silicon bonds.

  17. Re:Sigh. by Anonymous Coward · · Score: 0

    hey thanks for sharing your anecdotal experience as if it carries any weight whatsoever compared to actual controlled experiments and statistics.

    A controlled experiment with statistics. Too bad this article is not that.

  18. Re:Sigh. by beelsebob · · Score: 1

    Let me guess, every single SSD you bought was a low capacity sand force controlled one.

  19. Re:And the winners are... by fustakrakich · · Score: 0

    See for yourself. Sure, that's high end now, but in the future? Anyway, there you go, ten days (so sue me) will eat a little more than a petabyte. So now I would have to stripe 10 or 20 of these SSDs to hold it all. Now what will my failure rate be?

    On the other hand I still prefer SSDs over all the monkey motion going on in a hard drive. I'm just pointing out that a petabyte doesn't mean much anymore. And I still remember having a 20 meg drive and thinking I'll never use it all.

    --
    “He’s not deformed, he’s just drunk!”
  20. Re: And the winners are... by fustakrakich · · Score: 0

    100TB a day? Roughly 1.2GB per second? No. No you won't.

    Yes. Yes I will! Any other questions?

    --
    “He’s not deformed, he’s just drunk!”
  21. Re:And the winners are... by Anonymous Coward · · Score: 0

    He's using OooA interface.

  22. Re:Sigh. by fuzzyfuzzyfungus · · Score: 5, Funny

    Yes, they are sooo reliable, every single SDD I've bought has been dead within 3 months.

    A happy OCZ customer, I take it?

  23. Re:Sigh. by Anonymous Coward · · Score: 0

    not to mention a write error: "Critical" instead of "Crucial"

  24. Re:Sigh. by MasterOfGoingFaster · · Score: 1

    not to mention a write error: "Critical" instead of "Crucial"

    Hee hee. That's a "loose nut behind the keyboard" error - not an SSD error.

    --
    Place nail here >+
  25. Re:And the winners are... by Anonymous Coward · · Score: 0

    Protip: A computer is capable of performing actions without a person sitting in front of it 24/7.

  26. Re:Sigh. by MasterOfGoingFaster · · Score: 2

    I don't recall the brand of the fourth, got distracted and forgot to edit. But I knew someone would have fun pointing it out, so it would be rude for me to deny you the pleasure. So - yeah - I dropped a bit. :D

    --
    Place nail here >+
  27. Re:And the winners are... by Anonymous Coward · · Score: 0

    I have to reply anonymously to avoid the trolls mod bombing the account, but read the links in the other replies I made. A video house will chew these things up. And yes, you would have to stripe at least ten of these things to a very fast interface.

  28. Re:Dammit by Anonymous Coward · · Score: 0

    I just bought that same drive a week ago. It has very good reviews and very few failures.

    I've been using a 256GB G.Skill Sniper SSD for years now, and it has served me well. I figure it was time to upgrade as SSDs have proven themselves to be very reliable; often more reliable than their platter counterparts. Still, it doesn't seem to stop this spew of "lol but ssds sux n fail alot!" nonsense from the fools that can't actually be bothered to do any research on the subject.

    In short, no worries. Your drive will almost certainly be fine. And if it isn't, then contact Samsung for a replacement. You should *always* backup sensitive data no matter what storage medium you are using.

  29. Re: And the winners are... by Anonymous Coward · · Score: 0

    100TB a day? Roughly 1.2GB per second? No. No you won't.

    Yes. Yes I will! Any other questions?

    Um, sir? Yes, um, I have a question...what sort of device can I stick in my computer that will write data at 1.2 GB/sec?

  30. Re:Sigh. by Anonymous Coward · · Score: 0

    They are now half dead, half something else. Wait for them to replenish the chips, violently.

  31. Re:And the winners are... by Anonymous Coward · · Score: 0

    You need to read the replies I already made. There's a link to shows how fast you can wreck an SSD, and if you have that kind of money, you probably wouldn't care.

    And here's a big fuck you! to the idiot moderator(s).

  32. Re:And the winners are... by swb · · Score: 1

    You couldn't sustain that bit rate on a SATA interface. No normal workflow would sustain that volume of writes or encoding, especially prosumer or lower.

    There may be broadcast or industrial uses but they would be writing to industrial strength storage via 16 gig fc to SAS SLC arrays.

  33. Re:And the winners are... by gman003 · · Score: 1

    Good luck with that.

    The Intel 335 has a sequential write speed of about 350MB/s (the rest are around the same speed). Writing 700TB at that speed would take 24 days and change, with no breaks to do things like read any of that data.

  34. Re:And the winners are... by gman003 · · Score: 2

    Which will also spread around the writes. If you're writing a 4TB video across 10 disks, that's only 410GB to each, so you only get that much endurance used up.

  35. Re: And the winners are... by viperidaenz · · Score: 1

    SATA 3.2 isn't out yet for consumer drives, so no you won't.

    1.2GB/s twice the bandwidth of SATA 3.0
    .

  36. Re:And the winners are... by viperidaenz · · Score: 1

    You failed at math.

    You won't be writing 1.2GB/s to any SSD currently available. They all max out at SATA 3.0 - 600MB/s.
    Since you'd need at least 3 striped drives to even try to sustain 1.2GB/s, your endurance has now tripled from 700TB to 2.1PB.

  37. Re:Sigh. by gukin · · Score: 4, Funny

    Amen to this, I STUPIDLY bought a REFURBISHED OCZ drive which, coincidentally failed shortly after OCZ announced bankrupcy. The other drive I bought was a Corsair that, like it's OCZ bretheren died three weeks after put into service. The speed is wonderful but the life is pathetic. Despite this, I have a Kingston and a Samsung which are both going strong so I can confidently state that HALF OF ALL SSDs FAIL AFTER THREE WEEKS, THE OTHER RUN FOREVER!

    Perhaps I need to work on my sample set and my over-use of capital letters.

  38. Re:And the winners are... by viperidaenz · · Score: 1

    Protip: less than 1GB/sec is much less than 700TB/week.
    Protip 2: SATA 3.0 is only 600MB/sec, the peak interface bandwidth is only 346GB/week.

  39. I am sticking to rated lifespan by iamacat · · Score: 1

    Ability to write hundreds of terrabytes more is nice. But it's reading them back that I am really worried about. Great news for someone deploying a short term cache.

  40. Re:And the winners are... by Anonymous Coward · · Score: 0

    Since you'd need at least 3 striped drives...

    That's just multiplies the possible failure rate by three, regardless of the reason for such failures.

  41. Re:Sigh. by Anonymous Coward · · Score: 0

    Stop defragmenting them.

  42. Re:Sigh. by kimvette · · Score: 1

    I have two different Crucial mSATA drives - one runs VMware in one workstation (well, "server"), and the other runs virtualbox in another. Each is a different generation SSD - and no problems. I've also shipped many to customers in servers (real servers on RAID controllers, not workstations posing as servers). Not one failure.

    --
    The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
  43. Re:Sigh. by msauve · · Score: 3, Funny

    "I don't recall the brand of the fourth"

    There you go again. :-)

    --
    "National Security is the chief cause of national insecurity." - Celine's First Law
  44. Re:And the winners are... by kcitren · · Score: 1

    I think your math is off a bit by a factor of 1000:

    600 MB/s and 604,800 sec/wk = 362,880,000 MB/wk = ~362,880 GB/w = ~ 362 TB/w

  45. extremesystems test by 0111+1110 · · Score: 3, Informative

    There was also a very interesting endurance test done on extremesystems.org. Very impressive stuff. I don't yet own an SSD, but I'll continue to consider buying one! Maybe next Black Friday. Just waiting for the right deal.

    --
    Quite an experience to live in fear, isn't it? That's what it is to be a slave.
    1. Re:extremesystems test by camperdave · · Score: 1

      I bought two a few years back, and both are working like champs. The only problem I encountered is that my laptop now boots too fast. The keyboard becomes unresponsive for about 30 seconds (both Win7 and Linux), so I have to twiddle my thumbs at the login prompt. Before, this was hidden by the slow turning of the platters.

      --
      When our name is on the back of your car, we're behind you all the way!
    2. Re:extremesystems test by WuphonsReach · · Score: 1

      You really don't know what you're missing. For business laptops, we've made the switch to 100% SSDs for 2-3 years now (ever since they dropped down to $1.50-$1.75 per GB). Granted, these are all uses who can function with only a 128GB SSD. Which holds true for probably 90% of office workers who have access to a file server (instead of storing business critical data on their HD).

      Now, instead of waiting on their HD to seek around and find information (a boot process measured in minutes, program loading times measured in 10s of seconds), boot-up takes under 20-30s and program loading times are near instant. What you *will* notice is that your CPU is now the bottleneck (oops). For development work or any thing where you need to do two or three things at once, or run something disk-intensive like a scan or search of files, SSDs are a must-have. I will regularly kick off compiles / version control updates / searches, and still be able to use the machine for other things while it thinks.

      Just makes sure you have a good backup system in place. On the Windows-side, I recommend Acronis True Image writing to a 2nd old-style HD inside the case. Or an external 1TB USB3 drive that you leave connected during the backup window. That is not because SSDs are unreliable (unless you buy crap like OCZ), it's because their failure modes are such (if the controller goes crazy) that data recovery is highly unlikely.

      --
      Wolde you bothe eate your cake, and have your cake?
    3. Re:extremesystems test by Threni · · Score: 1

      How does this compare to hard drives, though? That's the key metric. I don't mind my pc booting up in 30 rather than 10 seconds if I don't have to do disaster recovery and pay far more per gig.

    4. Re:extremesystems test by Anonymous Coward · · Score: 0

      Am assuming USB keyboard. Have you tried toggling the bios settings for legacy suppport for usb keyboards and mice. Also, ps2 works will if you got the ports and a keyboard.

    5. Re:extremesystems test by camperdave · · Score: 1

      Am assuming USB keyboard. Have you tried toggling the bios settings for legacy suppport for usb keyboards and mice. Also, ps2 works will if you got the ports and a keyboard.

      Nope. This is the laptop's native keyboard. There is no problem if I use a USB keyboard. However, carrying an extra keyboard kind of defeats the purpose of having a laptop.

      The really weird thing is that this delay happens *AFTER* grub. The keyboard works fine in grub. However, once I choose an OS, I get stuck for 30ish seconds at the login prompt, whether I choose Windows or Linux.

      --
      When our name is on the back of your car, we're behind you all the way!
  46. Re:And the winners are... by viperidaenz · · Score: 1

    Off by a letter.
    s/G/T

  47. Re: And the winners are... by Anonymous Coward · · Score: 0

    Who says he's using SATA? There are PCIe 4x SSDs out there you know.

  48. Re:And the winners are... by Anonymous Coward · · Score: 0

    Not everyone - I work in broadcast. We're cheap. Not just cheap, fucking ultra cheap. Our sister station get whatever they want whenever they ask for it, but we get nothing. We use outdated consumer grade IT gear. Our network is much slower than 100 megabit should be. Our top editing computers are a couple of years old. We have single terabyte drives in them, and use those as our capture and export drives.

  49. Re: And the winners are... by LordLimecat · · Score: 1

    You're editing 4k video 24/7? Thats quite impressive, but not terribly believable.

  50. Re: And the winners are... by LordLimecat · · Score: 1

    Its all irrelevant, because theres no SSD out there that could handle that write rate, and theres no way hes generating that much data 24/7.

    Hes full of crap and doesnt want to admit it.

  51. Re:And the winners are... by Anonymous Coward · · Score: 0

    Protip: A PCIe 4x SSD can reach 930MB/s write speed

    Protip 2: When you make sweeping assumptions, you look like an idiot.

  52. Re:And the winners are... by LordLimecat · · Score: 1

    Thats not really how it works. The wear is leveled across cells, so increasing the number of drives in a RAID0 really does increase the amount of data till a predictable failure (ie, the "write limit").

  53. Re:And the winners are... by Anonymous Coward · · Score: 0

    Not true... SSDs with direct PCI connections, such as those from FusionIO, Intel, and in current Macs can reach those speeds.

  54. IO pattern by ThePhilips · · Score: 3, Insightful

    That's a heck of a lot of data, and certainly more than most folks will write in the lifetimes of their drives.

    Continued write cycling [...]

    That's just ridiculous. Since when the reliability is measured in how many petabytes can be written?

    Spinning disks can be forced into inefficient patterns, speeding up the wear on mechanics.

    SSDs can be easily forced to do a whole erase/write cycle just by writing single bytes into the wrong sector.

    There is no need to waste bus bandwidth with a petabyte of data.

    The problem was never the amount of the information.

    The problem was always the IO pattern which might accelerate the wear of the the media.

    --
    All hope abandon ye who enter here.
    1. Re:IO pattern by m.dillon · · Score: 2

      Yes, but it's a well-known problem. Pretty much the only thing that will write inefficiently to a SSD (i.e. cause a huge amount of write amplification) is going to be a database whos records are updated (effectively) randomly. And that's pretty much it. Nearly all other access patterns through a modern filesystem will be relatively SSD-efficient. (keyword: modern filesystem).

      In the past various issues could cause excessive write amplification. For example, filesystems in partitions that weren't 4K-aligned, filesystems using a too-small a block size, less efficient write-combining algorithms in earlier SSD firmwares. All of those issues, on a modern system, have basically been solved.

      -Matt

    2. Re:IO pattern by Anonymous Coward · · Score: 0

      >The problem was never the amount of the information.
      To you it isn't. The rest of your post can be cheerfully ignored.

    3. Re:IO pattern by martin_dk · · Score: 1

      I agree, measuring reliability like this is strange.

      Even more disturbing is the number of drives being tested. What is the statistical significance of their results?

    4. Re:IO pattern by Anonymous Coward · · Score: 0

      For SSDs, the effect of whole erase/write cycle just by writing single bytes is called write amplification. You can't write bytes anyway, but only sectors (512 bytes, possibly 4096 for some drives). Since write amplification is just a factor, you can just divide the endurance in PB written by the appropriate write amplification for that drive and your workload, and you get the proper endurance for your specific use case. For typical workloads, write amplification is around 1, there are even claims of write amplification < 1.

    5. Re:IO pattern by Anonymous Coward · · Score: 0

      The statistical significance is effectively nil. However, it is an *interesting* result which may result in a more in-depth study being conducted in the future.

  55. Re:And the winners are... by Anonymous Coward · · Score: 0

    I did some calculations on it as well, I think Tera is actually correct here. That also makes more sense to me, as I've certainly moved a few hundred GB from HD to HD with in a day before.

  56. Re:Sigh. by ColdWetDog · · Score: 5, Funny

    We seem to have the beginning of a trend here - AC's don't have very good luck with SSD's.

    Try logging in and see if that changes your outlook.

    --
    Faster! Faster! Faster would be better!
  57. Re:Sigh. by Anonymous Coward · · Score: 0

    Single board Computers and Sata Express...

  58. Re: And the winners are... by tysonedwards · · Score: 1

    Um... a good PCI-E drive, such as a Fusion-IO board will certainly handle that write rate. That *he* is generating enough content to fill that pipe for a week strait is unlikely though as it would require multiple 10Gbase connections to do so. Since he is talking about video editing, let's say this is a surveillance system taking uncompressed HD streams that are being written natively to disk without transcoding prior to editing; we are still talking about 188 cameras coming to this one server.

    That the likes of Facebook would be generating sufficient content to saturate these cards, again possible in terms of server to server replication to keep their cluster in sync and maintain live backups and hot standbys, however unlikely that they would want to fully saturate their bandwidth to single nodes as opposed to just adding some more servers to ensure that capacity exists so their users can connect.

    --
    Thirty four characters live here.
  59. limited endurance? by Anonymous Coward · · Score: 0

    why does flash memory have limited endurance? Too bad companies can't use regular DDR3 memory in SSDs.

    1. Re:limited endurance? by camperdave · · Score: 1

      DDR3 must be continuously powered and actively accessed in order to keep the data alive. No power - no data. Great for RAM, but completely wrong for long term storage.

      --
      When our name is on the back of your car, we're behind you all the way!
  60. Re:And the winners are... by nabsltd · · Score: 2

    See for yourself.

    Why didn't you just refer to the LHC web page and imply that you are writing at that same data rate to a single SSD...it would have exactly the same value as an argument.

  61. Re:Sigh. by Anonymous Coward · · Score: 0

    I seem to have the exception. I have almost 2 years on an OCZ drive I purchased from NewEgg. It's seen daily use for those two years.

  62. Graceful Failover ? What Graceful Failover? by citizenr · · Score: 1

    Even Intel, behemoth of reliable server hardware, wasnt able to fix Sandforce problems.
    According to Intel representative Graceful Failover of SSD drive means you _kill_ the drive in software during a reboot :DDD and not switch it to read only mode (like you promise in the documentation).

    Kiss your perfectly readable data goodbye.

    --
    Who logs in to gdm? Not I, said the duck.
    1. Re:Graceful Failover ? What Graceful Failover? by Jack+Malmostoso · · Score: 1

      That was also my question when I RTFA. It says that the Intel drive entered some sort of "read-only" mode, and that at that point the drive was still OK. Then a new write cycle was forced (how?), and the drive committed seppuku and became unreadable.

      Which is it? Can I be confident that my SSD will fail to a gracious read-only mode? All my ~ is in RAID1 and backed up so I'm not worried, but it'd be nice to be able to just copy the / from a read-only SSD to a new one when the time comes.

  63. Re:And the winners are... by Anonymous Coward · · Score: 0

    I think you're wrong about the week part, but otherwise correct. When we attempted to swap-out the drives on a MySQL server that had a pair of five year-old 15k drives quit, it was less than ninety days before we had the first failure and four days later the second drive quit. They were Samsung 840 250GB drives. I was able to swap them out quickly because I had bought several extra for desktops. I think at the end of five months before we finally replaced them with new 300 GB 15k SAS drives, we had replaced nine drives. The RAID array was only eight drives! Fortunately we were using RAID 6 so we didn't lose any data, but it was scary. I hated to pay more for slower drives, but for a server you must have reliable drives. There's a reason Dell's cheapest (well, the last time I looked) server SSD is $1,500.

    Samsung would only replace four of the drives. Those four have been in heavy use in developer machines since then without a failure, but they just didn't work for a write-heavy DB server. I would be weary buying another Samsung product because of their horrible support, but the drives do work for their intended purpose.

  64. Re:Sigh. by Anonymous Coward · · Score: 0

    Which reminds me, my VISA extended warranty claim on 1 of my terrible OCZ drives came back last week. VISA actually paid for the original price of the drive + tax + registered mail cost.
     
    I had to do this because OCZ refuses to honor my 'intermittent' disconnect-prone ssd harddrives.

  65. Endurance Experiment Writes One Petabyte To Three by Culture20 · · Score: 1

    Endurance Experiment Writes One Petabyte To Three Consumer SSDs
    "how much data could be written to six consumer drives. One petabyte later, half of them are still going."

  66. Re:How is 700TB "endurance"? by unrtst · · Score: 2

    How is 700TB "endurance"? I copy near a TB of data from Backups at work almost daily. So 1-2 years (if that) is "endurance"? Screw that! Sounds more like modern SSD's suck hard and aren't designed to last past 1-2 years of work. I'll stick with traditional HD's until they figure out DRAM drives that don't need batteries or constant power.

    How large is your backup filesystem(s)? This was 700TB written to a 250gb drive. If you're copying "near a TB of data from Backups ... almost daily", then I'm betting you have many many TB of storage in the backup pool... so divide that by 250gb and multiple that by 700TB and that's the endurance the SSD's would have. However, even then it doesn't really apply... your backups are not likely to be rewriting a lot of sectors (ex. deduplication, if used, means few files are actually written). You also said you copied FROM backups, so those are just reads (I'm presuming those are going out to multiple clients).

    In any case, the 700TB "endurance" figure is still acurate, even if you consider that fragile - it's a level of endurance under a specific use case.

    FWIW, for a backup system, I'd also stick with spinning disks (or tape) for now and well into the foreseeable future. Throughput and IOPs are not very important to backup storage, and you'll get way more GB/dollar from HDD's.

  67. Re:Sigh. by Hamsterdan · · Score: 1

    Might be luck, might be an exception, but my Agility 2 is still kicking after 3 years, half of that was under XP (no TRIM).

    I've had 4 spinning drives (Seagate) die or get bad sectors in the same time frame.
     

    --
    I've got better things to do tonight than die.
  68. Re: And the winners are... by Anonymous Coward · · Score: 0

    Automation, baby! It's the next big thing.

    Posting AC because the asshole moderators are trolling my account

    -F

  69. Re:And the winners are... by Anonymous Coward · · Score: 0

    SAS 12gbps exists.... 1.5GB/s

    I have a loverly LSI megaraid controller that does 1.5GB/s read/write(to bbu cache)

  70. Re: And the winners are... by Anonymous Coward · · Score: 0

    You're acting like a smug little bitch, it's no wonder people want to mod you down.

  71. Re: And the winners are... by Anonymous Coward · · Score: 0

    you underestimate the porn overlords of our time.

  72. Times spun up was a factor too by dutchwhizzman · · Score: 2

    Stopping and starting a drive is also a moment where you can break/wear down a drive. This can be explained by the fact that heads rest on platters (unless in parked position) when the platters are not spinning at the right speed. Also, because a drive that is being spun down will cool down and warm up again when being spun up. These temperature fluctuations will be of influence on the drive reliability. The most plausible explanation I can come up with is that temperature shifts will make parts inside the drive align differently, possibly permanently changing alignment enough for head-misalignment to occur.

    --
    I was promised a flying car. Where is my flying car?
  73. Re:And the winners are... by m.dillon · · Score: 1

    And... that's it? What did SMART say? Did you actually wear the SSDs out as-per the wear indicator? Or did you hit a bug in the samsung controller before the wear-indicator maxed out?

    To be fair, the precise situation you describe, particularly if you did not retune the RAID-6 setup or the mysql server, and if the server was fsync()ing on every transaction (instead of e.g. syncing on a fixed time-frame as postgres can be programmed to do)... that could result in el-cheapo samsungs not being able to do any write-combining and cause a 256:1 write-amplication of the data.

    With proper tuning the write-amplication could easily be reduced to 4:1 and you would probably be able to run the server with SSDs. Maybe use Intel or Crucial though, and not Samsung. It isn't just the controller that matters... just using stock firmware doesn't really net you a good, robust SSD and there aren't too many real vendors who work on the firmware vs just OEM whatever was supplied with the controller. Intel is probably one of the better ones. They actually fix bugs, as does Crucial. Samsung... I dunno.

    -Matt

  74. Re:Sigh. by complete+loony · · Score: 1

    How do you know he wasn't listing them chronologically? "1 Intel, 2 Samsung[, 1 don't recall] and 1 Critical. "

    --
    09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
  75. media wearout indicator (MWI)? by Anonymous Coward · · Score: 0

    Has anyone found a way to get this media wearout indicator (MWI) that the article claims to exist? Sounds like BS since none of the tools I have ever used including smartmontools or HD Sentinel show that value.

    1. Re:media wearout indicator (MWI)? by Emetophobe · · Score: 2

      If you read the article (I know this is Slashdot) they explain that MWI is an Intel-only SMART attribute. They use different SMART attributes for the Kingston and Samsung drives.

      Intel:

      This SMART attribute starts at 100 and decreases as the NAND's rated write tolerance is exhausted. It's completely unaffected by the number of reallocated sectors, and it's been ticking down steadily since the experiment began. The remaining life estimate in Intel's SSD Toolbox utility is based on the MWI, and so is the general health assessment offered by HD Sentinel, the third-party tool we've been using to grab raw SMART data.

      Kingston:

      On the HyperX 3K, the SSD life left attribute tracks flash wear. Like Intel's media wearout indicator, it counts down from 100 and is tied directly to the rated lifespan of the NAND.

      Samsung:

      The wear-leveling count is sort of like the MWI and life-left attributes on the Intel and Kingston SSDs. It's "directly related to [the] lifetime of the SSD," according to Samsung, and it bottomed out after 300TB of writes.

  76. Re:Sigh. by Twinbee · · Score: 1

    All you had to do was check at Amazon to see the star ratings people are giving them. The Samsung's were/are at about 4.8/5 for hundreds of reviews, while the OCZ was closer to 3/5 (again for loads of reviews). I'm still amazed how few actually bother doing this simple step.

    --
    Why OpalCalc is the best Windows calc
  77. Re: And the winners are... by Blaskowicz · · Score: 1

    No, with 1920x1080 24bit per pixel and 30fps I'm getting 178MB/s so six cameras almost saturate the SSD. Of course it would be a great deal more reasonable to acquire the data compressed in H264 frame by frame, or H265 if it comes out and is better for that.
    Though, we might get mad and use uncompressed 1080p at 60fps. Then you can have a realistic "zoom in and ehnance" sequence as in the dumb movies and TV shows, with an algorithm able to combine data from multiple pictures and see a face more clearly esp. if the subject was reasonably stationary for a short while.

  78. Re:And the winners are... by Blaskowicz · · Score: 1

    But the SSD controllers aren't fully aware of what is going on and the free or reserved space for wear leveling is splitted as many times as you have drives. A n times bigger SSD are still better than n SSD in a RAID 0.
    Then we're limited by interface for speed, but we have good incremental progress on the horizon. PCIe storage is already standardized in the form of M2 and SATA Express, the latter works in PCIe 2.0 2x or 4x, the latter is 2x : that gives a theoretical 1GB/s and 2GB/s. Upgrade to PCIe 3.0 doubles that.
    SAS 12Gb/s is also an option.

  79. Re:And the winners are... by Blaskowicz · · Score: 1

    As far as I know SAS 12Gb gives you 1.2GB/s theoretical.

  80. Re:And the winners are... by Blaskowicz · · Score: 1

    Also Samsung 840 (non pro) is TLC chips, three bits stored per flash cell. They'd be the drives that suffer the most from that write amplification, with 840 EVO that is similar and is very aggressive in working with very few overprovisioning.

    840 Pro would have taken a lot longer to die, while still being a consumer drive. Still that was an interesting experiment.

  81. Re:Sigh. by Anonymous Coward · · Score: 0

    I've got 3 SSDs. One is a Toshiba that was stock with an Apple laptop used constantly. Fine for 3 years of use until I replaced it for a larger-capacity Corsair SSD (i.e. no problems, just too small). The other is an OCZ as the boot drive in a very heavily used Windows 7 desktop machine for ~3 years. These aren't top-of-the-line drives and OCZ is particularly notorious for reliability problems (I bought that drive before their problems became so well-known). People are probably laughing at the brand choices. I'm practically asking for problems by using cheap drives rather than, say Intel ones. And yet the drives have all been fine.

    So, that AC apparently has bad luck, or I'm running on borrowed time.

  82. Re:Sigh. by MachineShedFred · · Score: 1

    I'll see your incredibly small sample size, and raise you with "the company I work for has bought hundreds of Kingston SSDs, and we haven't had even one fail in the last two years."

    --
    Slashdot still doesnâ(TM)t support Unicode after it was added to the HTML standard in 1997.
  83. Re:Sigh. by MasterOfGoingFaster · · Score: 1

    How do you know he wasn't listing them chronologically? "1 Intel, 2 Samsung[, 1 don't recall] and 1 Critical. "

    Thanks, but no - I just screwed up. Yesterday was just my turn to be in the barrel.

    --
    Place nail here >+
  84. Re:And the winners are... by Anonymous Coward · · Score: 0

    Maybe buy SSDs meant for high transaction rates and extended write cycles next time. Throwing consumer SSDs into a SQL server is just dumb, and asking for a failure.

    Kingston has drives that are far less than the price you quote that they warranty for 5 years, and they came up with that number by how many days the drive could operate at 10GB of writes per day. Also, SLC drives are far more robust than MLC, but more expensive.

  85. Re:Sigh. by Hodr · · Score: 1

    I purchased two OCZ 64Gb SSDs and they both failed right around the end of the warranty. One was replaced under warranty, the other not. They replaced the 64Gb drive with a 60Gb drive which was a little upsetting, but better than nothing I suppose.

    Both died suddenly, and with no warning.

  86. Re:How is 700TB "endurance"? by Emetophobe · · Score: 1

    1TB of writes per day to an SSD probably isn't a normal usage scenario for your average consumer. Samsung for example claims that the average consumer writes no more than 10GB/day to an SSD:

    The 840 Series demonstrates impressive lifespan results under industry-standard methods of simulating real-world use-cases. BAPCo's SYSMARK, a third party benchmarking tool, shows a 20 year lifespan under a moderate workload consisting of 35% random writes. Applying JEDEC's testing methodology, the minimum lifespan is 7 years, despite an extremely severe workload containing 75% random writes. Keep in mind that these testing scenarios, especially the JEDEC workload, are used primarily for enterprise computing applications (e.g. workstations, servers). Under consumer workloads (internally estimated not to exceed 10GB/day for most users) and more appropriate testing scenarios, the 840 Series will show considerably better endurance numbers.

    (emphasis mine)

  87. Re:Sigh. by Anonymous Coward · · Score: 0

    OCZ's SSD's over-all average failure was about 2x-4x the industry average, which places them about mechanical harddrives, but there was several models in a row with failure rates over 50%. Worse than flipping a coin. No idea how their QA managed that, but their bankruptcy tells the story.

  88. Re:Dammit by KozmoStevnNaut · · Score: 1

    I'm using a 500GB 840 EVO as my main drive in my system. I've moved stuff like /var to a separate hard drive (because of log files and constant tiny read/writes that aren't speed-sensitive), and I do all compiles on a RAM disk, I've upgraded to 16GB to avoid swapping.

    Based on reviews etc., I fully expect the 840 EVO to outlast every other component in my PC.

    --
    Eat the rich.
  89. Re:Sigh. by Anonymous Coward · · Score: 0

    Actually, this *was* a controlled experiment, and it *does* have statistics. Unfortunately, the sample size is too small to extrapolate to the population at large, but studies often *start* like this, with interesting results triggering larger, more significant studies to determine if the result is typical, or atypical.

  90. Re:Sigh. by Nimey · · Score: 4, Funny

    No, I logged in and I've still got Outlook 2007.

    --
    Hail Eris, full of mischief...

    E pluribus sanguinem
  91. Re:Sigh. by Anonymous Coward · · Score: 0

    The OCZ drive I had lasted 2.5 years, so it was replaced within the 3 year warranty. Recently I bought a Mushkin 180GB which failed after only 10 months. Again, replaced under warranty, but nobody seems to be badmouthing them. I'm sure anecdotal data is anecdotal.

  92. Re:And the winners are... by Anonymous Coward · · Score: 0

    I'm interested in this question because we do nightly calculations rewriting postgresql databases. I'm thinking of trying to do it on SSDs for speed, but I'm worried that I'll just kill the drives in a month or so. Still, if I could get 2 months out of a sub $200 drive, it might just be worth it - make a kickstart file to ease the reinstall, then just pop in a new one when it dies.

  93. Re:Sigh. by Anonymous Coward · · Score: 0

    HAHA

  94. Re:Sigh. by Anonymous Coward · · Score: 0

    Hmm... Out of the four I bought, 1 failed after a few weeks. So I can confidently state that the typical lifetime of a SSD is somewhere between 3 weeks and 3 million years.

  95. Re:Dammit by Jesus_666 · · Score: 1

    I used to say that SSDs aren't mature yet and cost way too much. These days I find that most of the technical issues I have with them have been addressed, the sole remaining one being that HDDs tend to have kinder failure modes. SSDs have come of age and are desirable even to a relatively conservative buyer like me.

    Now if I could only justify the price... (The only computer where an SSD would be relevant for me is a laptop with a 500 GB HDD that typically sees heavy load. An SSD that fits my storage requirements would put a serious dent in my finances. Oh well, perhaps next year.)

    --
    USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
  96. Re:Sigh. by Anonymous Coward · · Score: 0

    And run the winsat formal before starting to use the new Windows installation.

  97. Supported by DrYak · · Score: 1

    RAID implementations don't always support cobbling together a random mixture of disk sizes which change over time.

    Linux' software RAID support this without any problem. As you finished a cycle of yearly swap over the whole pool, you can increase the RAID to the new maximum (= shared minimum accross the drives). The resize is done on-line and is gracefully restartable (in fact, you can even migrate to bigger RAIDs with more drives gracefully).
    (e.g.: After 6 years, once you've upgraded a RAID6 from 6x 1TB to 6x4TB, you can easily grow the system from 4TB to 16TB).

    In addition to that, modern filesystems like BTRFS and ZFS can entirely handle the random mixture of disk. Just specify the level of redundancy (i want to be able to lose 2 drives and still suffer no data loss), plug in drives, add them to the pool, and let BTRFS or ZFS handle the actual details.
    (e.g.: throw watever mix you want, total size would be always sum of drives minus what's needed for the level of redundancy you asked for).

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
    1. Re:Supported by Ed+Avis · · Score: 1

      I think that's what I was saying: a random mixture of disk sizes is not supported by this particular RAID implementation - it will only use the same size across each disk, meaning you are constrained to the size of the smallest disk in the pool. You have to upgrade all of the disks to a larger size before starting to use that size. Btrfs and ZFS sound like they handle it much better.

      --
      -- Ed Avis ed@membled.com
  98. BTRFS and ZFS by DrYak · · Score: 1

    I think that's what I was saying: a random mixture of disk sizes is not supported by this particular RAID implementation - it will only use the same size across each disk, meaning you are constrained to the size of the smallest disk in the pool.

    Okay I was thinking that you were comparing with other RAID implementation (most fake RAID cards can't even *grow* the raid, once you've cycled the drives and that the "smallest disk in pool" is now bigger).

    Btrfs and ZFS sound like they handle it much better.

    Yup, they would handle whatever you throw at them, as long as they can manage to fit the constrains you've asked.

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]