Slashdot Mirror


Samsung's SSD 840 Read Performance Degradation Explained

An anonymous reader writes with a link to TechSpot's explanation of the reason behind the performance degradation noticed by many purchasers of certain models of Samsung SSD (the 840 and 840 EVO), and an evaluation of the firmware updates that the firm has released to address is. From the piece, a mixed but positive opinion of the second and latest of these firmware releases: "It’s not an elegant fix, and it’s also a fix that will degrade the lifetime of the NAND since the total numbers of writes it’s meant to withstand is limited. But as we have witnessed in Tech Report’s extensive durability test there is a ton of headroom in how NAND is rated, so in my opinion this is not a problem. Heck, the Samsung 840 even outlasted two MLC drives. As of writing, the new firmware has only been released for the 2.5” model of the SSD 840 EVO, so users of the 840 EVO mSATA model still have to be patient. It should also be noted that the new firmware does not seem to work well with the TRIM implementation in Linux, as this user shared how file system corruption occurs if discard is enabled."

65 comments

  1. Samsung should fix it for 840 owners also by paradigm82 · · Score: 2

    Clearly the problem is there also on the 840!

    1. Re:Samsung should fix it for 840 owners also by Anonymous Coward · · Score: 0

      Can't you go back to the apping apps thing? Not that it was funny, but this just makes me cringe. Be a good troll, will ya?

    2. Re:Samsung should fix it for 840 owners also by Anonymous Coward · · Score: 0

      it's actually worse on 840 - on 840 evo they do about 50mb/sec, on 840 it's about 10mb/sec.

    3. Re:Samsung should fix it for 840 owners also by Anonymous Coward · · Score: 0

      The article is about the 840, the piece quoted might not be the most relevant, from the conclusion the article reads:

      "Reliability, as in data loss, has not been put into question. So here’s my open request to Samsung: admit the problem exists in all the affected drives as evidenced in this article and in the countless reports found in this lengthy thread on the Overclockers.net forums and elsewhere online.
      As of writing, this single discussion has gathered over 2,770 replies and 345,000 views. Thus far Samsung has decided to ignore the SSD 840 and all the aforementioned variants even though the drives carry 3-year warranties. Samsung, the ball is on your court now..."

    4. Re:Samsung should fix it for 840 owners also by Anonymous Coward · · Score: 0

      Moo Cow? Who let you out? Shouldn't you be back home at Before It's News?

    5. Re:Samsung should fix it for 840 owners also by BoogieChile · · Score: 1

      Thank you for that on-topic demonstration of the file system corruption problem.

  2. Sigh by Anonymous Coward · · Score: 0

    This reminds me of all of the abuse spewed by Samsung owners who were either completely ignorant or refused to acknowledge a problem.

  3. No Degradation Here by cdxta · · Score: 1

    Although that's because my 840 EVO is still in the box waiting to be installed...

    1. Re:No Degradation Here by Anonymous Coward · · Score: 0, Troll

      Well technically, not using it is when it degrades.

    2. Re:No Degradation Here by mister_playboy · · Score: 1

      Return it and get a different drive.

      --
      Do what thou wilt shall be the whole of the Law ::: Love is the law, love under will
  4. My older drive is worse. by glsunder · · Score: 1

    I tried hdtune on my older sata 2 samsung 470, and it's hdtune graph looks worse, with a minimum of 40MB/s. The drive's still in my system, but I don't use it anymore.

    1. Re:My older drive is worse. by Bengie · · Score: 1

      Make a good swap or temp drive.

  5. why i bought the 840 PRO by Anonymous Coward · · Score: 0

    There is a reason why I bought the 840 PRO and it's MLC. MLC is a bit more proven and reliable.

    1. Re:why i bought the 840 PRO by Anonymous Coward · · Score: 0

      Because installing the updated firmware to get around the issue was too hard?

    2. Re: why i bought the 840 PRO by Anonymous Coward · · Score: 0

      Fair call. The 5 year warranty was what clinched the choice for me at the time I bought mine. Only this morning though I've found the drive may have integrity issues - so swings and roundabouts...

  6. To keep the performance up the advertised values by Anonymous Coward · · Score: 1

    So, what they're going to do to keep the performance up to the advertised value, is to rewrite all the data at least once per 2 months. That's actually a good chunk of the rated TB written for SSDs, whose low values in that regard are only acceptable if you take into account that most data isn't continuously rewritten. If I had one of those SSDs, I'd consider returning them for a refund. They are obviously defective, as in significantly deviating from their advertised performance, either in speed or longevity.

  7. The new firmware misreports its supported features by Ingenium13 · · Score: 5, Informative

    Apparently the new firmware now advertises that it supports queued TRIM, when in fact it doesn't https://bugs.launchpad.net/ubu...

    The old firmware did not advertise queued TRIM support, so it wasn't an issue. The solution is a kernel patch to blacklist queued TRIM on all Samsung 8xx drives.

  8. toy anyway by dshk · · Score: 1, Interesting

    It has no power loss protection, so now it could lose data much faster. It should be good for worthless data but that is all. I am not sure if it has at least small capacitors, the half-assed power loss mitigation technique which does not protect new flushed data, but at least prevents the loss of old, unrelated data.

    1. Re:toy anyway by ledow · · Score: 4, Informative

      Most drives sold in the world today don't have power loss protection either.

      If it matters to you, you put that stuff in the controller, not the drive.

    2. Re:toy anyway by dgatwood · · Score: 4, Insightful

      I think the concern is that this would somehow dramatically increase the probability of data loss caused by powering the drive even while it appears to be inactive. After all, it randomly rewrites flash blocks. However, in practice, this should not be an issue.

      Presumably, their firmware never erases and rewrites a flash page in place. And presumably it does not write the log entry that causes the drive to look for those blocks in the new location until after the page has been fully written. Assuming they do, in fact, follow those rules, then a power interruption during a block clone should never result in loss of any data, because the data still exists in the old page, which will not be invalidated in favor of the replacement copy until that replacement copy is fully written. If they aren't doing that, then they are incompetent, and their drives should never be trusted with cat pictures, much less valuable data.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    3. Re:toy anyway by Gr8Apes · · Score: 1

      My guess is they do exactly this, because if they didn't, there would be large numbers of reports of data corruption.

      --
      The cesspool just got a check and balance.
    4. Re:toy anyway by dgatwood · · Score: 1

      I would certainly hope so. It's what the rest of the industry does for every write. Loss of data that isn't being actively modified should be almost impossible if the people writing the firmware are even halfway competent (ignoring unlucky filesystem metadata changes aborted halfway through).

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    5. Re:toy anyway by tlhIngan · · Score: 4, Interesting

      It has no power loss protection, so now it could lose data much faster. It should be good for worthless data but that is all. I am not sure if it has at least small capacitors, the half-assed power loss mitigation technique which does not protect new flushed data, but at least prevents the loss of old, unrelated data.

      You don't need power protection if you take precautions and design your system around the fact that power can be removed at any time.

      Some SSDs cheaped out and didn't have power protection AND used features that requires it (usually to get better performance - obviously if you're not worried about power dropping abruptly, you can avoid writing code to protect against it). It's no surprise those SSDs corrupted data liberally because their translation tables got corrupted.

      But there are plenty of SSDs that aren't concerned with performance. In fact, if you're on SATA, performance is no longer important as they're all maxing out the SATA bus. If you're wondering why they all seem to be at 540MB/sec read and writes, that's because SATA is now the bottleneck. So now you can spend lots of time working on power-fail-safe firmware - because if you're stuck at 540MB/sec, it doesn't matter what performance tweaks you do because you're stuck there. If you can do 1GB/sec internally, and power safe code loses 40% of that, you do it. 1GB/sec is wasted on SATA, but you can save a few bucks by not needing power backup parts. 40% loss brings you down to only 600MB/sec, which is faster than SATA still.

      It's why next gen SSDs are going PCIe - 540MB/sec is nothing compared to 1.5GB/sec you can find on Apple's machines.

      Power fail is nice to have, but given everything's limited by SATA more than anything else, it's currently optional. For PCIe SSDs, you'd expect power fail components because you need performance.

      Ironically, the faster it is, the less you need since you just need to dump your tables to storage ASAP, and if you're able to do 1.5GB/sec writes, and your tables are 500MB in size, you only need power for half a second. While if your media speeds was only 500MB/sec, you';d need power for a whole second.

    6. Re:toy anyway by gweihir · · Score: 1

      No modern drive except some very expensive data-center drives have it. In addition, basically all moder drives lie about having flushed data to permanent storage. This includes spinning disks. The Linux file-system folks found out about it some while ago. It is not a big deal if the OS filesystem driver knows about it. Also, if you have a good PSU, that will keep power up for 20ms or more after the respective signal line signals loss of power and then drop power slow enough that the drive has time to find out itself and can still do something about it. The Linux kernel folks had to do a hard switch off in order to get defects. No reasonable PSU does them.

      The world is not perfect, deal with it. Probably the only reason why this came to light at all is because Samsung sold so many units. I do not want to know what other manufacturers get wrong, but I have a dead OCZ here and one that returns data with bit-errors, even when freshly written, but never gives any error messages.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    7. Re:toy anyway by m.dillon · · Score: 1

      Actually, more and more SSDs today *DO* have power loss protection. Take it apart... if you see a bunch of capacitors on the mainboard all bunched together with no obvious purpose it's probably to keep power good long enough to finish writing out meta-data. Cheaper to use a lot of normal caps than to use thin-film high capacity caps.

      -Matt

  9. Re:To keep the performance up the advertised value by ledow · · Score: 4, Informative

    Say you bought the 1Tb version (which is big for an SSD).

    In this case, you rewrite it six times a year. That's 6Tb of write. That's...well... pathetic compared to the write expectancy of an SSD anyway.

    So, actually, it's not that big a deal at all.

  10. Solution: by Anonymous Coward · · Score: 0

    Use a copy on write filesystem, and make sure to image and reimage every 8 weeks. For optimal performance: heat up the NAND chips but cool down the controller itself.

    And vote with your money.

  11. Re:To keep the performance up the advertised value by Anonymous Coward · · Score: 0

    It's low enough that most users probably won't have any disadvantage from these additional writes, but it's not entirely negligible. "Write expectancy" of SSDs isn't really that generous. It's typically on the order of a few hundred complete writes. Samsung claims an expected lifetime of close to 30 years for typical workstation loads. Over that time, 180 complete writes is a significant portion of the expected write endurance. Now, I wouldn't expect any of these drives to still be in use in 2045, but just because the downgraded life expectancy is probably still good enough for most people does not mean it's not a significant downgrade.

  12. Re:The new firmware misreports its supported featu by Anonymous Coward · · Score: 0

    Apparently the new firmware now advertises that it supports queued TRIM, when in fact it doesn't https://bugs.launchpad.net/ubu...

    The old firmware did not advertise queued TRIM support, so it wasn't an issue. The solution is a kernel patch to blacklist queued TRIM on all Samsung 8xx drives.

    Whoa there! The Samsung 840PRO in addition to all the 850 series were not affected by this issue. You are painting with a very wide brush. I sincerely hope this is not the kind of decision-making that makes its way into the Linux kernel (though I suspect that is sometimes the case).

  13. Re:To keep the performance up the advertised value by Anonymous Coward · · Score: 0

    2045? Really? How many hard drives do you have running from 1985?

  14. Re:To keep the performance up the advertised value by ArcadeMan · · Score: 1

    I still have one, but it requires 1.21 gigawatts.

  15. Just like Floppies then? by Anonymous Coward · · Score: 4, Informative

    Goodness me.
    We had this problem back in the 1970s/1980s with floppy disks!

    When the disc drive writes to a part of the surface of the disc it energies the magnetic particle to saturation. This ability of the material to keep so much of its original pulse of energy was called the clipping level of the floppy.

    As soon as the area is energised, it starts to decay (hopefully) very slowly over time. Once it decays below 40% of the energy originally given, that bit is lost and data is lost.

    Some cheap floppies had a nasty low clipping level as they'd use cheap materials, over time of say a year the area that hasn't been rewritten to would decay and that bit was then unreadable. You lost that data. We had various programs that would take the 8", 51/4" and 3.5" floppies and read then rewrite the entire disk to ensure that the disc was refreshed. As I worked in Ferranti for the UK space and military, I could ask the likes of TDK,Maxell, etc. what the clipping levels of their discs were. Something the public didn't have access too.

    If the sellers wouldn't say, we simply didn't buy from them. Let me tell you most low-medium priced suppliers hide this value and we didn't do business. Glad to say the top disc suppliers were always open and we'd buy discs with an over 80% clipping level!

    With these MLC SSDs the voltage level is very important. It'll decay over time, nothing can stop it.

    1. Re:Just like Floppies then? by gweihir · · Score: 1

      Well indeed. The demodulation problem for decaying signals is not going away. It is basic physics.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  16. Dear Samsung by wbr1 · · Score: 1

    Admit you're wrong and fix the vanilla 840s and OEM devices based on the same or similar firmware/NAND. Keep mistreating customers and they will leave. This is not the smart tv market with average joes not caring. Most SSDs are purchased by pros, are specced en masse for machines by pros, or by well informed enthusiasts, Treat this customer base ill at your peril.

    --
    Silence is a state of mime.
  17. Re:To keep the performance up the advertised value by Anonymous Coward · · Score: 0

    Can you read? I said I don't expect any of the drives to be in use for 30 years, but the 30 year lifetime isn't a claim I made, that was Samsung's. Anyway, let's say you expect to use it for 7 years. That's not an unrealistic expectation. Over that time, 40 additional full rewrites amount to 5% of the total expected write endurance of the drive. Not a big issue, but not nothing either. I'm sure Samsung would object if all their customers came up 5% short on their end of the deal. And besides, a full rewrite of a 1TB SSD every two months isn't negligible from a power consumption point of view in a mobile device, because these additional writes will occur at times when the drive would normally be idle. That is 17GB of additional writes per day and roughly doubles the typical total writes of a desktop system.

  18. Re:The new firmware misreports its supported featu by Anonymous Coward · · Score: 0

    Seriously. How does this kind of knee-jerk bullshit get modded +5 informative?

  19. 840 by Anonymous Coward · · Score: 0

    I'm still getting 10mb/sec sec on samsung 840. they have released no fix. only for evo :(

  20. Re:To keep the performance up the advertised value by Anubis+IV · · Score: 2

    Say you bought the 1Tb version (which is big for an SSD).

    1Tb was big maybe five years ago for an SSD, but these days pretty much every SSD-based laptop has that much or more. If you want to go big now, 1TB is around where you'd look. Or 1TiB. But not 1Tb.

  21. Re: To keep the performance up the advertised valu by Anonymous Coward · · Score: 0

    You obviously failed to get his point.

  22. Strange Linux behavior by tlhIngan · · Score: 3, Insightful

    We have a bunch of shared build PCs with 840 Evo SSDs in them and we noticed strange problems when we build off the SSD (over say, the HDD).

    Basically what would happen after a little while (a month), all of a sudden during the build the entire system would practically lock up - all the cores are pegged at 99% system time, and system responsiveness collapses - it can literally be minutes for the system to respond. It makes a little headway, but compilation speed drops (since 99% of every core is spent in the kernel). It's completely fine off the hard drive, and if it wasn't for this loss in speed, the SSD would be faster (right now because it pauses a few minutes every 15 or so, the HDD is faster).

    It's completely unusual - I did try to analyze the kernel, which appeared to have all the cores tied up in ext4 spinlocks. Not sure if it's a result of the tables being slow and blocking or what.

    It happens under high load - I normally set the build at 12 threaded builds (8 cores!). Thought at first it was Linux collapsing under the weight of the build, but it's actually the SSD. Building off hard drive on the system system is no problem at all.

    1. Re:Strange Linux behavior by Anonymous Coward · · Score: 1

      LKML would probably love to have an detailed report of that problem. Sounds quite interesting situation.

    2. Re:Strange Linux behavior by Anonymous Coward · · Score: 0

      And what crappy AV program do you have installed with on-access file scan?

    3. Re:Strange Linux behavior by m.dillon · · Score: 1

      This is not related to the SSD. If your cpus are pegged then it's something outside the disk driver. If it's system time it could be two things: (1) Either the compilers are getting into a system call loop of some sort or (2) The filesystem is doing something that is causing lock contention or other problems.

      Well, it could be more than two things, but it is highly unlikely to be the SSD.

      One thing I've noticed with fast storage devices is that sometimes housekeeping operations by filesystems can stall out the whole system because the housekeeping operations assume the disk I/O will block when, in many cases, the disk I/O completes instantly and essentially does not block, causing the kernel thread to eat more cpu than intended.

      -Matt

    4. Re:Strange Linux behavior by Anonymous Coward · · Score: 0

      No, most likely the SSD is forcing multiple interrupts on the SATA bus. I've seen similar behavior with Windows too upon a failing drive. Just part of the x86 architecture in how interrupts are performed.

    5. Re:Strange Linux behavior by tlhIngan · · Score: 1

      This is not related to the SSD. If your cpus are pegged then it's something outside the disk driver. If it's system time it could be two things: (1) Either the compilers are getting into a system call loop of some sort or (2) The filesystem is doing something that is causing lock contention or other problems.

      Well, it could be more than two things, but it is highly unlikely to be the SSD.

      One thing I've noticed with fast storage devices is that sometimes housekeeping operations by filesystems can stall out the whole system because the housekeeping operations assume the disk I/O will block when, in many cases, the disk I/O completes instantly and essentially does not block, causing the kernel thread to eat more cpu than intended.

      True, however, it seems to be caused by the SSD. As in the same machine with SSD and HDD, the SSD will cause the issue, the HDD will not.

      And that's the real level of granularity I have into the problem.

      I do note it only happens when there is a lot of I/O going on - even the simple act of tarballing a big build directory stalls out (I was actually trying to avoid this issue with the 840EVO by simply refreshing my build tree by tarring up the build onto the HDD, then deleting the SSD, doing a TRIM, then untarring).

      The problem is it's not consistent at all. A similar PC (same model, different SSD and HDD) using an 840 Pro (and now 850 Pro as an upgrade) never suffered from the problem.

      And given no one else seems to have found the issue with Linux, I'd hesitate posting to the LKML - the 840 Evo's have been out for ages, and if there was a real problem, it would've been reported.

      It's just strange when you look at the CPU graphs in Gkrellm and it goes from blue (user) to all orange (system) time and even it stalls out. Like the kernel goes into some sort of introspective state where it contemplates the universe and ignores everything else.

      Like I said, it may not be the SSD, but the SSD seems to be an important contributor to the problem.

  23. Re:The new firmware misreports its supported featu by gweihir · · Score: 3, Informative

    Judging from the kernel's blacklist, queued TRIM does cause issues on quite a few SSDs, the 840EVO just did not announce that capability before the patch and now does (but cannot do it), and hence the problem. The kernel folks are now adding all Samsung 8xx to the blacklist, which will likely fix the issue. As Windows is traditionally behind in these areas it may just not use queued TRIM at all. I do hope that Samsung adds (more) Linux test systems to their qualification process now, though. Side-note: The 850PRO is apparently affected as well, but the kernel already blacklists it.

    The conclusion here is that apparently getting SSD firmware right is a pretty big challenge and that SSD technology is still evolving. Also, not enough testing on Linux and likely not enough really smart people in the SSD firmware team. It is a learning process and the prevalent "clueless MBA bean-counter plague" will likely affect Samsung as well, just as it does any other large company.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  24. Re:The new firmware misreports its supported featu by jones_supa · · Score: 1

    As Windows is traditionally behind in these areas it may just not use queued TRIM at all.

    That is my suspicion as well. This is sooo often the issue with all sorts of firmware. Linux tries to implement cutting edge features by spec, but in practice the hardware makers just write everything against Windows spec. The hardware might announce ACPI 5.0 support or queued TRIM support, but the actual codepaths are stubs that don't work properly. When such hardware is used under Linux, unexpected error states can be encountered. Sad trombone.

  25. Re:The new firmware misreports its supported featu by Anonymous Coward · · Score: 0

    Because the issue is in Samsung's general SSD codebase it would seem.
    They released firmware XM02B6Q for the 850 Pro in February this year, only to pull it because it was bricking drives.
    But that firmware also introduced queued TRIM support, something the SSD drive itself does not support!
    Now we see the same with the new firmware EXT0DB6Q released for the 2.5" 840 EVO.
    If that is not proof enough that Samsung does not know what they are doing and the kernel developers do I don't know what to say!

  26. Re:The new firmware misreports its supported featu by gweihir · · Score: 1

    That and apparently many hardware makers do not test against Linux or do not do it well. As reverse-engineering what Windows does is really not easily possible, we will see things like this from time to time. I hope this will result in more and more public shaming, as that is the only way to make the bean-counters realize they are not investing enough effort.

    The only good thing is that the Linux kernel folks are pretty fast to respond.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  27. Re:To keep the performance up the advertised value by Anonymous Coward · · Score: 0

    Oh come on, man. You should have done it in Authentic Western-Style Anonymous Coward.

    Like this: "small b means bits you shithead".

  28. Re:To keep the performance up the advertised value by udippel · · Score: 2

    Wow. I'm surprised how you take a crappy drive like that so easily. I bet you didn't pay the serious money they actually charged for it? And it's no cheapo brand neither; rather the self-declared Mercedes.
    And the prescribed maintenance is rewriting all data twice a month because it tends to be forgetful. What a fantastic piece of hardware! That is what we ought to discuss here; rather than if a brute-force workaround ... just works. Sure it does!

  29. Hyped up by Anonymous Coward · · Score: 0

    I dunno if everyone is aware that Samsung pays through the nose yearly to get large numbers of all sorts of journalists of technical papers on free business trips to Korea, several days, all included: business class regular flights, 5*-hotels, all meals, to cover the regular launch of their latest and greatest SSDs, including visits to the factories.
    No wonder that Samsung SSDs get hyped up after such incentives by the writers of those journals.
    If in doubt: no fantasy, but an enjoyment that a family member of mine partakes in regularly, the 'SSD-man' of one of those hardware journals. And now you won't tell me that those expenses are not factored into the selling price.
    Luckily, I bought a Crucial on his recommendation, though.

    I think, AC is the adequate authorship in this particular case! ;-)

  30. Re:To keep the performance up the advertised value by Christian+Smith · · Score: 1

    And the prescribed maintenance is rewriting all data twice a month because it tends to be forgetful. What a fantastic piece of hardware! That is what we ought to discuss here; rather than if a brute-force workaround ... just works. Sure it does!

    No. The prescribed maintenance is to rewrite old data that hasn't been written to for 2 months, because it tends to be slow to read otherwise. No-one has reported data loss as a result of this.

    Notice also, that all the reporting so far uses the artificial benchmarks to demonstrate the problem. In normal use, you'd be unlikely to ever notice, unless you're copying big old date files from one location to another.

  31. Samsung vs Crucial by Anonymous Coward · · Score: 0

    Why do so many people seem to use these Samsung drives when they seem to have so many problems?

    I never hear anything about Crucial SSDs; It would seem logical that those would be better than Samsung SSD's, no?

  32. Re:To keep the performance up the advertised value by udippel · · Score: 0

    Sorry, my mistake: wrong wording. I didn't mean data loss, but serious slowdown.

    'unless' what? Why so apologetic for Samsung? I still think you can't be affected, or you must be rich. If I pay four-fold for SSD, I won't find excuses for the manufacturer when the SSD is slower than a standard disk running on platters. Why would you?
    Yes, my movie collection for example: some movies that i didn't watch for years. "Oh, sorry, Samsung 840!" when the movies stutter, or "oh, sorry, Samsung 840!" when copying close to 1 TB takes ... wait ... 100.000 seconds, that is some 30 hours? And this is still generous, assuming 10 MB/s; while some report even lower transfer rates.

    Are you by any chance related to Samsung and try to stonewall criticism?

  33. Let's put everything we know on a row and think by udippel · · Score: 1

    0. About half a year ago, performance at read of data not read for a long time is observed to degrade.

    1. Samsung acknowledges this fact.
    The work around is obvious to everyone with a common sense: re-read data with old access data. This would be no fix, but the work-around of choice.

    2. Samsung offers a fix some months later. Immediate observation: this fix doesn't fix the problem.
    Samsung asks for more time and promises a fix.

    3. Some more months later, Samsung provides the 'fix', which isn't, but the almost obvious work-around: regularly re-write (!!) the data.

    Conclusions:

    4. Samsung has tried to find a fix during 6 months, but in vain. The final solution is a brute-force work-around.

    5. Strangely, though, the obvious work-around, that is re-read the data regularly, is not chosen. Instead, the data are re-written. This points to

    6. There is more than meets the eye, because the path of no-wear, lower power re-read is not taken, but the one that uses additional power and additional life-cycles. This can't be an oversight on the side of Samsung, but intentional.

    7. Why? What is it that we all don't know? There must be additional problems (unknown to me, at least) for Samsung to take this path for the work-around.

    1. Re:Let's put everything we know on a row and think by Anonymous Coward · · Score: 0

      @udippel: While it's clear that Samsung's first fix didn't solve the issue, I don't think this suggests "additional problems", merely that the fix didn't work. As for the details of the fix, it seemed to be tweaking the calibation parameters for reading the old data. MAybe this lead to an improvement in the lab, but clearly didn't lead to a sustainable solution in real life. It's not "obvious" the problem can be solved with calibation parameters since after all the underlying pheomena is changing/degrading of the data over time without any active involvement/action of the controller. Maybe a solution could have been for the controller to keep track of when the data was written and calibrate read parameters accordingly (expecting lower voltages for old data). But maybe the notion of "time" is hard for the controller. There's no clock inside the SSD and when it's switched off it is switched off. How will the SSD know how long has passed since it was switched off? So in other words I think Samsung's first solution was simply to do some general tweaking of calibration parameters or maybe some rough distinction of old vs new data, which clearly wasn't enough.

      Given enough time, the data will degrade to the point it can't even be read back using error correction (with the associated slow down) because the data isn't there. If this is an unsatisfactory short time, the only solution is to periodically rewrite the data.

      Now I don't know how the re-write was done. But I hope they periodically try and re-read all the data, and then rewrite the data that appears to be getting weaker (compared to just rewriting everything all the time). In that way, you are at least only rewriting where necessary.

      Still, what happens if the disk is switched off for a very long time (or is only switched on for brief periods too short for allow for rewriting to take place to a significant degree)? Will the data just get progressively degraded and become unreadable. I'm especially concerned about data that is rarely touched by the user some day might just need.

      There's one unexplained thing to the whole affair though: Why didn't the read performance for the old data (even before any FW upgrades) not increase after the initial slow read? After all, if the disk controller has to use ECC to recover the data (explaining the slow performance), it should schedule that data to be rewritten to another block soon.

  34. Power usage? by Anonymous Coward · · Score: 0

    Ugh.

    So data drift starts happening at 8 weeks, and the old data starts going from 500 MB/s to as slow as 30 MB/s.The first update tried to use different voltages when trying to access the data so it'd be less slow, and now the second update continuously rewrites data in the background in order to keep it fresh. So the drive I bought a few months ago for it's low power usage, speed and reliability now has 2 out of the 3 compromised. I'm trying to find updated power usage statistics if it's constantly rewriting in the background, but not finding anything. Really wishing I'd picked up the Crucial m500 at this point...

    Let the class action lawsuits start.

  35. Re:The new firmware misreports its supported featu by Anonymous Coward · · Score: 0

    I think in this case this is really inexcusable. So, they've got this shiny new queued trim feature they'd like to announce support for. They know neither Windows nor OSX actually will use it. They decide to not test it with the only easily available system (linux) which does support it.

    So how did they actually test it I wonder, if at all?

    Maybe they used the same test equipment as Crucial does I guess... Took Crucial years to finally acknowledge it is broken on all their SSDs. But Samsung is even more stupid here, despite they know (or should know) this feature caused problems on ssds from other manufacturers, they decide to now support this feature (which almost noone really would have missed anyway), don't test it and to everybody's surprise it fails just the same. Mind-boggingly stupid...

  36. Re:The new firmware misreports its supported featu by gweihir · · Score: 1

    Given that the kernel 4.0.2 blacklist lists Micron, Crucial and Samsung as broken for this feature, you may be on to something. I also completely agree on the stupid. It is likely more complex though: They may have tried to produce the firmware cheaper than possible, by having only semi-competent (cheaper) people on it and replacing technological insight with "processes". Would not surprise me one bit. The MBA-bean-counter plague is strong in the industry these days. Save a penny, lose a billion is the name of the game.

    Fortunately, Linux Kernel development is still a meritocracy (user-space only very partially so, just look at systemd or Gnome....) and things tend to get resolved fast, and in full view of everybody. The fix is in next-20150511, and I expect will go into all other maintained kernels soon. And more conservative distros did not turn on TRIM anyways and are unaffected.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  37. Head Up Ass by WhoBeDaPlaya · · Score: 1

    It's taking them ages to admit that the vanilla 840 has the same problem. Just like the silent data corruption firmware problem with their 2TB Spinpoint F4, which they purported to have fixed (just like the 840 EVO). Now before you mark me as a troll / flamebait, know that I do like some of Samsung's storage products. I have a bunch of 1TB Spinpoint F1, 1TB Spinpoint F3 and 512GB 840 Pro drives, all performing brilliantly and reliably.