Slashdot Mirror


Samsung Finds, Fixes Bug In Linux Trim Code

New submitter Mokki writes: After many complaints that Samsung SSDs corrupted data when used with Linux, Samsung found out that the bug was in the Linux kernel and submitted a patch to fix it. It turns out that kernels without the final fix can corrupt data if the system is using linux md raid with raid0 or raid10 and issues trim/discard commands (either fstrim or by the filesystem itself). The vendor of the drive did not matter and the previous blacklisting of Samsung drives for broken queued trim support can be most likely lifted after further tests. According to this post the bug has been around for a long time.

12 of 184 comments (clear)

  1. awkward! by Anonymous Coward · · Score: 4, Insightful

    Well, that's gotta be embarrassing for everyone bashing Samsung over this. I remember reading some rather strong opinions about who was at fault.

    1. Re:awkward! by mwvdlee · · Score: 2, Insightful

      Even more so for the kernel developers that blacklisted the Samsung drives.
      These developers should probably be banned from kernel development or atleast banned from making decisions regarding functionality.
      Creating code with a bug is human, not doubting your own code and blaming somebody else is stupid.

      --
      Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
    2. Re:awkward! by Khyber · · Score: 2, Insightful

      If the kernel devs and Linus don't apologize, they're all a bunch of self-absorbed shitlords and should be smacked off the face of this planet.

      --
      Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
    3. Re:awkward! by Anonymous Coward · · Score: 5, Insightful

      The firmware bug of Samsung drives, a very severe one actually, was confirmed by Samsung. The RAID 0 issue is a totally different one, hardly affecting anyone.

      So yes, the severe issue was a bug on Samsung side, thile the very rare RAID 0 bug is Linux kernel one.

  2. Re:Bravo by gstoddart · · Score: 4, Insightful

    After many complaints that Samsung SSDs corrupted data when used with Linux

    There was definitely some self-interest there.

    Samsung can't have people saying their SSDs corrupt data when it's not them doing it.

    --
    Lost at C:>. Found at C.
  3. Just another case.... by darkain · · Score: 4, Insightful

    This is just another case of "Not My Problem" syndrome that too many techs get into. They think their code/tools/systems/whatever must be perfect, and other's are the ones fucking up. Samsung drives went on a blacklist for issuing the commands to them due to this bug? "WALP, LINUX IS PERFECT, MUST BE THE HARDWARE GUYS, even though their devices perform perfectly on other OSes" - and instead now we're left with a bug in Linux that corrupts data until the patch can make its way through the distro channels and pushed out to end users.

  4. Re:Why did it only happened on Samsung's SSDs? by Anonymous Coward · · Score: 5, Insightful

    Confirmation bias. It was happening with other brands, but for one reason or another, people focused in on Samsung as the culprit, and once that happened, there was no getting out of it.

  5. Re:Crying wolf by beernutz · · Score: 4, Insightful

    The point however is that in a closed source system, Samsung could not have found and fixed the bug themselves.

    --
    (stolen from DaBum) I am dyslexia of borg - your ass will be laminated.
  6. Re:Crying wolf by Anonymous+Brave+Guy · · Score: 3, Insightful

    Is that really the point, though?

    Vendors of products affected by bugs in closed source software collaborate all the time. It's usually in their mutual interests, and it has been going on forever. Just look at the extraordinary lengths Microsoft used to go to in order to maintain compatibility of Windows with older applications.

    On the other hand, the existence of this issue in the first place, the fact that other vendors whose products may also have been affected did not act as Samsung did, and particularly the denial and active yet unjustified blacklisting of Samsung products by the people running the project with the real fault are indictments of that project, no matter how open it claims to be or how big and famous it is.

    This whole affair does not look good for Linux, and more importantly, it does not reflect well on the people currently running development of Linux.

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  7. Re:Bravo by Anonymous Coward · · Score: 4, Insightful

    Of course, this is only possible when the "other person's" code is Free Software. If this had been a problem in Windows/OSX that Microsoft/Apple was refusing to fix, there's little Samsung could have done about it.

  8. Re:fairly common to blacklist devices by Midnight+Thunder · · Score: 4, Insightful

    hardware firmware is commonly buggy. Device drivers often have to work around buggy hardware, so blacklisting devices for various functionality is not at all unusual.

    If the code seems to work with other devices and breaks with a new device, then the first instinct is going to be to assume the new device is doing something wrong.

    Another way of seeing things, is even if the bug is in the kernel, black listing still prevents damage to data on said vendor's hardware. When it comes to data corruption the first thing to do is limit damage, no matter who is it at fault. Afterwards, you can work together to try to isolate source of problems. Having unhappy users and customers is never good, unless you are the competition.

    --
    Jumpstart the tartan drive.
  9. How was this recreated before the bug existed? by godamntheman · · Score: 4, Insightful

    Something doesn't add up ... The fix for this was an oversight in a relatively new "bio_split()" routine that merged in with the immutable bio vector patch set for Linux kernel 3.15. The Algolia blog referenced in the Samsung patch claims it was able to replicate the discard issue using kernels 3.2, 3.10, and 3.14, before the bug existed. What gives?