Slashdot Mirror


Samsung Finds, Fixes Bug In Linux Trim Code

New submitter Mokki writes: After many complaints that Samsung SSDs corrupted data when used with Linux, Samsung found out that the bug was in the Linux kernel and submitted a patch to fix it. It turns out that kernels without the final fix can corrupt data if the system is using linux md raid with raid0 or raid10 and issues trim/discard commands (either fstrim or by the filesystem itself). The vendor of the drive did not matter and the previous blacklisting of Samsung drives for broken queued trim support can be most likely lifted after further tests. According to this post the bug has been around for a long time.

8 of 184 comments (clear)

  1. Bravo by Virtucon · · Score: 4, Interesting

    Nice to see vendors working together to improve Linux.

    --
    Harrison's Postulate - "For every action there is an equal and opposite criticism"
    1. Re:Bravo by DarkOx · · Score: 5, Interesting

      Sure there was self interest. Still I think they deserve a lot of credit here. Rather than the typical "Its not my code" response from a developer who is sure the problem is elsewhere (rightly or wrongly) they actually found and fixed the problem. That is good behavior!

      --
      Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
    2. Re: Bravo by bill_mcgonigle · · Score: 4, Interesting

      Yeah, the outcome is great. I just wonder why they waited more than a year to look into it. Maybe this will set a good example for the industry that with a little bit of effort you can take care of your customers and sell more product.

      If this were the 80's and a hard drive vendor had more than two reports of data loss under, say VMS, there would have been engineers on a plane to DEC by morning to get it solved by the coming weekend.

      Now we have thousands of users with reports and millions of units sold, and a wealthy vendor, and it's all crickets, leaving some kernel hackers to half-ass a blacklist. It's not like this is BeOS - there are millions of servers running in the target market. I don't mean to absolve the bad troubleshooting by kernel devs, but want to know what drove the apathy at Samsung (and other vendors behaving poorly). It's obviously not profit motive.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
  2. Re:awkward! by Anonymous Coward · · Score: 2, Interesting

    I'd be interested to see if anyone has apologized. Doing so is exceedingly rare on internet forums.

  3. Vote with your wallet by jwkane · · Score: 4, Interesting

    Vote with your wallet, my next SSD will be a samsung.

  4. Re:Just another case.... by 0123456 · · Score: 4, Interesting

    Devices working perfectly in other OSes is no indicator that the device is no at fault. Witness the vast amount of crap laptop hardware, whose disastrous ACPI implementations only worked because their Windows drivers were chock-full of workarounds.

    Back when I was writing Windows drivers for plugin cards, there were certain motherboards that we'd detect and switch the motherboard bus to the slowest possible speed, because the chipset was a heap of junk that didn't work properly at higher speeds. Anyone who said 'but it works on Windows!' clearly had no idea that it only worked because we'd intentionally turned off most of the features.

  5. Re:Just another case.... by Anonymous+Brave+Guy · · Score: 3, Interesting

    A pro-Linux bias on Slashdot is not exactly a surprise, but an equally accurate headline on another forum might have read "Critical bug in Linux corrupts data on SSDs", and the subtitle "Linux maintainers deny serious fault, blame innocent parties for data loss" would probably have been fair too.

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  6. Re:Just another case.... by nojayuk · · Score: 4, Interesting

    We did workarounds on the ATA bus spec for known hardware bugs in older VIA chipsets. These were silicon bugs, not chipset firmware so they couldn't be fixed afterwards with patches and there were millions of these boards out there. Declaring our devices (CD-ROM and DVD-ROM drives) wouldn't work with these boards was not going to happen for sales reasons so our code included a lockup-recovery function that was invoked when the rare bug conditions were met and the IDE bus froze. The average user never noticed these lockups and we didn't tell them about them.

    Out-of-spec bugs like this were well-known in the industry and workarounds were easy to produce as long as you had access to a few million bucks worth of test equipment and a good team of professional engineers with decades of experience, not something that's common in the Linux world.