Slashdot Mirror


TRIM and Linux: Tread Cautiously, and Keep Backups Handy

An anonymous reader writes: Algolia is a buzzword-compliant ("Hosted Search API that delivers instant and relevant results") start-up that uses a lot of open-source software (including various strains of Linux) and a lot of solid-state disk, and as such sometimes runs into problems with each of these. Their blog this week features a fascinating look at troubles that they faced with ext4 filesystems mysteriously flipping to read-only mode: not such a good thing for machines processing a search index, not just dishing it out. "The NGINX daemon serving all the HTTP(S) communication of our API was up and ready to serve the search queries but the indexing process crashed. Since the indexing process is guarded by supervise, crashing in a loop would have been understandable but a complete crash was not. As it turned out the filesystem was in a read-only mode. All right, let's assume it was a cosmic ray :) The filesystem got fixed, files were restored from another healthy server and everything looked fine again. The next day another server ended with filesystem in read-only, two hours after another one and then next hour another one. Something was going on. After restoring the filesystem and the files, it was time for serious analysis since this was not a one time thing.

The rest of the story explains how they isolated the problem and worked around it; it turns out that the culprit was TRIM, or rather TRIM's interaction with certain SSDs: "The system was issuing a TRIM to erase empty blocks, the command got misinterpreted by the drive and the controller erased blocks it was not supposed to. Therefore our files ended-up with 512 bytes of zeroes, files smaller than 512 bytes were completely zeroed. When we were lucky enough, the misbehaving TRIM hit the super-block of the filesystem and caused a corruption."

Since SSDs are becoming the norm outside the data center as well as within, some of the problems that their analysis exposed for one company probably would be good to test for elsewhere. One upshot: "As a result, we informed our server provider about the affected SSDs and they informed the manufacturer. Our new deployments were switched to different SSD drives and we don't recommend anyone to use any SSD that is anyhow mentioned in a bad way by the Linux kernel."

9 of 182 comments (clear)

  1. Is there a site maintaining a list of "bad" SSDs? by msobkow · · Score: 4, Interesting

    I'll Google in a moment, but I was wondering if anyone knew of any good sites that maintain lists of good/bad SSDs for Linux. With the number of vendors out there nowadays, having to scan the source seems like a poor way to track the information.

    --
    I do not fail; I succeed at finding out what does not work.
  2. TRIM -- command of mass destruction by m.dillon · · Score: 5, Interesting

    The only TRIM use I recommend is running on it on an entire partition, e.g. like the swap partition, at boot, or before initializing a new filesystem. And that's it. It's an EXTREMELY dangerous command which results in non-deterministic operation. Not only do SSDs have bugs in handling TRIM, but filesystem implementations almost certainly also have ordering and concurrency bugs in handling TRIM. It's the least well-tested part of the firmware and the least well-tested part of the filesystem implementation. And due to cache effects, it's almost impossible to test it in a deterministic manner.

    You can get close to the same performance and life out of your SSD without using TRIM by doing two simple things. First, use a filesystem with at least a 4KB block size so the SSD doesn't have to write-combine stuff on 512-byte boundaries. Second, simply leave a part of the SSD unused. 5% is plenty. In fact, if you have swap space configured on your SSD, that's usually enough on its own (since swap is not usually filled up during normal operation), as long as you TRIM it on boot.

    -Matt

  3. standards? by Anonymous Coward · · Score: 0, Interesting

    Its not the fault of TRIM... but Linux guys will code a fix for the offending hw before we can blink. Is this shady maneuvering at top levels of hardware design by competing OS parties to take cause Linux to take a reliability hit? Or just an oversight bug?

  4. Apple TRIM Whitelist? by LDAPMAN · · Score: 3, Interesting

    I wonder if this issue has anything to do with why Apple only supports TRIM on specific drives they OEM?

  5. Re:Name and shame by Rockoon · · Score: 4, Interesting

    The way I am reading the comments, the issue is that the buggy SSD's are flagging physical blocks as RZAT or DRAT when a trim request on a logical block is ignored. The bug presents itself later if the SSD performs wear leveling that swaps out the logical block with another, the bug being leaving the physical block tagged RZAT or DRAT.

    --
    "His name was James Damore."
  6. Re:Is there a site maintaining a list of "bad" SSD by Gaygirlie · · Score: 4, Interesting

    Not directly an answer to your question, but related: after Googling for a bit I actually cannot find any mention of Samsung SSD 840 PRO having issues with TRIM under Windows. If it was, indeed, a controller - problem then it would have to happen under all OSes as long as TRIM is enabled, but all the evidence I'm finding only points towards to Linux or these guys own setup as being the culprit.

  7. Re:Is there a site maintaining a list of "bad" SSD by MrBingoBoingo · · Score: 3, Interesting

    There is are two easy solutions to Ext4 vs. SSD problems. The first is ReiserFS which is still eminently usable on Gentoo. The second is UFS which is available on the BSD's.

  8. Re:trim by Nkwe · · Score: 3, Interesting

    While poorly written, I think the author was suggesting that any model of SSD for which the Linux kernel has specific special handling logic should be avoided. In my opinion, it is not an unreasonable statement.

    It probably is an unreasonable statement. If Linux has special logic to handle the drive, then someone else probably already had the problem and now there's a fix in so it probably won't happen to you.

    Perhaps. But if the drive was broken and someone had to write special software to fix it, how can you be sure that it was fixed correctly and completely? Can you also be sure that the "fix" works for all versions of firmware on the drive? While you might be confident of these things, I would suggest that it would be better to use a drive that follows the standards and doesn't require special code to make it work right. Granted that as always, your mileage may vary -- and it could vary in either direction.

  9. Re: Is there a site maintaining a list of "bad" SS by Bert64 · · Score: 3, Interesting

    If your booting from the SSD, chances are the machine will crash...
    Would be much better to just stay in readonly mode, and give you the chance to copy data off (and yes im aware this is no substitute for a backup, but think of the use case of a travelling laptop far away from its backup server etc).

    --
    http://spamdecoy.net - free throwaway anonymous email - avoid spam!