Slashdot Mirror


Ext4 Data Losses Explained, Worked Around

ddfall writes "H-Online has a follow-up on the Ext4 file system — Last week's news about data loss with the Linux Ext4 file system is explained and new solutions have been provided by Ted Ts'o to allow Ext4 to behave more like Ext3."

4 of 421 comments (clear)

  1. Quick workaround - no patches required by canadiangoose · · Score: 5, Informative

    If you mount your ext4 partitions with nodelalloc you should be fine. You will of course no longer benefit from the performance enhancements that delayed allocation bring, but at least you'll have all of your freaking data. I'm running Debian on Linux 2.6.29-rc8-git4, and so far my limited testing has shown this to be very effective.

    --
    Never eat more than you can lift -- Miss Piggy
  2. Re:Workaround is disaster for laptops by Kjella · · Score: 5, Informative

    Fixed code:
    fwrite()
    fsync() - sync this file before close
    fclose()
    rename()

    Either you're a troll or an idiot, since you're AC'ing I guess I got trolled. This will sync immidiately and kill performance and battery life, since every block must be confirmed written before the process can continue. What you need to fix this is a delayed rename that happens after the delayed write.

    Problem:
    fwrite()
    fclose()
    rename()
    *ACTUAL RENAME*
    *TIME PASSES* <-- crash happens here = lose old file
    *ACTUAL WRITE*

    Real solution:
    fwrite()
    fclose()
    rename()
    *TIME PASSES* <-- crash happens here = keep old file
    *ACTUAL WRITE*
    *ACTUAL RENAME*

    --
    Live today, because you never know what tomorrow brings
  3. A bad design that it is used everywhere by diegocgteleline.es · · Score: 5, Informative

    "No write is guaranteed to be written to disk until the OS is shut down, everything can be cached in RAM for an indefinite amount of time." However that'd be real flaky and lead to data loss. That makes my FS useless. Doesn't matter if it is well documented, what matters is that the damn thing loses data on a regular basis.

    It turns out that all the modern operative systems work exactly like that. In ALL of them you need to use explicit syncronization (fsync and friends) to get a notification that your data has really been written to disk (and that's all what you get, a notification, because the system could oops before fsync finishes). You also can mount your filesystem as "sync", which sucks.

    Journaling, COW/transaction-based filesystems like ZFS only guarantee the integrity, not that your data is safe. It turns out that Ext3 has the same problem, it's just that the window is smaller (5 seconds). And I wouldn't bet that HFS and ZFS have not the same problem (btrfs is COW and transaction based, like ZFS, and has the same problem).

    Welcome to the real world...

    1. Re:A bad design that it is used everywhere by Tacvek · · Score: 5, Informative

      The Ext3 5 seconds thing is true, but that is not the important difference.

      On Ext3, with the default mount options, if one writes a file to disk, and then renames the file the write is guarantee to come before the rename. This can be used to ensure atomic updates to files, by writing a temporary copy of the file with the desired changes, and then renaming the file.

      On Ext4, if one writes a file to the disk, and then renames the file, the rename can happen first. The result of this is that it is not possible to ensure atomic updates to files unless one uses fsync between the writing and the renaming. However, that would hurt performance, since fsync will force the file to be committed to disk right now, when all that is really important is that it is committed to disk before the rename is.

      Thankfully the Ext4 module will be gaining a new mount option that will ensure that a file is written to disk before the renaming occurs. This mount option should have no real impact on performance, but will ensure the atomic update idiom that works on Ext3 will also work on Ext4.

      --
      Stylish sheet to fix many problems in Slashdot's D3: https://gist.github.com/801524