Slashdot Mirror


Data Corrupting ext3 Bug In Latest Linux 2.4.20

An anonymous reader writes "Andrew Morton alerted readers of the Linux Kernel mailing list today that ext3 in the 2.4.20 kernel has a new bug that can easily cause file data corruption at unmount time. The bug will only affect people using ext3 in "data=journal" mode, which fortunately is not the default... Full details can be read on KernelTrap."

6 of 50 comments (clear)

  1. From LKM -- GET MIRRORS PEOPLE! by fire-eyes · · Score: 3, Informative

    In 2.4.20-pre5 an optimisation was made to the ext3 fsync function
    which can very easily cause file data corruption at unmount time. This
    was first reported by Nick Piggin on November 29th (one day after 2.4.20 was
    released, and three months after the bug was merged. Unfortunate timing)

    This only affects filesystems which were mounted with the `data=journal'
    option. Or files which are operating under `chattr -j'. So most people
    are unaffected. The problem is not present in 2.5 kernels.

    The symptoms are that any file data which was written within the thirty
    seconds prior to the unmount may not make it to disk. A workaround is
    to run `sync' before unmounting.

    The optimisation was intended to avoid writing out and waiting on the
    inode's buffers when the subsequent commit would do that anyway. This
    optimisation was applied to both data=journal and data=ordered modes.
    But it is only valid for data=ordered mode.

    In data=journal mode the data is left dirty in memory and the unmount
    will silently discard it.

    The fix is to only apply the optimisation to inodes which are operating
    under data=ordered.

    --- linux-akpm/fs/ext3/fsync.c~ext3-fsync-fix Sat Nov 30 23:37:33 2002
    +++ linux-akpm-akpm/fs/ext3/fsync.c Sat Nov 30 23:39:30 2002
    @@ -63,10 +63,12 @@ int ext3_sync_file(struct file * file, s
    */
    ret = fsync_inode_buffers(inode);

    - /* In writeback mode, we need to force out data buffers too. In
    - * the other modes, ext3_force_commit takes care of forcing out
    - * just the right data blocks. */
    - if (test_opt(inode->i_sb, DATA_FLAGS) == EXT3_MOUNT_WRITEBACK_DATA)
    + /*
    + * If the inode is under ordered-data writeback it is not necessary to
    + * sync its data buffers here - commit will do that, with potentially
    + * better IO merging
    + */
    + if (!ext3_should_order_data(inode))
    ret |= fsync_inode_data_buffers(inode);

    ext3_force_commit(inode->i_sb);

    _
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

    --
    -- Note: If you don't agree with me, don't bother replying. I won't read it.
  2. So it was a dumb idea... by Ayanami+Rei · · Score: 2, Informative

    JUST DON'T SHUT YOUR SYSTEM OFF! MUWAHAHAHAHAA!!

    just kiddin'

    Fortunately, this bug didn't make it into 2.5 so it won't be propogated forward. Hint: the quick fix ISN'T a quick fix, it doesn't work.
    Either stick with 2.4.19, don't use journaled file data, or sync before umounting (I do that anyway... just superstitious I guess ^_^).

    It will take a few days to add some extra magic to the umount logic to flush all buffers in an intelligent way. Hopefully this optimization is worth the effort for dudes with high-uptime.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
  3. Re:So I'm clueless by J'raxis · · Score: 4, Informative

    Unmounts happen at shutdown. You also need to unmount before scanning/fixing a filesystem. The whole bug here pertains to the fact that it isn't flushing ("syncing") the last 30 seconds of cached data to the disk beforehand. A cold reboot without unmounting could potentially cause all kinds of other data inconsistency problems to pop up.

    The temporary fix seems to be to run sync manually. Stick "sync" in your /etc/rc.d/init.d/mountfs (or whatever it's called on your system) script right before the "umount" line.

  4. Re:another victory for open source by the+eric+conspiracy · · Score: 2, Informative

    Slashdot needs a bit more balance in the way it covers things. If this had been a problem with the goddamn filesystem (!) in Windows you'd be seeing 900 posts to the tone of "Hah! M$ sucks!!!1!!".

    Oh baloney.

    The fact is that the open source development process is just that, open. This means that users have access to versions of the kernel at all stages of development. This build is only a few days old. Clearly everyone should realize the amount of testing is too small for widespread production use.

    This kernel, and bug have NOT made it into any significant distributions of Linux. The only people using this version are bleeding edge types and testers who routinely compile their own kernels from source.

    If this was a case of, say RedHat 8.0 showing up with a file corruption bug, then, yes, it should be a front page article. This is nothing of the sort. This is a kernel version that might have shown up in Red Hat 8.1, say six months from now had it passed the test of time.

    I shudder to think what kinds of problems we would be reporting here if Microsoft gave its customers anything like the same level of access to its development process.

    After all, Microsoft is the company that shipped Windows ME and MS Smartphone.

    Score: -1, Pro-Microsoft

    If this is your typical posting, yes.

  5. Re:another victory for open source by Anonymous Coward · · Score: 1, Informative

    Kez and ILOVEYOU have been patched by MS, a LONG time ago. The only people getting hit by them are people not willing to run windowsupdate. You're a freaking idiot.

  6. Re:another victory for open source by Anonymous Coward · · Score: 1, Informative

    Do you actually work on computers for home users and small buisness?

    Yes, I do. And a fully patched computer will NOT be infected by klez. My guess is that somehow you're screwing the system up. How bout this - stay away from Windows machines, they obviously don't like you.