Slashdot Mirror


Apps That Rely On Ext3's Commit Interval May Lose Data In Ext4

cooper writes "Heise Open posted news about a bug report for the upcoming Ubuntu 9.04 (Jaunty Jackalope) which describes a massive data loss problem when using Ext4 (German version): A crash occurring shortly after the KDE 4 desktop files had been loaded results in the loss of all of the data that had been created, including many KDE configuration files." The article mentions that similar losses can come from some other modern filesystems, too. Update: 03/11 21:30 GMT by T : Headline clarified to dispel the impression that this was a fault in Ext4.

7 of 830 comments (clear)

  1. Not a bug by casualsax3 · · Score: 5, Informative
    It's a consequence of not writing software properly. Relevant links later in the same comment thread for those who don't might otherwise miss them:

    https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45

    https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/54

    1. Re:Not a bug by Anonymous Coward · · Score: 5, Informative

      Quoting T'so:

      "The final solution, is we need properly written applications and desktop libraries. The proper way of doing this sort of thing is not to have hundreds of tiny files in private ~/.gnome2* and ~/.kde2* directories. Instead, the answer is to use a proper small database like sqllite for application registries, but fixed up so that it allocates and releases space for its database in chunks, ...

      Linux reinvents windows registry?
      Who knows what they will come up with next.

    2. Re:Not a bug by OeLeWaPpErKe · · Score: 5, Informative

      Let's not forget that the only consequence of delayed allocation is the write-out delay changing. Instead of data being "guaranteed" on disk in 5 seconds, that becomes 60 seconds.

      Oh dear God, someone inform the president ! Data that is NEVER guaranteed to be on disk according to spec is only guaranteed on disk after 60 seconds.

      You should not write your application to depend on filesystem-specific behavior. You should write them to the standard, and that means fsync(). No call to fsync, look it up in the documentation (man 2 write).

      The rest of what Ted T'so is saying is optimization, speeding up the boot time for gnome/kde, it is not necessary for correct workings.

      Please don't FUD.

      You know I'll look up the docs for you :

      (quote from man 2 write)

      NOTES
                    A successful return from write() does not make any guarantee that data has been committed to disk. In fact, on some buggy implementations, it does not even guarantee
                    that space has successfully been reserved for the data. The only way to be sure is to call fsync(2) after you are done writing all your data.

                    If a write() is interrupted by a signal handler before any bytes are written, then the call fails with the error EINTR; if it is interrupted after at least one byte has
                    been written, the call succeeds, and returns the number of bytes written.

      That brings up another point, almost nobody is ready for the second remark either (write might return after a partial write, necessitating a second call)

      So the normal case for a "reliable write" would be this code :

      size_t written = 0;
      int r = write(fd, &data, sizeof(data))
      while (r >= 0 && r + written sizeof(data)) {
              written += r;
              r = write(fd, &data, sizeof(data));
      }
      if (r 0) { // error handling code, at the very least looking at EIO, ENOSPC and EPIPE for network sockets
      }

      and *NOT*

      write(fd, data, sizeof(data)); // will probably work

      Just because programmers continuously use the second method (just check a few sf.net projects) doesn't make it the right method (and as there is *NO* way to fix write to make that call reliable in all cases you're going to have to shut up about it eventually)

      Hell, even firefox doesn't check for either EIO or ENOSPC and certainly doesn't handle either of them gracefully, at least not for downloads.

  2. Re:Bull by Anonymous Coward · · Score: 5, Informative

    This is NOT a bug. Read the POSIX documents.

    Filesystem metadata and file contents is NOT required to be synchronous and a sync is needed to ensure they are syncronised.

    It's just down to retarded programmers who assume they can truncate/rename files and any data pending writes will magically meet up a-la ext3 (which has a mount option which does not sync automatically btw).

    RTFPS (Read The Fine POSIX Spec).

  3. Re:Bull by pc486 · · Score: 5, Informative

    Ext3 doesn't write out immediately either. If the system crashes within the commit interval, you'll lose whatever data was written during that interval. That's only 5 seconds of data if you're lucky, much more data if you're unlucky. Ext4 simply made that commit interval and backend behavior different than what applications were expecting.

    All modern fs drivers, including ext3 and NTFS, do not write immediately to disk. If they did then system performance would really slow down to almost unbearable speeds (only about 100 syncs/sec on standard consumer magnetic drives). And sometimes the sync call will not occur since some hardware fakes syncs (RAID controllers often do this).

    POSIX doesn't define flushing behavior when writing and closing files. If your applications needs data to be in NV memory, use fsync. If it doesn't care, good. If it does care and it doesn't sync, it's a bad application and is flawed, plain and simple.

  4. man 2 fsync by Nicolas+MONNET · · Score: 5, Informative

    The filesystem doesn't guarantee anything is written until you've called fsync and it has returned.

  5. Re:Excuses are false. This is a severe flaw. by Tadu · · Score: 5, Informative

    KDE is *broken* to delete a file and expect it to still be there if it crashes before the write.

    Nope, it writes a new file and then renames it over the old file, as rename() says it is an atomic operation - you either have the old file or the new file. What happens with ext4 is that you get the new file except for its data. While that may be correct from a POSIX lawyer pont of view, it is still heavily undesirable.