Slashdot Mirror


Ext3 Filesystem Explained

sheckard writes: "The next installment of the wonderful Advanced filesystem implementor's guide, part 7, details the ext3 filesystem in all of its glory. This is another great voyage into the world of journaling filesystems, and ext3 has been rock-solid in my experience."

7 of 174 comments (clear)

  1. Distro battles? Nah. Journaling fs battles! by Deal-a-Neil · · Score: 5, Informative

    ext3 catches my fancy because there's no ext2 --> ext3 conversion -- you just have to unmount, make a journal file, and remount. reiserfs migration is a challenge for the huge partitions.

  2. ext3 by FeeDBaCK · · Score: 4, Insightful

    One thing I would have to agree on in the usage of ext3 is the fact that the machine can be booted with a kernel that does not understand ext3 (only ext2) and the filesystem can still be read. This is a major strong-point in my book.

    --
    wolf31o2 Developer, Gentoo Linux Games Team
  3. Excellent engineering by ppetru · · Score: 5, Insightful

    The very existence of ext3, and it's complete forward and backward compatibility with ext2, shows that ext2 was extremely well designed by it's authors. Kudos to Remy Card, Ted Tso, and the rest of the ext2 team!

    Also, based on the same extensibility of ext2, Daniel Phillips is working on a directory indexing patch which speeds up ext2 by a huge factor when working with lots of files in a directory. You can get the preliminary patches here and see a graph of a simple file creation benchmark here. Amazing!

    --

    Petru
  4. Re:Partition resizing? by Sapien__ · · Score: 5, Informative
    This thread might be useful.

    To summarize: yes, it's possible to resize ext3 partitions, so long as your resizer doesn't mind. Don't use Partition Magic to do it. It doesn't like it. Badly.

  5. Re:The journalling filesystem myth by cowbutt · · Score: 5, Informative
    Let's say the journaling file system has 5% overhead (it probably has more). That means you lose more than 1h per day on a busy server--it's spread out, but it's still lost. You'd have to do a lot of rebooting in order to make up for that in terms of "saved" fsck time.

    Actually, Andrew Morton reckons ext3 is actually quicker than ext2 in spite of the journalling. Go figure. :)

    --

  6. Re:The journalling filesystem myth by edhall · · Score: 5, Insightful

    A few points:

    1. You can't equate down-time to a slightly slower response time. Having a reboot time of tens of seconds vs. tens of minutes for (e.g.) a large source repository or a critical web server is well worth a minor performance hit. Reboot time is dead time for all who need access to the server.
    2. If your file server is running so close to capacity that a 5% decrease in maximum filesystem throughput represents a 5% slowdown in actual throughput, your server is dangerously overloaded already.
    3. In general, journaling affects write performance, not read performance. If your server performs mostly reads, the overall overhead of journaling may amount to much less than your 5% figure. Most (though not all) applications for file servers are read-intensive with incidental writes apart from the initial "load" of the server.
    4. Fast fsck's aren't the main reason for journaled filesystems. Rather, its the improvement in filesystem integrity that is the main attraction -- an improvement that incidently allows for fast fsck's.
    -Ed
  7. You are missing the point by sigwinch · · Score: 4, Insightful
    However, if you want reliability and avoid downtime, you must have redundant servers or replication; journaling will not protect against most of the problems that cause downtime.
    Here in the real world we cannot afford triple redundant drives, motherboards, RAM, CPUs, power supplies, keyboards, mice, monitors, NICs, routers, and network cables for every single computer on every desktop in the entire organization. Sure, we could do it, but the cost would be ludicrous for a very small payback.

    Most computers simply don't need guaranteed zero downtime. What they need is bounded downtime. It's OK if they crash every once in a while, as long as they reboot cleanly within a few minutes. The biggest contributor to boot time after a crash is the file system check. Since a journalling file system can recover the file system within a few minutes, it is a huge win.

    Relying on it for "filesystem integrity" or "reduced downtime" or "reliability" is foolish.
    Here in the real world, even the big real-time transaction processing systems occassionally have common-mode failures that wipe out all the redundant subsystems at the same time. Lightning strikes, idiots frob the emergency power switch, etc. Thus, the big real-time systems need journalling even more desparately than the small systems.
    You pay for fast reboots in slower performance and more complex file system code.
    Sheer ignorance. Replication of filesystems and databases has at least as much of a performance hit as journalling, and the complexity is likely to be vastly higher.
    --

    --
    Kuro5hin.org: where the good times never end. ;-)