Ext3 Filesystem Explained
sheckard writes: "The next installment of the wonderful Advanced filesystem implementor's guide, part 7, details the ext3 filesystem in all of its glory. This is another great voyage into the world of journaling filesystems, and ext3 has been rock-solid in my experience."
One thing I would have to agree on in the usage of ext3 is the fact that the machine can be booted with a kernel that does not understand ext3 (only ext2) and the filesystem can still be read. This is a major strong-point in my book.
wolf31o2 Developer, Gentoo Linux Games Team
"ext3 catches my fancy because there's no ext2 --> ext3 conversion "
In addition, you can actually read ext3 from a kernel then only supports ext2. Only catch is that the partition has to be cleanly unmounted for this to work. This is a "Really Good Thing (TM)", because then you can to boot from an old bootdisk and still access your files, or if you are running multiple distributions.
Nothing can insure data integrity in case of mid-write shutdown. That's logically obvious
Journaling insures filesystem integrity, which is very important. Mounting an unclean ext3 fs will take seconds - no need to check the filesystem for mid-write evidence, etc. - the journal says excatly what mid-write problems there are, and wether to delete them or keep them as files.
If your system crashes in the middle of your work, and your hard drive wasn't physically damaged (it can happen. Use RAID if you're so paranoid), everything but your open files will be normal. Your open files might be 'un-journaled' (new official term? no) back to before you wrote them.
My other
The very existence of ext3, and it's complete forward and backward compatibility with ext2, shows that ext2 was extremely well designed by it's authors. Kudos to Remy Card, Ted Tso, and the rest of the ext2 team!
Also, based on the same extensibility of ext2, Daniel Phillips is working on a directory indexing patch which speeds up ext2 by a huge factor when working with lots of files in a directory. You can get the preliminary patches here and see a graph of a simple file creation benchmark here. Amazing!
Petru
Let's say the journaling file system has 5% overhead (it probably has more). That means you lose more than 1h per day on a busy server--it's spread out, but it's still lost. You'd have to do a lot of rebooting in order to make up for that in terms of "saved" fsck time.
A few points:
Most computers simply don't need guaranteed zero downtime. What they need is bounded downtime. It's OK if they crash every once in a while, as long as they reboot cleanly within a few minutes. The biggest contributor to boot time after a crash is the file system check. Since a journalling file system can recover the file system within a few minutes, it is a huge win.
Here in the real world, even the big real-time transaction processing systems occassionally have common-mode failures that wipe out all the redundant subsystems at the same time. Lightning strikes, idiots frob the emergency power switch, etc. Thus, the big real-time systems need journalling even more desparately than the small systems. Sheer ignorance. Replication of filesystems and databases has at least as much of a performance hit as journalling, and the complexity is likely to be vastly higher.-- ;-)
Kuro5hin.org: where the good times never end.