EXT4 Is Coming
ah admin writes "A series of patches has been proposed in Linux kernel mailing list earlier by a team of engineers from Red Hat, ClusterFS, IBM and Bull to extend the Ext3 filesystem to add support for very large filesystems. After a long-winded discussion, the developers came forward with a plan to roll these changes into a new version — Ext4."
This'll fill the gap between now and when Reiser4 is declared stable - some time after Duke Nukem Forever gets released.
Interesting bit from wiki/ZFS:
LWN had an interesting article on ext4 not long ago.
engineers from Red Hat, ClusterFS, IBM
OK, hands up - who wants to run ClusterFS so that they can say they needed to do a "clusterfsck"?
Reiser4 does this.
Bogtha Bogtha Bogtha
Let me put it this way, it's a little past the average slashdot porn collection:
ext3: 8TB total, 4TB files
ext4: 32 zettabyte (1024*1024*1024 TB), 1 exabyte files (1024*1024 TB)
Beyond that, it doesn't seem to actually change much.
Live today, because you never know what tomorrow brings
Ext4 is an extention of ext3, much like ext3 is an extention of ext2. The plan is to ensure backwards compatability and sanity for when things break, and with filesystems.. things break.
There are many factors that influence filesystems, not just "how fast it can write", but rather.. how it breaks when it does.
While the fanboys of XFS, JFS, ZFS may promise that their filesystems are faster, had no problems, secure and will not eat your data, it simply is not as proven as ext2 and ext3.
Scream fanboys scream, someone will listen, but the problem is that these filesystems are not proven in the field, or in some circumstances even in the kernel itself.
With a block size of 32 kB (64 kB is expected to be supported soonish) the 48-bit numbers will take you 1 byte over the maximum file size that apps can support. There is no UNIX-like OS that lets an app handle files bigger than 2**63.
We'll need to adjust other things if filesystems ever get so huge. The whole design probably needs a rethink, but we can't do it now. We don't know what the future holds in terms of seek times, transfer rates, sector sizes, etc.
It's dishonest to put something in quotes when it's not a direct quote. The exact quote is:
There's a substantial difference between saying that something is more stable "as a result" of something and more stable "because" of something. He's not claiming that being out longer intrinsically makes it more stable as your misquote suggests, he's claiming that it led to reiserfs becoming more stable - because of the practices he mentioned.
In short - something being out longer == more stable? No. Something being exposed to lots of real-world use and receiving only bugfixes == more stable? Yes.
He didn't quote Adam Smith, he drew an analogy between what he was saying and the network effect. It's an entirely reasonable analogy.
What ridicule? He's actually supporting that approach. For example:
Would you care to point out where you thought he was ridiculing the UNIX approach?
Yeah, they look dumb, don't they?
I can only assume you mean something other than "technical".
Dancing trees are a fundamental part of the design. How are you meant to understand the filesystem without understanding dancing trees?
Ah, you don't mean technical at all, you mean practical for somebody who is entirely uninterested in the way the filesystem works. Perhaps Reiser4 Transaction Design Document is what you are after, but I doubt it.
Bogtha Bogtha Bogtha
Ext2...Ext3...Ext4
Wait... I think I can detect a pattern. The next number has to be Ext7½!
GAAH! MY PRINTER IS ON FIRE!!! PUT IT OUT! PUT IT OUT!
Nobody has a fsck that can compare to e2fsck (ext2/ext3/etc.) for quality.
The e2fsck program has a huge test suite that it must pass before a release. A set of corrupted filesystems must be correctly repaired to be bit-for-bit identical to the desired result.
A typical fsck has a good chance of crashing (SIGSEGV, the "segmentation violation") when the going gets tough.
While FreeBSD's UFS developers were messing around with sync writes to avoid testing a fsck that would often crash, the ext2 developers ran full async and wrote a damn fine fsck to put things back in order. Now you can choose from three different levels of journalling, and you still get the ass-kicking fsck program.
There basically is no fsck for XFS, Reiserfs, or Reiser4. JFS doesn't have much AFAIK, and ZFS is a newborn.
What are you going to do when your fancy filesystem gets trashed? I hope you keep excellent backups, very recent and tested to be readable.
This is true, but let's look at the case of 1-2 drives:
Assuming we still want mirroring or volume management on our two drives:
The overhead is still greater for SVM or for linux md and sistina lvm. Both require more administration knowledge, time, and commands to accomplish the same tasks that ZFS can do in a couple commands. (Yes, I'm aware that mdadm helps the process a *bit*, but it's still obtuse.) Anyone who has setup either knows how annoying anything is with either choice. (having to micromanage partitions, etc.)
The biggest thing for ZFS in a ``small'' 1-2 drive usage case is, in my opinion, the pooling: ZFS doesn't require one to set volume sizes in advance. Since everything pulls out of a common pool, the size of volumes can grow or shrink accordingly. (Affected by free pool space or volume quotas.) So, that means that one can just create their volumes, and not have to worry about making them the wrong size.
I'd also argue that fault tolerance is important anywhere, large or small.
Another thing is on-disk, low overhead, compression that can be enabled just by toggling one filesystem paramater, live. For a lot of things that people store, this compression would save a lot of space.
They really put a lot of thought in ZFS. It scales amazingly well, from small to large. I'm not really giving it justice explaining it here, so I'd encourage you to look at the documentation with an open mind before just writing it off as an ``enterprise only'' thing.
dks
(I have no affiliation with Sun in any way.