EXT4 Is Coming
ah admin writes "A series of patches has been proposed in Linux kernel mailing list earlier by a team of engineers from Red Hat, ClusterFS, IBM and Bull to extend the Ext3 filesystem to add support for very large filesystems. After a long-winded discussion, the developers came forward with a plan to roll these changes into a new version — Ext4."
LWN had an interesting article on ext4 not long ago.
That post makes more sense if you realize that there should be ^ marks to show exponentiation, such as 10^51 and 2^140. Otherwise it just looks like gibberish numbers that someone made up and stuck in the wiki for shits and giggles.
Reiser4 does this.
Bogtha Bogtha Bogtha
The kernel mailing list message:
/usr/src/linux/fs/ext4 that will initially register itself as the
Subject Proposal and plan for ext2/3 future development work
From "Theodore Ts'o"
Date Wed, 28 Jun 2006 19:55:39 -0400
Given the recent discussion on LKML two weeks ago, it is clear that many
people feel they have a stake in the future development plans of the
ext2/ext3 filesystem, as it one of the most popular and commonly used
filesystems, particular amongst the kernel development community. For
this reason, the stakes are higher than it would be for other
filesystems. The concerns that were expressed can be summarized in the
following points:
* Stability. There is a concern that while we are adding new
features, bugs might cause developers to lose work.
This is particularly a concern given that 2.6 is a
"stable" kernel series, but traditionally ext2/3
developers have been very careful even during
development series since kernel developers tend to get
cranky when all of their filesystems get trashed.
* Compatibility confusion. While the ext2/3 superblock does
have a very flexible and powerful system for
indicating forwards and backwards compatibility, the
possibility of user confusion has caused concern by
some, to the point where there has been one proposal
to deliberately break forwards compatibility in order
to remove possible confusion about backwards
compatibility. This seems to be going too far,
although we do need to warn against kernel and
distribution-level code from blindly upgrading users'
filesystems and removing the ability for those
filesystems to be mounted on older systems without an
explicit user approval step, preferably with tools
that allow for easy upgrading and downgrading.
* Code complexity. There is a concern that unless the code is
properly factored, that it may become difficult to
read due to a lot of conditionals to support older
filesystem formats.
Unfortunately, these various concerns were sometimes mixed together in
the discussion two months ago, and so it was hard to make progress.
Linus's concern seems to have been primarily the first point, with
perhaps a minor consideration of the 3rd. Others dwelled very heavily
on the second point.
To address these issues, after discussing the matter amongst ourselves,
the ext2/3 developers would like to propose the following path forward.
1) The creation of a new filesystem codebase in the 2.6 kernel tree in
"ext3dev" filesystem. This will be explicitly marked as an
CONFIG_EXPERIMENTAL filesystem, and will in affect be a "development
f
Actually, XFS (SGI), JFS (IBM), and ZFS (Sun) are very well proven in the field, on their respective native operating systems. Given the situations they're used in (financial sector, pharmaceutical research data, supercomputing), they're far more proven that EXT(anything). Now, whether the average Linux user knows how to install, tune, and use them is a different issue, but if I were worried about scalable, mission-critical, filesystems, those three would be on the top of my list. (and my personal history says that while XFS never gave me any trouble, JFS would be my first choice. Nobody ever let me have a budget large enough to buy a machine that would justify ZFS).
With IBM's know-how in the mix, EXT4 may be able to join the above three, but it would seem to be time better spent fixing XFS/JFS support in Linux first, rather than worrying about backwards compatibility with EXT2.
the more accurate the calculations became, the more the concepts tended to vanish into thin air. R. S. Mulliken
"ZFS (Sun) are very well proven in the field"
Um, I have yet to see a production installation of ZFS in an enterprise environment, and it hasn't been out as an actual release for even a year yet. You probably mean UFS. HTH.
However, a kernel which didn't support EXT3 could still read and write EXT3. EXT 3 is completely backwards compatible with EXT2. While you're running in EXT2 mode, none of the journalling stuff is done, but the data can still be read and written. Then you can unmount, and remount the drive as EXT3, and everything will be fine. At least that's my understanding. This might be harder to do with certain features. You can't just ignore encryption. Especially when trying to read data.
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
Nobody has a fsck that can compare to e2fsck (ext2/ext3/etc.) for quality.
The e2fsck program has a huge test suite that it must pass before a release. A set of corrupted filesystems must be correctly repaired to be bit-for-bit identical to the desired result.
A typical fsck has a good chance of crashing (SIGSEGV, the "segmentation violation") when the going gets tough.
While FreeBSD's UFS developers were messing around with sync writes to avoid testing a fsck that would often crash, the ext2 developers ran full async and wrote a damn fine fsck to put things back in order. Now you can choose from three different levels of journalling, and you still get the ass-kicking fsck program.
There basically is no fsck for XFS, Reiserfs, or Reiser4. JFS doesn't have much AFAIK, and ZFS is a newborn.
What are you going to do when your fancy filesystem gets trashed? I hope you keep excellent backups, very recent and tested to be readable.
The new data structures take up less space. They are thus faster to write and faster to read. They also seem to make delayed allocation easier.
On Linux, XFS has slightly better large-file performance but worse small-file performance than EXT3. EXT3 is comparable in performance to reiser3 on small files (a few kilobytes), and is stable and reliable, unlike reiserfs. JFS is lacking quota support. EXT3 also has the option to do data journalling, not just meta-data journalling like the other journalling filesystems. Right now, unless you are larger than a few terabytes, EXT3 is the way to go. If you're larger, XFS and accept the performance penalty and occaisional (massive) xfs_repair or restore (XFS is more likely to become corrupt due to memory or block layer errors, and recovers poorly compared to EXT3).
From what I understood the sector index will be configurable as either 32 or 64 bit, so pick it if you need it... Since there's no reason to use it unless the disk is that big, I imagine this can be set automaticly. Also, the whole reason this will be ext4 is that they'll change the way it stores the sectors (ranges instead of singles) which will be better for big files, and since one sector is 4kB almost any file is "big".
Live today, because you never know what tomorrow brings
This is simply not true. ZFS is not just for big iron. It's strongest feature is perhaps the melding of the volume manager and raid into one single unit greatly simplifies administration. Not to mention other nice features, either new os greatly simplified from their past versions, such as pooling, dynamic striping, CoW, instant snapshots and cloning, fault tolerance, etc.
m e.html - Why ZFS for home
I'd suggest reading through these links before spreading more mis-information:
http://unixconsult.org/zfs_vs_lvm.html - ZFS vs. Linux Raid vs. Linux LVM vs. Linux LVM + Raid
http://uadmin.blogspot.com/2006/05/why-zfs-for-ho
dks
This is true, but let's look at the case of 1-2 drives:
Assuming we still want mirroring or volume management on our two drives:
The overhead is still greater for SVM or for linux md and sistina lvm. Both require more administration knowledge, time, and commands to accomplish the same tasks that ZFS can do in a couple commands. (Yes, I'm aware that mdadm helps the process a *bit*, but it's still obtuse.) Anyone who has setup either knows how annoying anything is with either choice. (having to micromanage partitions, etc.)
The biggest thing for ZFS in a ``small'' 1-2 drive usage case is, in my opinion, the pooling: ZFS doesn't require one to set volume sizes in advance. Since everything pulls out of a common pool, the size of volumes can grow or shrink accordingly. (Affected by free pool space or volume quotas.) So, that means that one can just create their volumes, and not have to worry about making them the wrong size.
I'd also argue that fault tolerance is important anywhere, large or small.
Another thing is on-disk, low overhead, compression that can be enabled just by toggling one filesystem paramater, live. For a lot of things that people store, this compression would save a lot of space.
They really put a lot of thought in ZFS. It scales amazingly well, from small to large. I'm not really giving it justice explaining it here, so I'd encourage you to look at the documentation with an open mind before just writing it off as an ``enterprise only'' thing.
dks
(I have no affiliation with Sun in any way.
Actually, I think you'll find that ZFS has been out as a production release (GA or Generally available) for just under 2 weeks now. That's weeks!
There is no way in hell that ZFS is even _remotely_ proven in the field. And since we're still fighting with a bug with Sun Disksuite where you can't boot off the second disk when a disk in a mirror breaks, I'd be VERY loathe to mention Sun, Filesystems and Disk management as being stable right now.
What are you talking about? I said I didn't like the coding standards. I then had us change the code to conform to them.