EXT4 Is Coming
ah admin writes "A series of patches has been proposed in Linux kernel mailing list earlier by a team of engineers from Red Hat, ClusterFS, IBM and Bull to extend the Ext3 filesystem to add support for very large filesystems. After a long-winded discussion, the developers came forward with a plan to roll these changes into a new version — Ext4."
Interesting bit from wiki/ZFS:
What about a modularizable filesystem, which can be upgraded with modules for compression, encryption, larger file support etc. ? Is this impossible or is it a unkown area for the linux developers?
It's BS that people think it should be considered stable. I've never had more corruptions, other than using XFS w/ very heavy writes, than Resier4. It needs at least another year. ext3 on its own, though not awesome in all areas, hasn't lost me any data yet.
Ext4 is an extention of ext3, much like ext3 is an extention of ext2. The plan is to ensure backwards compatability and sanity for when things break, and with filesystems.. things break.
There are many factors that influence filesystems, not just "how fast it can write", but rather.. how it breaks when it does.
While the fanboys of XFS, JFS, ZFS may promise that their filesystems are faster, had no problems, secure and will not eat your data, it simply is not as proven as ext2 and ext3.
Scream fanboys scream, someone will listen, but the problem is that these filesystems are not proven in the field, or in some circumstances even in the kernel itself.
With a block size of 32 kB (64 kB is expected to be supported soonish) the 48-bit numbers will take you 1 byte over the maximum file size that apps can support. There is no UNIX-like OS that lets an app handle files bigger than 2**63.
We'll need to adjust other things if filesystems ever get so huge. The whole design probably needs a rethink, but we can't do it now. We don't know what the future holds in terms of seek times, transfer rates, sector sizes, etc.
Reiser4 will never be declared stable in the Linux kernel because Hans Reiser refuses to make his code conformant to kernel coding standards. There has been long and wearying discussion of this on the LKML.
Though this may be needed in some rare applications, I don't see ext4 as something needed in the near future. As I understand, the larger the max partition&file size, the more space indexes will need (not to mention that speed will probably drop).
For example, if we have 20-bit indexes (2^20 clusters max) and use 4-kilobyte clusters, to increase the maximum space we'll either have to add one bit to the indexes to double the maximum space or we'll have to increase the cluster size and have problems storing small files (remember the FAT16->FAT32 transition?)
ext4 is thousands larger than ext3, which will probably mean that indexes will need a lot more space, which will be bad for 8TB volumes (and besides, noone would notice any benefits!)
In ext4 they should get rid of some legacy stuff to foster development and usage of new technologies. The users of legacy technologies could still use ext3 and it would be very nice for ext4 users. I'm talking mostly about dropping support for the old style octal file access permissions system and bolting the ACL system as the default and enabling the metadata features by default.
The fact that nothing pressurises ever the distribution builders into using anything new has lead to majorly slowed down development of Linux.
Who cares? Linux has more than its fair share of filesystems, including XFS. I'm still wondering why XFS isnt used universally on desktop and server Linux installations everywhere. Is the ext2/3 just 'traditional'?
"Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
"ZFS is an exotic beast with a totally ridiculous maximum capacity and tons of advanced of features that do not exist in any other Unix filesystem, but are only useful for Big Iron."
Actually, except for his highly advanced algorithms, ZFS code is very small and simple, and on top of that, ZFS is really nice in small desktop deployments, where his "big iron" features give him the ability to detect and automatically correct garbage being delivered by that cheap SATA drive.
In fact, having been ported (compiles, doesn't yet run) to Linux and in process of being ported to OS X, and FreeBSD, ZFS is on a pretty good track to becoming ubiquitous... which would be the exact opposite of exotic.
There are or were a few quirks.
/boot, you could corrupt XFS.
First off the bat: you can't install the bootloader in a XFS partition since XFS uses the first 512 byte block on the partition. Of course, most people install the bootloader in the MBR but for some it's an issue.
GRUB had a bug with XFS. When you tried to use a XFS partition as
For a considerable period of time, ext3's code was more stable than XFS.
ext3 has an ordered data mode (which is the default). Other journaled file systems only support writeback mode. In general, ordered data mode doesn't provide any better warranty of consistency than writeback mode but does make an important difference for a few special cases but which can make a substancial difference to a desktop user.
Typical annoying case:
- You're editing a file on your favorite text editor and you save it.
- The editor opens the file in overwrite mode, meaning the file is actually deleted and a new one is created (under Linux's default settings, the OS will commit the changes to the metadata in 5 seconds or less and the changes to the data in 30 seconds or less).
- The changes to the metadata are commited to disk.
- The system crashes!
When the system comes back up, the new file is there it's full of garbage.
With ext3's ordered data mode, the contents of the file would have been commited to disk before the associated changes to metadata. It's problable (but not assured!!) that after a crash you'll have either the old version or the new version of the file.
Just a quick chime in, take it with a grain of salt. Some rambling thoughts.
I've just converted my main partition (non-/boot) on a notebook from XFS to reiser3 mainly because I work with huge svn working copies and svn loves to keep small files around, as well as create lots of small files (lock files, etc) during routine svn work. xfs is just way considerably slower than reiserfs for svn status, update, commit, cleanup. Besides, reiser3's tail feature means svn's penchant for small files uses less space overall on my tinny notebook harddrive. Not sure if performance of reiser3 will degrade over time, (I've been on xfs on this partition for longer than a year), but we'll see.
BTW, http://www.debian-administration.org/articles/388 My observations differ from theirs (operations on file tree). I do have a significant larger amount of files, and many of those are smaller than the default block size, so that might affect things.
On the server side, XFS, on multiple concurrent large, random, writes (postgresql) just creams reiser3 and ext3. (IIRC, battery backed SCSI raid controller, tested with both RAID1+0 and RAID5, Linux 2.6.x, 6 x 15000RPM 132(?)GB HDD) Read operations and single thread seq/random writes are too similar in performance for the various filesystems.
Another feature of XFS I used a lot (before converting to reiser3) is xfs_fsr, which defrags a mounted xfs filesystem. Oddly buggy though, as after some runs, some inodes tends to have max_extents corrupted (endian problem?). I'd recommend a xfs_repair after a xfs_fsr, which effectively makes xfs_fsr a utility for defragging *UN*mounted filesystems. So yeah, xfs is a tad unstable. I've only one real corruption, though, and that's from killing the notebook power during some writes. Not sure if that's from the fs, or the harddisk misbehaving.
The main described change / advantage in this proposed ext4 is that the notion that a file's allocation is tracked via "extents" (a specified number of contiguous 2k blocks) rather than a chain of inode pointers (with up to 3 levels of indirection).
This is based not only on the need for a larger maximum file system, but a recognition that there is significant performance advantage to reducing read/write head movement and initiating large reads from consecutive blocks that can take advantage of the high transfer rates of today's drives. (this assumes that the OS filesystem doesn't attempt/require that the entire disk drive be cached in RAM to get decent performance)
Except for "write once" files, over time this will cause files to become physically spread over the disk and the performance benefit is reduced, unless a process periodically consolidates the blocks back into a contiguous series of blocks (ignoring for the moment that on today's disk drives, blocks may be "spared" into place that are not really physically consecutive, but just logically appear to be)...
One of the "proofs" that *nix is superior to other O/Ss has been the absence of a need to "Defrag" the file system.
A commenter on the article also raises the question of why the "right" solution isn't to increase the 2k block size limit rather than rework the internals of the block pointers, and got the response that since the linux kernal manages memory in 2k blocks, it is a nightmare in the kernal to support larger I/O transfers (although others here seem to indicate this is one of the solutions people have implemented)
Isn't "extents" a concept contained in NTFS? Has anyone looked into the patent implications of these proposed changes?
Final 2006 "Proof of Global Warming" US Hurricane Count -> 0
ext2fsck has a history of plenty of problems, just like everyone. I get reports from users swearing they will never again use ext*. Ted Tso goes walking around FUD'ing everyone else's fsck. He does this because ext* performance is poor, so there is not much else to do but FUD. Some users suspect that high performance is a little sinful, so this works on some.
All of the major filesystems have a decent fsck, and all of them are by now stable to the point that you should worry about your hardware and backups failing, not your FS. The only qualifier on that is that ZFS is new, and I hope no one will view that as my FUDing.
But if the code's already been changed, why hasn't it been included yet?
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.