EXT4 Is Coming

How does it compare to zfs? by smartin · 2006-07-01 01:38 · Score: 0, Offtopic

I've heard good things about zfs, event that apple may adopt it, does any one know how it compares to ext4?

--
The difference between Canada and the USA is that in Canada healthcare is a right and gun ownership is a privilege.

Re:How does it compare to zfs? by Ignominious+Cow+Herd · 2006-07-01 02:35 · Score: 3, Insightful

Ummm...zfs exists, ext4 doesn't. Yet.

--
Lump lingered last in line for brains, and the ones she got were sorta rotten and insane.
Re:How does it compare to zfs? by The_Wilschon · 2006-07-01 08:17 · Score: 1

Wow. I realized most people didn't RTFA, but this is the first instance I've seen of of not RTFS(ummary)... It is proposed. That is, it doesn't exist yet.

--
SIGSEGV caught, terminating

wait... not that kind of sig.
Re:How does it compare to zfs? by bensch128 · 2006-07-01 10:02 · Score: 1

You do know that ZFS is only for Solaris right? (excuse the crappy spelling)

Sun may or may not port it to Linux and then maybe you'll see it submitted to the main branch.

So right now, ZFS DOESN"T exist on Linux, nor does ext4.

Cheers
Ben
Re:How does it compare to zfs? by Ignominious+Cow+Herd · 2006-07-01 13:31 · Score: 1

No. And I wasn't trying to be excessively sarcastic there either (+3 insightful, -1 troll). My only point was that since ext4 doesn't really exist yet, it is a bit premature to start comparing it with other filesystems. Sure there are currently ideas and patches to get the ball rolling. But AFAIK it is far from a working new/different filesystem. And who knows what it will look like when it finally gets called ext4 (vs ext3-dev).

Then again, I'm pretty sarcastic. :)

--
Lump lingered last in line for brains, and the ones she got were sorta rotten and insane.
Re:How does it compare to zfs? by miro+f · 2006-07-01 15:05 · Score: 1

actually all you need is RTFT(itle) since it clearly says "Ext4 Is Coming" not "Ext4 Is Here"

--
being vague is almost as cool as doing that other thing...
Re:How does it compare to zfs? by The_Wilschon · 2006-07-10 14:40 · Score: 1

Which was pretty much exactly what I said... Ext4 is proposed, not extant. Thus, it would be largely futile to ask for a real comparison between ext4 and zfs.

In other words, I did RTFT, and I in fact said that ext4 is not here, it is coming. RMFC (My Comment).

--
SIGSEGV caught, terminating

wait... not that kind of sig.
Re:How does it compare to zfs? by The_Wilschon · 2006-07-10 14:43 · Score: 1

Oh geez. I'm an idiot. Please disregard my other reply to you... Sorry I flamed you. I misread your comment as "actually you need to RTFT ...". And then I flamed you for misreading my comment. Kinda makes me look like a big dumkopf, eh? My apologies.

--
SIGSEGV caught, terminating

wait... not that kind of sig.

Sounds like a good idea. by Ant+P. · 2006-07-01 01:39 · Score: 5, Funny

This'll fill the gap between now and when Reiser4 is declared stable - some time after Duke Nukem Forever gets released.

Re:Sounds like a good idea. by Anonymous Coward · 2006-07-01 02:00 · Score: 2, Interesting

It's BS that people think it should be considered stable. I've never had more corruptions, other than using XFS w/ very heavy writes, than Resier4. It needs at least another year. ext3 on its own, though not awesome in all areas, hasn't lost me any data yet.
Re:Sounds like a good idea. by CRCulver · 2006-07-01 03:14 · Score: 4, Interesting

This'll fill the gap between now and when Reiser4 is declared stable

Reiser4 will never be declared stable in the Linux kernel because Hans Reiser refuses to make his code conformant to kernel coding standards. There has been long and wearying discussion of this on the LKML.
Re:Sounds like a good idea. by Anonymous Coward · 2006-07-01 03:47 · Score: 0

Interesting? He's just repeating what Ant P said.
Re:Sounds like a good idea. by Jesus_666 · 2006-07-01 08:27 · Score: 1

No, Ant P said what happens; the GP explained why it happens.

--
USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
Re:Sounds like a good idea. by mnmn · 2006-07-01 09:05 · Score: 2, Interesting

Who cares? Linux has more than its fair share of filesystems, including XFS. I'm still wondering why XFS isnt used universally on desktop and server Linux installations everywhere. Is the ext2/3 just 'traditional'?

--
"Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
Re:Sounds like a good idea. by kimvette · 2006-07-01 09:06 · Score: 1

Well, for what it's worth, based on ReiserFS 3.6, I'd trust Reiser4 before I'd trust Ext2 (and derivatives - Ext3 is Ext2 with journaling on top) again. I've lost data more than once on Ext2, but never on Reiser. In fact, Reiser's journaling was able to rescue data for me when an ABIT motherboard caught fire (bad caps burst, the motherboard started smouldering, and one of the processors exploded and punched a hole through the board). One of the NTFS drives got scrambled, yet the Reiser drives on the same controller were intact. I could not access the data, but reiserfsck --fix-fixable brought the data back, to its latest-saved state. Very nice. I'd call that stable. Ext2 has lost data on me due to lesser mishaps. Also, I LOVE the zero-slack feature of Reiser. Why do most systems STILL have fixed block sizes? Why can they not just map to sectors instead? No, you have to waste an ENTIRE 64K/32K/16K/8K/4K/etc. block of disk space when it's only partially filled.

Of course I haven't read up on Ext4 yet to see if it's significantly changed. but based on the hack that is Ext3's journaling I personally wouldn't run it (Disclaimer: I WILL read up on it when it's released in a "stable" kernel and give it a fair chance if it's something more than Ext2 plus journaling). Granted, much of the work that went into making the implementation of ReiserFS stable is Novell's doing, but it's still a good design to start with.

Where does the current "Stable" Reiser fall short? Lack of the immutable bit, compression, and other extended attributes supported by Ext2/3. I do miss those, but IMHO the gain in data integrity and AVAILABLE usable space is worth the tradeoff. Reiser does have its faults, but in my experience it's been very, very stable. I have not tried Reiser4 yet but the documents outlining the design show a lot of promise.

(Note: this is not intended to be a troll, just my anecdotal experience, so Ext3 fanboys can just chill. If you disagree, you're welcome to post your own opinions.)

--
The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
Re:Sounds like a good idea. by kimvette · 2006-07-01 09:10 · Score: 1

I don't trust Ext3, because the journaling in it is a hack riding on top of Ext2, and if you mount the partition as Ext2 (say, from a rescue disk, or move the HDD to another partition) you lose the journal, or at least render it useless. What they ought to have done was made it NOT backwards-compatible so it could not be mounted as Ext2, then the journal and data remain intact and in sync. IMHO of course.

--
The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
Re:Sounds like a good idea. by Anonymous Coward · 2006-07-01 09:25 · Score: 0

Linux XFS + Linux Kernel NFS server == BAD. I was using XFS happily for _years_ on my workstations. But it just doesn't get on with kernelspace NFS, as I discovered to my cost when using it as the file store for a small cluster. I don't think it's exactly "XFS' fault" or anything. It's just two lumps of code, written by two different groups of people, living in the same memory space, that Just Don't Get On.
Re:Sounds like a good idea. by raxx7 · 2006-07-01 11:08 · Score: 3, Interesting

There are or were a few quirks.

First off the bat: you can't install the bootloader in a XFS partition since XFS uses the first 512 byte block on the partition. Of course, most people install the bootloader in the MBR but for some it's an issue.

GRUB had a bug with XFS. When you tried to use a XFS partition as /boot, you could corrupt XFS.

For a considerable period of time, ext3's code was more stable than XFS.

ext3 has an ordered data mode (which is the default). Other journaled file systems only support writeback mode. In general, ordered data mode doesn't provide any better warranty of consistency than writeback mode but does make an important difference for a few special cases but which can make a substancial difference to a desktop user.

Typical annoying case:
- You're editing a file on your favorite text editor and you save it.
- The editor opens the file in overwrite mode, meaning the file is actually deleted and a new one is created (under Linux's default settings, the OS will commit the changes to the metadata in 5 seconds or less and the changes to the data in 30 seconds or less).
- The changes to the metadata are commited to disk.
- The system crashes!
When the system comes back up, the new file is there it's full of garbage.

With ext3's ordered data mode, the contents of the file would have been commited to disk before the associated changes to metadata. It's problable (but not assured!!) that after a crash you'll have either the old version or the new version of the file.
Re:Sounds like a good idea. by Nutria · 2006-07-01 12:02 · Score: 1

I'd trust Ext2 (and derivatives - Ext3 is Ext2 with journaling on top) again. I've lost data more than once on Ext2

While ext3 definitely started as a fork of ext2, I'm pretty sure that it's been totally rewritten by now.

--
"I don't know, therefore Aliens" Wafflebox1
Re:Sounds like a good idea. by SaDan · 2006-07-01 15:03 · Score: 2, Insightful

I've read the arguements on LKML, and it seems to me Hans isn't the only one being stubborn about filesystems and whatnot in the kernel. The kernel developers are unyielding to modernizing the VM subsystem, which is causing a lot of grief for ReiserFS.

It's ugly, and annoying, especially for people like me who rely on ReiserFS in production. I'd love to see ReiserFS 4 in the standard kernel, it'd make my life a lot easier.

I can't use EXT2/3, it's too slow and just kills the machine for the amount of files we deal with on a lot of our systems. Going from Ext3 to ReiserFS 3 took us from a machine load of over 50 down to about 3 during stress testing recently.

Hans knows what he's doing, I just wish the kernel developers would accept and respect that (regardless of the retarded ego wars on the LKML).
Re:Sounds like a good idea. by szap · 2006-07-01 16:29 · Score: 2, Interesting

Just a quick chime in, take it with a grain of salt. Some rambling thoughts.

I've just converted my main partition (non-/boot) on a notebook from XFS to reiser3 mainly because I work with huge svn working copies and svn loves to keep small files around, as well as create lots of small files (lock files, etc) during routine svn work. xfs is just way considerably slower than reiserfs for svn status, update, commit, cleanup. Besides, reiser3's tail feature means svn's penchant for small files uses less space overall on my tinny notebook harddrive. Not sure if performance of reiser3 will degrade over time, (I've been on xfs on this partition for longer than a year), but we'll see.

BTW, http://www.debian-administration.org/articles/388 My observations differ from theirs (operations on file tree). I do have a significant larger amount of files, and many of those are smaller than the default block size, so that might affect things.

On the server side, XFS, on multiple concurrent large, random, writes (postgresql) just creams reiser3 and ext3. (IIRC, battery backed SCSI raid controller, tested with both RAID1+0 and RAID5, Linux 2.6.x, 6 x 15000RPM 132(?)GB HDD) Read operations and single thread seq/random writes are too similar in performance for the various filesystems.

Another feature of XFS I used a lot (before converting to reiser3) is xfs_fsr, which defrags a mounted xfs filesystem. Oddly buggy though, as after some runs, some inodes tends to have max_extents corrupted (endian problem?). I'd recommend a xfs_repair after a xfs_fsr, which effectively makes xfs_fsr a utility for defragging *UN*mounted filesystems. So yeah, xfs is a tad unstable. I've only one real corruption, though, and that's from killing the notebook power during some writes. Not sure if that's from the fs, or the harddisk misbehaving.
Re:Sounds like a good idea. by teg · 2006-07-02 04:02 · Score: 1

Is that ext3 with directory hashing, or without?
Re:Sounds like a good idea. by hansreiser · 2006-07-02 05:16 · Score: 1

Reiser4 was fairly stable 1-2 years ago. We need more users to find bugs.
Re:Sounds like a good idea. by hansreiser · 2006-07-02 05:19 · Score: 4, Informative

What are you talking about? I said I didn't like the coding standards. I then had us change the code to conform to them.
Re:Sounds like a good idea. by bzipitidoo · 2006-07-02 17:37 · Score: 1

A pity that Reiser4 isn't available for 2.4 kernels. I read somewhere that v4 is even better than v3 for disk space. Every byte counts on old systems, both disk and RAM, so v4 interested me for that reason. But, I often use a 2.4 kernel because 2.6 takes a bit more RAM. I am more willing to experiment with something like v4 on an old computer where I'm not keeping anything critical.

--
Intellectual Property is a monopolistic, selfish, and defective concept. It is "tyranny over the mind of man"
Re:Sounds like a good idea. by fbjon · 2006-07-02 19:32 · Score: 2, Interesting

But if the code's already been changed, why hasn't it been included yet?

--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:Sounds like a good idea. by Anonymous Coward · 2006-07-03 00:26 · Score: 0

I actually found speed improvements on my system when using ext3 over reiserFS 3. I assumed ext3 was slow, but at least in my case (using the dir_index option), I found it just as fast as reiserfs.

Yes but by Anonymous Coward · 2006-07-01 01:40 · Score: 5, Interesting

Yes, but will it be enough if you had energy to boil all the oceans?

Interesting bit from wiki/ZFS:

ZFS is a 128-bit file system, which means it can provide 16 billion billion times the capacity of current 64-bit systems. The limitations of ZFS are designed to be so large that they will never be encountered in any practical operation. When contemplating the capacity of this system, Bonwick stated "Populating 128-bit file systems would exceed the quantum limits of earth-based storage. You couldn't fill a 128-bit storage pool without boiling the oceans."

In reply to a question about filling up the ZFS without boiling the ocean, Jeff Bonwick, an engineer at Sun Microsystems who led the team in developing ZFS for Solaris, offered this answer:

"Although we'd all like Moore's Law to continue forever, quantum mechanics imposes some fundamental limits on the computation rate and information capacity of any physical device. In particular, it has been shown that 1 kilogram of matter confined to 1 liter of space can perform at most 1051 operations per second on at most 1031 bits of information [see Seth Lloyd, "Ultimate physical limits to computation." Nature 406, 1047-1054 (2000)]. A fully-populated 128-bit storage pool would contain 2128 blocks (nibbles) = 2137 bytes = 2140 bits; therefore the minimum mass required to hold the bits would be (2140 bits) / (1031 bits/kg) = 136 billion kg.

To operate at the 1031 bits/kg limit, however, the entire mass of the computer must be in the form of pure energy. By E=mc2, the rest energy of 136 billion kg is 1.2x1028 J. The mass of the oceans is about 1.4x1021 kg. It takes about 4,000 J to raise the temperature of 1 kg of water by 1 degree Celsius, and thus about 400,000 J to heat 1 kg of water from freezing to boiling. The latent heat of vaporization adds another 2 million J/kg. Thus the energy required to boil the oceans is about 2.4x106 J/kg * 1.4x1021 kg = 3.4x1027 J. Thus, fully populating a 128-bit storage pool would, literally, require more energy than boiling the oceans."

Re:Yes but by Anonymous Coward · 2006-07-01 01:48 · Score: 4, Informative

That post makes more sense if you realize that there should be ^ marks to show exponentiation, such as 10^51 and 2^140. Otherwise it just looks like gibberish numbers that someone made up and stuck in the wiki for shits and giggles.
Re:Yes but by thephotoman · 2006-07-01 02:37 · Score: 1

The limitations of ZFS are designed to be so large that they will never be encountered in any practical operation. When contemplating the capacity of this system, Bonwick stated "Populating 128-bit file systems would exceed the quantum limits of earth-based storage. You couldn't fill a 128-bit storage pool without boiling the oceans.

That is, until next week, when some guy in Peoria manages to do just that by trying to create a single mirror of all the pr0n on the Internet.

--
Haec merda tauri est. Ceterum censeo Carthaginem esse delendam.
Re:Yes but by AccUser · 2006-07-01 02:37 · Score: 1

Wow. Imaging the size of the fan you would need to keep that hard drive cool.

--
Any fool can talk, but it takes a wise man to listen.
Re:Yes but by Anonymous Coward · 2006-07-01 05:49 · Score: 2, Funny

Nature 406, 10^47-10^54 (2000)
Volume 406 is really thick.
Re:Yes but by Anonymous Coward · 2006-07-01 06:22 · Score: 0

hey, thanks, man. I was thinking and thinking about what "1 kilogram of matter confined to 1 liter of space can perform at most 1051 operations per second on at most 1031 bits of information" could possibly mean? Can't a 286 do that? I was thinking "do they mean 2^1031 ? Then WTF??" Thanks for clearing that up :)
Re:Yes but by Anonymous Coward · 2006-07-01 06:31 · Score: 1, Funny

"That's no moon. It's a full ZFS SAN."
Re:Yes but by Anonymous Coward · 2006-07-01 07:41 · Score: 0

In reply to a question about filling up the ZFS without boiling the ocean, Jeff Bonwick, an engineer at Sun Microsystems who led the team in developing ZFS for Solaris, offered this answer

Sun, boiling oceans... Am I the only one who suddenly feel very uncomfortable on this planet?..
Re:Yes but by Anonymous Coward · 2006-07-01 11:12 · Score: 0

I can't believe there are at least 10 wasted mod points on anonymous coward. Wait...yes I can. Wait...I can't believe I'm replying to this.

LWN article on ext4 by ElMiguel · 2006-07-01 01:42 · Score: 5, Informative

LWN had an interesting article on ext4 not long ago.

Modularizable filesystem by Square+Snow+Man · 2006-07-01 01:48 · Score: 2, Interesting

What about a modularizable filesystem, which can be upgraded with modules for compression, encryption, larger file support etc. ? Is this impossible or is it a unkown area for the linux developers?

Re:Modularizable filesystem by Bogtha · 2006-07-01 02:00 · Score: 5, Informative

Reiser4 does this.

--
Bogtha Bogtha Bogtha
Re:Modularizable filesystem by dbIII · 2006-07-01 02:28 · Score: 2, Insightful

Interesting article - the premise that Reiser is more stable than ext3 "because it has been out longer", the quote from Adam Smith, the ridicule of the unix approach of everything as a file and all the naked people covered in newsprint?
Anyone have a "more technical" link without dancing trees and with a bit about how to recover your filesystem when something goes weird with the hardware even if the filesystem is perfect?
Re:Modularizable filesystem by hskinnemoen · 2006-07-01 02:59 · Score: 1

What's the point? One of the main reasons for creating a new filesystem with a new name is that the on-disk format changes. No matter how you implement the change, using plugins or otherwise, you need to make it very clear to the users that if they upgrade to the new on-disk format, they will not be able to use older kernels. The most obvious way of doing this is giving the filesystem a new name -- if your kernel doesn't support ext4, you won't be able to read an ext4 filesystem.
Besides, filesystems can be considered modularized components of the VFS. There will always be room for another layer of abstraction, but that doesn't mean that adding it will make things any better.
Re:Modularizable filesystem by Bogtha · 2006-07-01 03:01 · Score: 4, Insightful

the premise that Reiser is more stable than ext3 "because it has been out longer"

It's dishonest to put something in quotes when it's not a direct quote. The exact quote is:

"We don't touch the V3 code except to fix a bug, and as a result we don't get bug reports for the current mainstream kernel version. It shipped before the other journaling filesystems for Linux, and is the most stable of them as a result of having been out the longest. We must caution that just as Linux 2.6 is not yet as stable as Linux 2.4, it will also be some substantial time before V4 is as stable as V3."

There's a substantial difference between saying that something is more stable "as a result" of something and more stable "because" of something. He's not claiming that being out longer intrinsically makes it more stable as your misquote suggests, he's claiming that it led to reiserfs becoming more stable - because of the practices he mentioned.

In short - something being out longer == more stable? No. Something being exposed to lots of real-world use and receiving only bugfixes == more stable? Yes.

the quote from Adam Smith

He didn't quote Adam Smith, he drew an analogy between what he was saying and the network effect. It's an entirely reasonable analogy.

the ridicule of the unix approach of everything as a file

What ridicule? He's actually supporting that approach. For example:

Can we do everything that can be done with {files, directories, attributes, streams} using just {files, directories}? I say yes--if we make files and directories more powerful and flexible. I hope that by the end of reading this you will agree.

Would you care to point out where you thought he was ridiculing the UNIX approach?

all the naked people covered in newsprint

Yeah, they look dumb, don't they?

Anyone have a "more technical" link

I can only assume you mean something other than "technical".

without dancing trees

Dancing trees are a fundamental part of the design. How are you meant to understand the filesystem without understanding dancing trees?

and with a bit about how to recover your filesystem when something goes weird with the hardware even if the filesystem is perfect?

Ah, you don't mean technical at all, you mean practical for somebody who is entirely uninterested in the way the filesystem works. Perhaps Reiser4 Transaction Design Document is what you are after, but I doubt it.

--
Bogtha Bogtha Bogtha
Re:Modularizable filesystem by Znork · 2006-07-01 03:06 · Score: 1

Take a look at the device mapper and associated stuff like EVMS, LVM2, dm-crypt, etc.

Arbitrary block device layering is the way forward.
Re:Modularizable filesystem by jZnat · 2006-07-01 03:09 · Score: 1

You can't really get more technical about filesystems than talking about things like dancing trees and other algorithms for storing and retrieving data quickly, safely, and efficiently. Don't forget, Hans Reiser is a filesystem expert, so don't expect his site to not have that sort of information on it.

--
'Yes, firefox is indeed greater than women. Can women block pops up for you? No. Can Firefox show you naked women? Yes.'
Re:Modularizable filesystem by CastrTroy · 2006-07-01 03:22 · Score: 2, Informative

However, a kernel which didn't support EXT3 could still read and write EXT3. EXT 3 is completely backwards compatible with EXT2. While you're running in EXT2 mode, none of the journalling stuff is done, but the data can still be read and written. Then you can unmount, and remount the drive as EXT3, and everything will be fine. At least that's my understanding. This might be harder to do with certain features. You can't just ignore encryption. Especially when trying to read data.

--

Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
Re:Modularizable filesystem by Antique+Geekmeister · 2006-07-01 03:30 · Score: 1

Like the Abrams tank going 60 miles an hour, ReiserFS is fine for doing all sorts of amazing things, once. Actually being trying to recover any data from ReiserFS if any hardware errors creep in, such as a failing drive in a RAID array, is like the drive train seizing up on the Abrams: it's a big development effort to get spectacular features which don't actually work well in the field.

Ext2 and its descendants have been less ambitious and thus considerably more robust.
Re:Modularizable filesystem by dbIII · 2006-07-01 03:50 · Score: 1

Sorry it's beyond midnight in this part of the world - so here is the correct quote:
"and is the most stable of them as a result of having been out the longest"
The context is supplied in the article and the portion above - do you see what I am getting at here and why I do not agree that it is a definition of stability? It certainly gives me no confidence to the questions of stability and recovery, which I'm sure are answered elsewhere, but no - I didn't think much of the article - perhaps the annoying graphics distracted too much for me to get much out of the content.
Thanks for the second link. No pictures of people dancing to describe dancing trees.
Re:Modularizable filesystem by dbIII · 2006-07-01 04:00 · Score: 1

OK - I should have phrased that as without pictures of people pretending to be trees that are dancing, or even pictures of leafy trees dancing. Something boring with only the sort of diagrams and descriptions you would see in a published paper would be nice.
Re:Modularizable filesystem by Khazunga · 2006-07-02 10:34 · Score: 1

Ext2 and its descendants have been less ambitious and thus considerably more robust.
Ext2 yes. It's a workhorse, with limitations but extremely robust. Its descendants, no. I've lost one partition to ext3 when I first tried it out -- yes, it was unstable -- and one a year after when journaling failed and fsck then ate my data. On the other side, I've been using Reiserfs ever since SuSE began setting it as the default (longer than I remember). I've yet to see a data failure on reiser.

--
If at first you don't succeed, skydiving is not for you
Re:Modularizable filesystem by Antique+Geekmeister · 2006-07-02 13:09 · Score: 1

I have seen ReiserFS failures, in Terabyte scale RAID arrays on several operating systems, where ext3 filesystems remained reliable and usable long-term. The early 2.5 kernel ext3 versions were not that stable: after 2.6 was released, it's been a serious workhorse, competing well with ReiserFS for speed and reliability.

When you start seeing randomly named files that cannot be deleted in a ReiserFS system, you may as well scrub the whole partition and revert to your previous backup: it can't be recovered, can't be fixed, and can't even be reliably backed up anymore, unless ReiserFS has done some serious evolution in the last 18 months.
Re:Modularizable filesystem by Anonymous Coward · 2006-07-05 21:37 · Score: 0

Hmmm, prior to this post I had no idea how this filesystem was supposed to work - but this post has made me very very interested - specifically the naked people, and the dancing trees - not together or it sounds too hippy-ish, but apart the ideas sound freaking awesome!

ClusterFS by schon · 2006-07-01 01:51 · Score: 5, Funny

engineers from Red Hat, ClusterFS, IBM

OK, hands up - who wants to run ClusterFS so that they can say they needed to do a "clusterfsck"?

Re:ClusterFS by ScrewMaster · 2006-07-01 02:15 · Score: 0

Dammit! You beat me to it.

--
The higher the technology, the sharper that two-edged sword.
Re:ClusterFS by Anonymous Coward · 2006-07-01 02:57 · Score: 1, Insightful

Unfortunately, lustre's fsck is just called lfsck. You run it across the aggregate of file stores making up the cluster filesystem to ensure total consistency, after running clusterfs' e2fsck friendly-fork on each individual object store to ensure local consistency. Most of these changes to ext3 -> ext4 are driven by the needs of Lustre. Lustre FS is basically a virtual filesystem striped across a load of ext3 filesystems on servers. It is blazingly fast and very stable in our tests (clusterfs are _very_ conservative about tagging something "stable", their stuff is used on some serious computers).

define very large by frovingslosh · 2006-07-01 02:07 · Score: 2, Insightful

OK, I've read both links. What does this mean? Can anyone give a breakdown of ext3 vs. ext4, particularly in terms of what size files and what size partitions they both support, as well as any other differences that can be quantified?

--
I'm an American. I love this country and the freedoms that we used to have.

Re:define very large by gsnedders · 2006-07-01 02:22 · Score: 1

ext3 has a 32TiB max volume size, and 2TiB max file size.
Re:define very large by Kjella · 2006-07-01 02:31 · Score: 5, Insightful

Let me put it this way, it's a little past the average slashdot porn collection:

ext3: 8TB total, 4TB files
ext4: 32 zettabyte (1024*1024*1024 TB), 1 exabyte files (1024*1024 TB)

Beyond that, it doesn't seem to actually change much.

--
Live today, because you never know what tomorrow brings
Re:define very large by runep · 2006-07-01 03:17 · Score: 3, Funny

Let me put it this way, it's a little past the average slashdot porn collection:
I think you underestimate the combination of lonely geeks, OCD, unemployment, broadband and wget.
Re:define very large by zlogic · 2006-07-01 03:50 · Score: 2, Interesting

Though this may be needed in some rare applications, I don't see ext4 as something needed in the near future. As I understand, the larger the max partition&file size, the more space indexes will need (not to mention that speed will probably drop).
For example, if we have 20-bit indexes (2^20 clusters max) and use 4-kilobyte clusters, to increase the maximum space we'll either have to add one bit to the indexes to double the maximum space or we'll have to increase the cluster size and have problems storing small files (remember the FAT16->FAT32 transition?)
ext4 is thousands larger than ext3, which will probably mean that indexes will need a lot more space, which will be bad for 8TB volumes (and besides, noone would notice any benefits!)
Re:define very large by glwtta · 2006-07-01 04:16 · Score: 3, Insightful

ext3: 8TB total, 4TB files
ext4: 32 zettabyte (1024*1024*1024 TB), 1 exabyte files (1024*1024 TB)

Are they just going to work on improving the 8TB paper limitation, or are they actually trying to improve on ext3 scalability? Which, currently tends to suck the big one, especially on a significant number of disks (eg: http://scalability.gelato.org/DiskScalability/Resu lts).

I also seem to keep coming up against a pretty hard 2TB block device limit in Linux (eg LVM2 lv size, LUN size for fibre attached SAN, etc). I don't really know what the reasons for it are, anyone know what technologies allow for larger single partitions?

Anyway, I've long ago settled on reiserfs (3) for speedy random access to small files, and XFS for file server type applications; though I still wonder why RedHat doesn't include any "enterprise" filesystems by default in their "enterprise" products (I know, I know, you can enable it - I did say "by default").

--
sic transit gloria mundi
Re:define very large by Kjella · 2006-07-01 04:29 · Score: 3, Informative

From what I understood the sector index will be configurable as either 32 or 64 bit, so pick it if you need it... Since there's no reason to use it unless the disk is that big, I imagine this can be set automaticly. Also, the whole reason this will be ext4 is that they'll change the way it stores the sectors (ranges instead of singles) which will be better for big files, and since one sector is 4kB almost any file is "big".

--
Live today, because you never know what tomorrow brings
Re:define very large by Anonymous Coward · 2006-07-01 09:35 · Score: 0

I also seem to keep coming up against a pretty hard 2TB block device limit in Linux

Well one particular Stupid Limit isn't linux's fault at all: the ancient and crappy MS-DOS partition-table format doesn't support partitions larger than 2TB. Use a different partition-table / disklabel format on your disks, and single partitions/slices can be bigger than 2TB. Even windoze users hit this, nowadays.

Also, some limits are a result of 32-bit architecture, we had actually moved to x86_64 linux servers before any of our individual raid arrays crossed 2TB in size. 2TB limits are typically related to the maximum size of a 32-bit signed integer...
Re:define very large by rcamans · 2006-07-01 10:18 · Score: 1

ext3 is limited, unless it has LFS (Large File System) support in it. Then, if you have 8k block size, you support much larger than the claimed 8 TB total, 4 Tb files. See http://www.suse.de/~aj/linux_lfs.html for the table at the end of the article, although I believe the table has errors (swaps). Then you also have to have 2.6 kernel, and CONFIG_LBD set.

--
wake up and hold your nose
Re:define very large by Anonymous Coward · 2006-07-01 10:22 · Score: 0

There's something here I don't understand, namely how ext4 will support 1 exabyte files with the structure defined in the LWN article:

__le32 ee_block; /* first logical block extent covers */

If there's only 32 bits to specify the extent's first logical block (the extent's first block in the file), and block size remains at 4k, isn't 16 TB the farthest away from the start of the file that an extent can start? So isn't the max filesize 16 TB + however big the last extent is? What am I missing here?
Re:define very large by Kjella · 2006-07-01 12:33 · Score: 1

I think you underestimate the combination of lonely geeks, OCD, unemployment, broadband and wget.

We're talking about a dozen full 750GB hdds (with no redundancy) to hit 8TiB. That's over four grand just for the disks without controllers, never mind the broadband you need. Do tell where you can get that on unemployment benefits...

--
Live today, because you never know what tomorrow brings
Re:define very large by Nutria · 2006-07-01 15:13 · Score: 1

That's over four grand just for the disks without controllers, never mind the broadband you need. Do tell where you can get that on unemployment benefits...

Well, "he" lives in his parents' basement, and earns some cash doing errands for them...

--
"I don't know, therefore Aliens" Wafflebox1

LKML Message by Anonymous Coward · 2006-07-01 02:20 · Score: 3, Informative

The kernel mailing list message:

Subject Proposal and plan for ext2/3 future development work
From "Theodore Ts'o"
Date Wed, 28 Jun 2006 19:55:39 -0400

Given the recent discussion on LKML two weeks ago, it is clear that many
people feel they have a stake in the future development plans of the
ext2/ext3 filesystem, as it one of the most popular and commonly used
filesystems, particular amongst the kernel development community. For
this reason, the stakes are higher than it would be for other
filesystems. The concerns that were expressed can be summarized in the
following points:

* Stability. There is a concern that while we are adding new
features, bugs might cause developers to lose work.
This is particularly a concern given that 2.6 is a
"stable" kernel series, but traditionally ext2/3
developers have been very careful even during
development series since kernel developers tend to get
cranky when all of their filesystems get trashed.

* Compatibility confusion. While the ext2/3 superblock does
have a very flexible and powerful system for
indicating forwards and backwards compatibility, the
possibility of user confusion has caused concern by
some, to the point where there has been one proposal
to deliberately break forwards compatibility in order
to remove possible confusion about backwards
compatibility. This seems to be going too far,
although we do need to warn against kernel and
distribution-level code from blindly upgrading users'
filesystems and removing the ability for those
filesystems to be mounted on older systems without an
explicit user approval step, preferably with tools
that allow for easy upgrading and downgrading.

* Code complexity. There is a concern that unless the code is
properly factored, that it may become difficult to
read due to a lot of conditionals to support older
filesystem formats.

Unfortunately, these various concerns were sometimes mixed together in
the discussion two months ago, and so it was hard to make progress.
Linus's concern seems to have been primarily the first point, with
perhaps a minor consideration of the 3rd. Others dwelled very heavily
on the second point.

To address these issues, after discussing the matter amongst ourselves,
the ext2/3 developers would like to propose the following path forward.

1) The creation of a new filesystem codebase in the 2.6 kernel tree in /usr/src/linux/fs/ext4 that will initially register itself as the
"ext3dev" filesystem. This will be explicitly marked as an
CONFIG_EXPERIMENTAL filesystem, and will in affect be a "development
f

Too late... by Anonymous Coward · 2006-07-01 02:31 · Score: 0, Troll

Reiser4 is stable for at least 1 1/2 years now. Why not include that? Because of the changes that would go beyond of the focus of the FS layer?

First of all, they should return to the old development model and put all the broken stuff from 2.6.x into 2.7.x instead of continuing this 2.6.x.y BS.

I don't want to make these decisions myself by abandoning Linux for FreeBSD.

Why EXT4 ? by Anonymous Coward · 2006-07-01 02:36 · Score: 4, Interesting

Ext4 is an extention of ext3, much like ext3 is an extention of ext2. The plan is to ensure backwards compatability and sanity for when things break, and with filesystems.. things break.

There are many factors that influence filesystems, not just "how fast it can write", but rather.. how it breaks when it does.

While the fanboys of XFS, JFS, ZFS may promise that their filesystems are faster, had no problems, secure and will not eat your data, it simply is not as proven as ext2 and ext3.

Scream fanboys scream, someone will listen, but the problem is that these filesystems are not proven in the field, or in some circumstances even in the kernel itself.

Re:Why EXT4 ? by Frumious+Wombat · 2006-07-01 02:51 · Score: 4, Informative

Actually, XFS (SGI), JFS (IBM), and ZFS (Sun) are very well proven in the field, on their respective native operating systems. Given the situations they're used in (financial sector, pharmaceutical research data, supercomputing), they're far more proven that EXT(anything). Now, whether the average Linux user knows how to install, tune, and use them is a different issue, but if I were worried about scalable, mission-critical, filesystems, those three would be on the top of my list. (and my personal history says that while XFS never gave me any trouble, JFS would be my first choice. Nobody ever let me have a budget large enough to buy a machine that would justify ZFS).

With IBM's know-how in the mix, EXT4 may be able to join the above three, but it would seem to be time better spent fixing XFS/JFS support in Linux first, rather than worrying about backwards compatibility with EXT2.

--
the more accurate the calculations became, the more the concepts tended to vanish into thin air. R. S. Mulliken
Re:Why EXT4 ? by Anonymous Coward · 2006-07-01 03:01 · Score: 0

ZFS, well proven in the field? As in, being first announced stable in Solaris 10 just a month ago?

Troll.

EXT3 has been solid for around 5 years in commercial environments.
Re:Why EXT4 ? by Znork · 2006-07-01 03:20 · Score: 3, Informative

"ZFS (Sun) are very well proven in the field"

Um, I have yet to see a production installation of ZFS in an enterprise environment, and it hasn't been out as an actual release for even a year yet. You probably mean UFS. HTH.
Re:Why EXT4 ? by dnaumov · 2006-07-01 03:41 · Score: 1

"While the fanboys of XFS, JFS, ZFS may promise that their filesystems are faster, had no problems, secure and will not eat your data, it simply is not as proven as ext2 and ext3."

I am sorry, but you got this quite the wrong way around :)

XFS and JFS have been used in enterprise enviroments far longer than EXT2 (not to mention EXT3) has been in existance.
Re:Why EXT4 ? by Carewolf · 2006-07-01 03:53 · Score: 2, Interesting

In enterprise.. Exactly!

Note that servers with extensive mirroring and other hardware error-handling rarely need error-recovery from the filesystem. Filesystem errors happen on ordinary peoples harddrives when they grow old, and ext* have a million times more experience in the handling those than any enterprise FS..
Re:Why EXT4 ? by Anonymous Coward · 2006-07-01 04:24 · Score: 1, Informative

On Linux, XFS has slightly better large-file performance but worse small-file performance than EXT3. EXT3 is comparable in performance to reiser3 on small files (a few kilobytes), and is stable and reliable, unlike reiserfs. JFS is lacking quota support. EXT3 also has the option to do data journalling, not just meta-data journalling like the other journalling filesystems. Right now, unless you are larger than a few terabytes, EXT3 is the way to go. If you're larger, XFS and accept the performance penalty and occaisional (massive) xfs_repair or restore (XFS is more likely to become corrupt due to memory or block layer errors, and recovers poorly compared to EXT3).
Re:Why EXT4 ? by eviltypeguy · 2006-07-01 05:03 · Score: 1

Um, I have yet to see a production installation of ZFS in an enterprise environment...

Then you haven't been looking very hard. SUN has been using ZFS internally in their enterprise environment for a while. In addition, there are several special customers that were using ZFS in production working closely with SUN engineers. Not only that, I know of a hosting company that posted about using ZFS already for their production environment. In addition, ZFS is now officially supported and part of Solaris 10 as of Update 2, so there are definitely many production installations already. If you read the ZFS discussion forums on opensolaris.org, you will see a lot of posts from people that have already set up ZFS installations in production environments.
Re:Why EXT4 ? by demotivator · 2006-07-01 05:35 · Score: 1

JFS on AIX has certainly been around for awhile, but it was not without problems or limitations. JFS2 is really what people use on AIX these days, and is generally a good feature-rich filesystem. But between the LVM bugs and JFS2 bugs I've encountered over the last couple years AIX is not nearly as reliable as I would have hoped. One of the more problematic features with JFS2 was the dynamic resizing, as there was a lovely bug where the resize would hang, which in turn hung all I/O to that filesystem. The only option at that point was a reboot. Needless to say, we were forced to do all resizes offline for a while (even after we'd applied the patches to fix the issue, as we simply couldn't take the chance of it happening again).
Re:Why EXT4 ? by 51mon · 2006-07-01 07:16 · Score: 1

Mod parent up -- oh he had 5 points -- still give it a try.

Please complete the following;

We need another enterprise file system;

- like we need another web browser.
- like we need another Window manager.
- like we need another Bourne shell derivative.
- more than we need improved network filesystem support.
- more than we need Hans Reiser to rip out the limitations of the VFS from the kernel.
- because the other Enterprise level filesystems just don't support big enough filesystems/files.
- because the other Enterprise level filesystems are too fast.
- because backwards compatibility to ext2 is written in stone on tablets handed to McKusick on Mount Berkeley by someone with a burning beard.
- because cludgy journaling addon are cool and we should strive to preserve cludges as long as possible.
- other - please specify.
Re:Why EXT4 ? by fimbulvetr · 2006-07-01 08:12 · Score: 1

I got this:
xfs_force_shutdown(sdb1,0x8) called from line 1088 of file fs/xfs/xfs_trans.c. Return address = 0xf8c3043b Filesystem "sdb1": Corruption of in-memory data detected. Shutting down filesystem: sdb1 Please umount the filesystem, and rectify the problem(s)
Last week on a debian sarge box runing 2.6.8 while setting up a file system on a box that built, ran and read from ext3 just fine. A format and reinstall of the 400GB array got XFS working on the second try.

This was minutes after creating the partition and in the process of making 3 million 4k files.

After doing it the second time, it's created the files and hasn't crashed yet (crosses fingers).
Re:Why EXT4 ? by Builder · 2006-07-01 08:47 · Score: 2, Informative

Actually, I think you'll find that ZFS has been out as a production release (GA or Generally available) for just under 2 weeks now. That's weeks!

There is no way in hell that ZFS is even _remotely_ proven in the field. And since we're still fighting with a bug with Sun Disksuite where you can't boot off the second disk when a disk in a mirror breaks, I'd be VERY loathe to mention Sun, Filesystems and Disk management as being stable right now.
Re:Why EXT4 ? by kimvette · 2006-07-01 09:25 · Score: 2, Insightful

SUN has been using ZFS internally in their enterprise environment for a while.

Most people would not consider that to be "proven in the field"

By your logic, Windows Vista should have been released a year ago because it's long been "proven" stable via widespread deployment at Microsoft.

Internally, Sun has Sun software running mostly on Sun hardware, not the mis-mash of SANs, external and internal third-party hard drives, and custom RAIDs that many enterprises will have. When it's used and stable across a variety of configurations in real-world far-away-from-Sun's-debug-environements without a(n unreasable/unexpected) glitch, it can be considered "proven in the field."

--
The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
Re:Why EXT4 ? by kimvette · 2006-07-01 09:45 · Score: 1

Also, if choosing between data integrity and performance, one should choose integrity (unless doing something like heavy NLE where the data is not stored locally, just pulled down to get work done). Just as with RAID-5 vs. "RAID-0" there are advantages for performance and there are advantages of integrity with a (hopefully minor) performance penality with various filesystems.

--
The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
Re:Why EXT4 ? by Anonymous Coward · 2006-07-01 10:06 · Score: 0

If you think that enterprise systems with extensive mirroring and other hardware error-handling rarely need error-recovery from the filesystem, then you've obviously never had to deal with such a system crashing!

If your system goes tits-up and panics, no amount of hardware mirroring is going to prevent you from having to fsck the filesystem when the system is brought back up again.
Re:Why EXT4 ? by qbwiz · 2006-07-01 11:07 · Score: 1

I'm not using their native operating systems, so how stable the filesystems are there doesn't really affect me. What matters to me is whether the code in Linux that uses them is stable.

--
Ewige Blumenkraft.
Re:Why EXT4 ? by Chandon+Seldon · 2006-07-01 11:10 · Score: 1

What's unstable and unreliable about reiserfs? Have you been hibernating since 1998?

--
-- The act of censorship is always worse than whatever is being censored. Always.
Re:Why EXT4 ? by oso · 2006-07-01 19:00 · Score: 1

Holy It's GA So It Must Be Proven Batman!

ZFS proven? Give me a break.
Re:Why EXT4 ? by Anonymous Coward · 2006-07-01 23:40 · Score: 0

Re: XFS and JFS have been used in enterprise enviroments far longer than EXT2 (not to mention EXT3) has been in existance.

I'm the original person you are quoting. I was talking specifically about Linux systems, but if we are going to take things out of context.. lets dance.

I'd be willing to bet that ext3 has more deployed hours in the field than any of the competitors. The JFS/XFS filesystems, while may be deployed before the ext2 filesystems there would be more hours racked up in total for ext3.

I guess one could make the same arguement as NTFS, although being closed source I can't say I've bothered to look into how it handles in bad conditions, or even how the recovery tools work.

Why only 48 bits? by The+Wicked+Priest · 2006-07-01 02:41 · Score: 2, Insightful

Why not go all the way to 64 bits now, and thereby avoid further changes for the forseeable future? In one of the messages linked from the article, it's suggested that 1024 PB, obscene as it sounds, may only be good enough for another decade.

I guess we'll be on to ext5 or 6 by then, though.

--
Share and Enjoy: 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0

Re:Why only 48 bits? by r00t · 2006-07-01 02:59 · Score: 4, Interesting

With a block size of 32 kB (64 kB is expected to be supported soonish) the 48-bit numbers will take you 1 byte over the maximum file size that apps can support. There is no UNIX-like OS that lets an app handle files bigger than 2**63.

We'll need to adjust other things if filesystems ever get so huge. The whole design probably needs a rethink, but we can't do it now. We don't know what the future holds in terms of seek times, transfer rates, sector sizes, etc.
Re:Why only 48 bits? by kimvette · 2006-07-01 09:53 · Score: 1

Why stop at 64 bits? It may seem excessive now, but so did the 2GB limitation of the original IDE implementation (even though IDE included large-disk addressability in the original design the common implementation did not include the full feature set) when "huge" hard drives were just 80MB, and each time it has been extended we've hit the limit, with 32GB limits and 137GB limits. I realize there has to be some decision about where to draw the line but why not make it TWICE as wide as we can imagine technology can possibly be pushed, which will push any need for extension/amendment of the specification out to several decades away? Chances are we WILL hit any limit, and far sooner than anticipated, so make the limit exponentially higher. OS support may not be there to actually use all of that space, but software is a heck of a lot easier to change than embedded controllers, which are usually so upgradable. Some flashable controllers have been extended by third-party independent hackers to support 48-bit addressing but most required software workarounds in the OS. Remember having to pass HDD geometry on to kernels in your lilo config, and the concern of the kernel's having to be within the first 1024 cylinders of space back in the day?

--
The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
Re:Why only 48 bits? by Anonymous Coward · 2006-07-01 17:47 · Score: 0

Yeah, you never know when you might need to fit the whole Internet into a single file.

Seriously, you could store a stereo HD video of your entire life in a single file without running out of space. 640 million billion K should be enough for anybody!

dom

Yes but... by mkw87 · 2006-07-01 02:45 · Score: 0

will it run linux? Oh darn it, wrong thread.

--
Arguing with an engineer is like wrestling a pig in mud. Soon, you realize the pig is dirty, and he likes it.

128 bits? by turgid · 2006-07-01 03:04 · Score: 2, Funny

"128 bits should be enough for anyone." - Scott G. McNealy (retired).

/me ducks.

--
Stick Men

Re:128 bits? by maxwell+demon · 2006-07-01 04:24 · Score: 2, Funny

Well, let's see what you can address with 128 bits. If we assume byte-addressing, it's enough for 2^128=3.4*10^38 bytes, or 2.7*10^39 bits.

Now lets assume we want to store every bit in a single carbon atom. Carbon has a specific mass of 12 g/mol, 1 mol about 6.022*10^23 atoms. So 2.7*10^39 bits would translate to 4.5*10^15 mol, or 5.4*10^16 g, which is 54 gigatonnes of carbon.

I doubt hard drives will get larger than that any time soon :-)

--
The Tao of math: The numbers you can count are not the real numbers.
Re:128 bits? by turgid · 2006-07-01 07:05 · Score: 1

Calm down dude, it's OK, I used to work for Sun. I heared all about ZFS before they started writing the code.

--
Stick Men

Pattern by Eudial · 2006-07-01 03:09 · Score: 4, Funny

Ext2...Ext3...Ext4

Wait... I think I can detect a pattern. The next number has to be Ext7½!

--
GAAH! MY PRINTER IS ON FIRE!!! PUT IT OUT! PUT IT OUT!

Re:Pattern by Anonymous Coward · 2006-07-01 03:33 · Score: 0

Microsoft plans to supercede all of these with a superior version, Ext2007. A beta release is expected early 2010...
Re:Pattern by ConceptJunkie · 2006-07-01 07:17 · Score: 1

No, the next one will be ext2008.

Then extxp.

--
You are in a maze of twisty little passages, all alike.

What about performance? by Tamerz · 2006-07-01 03:12 · Score: 1

I may be blind but I can't find any info on that. Is it simply going to allow larger file systems? Or will there be performance increases as well?

But... by Ulrich+Hobelmann · 2006-07-01 03:15 · Score: 1

will it support the Hurd?

Re:But... by kimvette · 2006-07-01 09:13 · Score: 1

Didn't you mean to say the obligatory "but will it run Linux?"

However in this case we need to flip it:

"Will it run Vista, and will it come out before Duke Nukem Forever?"

Oh and to for a momentary instance of reason: is it GPL-compatible? If so then I'm sure that it will support Hurd, or the reverse, Hurd will support it.

--
The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50

fsck quality by r00t · 2006-07-01 03:27 · Score: 5, Informative

Nobody has a fsck that can compare to e2fsck (ext2/ext3/etc.) for quality.

The e2fsck program has a huge test suite that it must pass before a release. A set of corrupted filesystems must be correctly repaired to be bit-for-bit identical to the desired result.

A typical fsck has a good chance of crashing (SIGSEGV, the "segmentation violation") when the going gets tough.

While FreeBSD's UFS developers were messing around with sync writes to avoid testing a fsck that would often crash, the ext2 developers ran full async and wrote a damn fine fsck to put things back in order. Now you can choose from three different levels of journalling, and you still get the ass-kicking fsck program.

There basically is no fsck for XFS, Reiserfs, or Reiser4. JFS doesn't have much AFAIK, and ZFS is a newborn.

What are you going to do when your fancy filesystem gets trashed? I hope you keep excellent backups, very recent and tested to be readable.

Re:fsck quality by Anonymous Coward · 2006-07-01 04:04 · Score: 1, Informative

The fsck.reiser4 utility is actually *very* good. It basically implements reiser4 in userspace with very extensive checking and it must not crash on any crazy input. It also has a vast test-suite and never fails (short of hardware error).

I have been using it a lot as I am developing some reiser4 plugins and little toys around the reiser4 code, and am very impressed with the code.
Re:fsck quality by MrHanky · 2006-07-01 04:33 · Score: 1

I've had e2fsck crash with a segfault, fairly recently, on an Mac. This was with Debian unstable, so I downloaded a static build of e2fsck from the web to try a more conservative solution, and that would also crash. It didn't cause that many problems, because I could still read the FS, backed it up somewhere else, and recreated the FS. Reiserfsck had no problem fixing the other FSes (I have had loads of FS related problems with this machine, probably because it writes gibberish to the disk when the power is cut. It's a laptop, but the battery is flat). Now that it's running OS X, though, the whole system partition becomes unreadable and unfixable quite often. HFS+ isn't very reliable, even with journaling.

Sorry, just had to get some OS X bashing in there, it's so ridiculously overestimated on this site. My only point was that e2fsck can crash.
Re:fsck quality by rs232 · 2006-07-01 05:19 · Score: 1

"There basically is no fsck for XFS .."

XFS provides journaling for file system metadata ..In the event of a system crash .. Recovery is performed automatically at file system mount time ..

--
davecb5620@gmail.com
Re:fsck quality by ComputerSlicer23 · 2006-07-01 05:52 · Score: 1

Clearly you've never had hit a bug in hardware or software. As someone who has had fsck turn up errors on both reiserfs and ext3 while running with Journals, I can assure you, that a really good fsck is a wonderful thing. Bugs in hardware or software can reduce the "journaling" to nothing. Journaling is really about applying database ACID technology to filesystems in the event of an crash. It's also pretty speedy.
Kirby
Re:fsck quality by rs232 · 2006-07-01 06:43 · Score: 1

xfs_check checks whether an XFS filesystem is consistent

--
davecb5620@gmail.com
Re:fsck quality by Anonymous Coward · 2006-07-01 09:18 · Score: 0

Fat lot of good that does you. With data=journal set, EXT3 checks whether or not the DATA is consistent.
Re:fsck quality by r00t · 2006-07-01 09:40 · Score: 1

Wow, it's been a damn long time since I've heard of such a thing. It would be nice if you'd have contacted tytso@mit.edu to ensure that your filesystem gets into the test suite.

The fact that HFS+ is so unreliable is a bit worrying. While lower reliability is to be expected, failures should still be rare. Perhaps your hardware has some minor (or not so minor) memory problems.
Re:fsck quality by MrHanky · 2006-07-01 10:35 · Score: 1

I think the problems are caused by the disk cache being corrupted before it's properly flushed. This is an old laptop, and it probably doesn't expect 16 MB of cache. It usually gets uptimes of several weeks, so I don't think it's the system memory.

Unfortunately, I no longer have Linux installed on it, and I rebuilt the filesystem anyway.
Re:fsck quality by TarpaKungs · 2006-07-01 11:35 · Score: 1

There basically is no fsck for XFS

Really. What are xfs_check and xfs_repair for then?

--
Why can't women be like Hedy Lamarr - beautiful, talented and inventors of frequency-hopping spread-spectrum techn
Re:fsck quality by r00t · 2006-07-01 12:00 · Score: 1

You'll find out what they are when you find yourself wanting a fsck program.

Hint: they barely do anything, if you are lucky.
Re:fsck quality by TarpaKungs · 2006-07-01 22:48 · Score: 2, Funny

Nonsense. xfs_repair runs over half a dozen checks on metadata. I've done it and watched it and it is thorough. The only problem is running it on 32 bit architectures with a 13TB filesystem, when it runs out of memory address space at around phase 7. That's curable with an Operton.

--
Why can't women be like Hedy Lamarr - beautiful, talented and inventors of frequency-hopping spread-spectrum techn
Re:fsck quality by hansreiser · 2006-07-02 05:29 · Score: 2, Interesting

ext2fsck has a history of plenty of problems, just like everyone. I get reports from users swearing they will never again use ext*. Ted Tso goes walking around FUD'ing everyone else's fsck. He does this because ext* performance is poor, so there is not much else to do but FUD. Some users suspect that high performance is a little sinful, so this works on some.

All of the major filesystems have a decent fsck, and all of them are by now stable to the point that you should worry about your hardware and backups failing, not your FS. The only qualifier on that is that ZFS is new, and I hope no one will view that as my FUDing.
Re:fsck quality by r00t · 2006-07-02 06:20 · Score: 1

A filesystem with most data structures at fixed locations is inherently more fixable.
Re:fsck quality by Frumious+Wombat · 2006-07-02 07:19 · Score: 1

To be entirely honest, I don't remember fscking the JFS (G30 -> SP2), or XFS (PowerIndigo2 -> Origin2000, Linux) file systems anywhere nearly as much as I've had to EXT2/EXT3, and frankly, have been in much less danger of losing data on those than I have on an ext3 partition on an adaptec SCSI RAID controller about two years ago. Yes, the performance is nice, but in terms of where I've lost data, it's been on ext2/3, and not on JFS/XFS. I've run XFS on Linux as well, and not had any noticeable issues with it. Maybe this is luck, maybe the drive controller/scsi drives weren't as compatible as they should have been, maybe a terabyte filesystem was a bad idea circa 2003, maybe I shouldn't have cut in front of that Wiccan in undergrad, but I've had many fewer "Oh S*!" moments with the filesystems that came down from the big-iron OS's.

--
the more accurate the calculations became, the more the concepts tended to vanish into thin air. R. S. Mulliken

Yes. by r00t · 2006-07-01 03:31 · Score: 2, Informative

The new data structures take up less space. They are thus faster to write and faster to read. They also seem to make delayed allocation easier.

Re:Yes but - Limes Compu.... something by Anonymous Coward · 2006-07-01 03:37 · Score: 0

[see Seth Lloyd, "Ultimate physical limits to computation." Nature 406, 1047-1054 (2000)].

Now in which book (though admittedly sci-fi) did I read that weight before 2000, it had a similar concept named Limes compu... (my Latin fails me). There it was the other way around though, it was faster to compute "something" on a given location than to compute it "on the next node" and transfer it over, i.e. throwing more hardware at it wouldn't help and the ultimately best computer (also an AI) was the size of half a cubic metre.

and 640 K... by a_greer2005 · 2006-07-01 03:38 · Score: 1

...will be enough for anyone...right?

Everytime I hear someone say "there is no way we would ever use that much data", I laugh out loud! HD cameras are coming, bandwidth is getting faster and cheaper (DSL is like $12 here in Indiana) and lets face it, people want to save EVERYTHING...weather this is good or bad is a differant topic, but the fact is, if you give people the storage, they will use it...Remember when you asked yourself "How will I ever fill this 500MB HDD?" I do...

Re:and 640 K... by G+Morgan · 2006-07-01 04:43 · Score: 1

It's not so much that its enough for anyone just that theres pretty much a natural limit on this. You'd have to find QM to be wrong to fill it to its maximum.
Re:and 640 K... by 0racle · 2006-07-01 04:54 · Score: 1

I'm reporting greer2005 to Homland Security for his terrorist plot to boil the oceans.

--
"I use a Mac because I'm just better than you are."
Re:and 640 K... by kimvette · 2006-07-01 09:19 · Score: 1

For what it's worth, I'd suspect that if such amounts of storage were to become available, the current administration here in America could fill it up.

In addition to knowing who my friends are, whom I call, and that I'm interested in aviation and fast cars, they'll be able to track which brand of mouthwash I buy (and where I buy it), who my dentist is, how much I've spent on my teeth, which brand of toilet tissue I buy (because only terrorists buy Scotts, right?), how many times I've read 1984 over the years, which really bad sci-fi movies I buy (yes I bought Battlefield Earth, a great poorly-written b-movie experience), and that I've watched Lord of the G-Strings and Lord of the Rings back to back on OnDemand cable, not to mention cataloguing every post I've made here on slashdot.

Build the storage, and SOMEBODY will fill it up, probably the government (not necessarily just the current administration by the way) with tracking every inane detail of our lives in the name of "combating terrorism" and "increasing safety/security."

--
The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
Re:and 640 K... by Hast · 2006-07-01 12:54 · Score: 1

The number of particles in the Universe is estimated at around 10^89 (high estimates). This is about 2^270 (I can't be bothered to calculate the correct number).

That should put 2^128 in perspective when it comes to adressing. Also see the posts about how much energy it would take to store this amount of data.

Really there comes a time when "... enough for anyone" really is enough for anyone. Unless you're building Deep Though or similar computers.
Re:and 640 K... by heinousjay · 2006-07-01 13:32 · Score: 1

For what it's worth, I'd suspect that if such amounts of storage were to become available, the current administration here in America could fill it up.

They wouldn't fill it up if they individually tracked every cell in every human's body.

Actually, I could be wrong about that, I didn't do the math. It just sounded like a pithy retort in my mind.

--
Slashdot - where whining about luck is the new way to make the world you want.
Re:and 640 K... by heinousjay · 2006-07-01 13:44 · Score: 1

So then I went and did the math. I rounded my initial numbers up and my answer down, because I'm a conservative. I made the assumption of 200 trillion cells in the human body, and 7 billion people on earth. That would allow for 220 terabits of information to be stored for each cell.

If the current administration did that, I'd applaud the technical achievement.

And now I've taken this entirely too far. Just remember, it does one good to keep in mind that really big numbers don't fit into human consciousness too well.

--
Slashdot - where whining about luck is the new way to make the world you want.
Re:and 640 K... by Nutria · 2006-07-01 15:02 · Score: 1

the current administration here in America could fill it up.

Blah blah blah I hate Bush blah blah Bush is Evil blah blah blah I have no independent thoughts blah blah blah.

Build the storage, and SOMEBODY will fill it up, probably the government (not necessarily just the current administration by the way) with tracking every inane detail of our lives

Man, where have you been the past 10 years? Credit card companies, banks, insurance companies, big retailers (Wal-Mart, Blockbuster, NetFlix, Amazon, etc, etc, etc), toll road authorities, airlines, they all track everything you do.

The government doesn't have to track you, they just ask the private sector to turn over their (the private sector's) detailed data.

--
"I don't know, therefore Aliens" Wafflebox1

Well, how does a Honda Civic ... by tetromino · 2006-07-01 03:40 · Score: 2, Insightful

compare to a Liebherr T282? These are two projects with vastly different goals. Ext4 is basically Ext3 with better performance and a much larger maximum capacity; it's still a typical traditional Unix filesystem, a safe default choice for desktops and small servers. ZFS is an exotic beast with a totally ridiculous maximum capacity and tons of advanced of features that do not exist in any other Unix filesystem, but are only useful for Big Iron.

Re:Well, how does a Honda Civic ... by Anonymous Coward · 2006-07-01 04:59 · Score: 0

Since Ext3 is just Ext2 with journaling, we can apply transitivity:

Ext4 is just Ext2 with journaling and support for more/bigger files. Ext2 is a pretty run-of-the-mill filesystem. Since most linux distrubutions run filesystem checks every 20 boots anyway, Ext2 and Ext4 are entirely interchangable for desktop users.
Re:Well, how does a Honda Civic ... by DavidS · 2006-07-01 05:13 · Score: 3, Informative

This is simply not true. ZFS is not just for big iron. It's strongest feature is perhaps the melding of the volume manager and raid into one single unit greatly simplifies administration. Not to mention other nice features, either new os greatly simplified from their past versions, such as pooling, dynamic striping, CoW, instant snapshots and cloning, fault tolerance, etc.

I'd suggest reading through these links before spreading more mis-information:

http://unixconsult.org/zfs_vs_lvm.html - ZFS vs. Linux Raid vs. Linux LVM vs. Linux LVM + Raid

http://uadmin.blogspot.com/2006/05/why-zfs-for-hom e.html - Why ZFS for home

dks
Re:Well, how does a Honda Civic ... by tetromino · 2006-07-01 05:29 · Score: 1

ZFS is not just for big iron. It's strongest feature is perhaps the melding of the volume manager and raid into one single unit greatly simplifies administration. Not to mention other nice features, either new os greatly simplified from their past versions, such as pooling, dynamic striping, CoW, instant snapshots and cloning, fault tolerance, etc.

Yes, ZFS has awesome volume-management features. If you have a big fileserver with a dozen drives, ZFS is a godsend. However, considering that most laptops, desktop, and small servers only have only 1-2 hard drives, ZFS is a total overkill for the average user. Which was precisely my point.
Re:Well, how does a Honda Civic ... by DavidS · 2006-07-01 06:04 · Score: 4, Informative

This is true, but let's look at the case of 1-2 drives:

Assuming we still want mirroring or volume management on our two drives:
The overhead is still greater for SVM or for linux md and sistina lvm. Both require more administration knowledge, time, and commands to accomplish the same tasks that ZFS can do in a couple commands. (Yes, I'm aware that mdadm helps the process a *bit*, but it's still obtuse.) Anyone who has setup either knows how annoying anything is with either choice. (having to micromanage partitions, etc.)

The biggest thing for ZFS in a ``small'' 1-2 drive usage case is, in my opinion, the pooling: ZFS doesn't require one to set volume sizes in advance. Since everything pulls out of a common pool, the size of volumes can grow or shrink accordingly. (Affected by free pool space or volume quotas.) So, that means that one can just create their volumes, and not have to worry about making them the wrong size.

I'd also argue that fault tolerance is important anywhere, large or small.

Another thing is on-disk, low overhead, compression that can be enabled just by toggling one filesystem paramater, live. For a lot of things that people store, this compression would save a lot of space.

They really put a lot of thought in ZFS. It scales amazingly well, from small to large. I'm not really giving it justice explaining it here, so I'd encourage you to look at the documentation with an open mind before just writing it off as an ``enterprise only'' thing.

dks
(I have no affiliation with Sun in any way.
Re:Well, how does a Honda Civic ... by Anonymous Coward · 2006-07-01 09:31 · Score: 1, Interesting

"ZFS is an exotic beast with a totally ridiculous maximum capacity and tons of advanced of features that do not exist in any other Unix filesystem, but are only useful for Big Iron."

Actually, except for his highly advanced algorithms, ZFS code is very small and simple, and on top of that, ZFS is really nice in small desktop deployments, where his "big iron" features give him the ability to detect and automatically correct garbage being delivered by that cheap SATA drive.

In fact, having been ported (compiles, doesn't yet run) to Linux and in process of being ported to OS X, and FreeBSD, ZFS is on a pretty good track to becoming ubiquitous... which would be the exact opposite of exotic.
Re:Well, how does a Honda Civic ... by Anonymous Coward · 2006-07-01 10:23 · Score: 0

You forgot to mention that at present, ZFS cannot be used for the root filesystem which for 1-2 disk systems definitely limits its usefulness.
Re:Well, how does a Honda Civic ... by DavidS · 2006-07-01 15:31 · Score: 1

That is correct. Support for that is in-beta now.

Meanwhile, instead of giving the entire disk to the pool, you can just give a portion of it.

Mirror the bootable root and system with SVM, and then assign all of the rest to the ZFS pool and go from there.

dks

Linux and other Unix FSes by digitalhermit · 2006-07-01 04:27 · Score: 3, Insightful

I'm as big a Linux fan as anyone, but one glaring thing that it needs is some better filesystem tools. Don't get me wrong -- they've come a long way in the last couple years -- but compared to something like AIX it still has a little ways to go. Here's one feature that causes a challenge: Linux filesystems and the underlying logical volume layer is largely decoupled. You have an immense amount of flexibility but as a consequence, the filesystem and volume layers don't always communicate as well. For example, the AIX JFS2 tools allow you to dynamically grow/shrink filesystems. This functionality exists in Linux for some filesystems (EXT3, ReiserFS) but the procedure varies depending on how the filesystem is constructed. And at this point, I'm not fully convinced of its stability as I've recently (three months ago) lost an entire disk after a dynamic resize on an LVM backed EXT3 partition. I have yet to reproduce the failure but it occurred with a 95% full /home and a kernel compile going full tilt.

But I'm amazed at how quickly these features are being integrated. There's functionality in Linux that allows me to easily create file-backed volumes, remote volumes, SAN LUNs, etc.. The "resize in a single command" is not fully there yet, but within 6 months I'd expect it to be.

Re:Linux and other Unix FSes by Homology · 2006-07-01 05:55 · Score: 3, Insightful

>I'm as big a Linux fan as anyone, but one glaring thing that it needs is some better filesystem tools.

I'm pretty certain that Linux would have better filesystem tools if the developers could resist add a new filesystem every few months.
Re:Linux and other Unix FSes by bzipitidoo · 2006-07-02 18:10 · Score: 1

Speaking of glaring absences, will there be an Undelete tool? Not every environment is a GUI with a trash can.

--
Intellectual Property is a monopolistic, selfish, and defective concept. It is "tyranny over the mind of man"
Re:Linux and other Unix FSes by geminidomino · 2006-07-03 21:08 · Score: 1

If you need an undelete, you shouldn't be deleting files!

I kid. This is coming from a guy who learned that cleaning filesystems on 3 hours sleep is a bad idea. And why one should use sudo...

rm -r /tmp/usr and rm -r /tmp /usr look so similar....

My take on current filesystems by Anonymous Coward · 2006-07-01 05:04 · Score: 0

Just my bit on current filesystems

I have a laptop which currently has a corrupted filesystem. It's only a "little" corrupted in that it's just some metadata. Basically there's a file that when I delete it, it still thinks it's there, if I try to do anything to it it says "no such file or directory".
This system is running reiserfs. Now to be fair I bash my filesystems VERY hard. My laptop crashes a minimum of once a week. I use gentoo so I have a lot of small files that get updated daily (which is why I was usinr reiser). ReiserFSCK is unable to repair this problem. Apparently the only way is to do a full tree rebuild, which is a fairly scary proposition. I've never had a problem with JFS. I have had problems with ext3, but only if I never fsck'd, which is not it's intended usage pattern.

Now, in tests JFS performs as well or better except with very large directories, and usually with less CPU load than reiser. I would use JFS on any system where performance mattered, but if you want stability it's all about ext3. Sometimes I don't care if it takes an extra second to read a file, as long as it friging works.

I have friends who run XFS, and it's crash performance is abismal. XFS was designed for SGI servers, which have a different failure pattern than PC's. Primarilly, XFS deals very poorly with power deaths. As a result XFS ZERO'S things that it gets confused about. It doesn't even just ignore them, it intentionally zero's them. That is not the right thing to do pretty much ever. On top of that the locking uses a wrapper layer around the linux locking primitives because it wasn't so much ported as wrapped in a layer of code that pretends to be IRIX. This mapping from one set of locking primitives to another is not perfect. As any real systems hacker knows the EXACT implementation of locking DOES matter, and can mean the difference between deadlocks, and not. E.G. Posix mutexes != basic yield spinlock, one will deadlock in cases where the other wont (even on a uniprocessor system). Basically the people I know who run XFS do a filesystem rebuild every couple months.

I have not yet toyed with reiser4, though another friend who runs that also only has a filesystem crash once every couple of months (Yeah... ONLY). I consider it to be about as stable as XFS. In short, I think ext4 is kindof silly, good tree based stuff is the right way to go not more extensions of ancient concepts. But practically speeking no-one else is up to the job of a stable filesystem just yet, so for now we NEED ext4.

Re:My take on current filesystems by PenGun · 2006-07-01 05:16 · Score: 0

I've run xfs for years on my media storage partitions and I've never had a problem. Crashs. power failure etc ... nothing. It is a journaling filesystem and works so well I'm thinking of running my whole system next install.

PenGun
Do What Now ??? ... Standards and Practices !
Re:My take on current filesystems by waferhead · 2006-07-01 07:47 · Score: 3, Insightful

"I consider it to be about as stable as XFS."

I have had my /video and /home partitions on XFS for... WAY too long, several years, same drives.
(I just keep adding on)

I lose power a lot where I live (glitches) and XFS has been utterly bullet proof.

(This filesystem has bee thru 3 motherboards, several linux distros (1 mb dead/2 upgrades), 2 cases, and so on)

If Reiser4 is about as stable as XFS, I'll glady switch everything over tomorrow on my MythTV box.
Re:My take on current filesystems by Anonymous Coward · 2006-07-01 09:27 · Score: 0

I can do you a few better. I've had my entire linux lab (4 machines at home) running entirely on XFS (sans /boot) since it has been around and I haven't had a single problem.

I also worked on a settop box, we chose XFS because it seemed to be the most reliable. I think there are around 50,000 units in the field using XFS. To my knowledge there have been no FS based issues, a fair number of harddisk hardware issues but the filesystem has been a rock.

Of the modern Linux filesystems, reiserfs (3) has been the only one that I have seen multiple data loss failures. 2 on stock Suse installs and then many during power cycle testing which made us pick XFS for the settop box mentioned.
Re:My take on current filesystems by Chatz · 2006-07-01 12:06 · Score: 1

For a better understanding of what you are referring to as XFS zeroing:
http://oss.sgi.com/projects/xfs/faq.html#nulls

To see how XFS can now be configured to reduce corruption due to power failure, see:
http://oss.sgi.com/projects/xfs/faq.html#wcache

--
There is folly and foolishness on the one side, and daring and calculation on the other. - Admiral Pellew, Hornblower
Re:My take on current filesystems by Tracy+Reed · 2006-07-01 12:46 · Score: 1

I think a lot of people misunderstand ReiserFS and filesystems in general. ReiserFS (3 and 4) acknowledges the fact that cpu is very fast and disk IO is slow. If you can do anything at all in cpu as far as calculations or optimizations to avoid having to make disk accesses it is a win. This is why ReiserFS takes more cpu. Overall it should be faster. It also assumes that your hardware is reliable. If your hardware is bogus you are going to have problems with any fs but particularly ReiserFS. The on disk and in-memory data structures are much more complicated than ext2/3/4. All designed to provide better performance. If you have a memory problem or disk controller problem or really any hardware problem at all you are in deep shit. Want good performance and data integrity? Use quality hardware and implement redundancy!

Journalled filesystems like ReiserFS easily handle power-out problems, accidental reboots, etc. These are not data corruption issues. But once some bogus piece of hardware starts causing random bits to be scribbled to the disk all bets are off. I don't even see the lack of an fsck program as a problem. If you ever get to the point where you need to do an fsck you really should just restore from backup. When I hear these stories about how people lost all of their data because their filesystem "crashed" I have two reactions: 1. Skepticism that they didn't have bogus hardware or didn't somehow screw themselves up. It is extremely rare that anyone can actually prove it was a bug in the fs that burned them. 2. Total lack of sympathy because they didn't have a backup.

Here's what I do:

I value my data so I spent an extra $100 to get another 250G disk and I mirror. $100 is DIRT CHEAP insurance against hard drive related failures. Disks are so cheap and big there is no excuse for not mirroring important data. Plus you get a bonus on read performance. If I offered you $100 to let me delete 250G of data from your machine right now would you let me? Then your data is worth more than $100 also and worthy of a mirrored disk. But a mirrored disk is not a backup. You need backups too.

I have Bacula setup to run every night. It makes a backup of my data to an external USB2 attached 80G drive. I don't back up all of my data as there is some stuff I really don't care about. But all of my email, source code, and vacation photos etc get backed up every night. I probably have 30G of data I really give a care about. I have two of these drives. I do a full backup once a month and incrementals every night after. At the end of the month I take the drive over to my storage unit (or a friends house would do, or even my desk at work) and swap it with a second drive which I have stashed there.

I think I paid around $80 for each of the external drives plus $100 for the extra disk for the mirror. So I have a really great, fast, reliable backup solution for $260 plus some time to set it up. Is it worth it? HELL YES! While writing this I just thought to do a test restore of some data. It worked. Yeay! My backup is solid and there if I need it.

If any one of you offered me...say, $1000 to come over to my house in San Diego right now to boot your own super-destructo CD which did a military grade erase of my HD's I would let you. RIGHT NOW. I have the data backed up. I figure my time to do the restore is worth $1k to me. And I'll have everything back up in 24 hours or less. If you can't do the same right now your data better not be important to you because that's how disasters happen: Completely unannounced.

Remember kids: If it wasn't backed up it wasn't important!

What they really should do is by Anonymous Coward · 2006-07-01 05:10 · Score: 1, Interesting

In ext4 they should get rid of some legacy stuff to foster development and usage of new technologies. The users of legacy technologies could still use ext3 and it would be very nice for ext4 users. I'm talking mostly about dropping support for the old style octal file access permissions system and bolting the ACL system as the default and enabling the metadata features by default.

The fact that nothing pressurises ever the distribution builders into using anything new has lead to majorly slowed down development of Linux.

How Boring by Anonymous Coward · 2006-07-01 05:59 · Score: 0

Darn. I read EXT4 may be coming soon and got my hopes up. I have always disliked EXT3 because it was essentially just EXT2 with journalizing tacked on with no performance advantages whatsoever. I've had much better results both performance-wise and even stability-wise with ReiserFS versus EXT3 (both tested over several hardware crashes and the ReiserFS filesystem remained undamaged while the EXT3 became badly enough damaged to prevent the operating system from booting eventually. Perhaps it would be more accurate that I did not so much test as that I used EXT3 when I installed, crashed a few times causing problems each time, was unable to even boot at all the last time, then reinstalled with ReiserFS instead and despite a few crashes since before I solved the hardware problem it remained undamaged.) The fact is, it has been said many times over the past that EXT3 was basically just a quick-fix for the problem of lack of journalization. Unfortunately, by the sound of things, in the same way that EXT3 is essentially EXT2 + journalizing, it would appear that EXT4 will just be EXT3 + insanely huge filesystem support, which is a great quick-fix for those with uber RAID arrays filled with 500GB harddrives, though those of us who can't even afford one 500GB harddrive will find that to be no more helpful than EXT3 was since ReiserFS supports a filesystem of up to 16 terabytes (which means you'll need 32 500GB harddrives -- well, in a few more years you'll actually be able to have 16 terabytes without a gigantic RAID array, but, for the moment you're still very unlikely to hit 16 terabytes in any kind of rush. Oh, and I think the size limit has to do with the paging inherant in the CPU, which may mean a CPU supporting larger paging than the 4K they say was standard among Intel CPUs at that time should therefore support a larger filesystem.)

Oh well. I can always hope that EXT5 will be what I really want to see -- a complete rework of the filesystem implementing all the advantages seen in filesystems like ReiserFS with the support that the well known EXT standards enjoy. I'm sure EXT5 will just be a quick fix for tiered storage or something. In the meantime, so other filesystem will end up doing it better for those who are patient.

What about directories? by Roadkills-R-Us · 2006-07-01 06:22 · Score: 1

Everyone sweats out the file and FS size limits, but it's amazing to me that Linux's most popular filesystem still limits you to under 32K directories at one level in a directory. Does ext4 address this? Why not?

I realize this is irrelevant for most people, but for some of us it's crucial.

Re:What about directories? by Nutria · 2006-07-01 15:18 · Score: 1

Linux's most popular filesystem still limits you to under 32K directories at one level in a directory.

Something's wrong with your data layout if you need to put 32,767 directories at a single level.

--
"I don't know, therefore Aliens" Wafflebox1

design by m874t232 · 2006-07-01 09:19 · Score: 1

I'm not so sure that that's a reasonable analogy.

ext2 and ext3 are very high performance file systems that have no trouble moving large amounts of data. ext4 appears to be a market-driven extension of ext3, in which what amounts to users pay for the minimum number of changes necessary to get the job done.

ZFS, on the other hand, is a typical Sun design, in which their kernel engineers throw in every feature they can think of and Sun is marketing the hell out of it. But a lot of features also means a lot of features that can be misconfigured, that can have bugs, and that can cause unexpected performance bottlenecks.

Even if the ZFS feature set is the right one, it's far from clear that putting them into the file system layer is the right place to put them.

So, at this point, ZFS may end up being more Edsel than Liebherr T282.

that is not fsck by r00t · 2006-07-01 09:35 · Score: 1

Suppose you have a little accident. You whack the hard drive as it is writing, or a cosmic ray hits the controller chip. A few weeks later, you discover that your filesystem is an inconsistant mess. What will you do?

Re:that is not fsck by spyowl · 2006-07-01 18:26 · Score: 1

When I used ReiserFS, rebuilding the tree allowed me to get to my data every time (bad IBM deathstar drive). When something similar happened on ext3, fsck kept segfaulting. So, the arbitrary description of "good" fsck was no help in this case and 3 other unrelated cases when I used ext2 and ext3. In my experience ReiserFS has been a lot more stable filesystem, even in the case of hardware problems.
Re:that is not fsck by Anonymous Coward · 2006-07-02 14:51 · Score: 0

Reply here, whining for your yelp!

If you mount it as ext2... by Ayanami+Rei · 2006-07-01 10:59 · Score: 1

...then you don't need the journal. The journal is only of any concern when you don't cleanly unmount. That's it.
ext2 won't mount unless the filesystem is marked clean, so you would have already suffered a fsck scan anyway, as opposed to a fast journal resync if it was ext3.
BTW, ext3 just "starts from the beginning" at each mount. There's nothing to keep in sync.

Yeah, ext3 is great. I've recovered from _very bad_ situations involving hardware that might not have been possible with any other FS.

--
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON

Re:If you mount it as ext2... by spyowl · 2006-07-01 17:17 · Score: 1

Yeah, ext3 is great. I've recovered from _very bad_ situations involving hardware that might not have been possible with any other FS.

Funny, I have same experience with ReiserFS. I had a powerful desktop system that used a "Deathstar" as a main hard drive using all ReiserFS partitions. Before I knew anything about those drives I would have my PC lock up every few days, usually with the HD light on. I would need to do a hard reboot which would invoke reiserfsck on boot and on occasion I'd have to manually run rebuild tree to repair my partitions. This went on for several months until I replaced the HD with a different brand that got rid of that problem altogether. I never lost a single file during this time.

On the other hand, I've given a chance to ext3 3 times now. Each time I ended up with a partition that could not be repaired by fsck; or the last time I gave it a try and the FS worked fine for few months until the first time it didn't unmount cleanly - fsck would simply segfault at a particular spot. Yeah, maybe a hardware problem but I couldn't easily get to my data again.

At least in my experience, not only is ReiserFS faster than ext* filesystems for most operations, it is also a lot more stable and reliable in protecting my data - even with bad hardware. I have no reason whatsoever to try ext* filesystem or recommend it to anyone.
Re:If you mount it as ext2... by geminidomino · 2006-07-03 20:45 · Score: 1

Slightly OT:

fsck would simply segfault at a particular spot.

I recently ran into this problem. There is apparently a bug in fsck where certain combinations of flags on an inode will segfault it.

Downloading the latest e2fstools got me around it (though the data was still hosed... That's what I get for being too lazy to reboot back into linux to make a new partition after playing oblivion)

OT: Metadata on the file system? by KWTm · 2006-07-01 11:09 · Score: 1

Your comment on filesystem tools led me to think about one particular tool I'd love to have: I would like to know whether metadata --more specifically, my user comments on the file-- would be a component of the proposed ext4.

As an example of when I would like to annotate files: sometimes I download a file --let's say it's a program for my Palm, called "VP2.pdb". Now, that filename could mean just about anything; let's say it was some image viewer named "ViewPicture II", so I would like to rename it "ViewPicture2.pdb".

On the other hand, if someone has some web page pointing to "a cool Palm program that lets you see images", with a link pointing to "VP2.pdb", I want to realize that this is a file that I've already downloaded before. It's not that easy if, say, it was among a bunch of programs you had compared last year, but then put on the back burner until now. I might very well download "VP2.pdb", not realizing that it was the same as "ViewPicture2.pdb".

You can think of other circumstances where you might not want to change the name of a file and yet have some way to store some comments on it.

You could try commenting the file itself, which would easy if it were a text file, but hard if it were some delicate binary format. You could try writing up a "notes" file in that directory, but what if you copy the file itself but not the accompanying "notes" file?

Right now I compromise by appending to the filename: "mv VP.pdb VP-ImageViewer_fromJoeBlowsWebsite.pdb". When I try to download another copy, Firefox won't ask if I want to overwrite, but if I type in Save As: "VP..." it will try to guess: Do you want to type "VP-ImageViewer_fromJoeBlowsWebsite.pdb"? At which point I will realize that I've downloaded it before.

But it would be great to have some sort of all-purpose metadata field, preferably variable length, to tag onto the files. It would be like the EXIF content in digital camera JPEGs that store the date, exposure, etc. without disturbing the image itself.

Is such a system available on any of the current file systems, such as ReiserFS (which I use now) or ext3? If it were, for example, on XFS or JFS, I might be tempted to switch over. Perhaps somewhere someone has written such an addition to the filesystem? I'm thinking EncFS: if someone can make an OTFEncyrption system for individual files, someone ought to be able to make some annotation filesystem.

Anyway, if the Ext4 standard hasn't been solidified yet, I would love to have this added in.

--
404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
[GPG key in journal]

Re:OT: Metadata on the file system? by The+Cisco+Kid · 2006-07-02 02:04 · Score: 1

A far simpler solution to the specific problem you mention:

[root@flathat example]# ls -l
total 4
-rw-r--r-- 1 root root 2692 Jul 2 10:03 VP2.pdb
[root@flathat example]# ln -s VP2.pdb ViewPicture2.pdb
[root@flathat example]# ls -l
total 4
lrwxrwxrwx 1 root root 7 Jul 2 10:03 ViewPicture2.pdb -> VP2.pdb
-rw-r--r-- 1 root root 2692 Jul 2 10:03 VP2.pdb
Re:OT: Metadata on the file system? by asretfroodle · 2006-07-02 17:14 · Score: 1

It's a good simple solution for the file renaming problem, but doesn't cover his desire to attach arbitrary annotations.

Something like the summary tab on the Windows file properties dialog seems close to what he's after.

Re:hi by Anonymous Coward · 2006-07-01 14:52 · Score: 0

CONGRATULATIONS!!!

But none of your links actually lead to gay porn. I'm dissapointed. :(

How Lustre's ClusterFS works by Nutria · 2006-07-01 15:11 · Score: 1

You run it across the aggregate of file stores making up the cluster filesystem

Does that mean that the "filesystem" is broken into chunks and spread across all the nodes in the cluster?

--
"I don't know, therefore Aliens" Wafflebox1

Your sig. by Ulric · 2006-07-01 19:17 · Score: 1

Microsoft disn't invent Visual Basic, they bought it.

Repeat after me... R E I S E R by CypherOz · 2006-07-01 19:47 · Score: 1

Reiser Rulz. Say no more.

--
You want a signature? You can't handle a signature!!

Re:Repeat after me... R E I S E R by Anonymous Coward · 2006-07-02 15:29 · Score: 0

no more

A real O/S filesystem needs defrag! by ArtStone · 2006-07-02 00:41 · Score: 2, Interesting

The main described change / advantage in this proposed ext4 is that the notion that a file's allocation is tracked via "extents" (a specified number of contiguous 2k blocks) rather than a chain of inode pointers (with up to 3 levels of indirection).

This is based not only on the need for a larger maximum file system, but a recognition that there is significant performance advantage to reducing read/write head movement and initiating large reads from consecutive blocks that can take advantage of the high transfer rates of today's drives. (this assumes that the OS filesystem doesn't attempt/require that the entire disk drive be cached in RAM to get decent performance)

Except for "write once" files, over time this will cause files to become physically spread over the disk and the performance benefit is reduced, unless a process periodically consolidates the blocks back into a contiguous series of blocks (ignoring for the moment that on today's disk drives, blocks may be "spared" into place that are not really physically consecutive, but just logically appear to be)...

One of the "proofs" that *nix is superior to other O/Ss has been the absence of a need to "Defrag" the file system.

A commenter on the article also raises the question of why the "right" solution isn't to increase the 2k block size limit rather than rework the internals of the block pointers, and got the response that since the linux kernal manages memory in 2k blocks, it is a nightmare in the kernal to support larger I/O transfers (although others here seem to indicate this is one of the solutions people have implemented)

Isn't "extents" a concept contained in NTFS? Has anyone looked into the patent implications of these proposed changes?

--
Final 2006 "Proof of Global Warming" US Hurricane Count -> 0

Re:A real O/S filesystem needs defrag! by Anonymous Coward · 2006-07-02 21:49 · Score: 0

Except for "write once" files, over time this will cause files to become physically spread over the disk and the performance benefit is reduced

Unless a number of blocks are reserved for future growth of the file, so that sets of blocks are multiples of the read-ahead. In this way, each read-ahead will be read without intermittent seeking. I think this is the reasoning used in ext2.

can i upgrade from ext3 to ext4? by t35t0r · 2006-07-02 03:47 · Score: 1

will i be able to upgrade from ext3 to ext4?

Re:can i upgrade from ext3 to ext4? by Slashcrap · 2006-07-02 20:58 · Score: 1

will i be able to upgrade from ext3 to ext4?

No, absolutely not. In fact you won't even be able to move your old files across. You will need to recreate all your documents from scratch on your new ext4 fs. Have fun.

Add Access Control Lists!! by Myria · 2006-07-02 05:47 · Score: 1

If they're going to make an ext4, why not add access control lists and extended attributes, which have been sorely needed for some time?

melissa

--
"Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager

Re:Add Access Control Lists!! by Anonymous Coward · 2006-07-02 11:57 · Score: 0

ext2 and ext3 already have ACLs (its a mount time option) as well as extended attributes (not sure if its a mount time option or what, never tried using them).
Re:Add Access Control Lists!! by demon · 2006-07-06 03:56 · Score: 1

You mean like the ones you get when you mount an ext2 (or ext3) filesystem with the 'acl' attribute? Most distribution kernels already turn on the feature, and it's been in the mainline kernel since 2.6.0 (and I believe it was incorporated sometime late in 2.4.x as well).

--

Sam: "That was needlessly cryptic."
Max: "I'd be peeing my pants if I wore any!"

fsck crash? by RT+Alec · 2006-07-02 07:08 · Score: 1

"While FreeBSD's UFS developers were messing around with sync writes to avoid testing a fsck that would often crash..."

I can't recall fsck ever crashing, and I have been running FreeBSD systems since 2.1 (1995). "Kick ass" fsck sounds scary-- like it was designed for really fscked up drives. Wouldn't it be better to never, ever have really damaged file systems? For the vast majority of uses, stability should trump performance.

As far as what FreeBSD developers were messing around with, here is a good read from 2001:
Matt Dillon interview

VFS needs fixing first by sneakerfish · 2006-07-02 07:15 · Score: 1

Can we fix the VFS system first?. As one of the linked articles says all filesystems are equal but ext3 is the first among equals. Anyone who has tried running NFS over ReiserFS can attest to that. The VFS filesystem does not treat everyone equally. Although I am happy to see progress with the ext series of filesystems, I would like to see better support for other filesystems first.

Another issue is that distributions don't support all the features available in ext3. Did you know that ext3 supports indexed directories? This will aid situations like mail servers where there are many, many files in a single directory. It would if distributions would use proper mount options. Extended attributes and ACLs will be the most sought after features the next few years I think (think BFS and the nascent WinFS). Ext3 supports these, but alas these features are not enables by default by the major distributions. I guess it is too difficult for them to support or they figure we are ready for such advanced features.

My last gripe has to do with the features they are adding to ext3 to make ext4. Most of the features list seem to center around large file support and other features necessary for enterprise size data. I'm all for managing this class of data on Linux, but do we need to do in ext? There is already XFS, JFS, maybe even ReiserFS for applications like this. Can we keep ext3 clean and pure for core Linux support? The majority of files in a basic install are small, read often, and written to once in a while. Keeping ext3 optimal for basic necessities while allowing enterprise users to get their work done via access to enterprise filesystems like XFS seems like the best of both worlds to me.

Anyhow we filesystem snobs are very lucky to have all these choices in Linux. Tuning your applications from the filesystem up with SW RAID, LVM, and various filesystem options can net quite a performance boost. The BSD distributions don't have these choices although they have GEOM, Vinum, FFS (the grandfather of all UNIX filesystems including the ext series) with soft updates which are fine options. And where is this all knowing ZFS for linux?

Will Ext4 be state of the art? by wysiwia · 2006-07-02 22:16 · Score: 1

Is Ext4 able to do integrity checks during ordinary use on the fly, allowing to get rid of the startup/access limit checks?

Is Ext4 able to correct minor discrepancies on the fly, as long as the involved blocks/nodes aren't accessed?

Does Ext4 have a log of major discrepancies which may be corrected in an unmounted state without performing full checks first?

Is Ext4 fail save (power loss) after a certain amount of time (less than 30 sec) of no access? In other words does a power failure have no effect on any block/node after the last access is older than this time limit?

Can Ext4 be used cross-platform, e.g. in a multi boot environment or virtual server with different systems?

IMO these are the requirements which a state-of-the-art file system should have these days. Creating and naming a few file system makes only sense if these requirements are full filled.

O. Wyss

--
See http://wyoguide.sf.net/papers/Cross-platform.html

ROFL!!!!111one by Anonymous Coward · 2006-07-04 00:16 · Score: 0

Hi,

Outstanding job!!! Though I thought Slashdotters were far from being gais. My asumption was that only m$$ lusers users were the maricas. When reading the Not So Short Guide To Latex (something like that) the author has a reference stating that "REAL MEN USE *NIX OSES", kewl :-)

Aside from the great joke, I am shocked that such a group GNAA even exists, it's almost brutal, sounds like self-loathing, self-flagelating sort of lifestyle, I think because they use the N* word. I don't think there are any group called White Trash g* (I don't even like to write that word) Association WTGA. Anyways this is way off topic.

Real funny, bye :-))))

Looks like you made it!!! by Anonymous Coward · 2006-07-04 00:48 · Score: 0

"Second, you need to succeed in posting a GNAA First Post on slashdot.org, a popular "news for trolls" website."

Gathering from all the instructions on your post looks like you've succesfully become a member of GNAA...

Really? by Ayanami+Rei · 2006-07-05 12:37 · Score: 1

Could you recover from having the wrong superblock on your filesystem?
That's right. My SCSI enclosure somehow managed to write the wrong superblock across two LUNs (swapped). On reboot a fsck occured and proceeded to fuck everything up.
Using some perl and header files for the superblock and inode formats, I was able to revert the changes and repair the damage.
ext2 is simple enough that I did it and it wasn't too difficult. I don't know how much luck I'd have low-level manipulating reiserfs (I guess you have to be in the situation to go through it, otherwise you wouldn't bother).
But yeah, since then I've felt more than confident leaving everything as ext3 since it has such wide use and a predictable behavior (at least to me).

--
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON

182 comments