Why Redhat Choose ext3 For 7.2
mz001b writes "There is an interesting article from RH posted on LinuxToday discussing why they chose ext3 over the other available journaling filesystems (ReiserFS, xfs, jfs,...) for RH 7.2"
← Back to Stories (view on slashdot.org)
I've worked with ext2, ext3, and ReiserFS extensively, and I can say I've had vastly different results than what many people have _read and repeated_ here. Ext2 is a nice filesystem, assuming you don't have to worry about an unclean shutdown. I can't count the number of times I've lost a filesystem entirely because it was ACTUALLY doing something when the power was lost, or just 2.4's bad VM sending the machine into oblivion and the filesystem with it. Ext3 was nice when I used it once or twice, until I turned DMA on for the disk, at which point it started corrupting itself quite nicely (not a hardware issue, trust me). I would hope this is fixed by now, but I always found it to be a nice feature. ReiserFS, but comparison, has never failed me. I've used it extensively on production machines under 2.2 and 2.4, and been using knfsd since 2.4.6 was released (damn ext2 hooks in the code, completely ridiculous). Obviously, you should find what suits your needs best, but some of the flaming and outright incorrect claims I've seen recently are just ridiculous. See what works for you, not just what RedHat tells you. I remember when Linux was about choice, not about RedHat telling me that I shouldn't use a certain filesystem on my machine and not giving me the CHOICE of doing so.
Interested in open source engine management for your Subaru?
why did they choose to use ext3 ?
ofcourse its the migration path. Users can choose to install ext3 and later if they want to they can choose to go back to ext2. forward and backwards compatiblity makes ext3 a much more friendly jouraling filesystem for businesses. Some of the intranet servers cannot risk to backup and hope the new filesystem to go up working alright. Ofcourse there are better journaling filesystems out there. But the choice to use ext3 is good one since, its mature,stable and easier to administrate and use. Easier to administrate and use the keywords here. Any kernel out there can read an ext3 partition without extra modules. So it definitely plays well with others. Is there any other journaling filesystem that can say this ?
The same could be said for ext2, really... but the idea is that after an "unclean" shutdown you don't have to wait as long for a disk check... at least that's what I look to either reiserFS or EXT3 for... When you start looking at really large disk arrays - ext2 fsck takes a helluva long time.
Yes, backups and a UPS are alway necesary for mission critical stuff... but this adds another layer of 'help'.
BlackNova Traders
I like ReiserFS, but I have had it fuck up on a machine before. I'm not bitter against it, and I'm not saying that it's not good - but there is still work to be done on it. Ext3 has a lot working for it, and that's why Red Hat is using it in 7.2. Read the article for more details.
Every once in a while I like to masturbate a new word into my vocabulary, even if I don't know what it means.
You are missing a little from your math. Alan Cox is not relevant in this case.
Stephen Tweedie is. He is one of the top filesystem ext[23] hackers and is employed by Redhat. RedHat runs the mailing list for ext2 and ext3 stuff.
But mostly, ext3 allows new filesystems to be employed over old existing ones without a backup and re-creation of the file system. This means ext3 will be deployed (in the US) 10 times more than any other journaled file system.
As for speed, I think the ext[23] file systems, which are already fast, are going to catch up with the addition of an inode hash from Daniel Phillips. Or with his Tux2 file system which is in development. But really, unless you use directories with a large number of small files, ext2 and ReiserFS are not much different for speed.
Having an EASY upgrade path is the way. I also suspect Linus will add ext3 to the mainline kernels in another 2-3 kernel iterations, since the ext3 hackers are quite used to the appropriate methods for getting new code included.
... since Alan Cox (@redhat.com) had so many arguments over the Linux kernel Mailing List with Hans Reiser.
This thread is a good example.
I'm impressed that you were able to write that long paragraph in the three minutes since the article was posted, let alone read the message linked-to. But, I think that in your hurry, you missed a few key things, so I thought I'd quote them here for your benefit and the benefit of anyone else in too much of a rush:
Red Hat is just telling you what they think works -- not taking away any of your choices. They even ship the reiserfs tools. Perhaps you've fallen to the whole "Red Hat is too popular to be cool" thing?
I've been interested in upgrading the FS on the machines I manage here in the office, give or take about 15 servers. The fact of the matter is that it is no small job bringing down a production machine to change its filesystem. So, it sits with an unjournaled ext2 fs. Which is where it would sit, potentially forever until it left the production scope. The ability to upgrade the FS to ext3 without even a reboot, AND maintaining the security of being able to roll back those changes are more than enough to convince me that this is the best way to go.
If I push to have the systems upgraded, say to ReiserFS, and something goes wrong. I'm just plain f**ked. It's that simple. This offers me the ability to upgrade with a fraction of the risk. Which, considering RedHats duties to its customers, I think is the perfect decision.
Aaron
AaronCameron.net
So I read the article, and all of those reasons could easily apply to any of the above filesystems. Never mind that all of them are more mature and more stable than ext3. The only technical argument for ext3 is the upgrade path: ext3 is ext2 with a journal. But the real reason might be that RH can speed adoption (and by the bazaar model, improvement) of ext3, developed at RedHat, this way.
Although my machine is currently running Mandrake 7.2 (I can't vouch for 8.0 & greater, but I'd assume it is the same), Mandrake gives you the option of what file systems you would like to install. You even have more choices than what you listed above...
ext2, ext3, XFS, Reiser, UFS, HFS (Mac), etc. I'd guess that there are 40 different filesystems for you to chose from...
Doh!
Reiser performs better than ext2 mainly on two points:
ext2 uses a linear search algorithm to index directories while Reiser uses a hashtable. This makes handing of large (10000+ files) directories far more efficient. No more need for /home/h/he/hensema.
Reiserfs also packs together the 'tails' of files, meaning that multiple endings of files can occupy the same disk block. This saves space (less slack). The classic example where this works very well is a newsspool, containing hunderds of thousands of files sized typically around 4 KB.
I'm not sure wether the special Reiserfs API has been implemented which can allocate files without names but using their hash-index. This may speed up processes like squid, which have to store vast amount of files but don't care about their names. Cutting out the directory layer completely is a very nice sollution.
This is your sig. There are thousands more, but this one is yours.
The article at LinuxToday isn't about RedHat prefering ext3 over other journaling filesystems. It's merely an explaination of why they decided to include ext3 in the new RedHat 7.2.
The only comparison made is between ext3 and ext2 where they explain the advantages of a journaling system.
Linux has never crashed on me without a hardware problem causing it (not an exaggeration), but that doesn't mean we haven't had plenty of hardware problems, and each time there was a failure, the fsck would take 30-45 minutes. My first thought was ext3, but... heh. It was always grayed out in the kernel config menus. Not a good sign. ReiserFS on the other hand was immediately available.
Of course, you don't trust your data to something without being damn thorough about it, so I did a bunch of tests on staging servers (which went great) and I spent a lot of time reading Hans Reiser, who impressed me considerably as a smart person with a lot of good ideas. We made the move this spring and have had zero problems with the filesystem during normal operations. Zero. It's blazing fast on our tests, it appears to scale beautifully, and if I go down, I have no wait time anymore coming back up.
Of course, I keep up with the kernel changes and upgrade when I see updates relevant to the filesystem.
It's not a perfect package, but nearly. Its consistency checker/repair tool (reiserfsck) is not finished (as its messages vigorously warn). Now, remember, this is not the same thing as e2fsck. You are not using it in the same role, its purpose is much more specialized (disaster recovery), so the significance is different. Still; we came to use it during several of the many times high-speed SCSI chomped on our asses and corrupted data. We have backups, of course, but I wanted to see what the tool was capable of. In several cases it was able to successfully rebuild the filesystem, very slowly, with --rebuilddb, but in several other cases, the tool would dump core, which, if you were one of those fools without a backup, would leave you stranded.
Even in this, however, I was reassured; the maintainer of the tool answers emails quickly and was eager to try to troubleshoot the problem. I thus have no doubt that it will quickly mature into something quite good. It's just not there at this moment.
On the whole I would say I'm extremely happy with ReiserFS; we've punished it here pretty brutally and it's passed every test. I don't have any experience with ext3, but anecdotally I'm told it's less mature. Still, I have nothing against it. I can only comment that I hope Redhat's upgrade process from 7.1 to 7.2 will at least take reiserfs into account, instead of breaking the way it did from 7.0 to 7.1.
We're on the road to Tycho.
I'm suprised that more people haven't said anything about XFS. I've been using for awhile now at home and on a production fileserver at work for awhile now and haven't experienced any problems. The only thing at all that has been a worry is the fact that Grub can not yet read XFS, so you have to create a small boot drive at the beginning. At least with XFS, the filesystem has already been designed and tested for years by SGI, and the only matter was porting it to Linux. From what I've seen with ReiserFS, they are still trying to decide on features and on how it is going to go about doing things. That's fine and all, but I don't want to end up having to backup and restore my filesystem a few times as they decide to impliment a new "everything and the kitchen sink" feature. If I'm doing something for file integrity and security, I'd rather have something that I know has been working for years now in a high performance environment. Just so this won't be considered offtopic, I would say that I can see why ext3 would be preferred by Redhat over Reiser (with the in-house development, and the easier migration), and hey, it will probably be "good enough" for most people (and certainly some kind of journaling is better than plain ext2), so hey, good for Redhat, and good for their users. I'll continue using XFS, but that's what's nice about choice anyway, right?
http://cambuca.ldhs.cetuc.puc-rio.br/ has a install disk for RH & ReiserFS, I expect you could
use it for recovery.
These make nice emergency disks, including ssh:
http://www.lnx-bbc.org/ The LNX-BBC is a mini Linux-distribution, small enough to fit on a CD-ROM that has been cut, pressed, or molded to the size and shape of a business card.
Plato seems wrong to me today
I use Partition Magic on a regular basis to manage my partitions, resizing and moving them around as needed. (I know, it's commercial software, but it's one of the more useful pieces of commercial software out there, especially if you like to change things around a lot on your systems.)
PM supports ext2 but not any of the newer exotic journaling file systems like ReiserFS or xfs.
The fact that ext3 is comatable with ext2, and can be converted back and forth is a welcome feature for those who use PM to manage their partitions.
I'm anything but a filesystem expert, but from what I know, a journaling filesystem doesn't magically protect your files against damage. If power fails while writing to your disk, even journaling filesystems won't be able to complete the write later, the data simply isn't stored anywhere.
What the journaling system can do is have a very short disk check time during reboot, because it doesn't have to scan the complete disk after a crash.
Sig (appended to the end of comments I post, 54 chars)
Yes, I admit I skipped over some of the technicalities, but my point still remains. Alan has already written the kernel patch. He works for RedHat.
Stephen Tweedie and Ted T'so wrote ext3, mostly. Andrew Morton led its port to the 2.4 filesystem, and maintains patches for ext3 to be applied to the 2.4 kernels.
Alan Cox merges those submitted patches into his ac series, which ultimately leads to Linus adding them to the "stock" kernels.
Um, how can ext[23] catch up on speed with the addition of Tux2 filesystem?
In many ways Tux2 is an add-on to ext2 to add
1) A hash function which is analogous to a B-tree for directory searches. This is currently a big speed hit for ext2 for directories with lots of small files.
2) Atomic updating of file system writes. This will make the file system power-button resistant without adding a journal. This is a feature already present in FFS + Soft Updates in FreeBSD.
The atomic updating algorithm will make the file system faster than any journalled file system, ceteris paribus. But Tux2 will also be an add-on to existing ext[23] file systems, and will inherit most of its code base from them. Similarly, the upgrade path will not require a backup and re-creation of the file system. At least, this is according to statements made by its coder, Phillips.
BTW, people can get more information about XFS, or download patches or kernel source from the SGI Linux XFS site. CVS is also available.
Thank you. That was entertaining.
=)
and they want to define what Linux is. That's not so hard to understand.
Yes, I'm a Linux user, yes, I have it pre-loaded onto computers I buy for work and will sing it's virtues all day long, but RedHat is still a company and does have to make money. If they've tested the filesystems and one type works better for what Redhat needs in a filesystem, it doesn't take a brain surgeon to figure out which filesystem they're going to use.
The good news is that you can have a dozen filesystems on your hard drive and mount each and every one of them and not have to worry about which one is what when it comes time to go to that directory.
DanH
Cav Pilot's Reference Page
UNIX - Not just for Vestal Virgins anymore
Here's my reasoning:
* Any non-trivial choice in the computer world has its pros and cons.
* Magically, Michael Johnson only finds pros in adopting ext3.
Conclusion: Michael Johnson is not fully confronting the issue or being fully objective and upfront.
(Please browse at -1 to read this comment.)
ext2 - hmmm, too cold.
ReiserFS - mmmm... too hot.
ext3 - mmmmm.... Ah! Just right!
Both at home, and at work, running various versions of the 2.2 kernel plus patches, stock Mandrake installs, 2.4 kernel, etc. we have, over time, experienced data loss using reiserfs. This has not been limited to one machine, or one configuration, or one version of the filesystem (though, admittedly, we have been unwilling to try any newer versions in the last six months or so ... let someone else take the pain for a while).
... at least for the nonce. I concur with another poster who pointed out that SGI's XFS has been well tested and stable since 1994 ... any issues are porting issues, not design or internal issues, which IMHO is quite important when looking for a managable and stable alternative.
I say this not to knock reiser per se (I am quite happy it made it into the kernel tree, although I won't be completely happy until ext3, jfs, and xfs are all in the kernel tree as well so that they can compete with one another on quality and features rather than merely convinience), but to point out that all is not necessarilly sunny, nor is it unequivocably the "leader" when it comes to Linux journalling filesystems. I feel it is important to counterbalance some of these overly sunny depictions of experimental filesystems being used in serious environments with a little real-world, personal experience to the contrary.
I haven't yet tried ext3 or jfs, but have used various incarnations of xfs and must say that I have developed a preference for it over the last few weeks. That having been said, I still make use of ext2 filesystems in production environments and will only use less tried linux filesystems (previously reiser, now ext3 and xfs) in development/test environments only
These filesystems are fun and exciting, but they are not perfect, and in the case of reiser some rather serious (hopefully now fixed, but what's next?) flaws have gotten played down a little more than they should have. Remember, back up early, back up often, and be conservative in using any of them in anything other than a test situation (you can, once its tested to your satisfaction, but be cautious).
Back up early and back up often.
The Future of Human Evolution: Autonomy
=WHY CARE WHICH???=
Once you've formatted the partition, and loaded the appropriate FS driver, you don't =NEED= to care what the underlying filesystem is. There is NO DIFFERENCE!
From the perspective of Red Hat, or any other distribution, the difference in effort of supporting one journalling FS or a hundred is negligable. A menu is a menu is a menu. The number of items on it is irrelevent.
From the perspective of the user, it doesn't make a difference, either. Oh, it's a pull-down menu! Wow! Never seen one of those before, I wonder how it works. Duh! Give even novice users the wits to ignore things they don't understand, and expert users the freedom to tweak things they DO understand. Futher, since ALL filing systems behave in much the same way, from the user's standpoint, it doesn't make a damn bit of difference whether someone "messes up" on this or not.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
I did not see any mention on the article on whether other journaling file systems would be available on Red Hat 7.2 as part of the installation/upgrade procedure.
;-)
I am have been using ReiserFS for about 18 months now and greatly appreciate it on my 20 gig hard drive on my laptop. Never had a single corruption problem, and always rebooting the machine quickly (specially with VMware crashing the system
Some systems at Ximian use XFS extensively as well. When you have too many files in a directory (for example Gnus spools) ext2/ext2 wont cut it due to the slow operations on directories with lots of files.
Miguel.
Hmm this will be interesting. My machine has had a few unexpected shutdowns, and I have had to run fsck on them. With a 30 GIg drive it actually was faster than with my old system. Of course it helps that I have a super fast computer these days (1.2Gighz with WD 7200RPM drive). I'd imagine that soon I can have my dream of instant on / instant off machine ;-).
Only 'flamers' flame!
RedHat has been VERY good about not radically changing their platform between point releases so I'm not surprised to see this incremental filesystem improvement.
I would, however, be surprised to see them skip XFS, JFS or ReiserFS in their 8.0 release. It would make sense for them to add that capability at that time (and would allow the implementations to mature that much more).
-- James
James
$ ls
.
..
script
$ chmod 700 script
$
File Not Found
Yet, ls still finds the file... This is a phantom problem I have only seen on RaiserFS and appears to be some sort of metadata corruption. It is really irritating because you have to create another file. To the credit of RaiserFS, the problem has not occurred since I upgraded to SuSE 7.1.
LedgerSMB: Open source Accounting/ERP
They should change it to
;)
Why Redhat Choo-Choo-Choose ext3 for 7.2
and replace the Red Hat logo with a picture of a train
From hell's heart I fstab at /dev/hdc
A filesystem like reiser/ext3/xfs is only designed to guarantee the internal consistency of the filesystem and metadata after a power failure. No guarantees are made about the data itself.
Contrary to popular misunderstanding, using a journaled filesystem does not mean you can start using the power switch before doing a proper shutdown -- try this and you will lose data on any filesystem. It's just that on the journaled filesystems, the risk of further filesystem corruption as a result of power failure is drastically reduced and there is no long delay to analyze the entire filesystem for consistency when powering up again.
STOP . AMERICA . NOW
Allusion is made in the article to the future use of NVRAM for the journal. Now, I know how much a performance win this would be, but doesn't NetApp hold a patent on that?
Can someone with more patent-searching skills than myself verify? Thanks!
(Sorry if I be a little short-sentenced. I just wrote a whole story then Mozilla went nuts so now I am doing it again.)
;-), and that, as a result, journaling and devfs will really become mainstream when 2.5 is in good sight. So while 2.4 was supposed to bring us these two big features, in reality, well, it doesn't. Yes, I know, it provides the basis, is being worked on, can be obtained by patches etc. etc., but that doesn't practically make it much difference from 2.2, because as I said, for what I guess, most people still aren't encouraged to take the step to a journaling filesystem.
;-)
Two things: First, with 2.4 we were `promised' journaling and devfs. Both are still marked experimental, and of journaling, only ReiserFS is included as an appetizer, but the subsystem is still heavily in development. Some smaller things that were supposed to be improved at 2.4 are also still marked experimental. My guess is that most people -like me- are still using ext2 and device nodes, silently but eagerly waiting until journaling and devfs (and these other smaller things) get marked `stable' (by the proper authorities
Second: think GCC-2.96 (IIRC). RedHat has the power to shape the Free Software market a little bit the way they like it. With the inclusion of the compiler marked as GCC-2.96 they have practically released a GCC version without involving the GCC team. When RedHat issues a kernel that does ext3 (not just as an option, but as a default feature), I guess at least some of the results are the same as with the GCC-2.96 case. Although maybe this time not `faced with the facts' (that RedHat issued GCC-2.96), but merely `by popular demand' (from other distro's that want to use journaling by now), there will be some pressure on other distro's and the kernel developers to get journalining in.
Hmm. Maybe I'm really exaggerating the case. And do keep in mind that I'm not mad that I don't `get what I'm promised' or something like that. It just makes me nervous that I can't find ext3 anywhere in my fresh kernel sources (2.4.7; debian testing doesn't have 2.4.8 yet but I don't think the differences are that big wrt journaling and `marked experimental' stuff AFAIK from the changelogs) while the ext3 patches for the 2.2 series _are_ in the distro. And I really can use that stable VM of 2.4; earlier on the GIMP crashed my box, now it just crashes itself when loading huge things. I do get complete keyboard blocks once in a while, but no trashing anymore, and hey, that's what the reset button was built for, right?
Which brings us back to journaling.... Oh well
"We can confirm that Debian does *not* ship the version with the trojan horse. Our version predates it." [CA-2002-28]
Not 100% on topic, but somewhat related: The second beta has been released yesterday. You can get it at ftp://ftp.redhat.com/pub/redhat/linux/beta/roswell /.
This message is provided under the terms outlined at http://www.bero.org/terms.html
Isn't "to administrate" to issue lots and lots of reports?
I think we've pushed this "anyone can grow up to be president" thing too far.
It seems to me that hard disks are now large enough that there needs to be another option (make it easy enough so that everyone can try it):
At install time, optionally (check box, just like ext3) set up a few partitions, each a test version. Set each one to a boot partition with a different file system. At boot time, allow the user to pick which one to boot from. Set up a few dummy one's, too, so that new file system types can be added, and a utility to copy a boot partition to the new one, customizing partition mapping tables (which is why the scripting is needed).
I tried doing something sort of like this by hand once, but I had a lot of trouble getting the partition mapping tables straight. In fact I considered myself lucky to be able to straighten things back out without a reinstall.
The names of the boot partition and the dummies need to be shuffled around as you move from one to the other, but the device addresses stay fixed. This is fine, but when some options are used the device addresses get replaced by the current name of the partition... well, it got a bit messy.
I think we've pushed this "anyone can grow up to be president" thing too far.
When Redhat 7.1 was released, someone here on slashdot wondered why reiser was not part of the distro. A redhat employee posted that under certian conditions with heavy file i/o tests, the filesystem became corrupted. I am sure the bug is not very common but on a corporate server it should not be installed. The article has been slashdotted so I have no idea what it said about rieser or the other filesystems but I assume that the other filesystems were not fully tested enough with the linux kernel so were not included.
Also remember that redhat is made and used for bussiness oriented servers and workstations where stability and reliability over cutting edge technology is very important. I would not want to bet my job on a new technology until its very matured. Redhat needs to take it slow due to their market. You can always use a more bleeding edge distro like mandrake. Also reiser is available on the Redhat Powertools cd with the deluxe or server edition of RH 7.1 if you just have to have it.
PS: Could someone posting anonymously post the story so I could read it. Thanks
http://saveie6.com/
I suspect performance characteristics will vary with how and what you use it for.
An example: When I did benchmarks with PostgreSQL 7.1.1/2 a couple of months ago, XFS and ext3 were of similar speed while ReiserFS was rather slow. OTOH, I'd expect ReiserFS to handle cases with lots of files in a directory faster than ext2. Also, the tailmerging in ReiserFS should save space (especially in situations with many small files), but is somewhat risky (there's been a few bugs in that part of the code) and will cost performance.
One of the big benefits of ext3 is that the filesystem isn't new, it's proven solid and has been around for a long time. Adding a journal layer isn't too risky, and it has been in testing (with good results) for a long time.
The only resolution I could come up with was to recreate a new file. It was limited to our ReiserFS / partition, and was most common with the rc.d scripts. This was a major headache and I decided that the journal did give some advantages but at that time (SuSE 7.0) was not ready for prime time, as it added additional ways things could go wrong.
This problem had a habit of preventing certain important services from starting because the metadata for the startup scripts would become corrupt. It was an issue for us, and it has not made us very favorable towards ReiserFS. In fact it prevented further migration.
LedgerSMB: Open source Accounting/ERP
If it don't run properly on Linux, I don't run it.
LedgerSMB: Open source Accounting/ERP
If you will recall, Red Hat was flamed extensively for releasing glibc when they did.
Benchmark results are not #1 on the list of important features in a server-class OS.
Red Hat is not claiming that "reiser goes corrupt", but you would have known that, had you read the article.
I have seen the future, and it is inconvenient.
Is there still a two gig max file size limitation? I'm assuming so, since this seems to mainly be ext2fs with journaling tacked onto it.
Bero has indeed claimed that Reiser experiences corruption at high loads, and it is you who have not done sufficient research. Please pay attention.
However, if the likes of Oracle has stress-tested Reiser and pronounced it good enough, Red Hat's protestations fall upon deaf ears. I myself would have preferred to see XFS or JFS, and we both know that bugs have never stopped Red Hat from making a production release before.
The next time you are insulting, try to be more literate.
The linked article (also known as, "the subject under discussion") makes no such statement. As for previous statements, if Red Hat discovers flaws in reiserfs, it is important that they say so (as they have done in the past with both reiserfs and ext3). What Oracle chooses to do has absolutely no bearing on the validity of Red Hat's tests or on whether reiserfs has bugs or not.
I am not trying to be insulting. This is Slashdot. By posting you give an implicit license for others to post replies which disagree with you.
I have seen the future, and it is inconvenient.