Merits Of The Different Journaling Filesystems?

← Back to Stories (view on slashdot.org)

Merits Of The Different Journaling Filesystems?

Posted by Cliff on Friday September 29, 2000 @11:58PM from the advantages-and-disadvantages dept.

a2800276 asks: "The story that XFS has gone beta raised some questions in my mind. There are now four journaling filesystems available under various OSS licenses and being actively developed for Linux, there being (in estimated order of maturity): SuSE/Namesys's reiserfs, SGI's XFS, IBM's JFS and Tweedie/Redhat's ext3fs. Avoiding the obvious question of why can't the effort going into four different projects be channeled into one, I think a discussion of the particular merits of the different fs's would be interesting."

10 of 192 comments (clear)

Min score:

Reason:

Sort:

Um, it's deja vu all over again... by Holgate · 2000-09-29 19:18 · Score: 4

Haven't we had this discussion umpteen times before? Such as... two days ago? There's even a link to a discussion on the four competing filesystems. Sheesh. Flogging the dead horse.
Re:ReiserFS by Jerry · 2000-09-29 20:52 · Score: 4

I was ASKED to install a Linux system at work last week! (I've been
preaching Linux for 3 years - patients pays off!) They gave me an
old P100 with 71MB or usable RAM and two HDs.
I decided to use SuSE 6.4 BECAUSE it had ReiserFS.
The graphic install really impressed the Win techies standing
around watching because it was easy enough that even they
could do it, and is pretty eye candy. KDe really impressed them
too.
Thirty minutes later the second HD, a 4.3 BigFoot, died.
I had /home on it, and since I was logged in as root, it didn't matter.
The dead drive was smoothly disconnected from the system.
Since I was needed to power down to replace the HD I decided
to test out the ReiserFS. I reached over and pressed the reset
button. A collective "gasp!" rose from the assembled techies.
Thirty seconds later I had the KDE graphical login prompt.
No corruption, no losses. It's like having an UPS attached.
I didn't notice any increase in speed of file accessing, but it
was fast at rebooting. It's been up 18 days now, which is
also impressing the techies in our M$ shop. They are still
afraid of Linux though. I think it is because they may feel
that they may have to retrain, loosing any employment
advantages they may have accumulated. They are right.

--
Running with Linux for over 20 years!
Re:Try *BSD Soft Updates by miguel · 2000-09-29 22:52 · Score: 4

McKusick's Soft Updates has also a nice feature: unlike the journaling file systems, it does not have the burden of writing blocks to a logging device. So a soft-updates enabled kernel runs at the speed of traditional asyncronous file systems (ie, default ext2) while providing a very good level of reliability (it is not a syncronous file system, so it runs at a very enjoyable speed).

You can boot a Soft Updates file system without fscking it, the file system will be in a functional state. The only problem is that you might start to loose free blocks that are believed to be busy. So every 100 or 200 crashes you might want to run fsck to free those 100 blocks.

I agree with you regarding the ext2 file system when running in async mode: when there is a lot of activity on the disk, and a lot of changes to the file system, crashing an ext2 file system will loose a considerable ammount of data. ext2 fsck will not be able to recover your file system properly (it has happened to me a couple of times already).

For non-SoftUpdates kernels and non-Journaling kernels, if you are running a system with sensitive information, I suggest turning syncronous access on the file system (add option sync to it).

The sad part here is that the BSDs have traditionally been optimized for the syncronous case, so they run at acceptable speeds. Linux ext2fs has never been optimized for this case so in practice it is very slow.

I am using ReiserFS on my laptop, but on a server, if I had to choose, I would run SoftUpdates for BSD kernels and ReiserFS for Linux kernels.

Miguel.
ext3 more real than JFS by crow · 2000-09-29 19:22 · Score: 4

When I was at the Ottawa Linux Symposium, there were talks on XFS, JFS, and ext3fs. It seemed clear that XFS was near beta, so the recent announcement was no surprise. Ext3fs also sounded near beta. Ext3 takes the simple approach of adding journaling to ext2 in such a way that as long as you unmount cleanly (so there's no need to play the log back), you can take an ext3 partition and mount it as an ext2 partition. From the talk, it sounded pretty much ready.

JFS was another story. My take on the talk was that people who atteneded it learned one important thing: JFS is the journaling file system to ignore. The Linux port comes from OS/2, instead of directly from AIX. It lacks such things as support for mixed case filenames. The answers to most of the questions were, "We hadn't thought of that," or, "We'll have to look into that." If JFS didn't have the "me-too" ego of IBM behind it, the developers would have realized that they were better off working on one of the other file systems.
ReiserFS is just fine by me by da3dAlus · 2000-09-29 20:06 · Score: 4

I tried to ask this question a few months ago, but with no luck getting it posted I did some research on my own. I wanted to make a 60GB file server that would give me some insurance on my data. I was close to using the IBM JFS, but kept hearing about ReiserFS and gave it a try. (Heck, sourceforge uses ReiserFS on their servers, so it's good enough for mine.) Anyway, after a little more reading, I realized that ReiserFS doesn't just add journaling to a partition, it also restructures the filesystem into B-trees which can enhance access speeds, and it also adds a bit of encryption to the filesystem since it uses a hashing algorithm to sort the files.
In my opinion, you just get more. I also found the installation and recompile fairly easy to do. I've been using ReiserFS for the past 3 months with absolutely no problems.

--

Sometimes I doubt your commitment to Sparkle Motion.
1. Re:ReiserFS is just fine by me by Inoshiro · 2000-09-29 23:42 · Score: 4
  
  I think you're confused about what a hash is. A hash is a one-way function. A cryptographic hash is a one-way function with an incredibly low probability of collision. Example: passwords are cryptographic hashes, allowing the system to verify your password without having a copy of it handy :)
  
  Encryption is a two-way function requiring a key (3DES, blowfish, IDEA), or a pair of matched one-way functions (RSA, DSA). Don't confuse this with a cryptographic hash which is strictly one way. And don't confuse hashes with any form of encryption.
  --
  
  --
  --
  Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
Re:Journalling is dead. Long live phase trees! by Daniel+Phillips · 2000-09-30 05:37 · Score: 4

tux2fs probably *will* take more memory (substantially more?) than ext2 or a journaling filesystem, but with the amount of memory that most systems have available for file cache, I doubt that is a problem.

I've analyzed that question and I think tux2 will only use a little more cache memory, not a lot more, and it could even be less - see below. Tux2 uses per-block copy-on-write, and when the old version of a block won't be used any more (the normal case) that means you can just change the disk block number in the buffer - no extra memory used at all. The only time extra memory is used is when a file block is written over and over again, every 10th of a second or so - then you will sometimes get two copies of it in memory at the same time. The first copy will disappear as soon as it finishes being transfered to disk. This kind of writing pattern is rare with normal data but is common with metadata. Fortunately metadata is about .1% of the total in a filesystem. My guess is you won't notice any extra load on the buffer cache.

In fact, I think Tux2 will take a load off the buffer/page cache because it doesn't let dirty data hang around a long time - it starts writing to disk a fraction of a second after you start writing to a file. My plan is to have Tux2 shorten its phase length under heavy memory pressure, so the space needed for dirty buffers will drop down to just 100-200K, and you'll still get good performance.

Cache memory for reading under Tux2 is the same as Ext2 and most other filesystems.
--

--
Have you got your LWN subscription yet?
A brief summary by Fluffy+the+Cat · 2000-09-29 19:09 · Score: 5

XFS is optimised for dealing with streaming media, and so deals well with high IO and large files.

JFS has been around for years under AIX. It's a well proven general purpose journalling filesystem.

ReiserFS is the best established of the Linux journalling filesystems. It has several fairly innovative features and is more efficient than ext2 in terms of space utilisation. People are using it as their primary filesystem now, although it's still in development.

EXT3 is (unsurprisingly) a development of EXT2. It lacks most of the pretty features of the other journalled filesystems, but has the significant advantage that you can turn EXT2 partitions into EXT3 (and vice versa) without any trouble at all.
Try *BSD Soft Updates by redelm · 2000-09-29 19:59 · Score: 5

For similar crash protection, you might want to try out McKusick's "Soft Updates" that appear in *BSD systems. Essentially, they are ordered disk writes that makes sure data gets on the disk before metadata is altered. They go through the buffereing system, so performance isn't bad.

As an experiment, I pulled the plug towards the end of 5 FreeBSD kernel compiles (SMP `make -j 4`). In all cases, the fsck upon restart was minor, just freeing inodes. In four of the cases, `make` just picked up where it left off, and finished the kernel compile, losing only ~40 seconds work. In one case, a `make clean` had to be done because something was incomplete.

Don't try this on Linux! The ext2 fsck is horrible after a powerfail, and I've lost superblocks and had to re-install :( .
Journalling is dead. Long live phase trees! by teraflop+user · 2000-09-29 19:33 · Score: 5

OK, that overstates the case a little, journalling is still better when transactions have to complete as early as possible. But for most purposes Daniel Phillips' 'Tux2' phase-tree filesystem looks as though it may well be superior to journalling - it provides the same guarantees of a consistent filesystem as data+metadata journalling, without the performance hit.
It is also proof that open source software does not just 'chase tail lights' - the work is substantially innovative.
Phillips is also implementing tailmerging (a feature from ReiserFS to efficiently store small files) for ext2/ext3/tux2.
For more details, check his web pages here, and the linux-fsdevel mailinglist.