Benchmarking Linux Filesystems Part II
Anonymous Coward writes "Linux Gazette has a new filesystem benchmarking article, this time using the 2.6 kernel and showing ReiserFS v4. The second round of benchmarks include both the metrics from the first filesystem benchmark and the second in two matrices." From the article: "Instead of a Western Digital 250GB and Promise ATA/100 controller, I am now using a Seagate 400GB and Maxtor ATA/133 Promise controller. The physical machine remains the same, there is an additional 664MB of swap and I am now running Debian Etch. In the previous article, I was running Slackware 9.1 with custom compiled filesystem utilities. I've added a small section in the beginning that shows the filesystem creation and mount time, I've also added a graph showing these new benchmarks." We reported on the original benchmarks in the first half of last year.
An interesting analysis in every aspect, and it's fine and dandy for the person who uses 400 GB drives and a ATA controller on a 500MHz computer but I'd like to see how the filesystems compare on a bigass RAID system run by a Power5 server, or a few Itaniums that usually have with a few hundred connected users. Something a bit more "entreprise" - where the choice of a filesystem is a bit more critical than a small server or a home PC.
It is widely known that Reiser filesystems are heavy on CPU usage 4 more than 3. These benchmarks seem to show a CPU bound IO situation as opposed to an IO bound IO situation. As an earlier comment pointed out, the hardware used in this test was a 500mhz CPU. My slowest computer is a 1000mhz system, which is usually IO limited, not CPU limited. I'd be interested to see these same benchmarks run on real hardware, or some more complex benchmarks (random RW, DB load, etc.). The hardware used for this test would be suitable for a fileserver, but not much else. In that situation, E2, E3 or XFS are probably the right choices as it points out. What about desktop loads, enterprise loads, or something more interesting?
--Brandon
Here's what's missing. They forgot to tell you how well the drive performed after being used for 1 year, and having constantly moved data from one place to another, and constantly deleting and creating new data. It would have been a better test if the drive was about 75% full, with data from 2 years of use, and then the same tests were performed.
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
I'll leave aside the fact that all the other benchmarks I've seen are very favourable to Reiser4 and this is very unfavourable, and concentrate on the discrepancies.
Reiser4 is the slowest in searching, creating and removing. It performs a lot better when tarring and untarring, which indicates that reading and writing is much better than other filesystems. However, when you get to copying and creating large files, it loses again.
Why the discrepancy? These benchmarks contradict others, but don't make sense when taken alone. I'm inclined to believe the other benchmarks.
I'm no expert by any means, but I think the idea behind the ReiserFS is breaking down the FS paradigm from the file level to the line level.
There is the classic example from the Reiser website. If your password file gets hacked, you have to ditch the whole file if you're using traditional file systems. You only know whether or not the file's been changed. However, with the Reiser system, it can tell you *what line*, and thus which user/password, was changed.
That's just a taste of where you can go with the ReiserFS. There are other things coming down the pipe; check out the reiser website for a better idea of the new features that ReiserFS promises.
Computers are useless. They can only give you answers.
-- Pablo Picasso
Reiser is not designed for slow CPUs. AFAIK, a key part of the design was the Hans Reiser realised that CPUs were vastly underused. IO resources were maxed out and CPUs were sitting idle. So he found ways to use the CPU to make more efficient use of the IO resources. So this benchmark on a 500Mhz machine will of course show Reiser in a bad light, and moving lower down to a 266Mhz will make it even worse.
For a decent benchmark of how filesystems work on modern hardware: use modern hardware.
Please help publicise swpat.org - the software patents wiki
Reiser4 now defaults to journalling everything - file data as well as metadata. If they left it like that, then no wonder it's slower - but it's the best choice if data integrity is important.
I am trolling
What am I saying? I want to know how efficent these filesystems are in packing the data on the HD.
I'm glad that you get more data out of Reiser v4, JFS, and XFS at formatting time, but my feeling is that Reiser v4 (once profiled, tweaked and refined for speed and space) will pack data tighter than anyone else. Meanwhile, I'm looking for something like ext3 that packs better.
--
# Canmephians for a better Linux Kernel
$Stalag99{"URL"}="http://stalag99.net";
I'm *sick* of reading filesystem benchmarks of people who doesn't even care about even reading the documentation of the filesystems they compare
OK, so ext3 is not the fastest filesystem on earth. But it has some default options which makes it suck even more than it usually do, and those options are *documented* in Documentation/filesystem/ext3.txt
* Ext3 does a sync() every 5 seconds. This is because ext3 developers are paranoid about your data and prefers to care about your data than win on benchmarks. Syncing every 5 seconds ensures you don't lose more than 5 seconds of work but it hurts on benchmarks. Other filesystems don't do it, if you are doing a FAIR comparison override the default with the "commit" mount option
* ext3's default journaling mode is slower than those from XFS, JFS or reiserfs, because it's safer. When ext3 is going to write some metadata to the journal, it takes care of writting to the disk the data associated to that metadata. XFS and JFS journaling modes do *not* care about this, neither they should, journaling was designed to keep filesystem integrity intact, not data, ext3 does it as an "extra", and it's slower because of that. But if you want to do a fair comparison, you should use the "data=writeback" mount option, which makes ext3 behave like xfs and jfs WRT to journaling. Reiserfs default journaling mode is like XFS/JFS, but you can make it behave like the ext3 default option with "data=ordered"
ext3 is not going to beat the other by using those mount options, but it won't suck so much, and the comparison will be more fair. And remember: ext3 tradeoffs data integrity for speed. There's nothing wrong with XFS and JFS, but _I_ use ext3.
Of course JFS won, since it was designed to be as simple as possible ... it's originating from OS/2, afterall. On such a machine as used in this test, this is a huge advantage.
You can't fsck an xfs mounted filesystem, even if it's mounted read-only. If your root fs gets damaged and you need to fsck it, you need to boot from a rescue CD. If it's a server in a remote location, you're shit outta luck.
ext3 and reiser at least let you fsck read-only mounted filesystems.
I brought up this problem to xfs developers and their response was "well, it's not a problem on SGIs so we're not going to fix it". Nice.
XFS is a nice filesystem, I like it. Not enough to use in production, but I like it. Personally I use reiserfs3.6 on many production servers, and have never seen a problem. I am experimenting with 4 at home.
I have a strong warning if you are considering XFS. If you don't have a GOOD power backup (UPS), then don't use it. XFS caches very agressively for writes in RAM. You lose power, you lose that data.
XFS was designed with datacenters with good power backups in place, not home users. So chose carefully.
-- Note: If you don't agree with me, don't bother replying. I won't read it.
While I basically agree with you, 500MHz is not four times slower than 2 GHz. However in this case it is probably worse, since a 500MHz PIII implies a slow 100MHz side bus, slow 33MHz PCI bus, slow PC100 memory. A terrible system for doing benchmarks in 2006! It is completely unrepresentative of anything.
Actually, I am getting angrier as I write this. It was just wrong to publish an article using such an outdated system. People worried about high FS performance are not going to be using anything like that.
Any time I've lost a drive to data corruption it was formatted Reiser, every attempt at using Reiser eventually resulted in massive data corruption. This was various hardware and distros. I don't know about the newest version but trice bitten forever XFS for me.
I've seen those benchmarks before, and last time I saw them, I decided Reiser was for me. I've been using it ever since.
Based on the new benchmarks, I'm serious considering moving to JFS. It seems to be much faster for my typical desktop usage.
I'm curious what hash function he was using with ReiserV4, though. TEA produces a more responsive filesystem, for me, than R5. Reiser defaults to R5, so I'm guessing he was using that, but I'd like to see the difference that TEA produces.