Slashdot Mirror


Benchmarking Linux Filesystems In New 2.6 Kernel

An anonymous reader writes "KernelTrap has an interesting article about a recent benchmark conducted to compare five journaling filesystems available with the current 2.6.0-test2 Linux development kernel. The tests were conducted with a very simple shell script, mainly timing how long it takes to copy, tar, and remove directories. Looks like reiser4 is the fastest filesystem at the expense of consuming much more CPU, with ext3 trailing a ways behind."

56 comments

  1. fs issues by Tirel · · Score: 4, Insightful

    speed is nice, but I think the more important question is; how stable is it?

    1. Re:fs issues by BrokenHalo · · Score: 5, Informative
      I don't know about 2.6 yet, but Reiserfs hasn't let me down so far.

      For a long time I resisted moving away from ext2, as I didn't want the journalling overhead and didn't mind the occasional e2fsck. After I got a few power outages I changed my mind :-) but I haven't been disappointed with Reiserfs' speed.

    2. Re:fs issues by Anonymous Coward · · Score: 0, Troll

      Where I work, we had a Linux PDC with software raid1 and reiserfs for the data partition.
      We had fs corruption and the kernel would freeze when accessing certain directories of the reiserfs partition.
      of course fsck.reiserfs --rebuildtree didn't work.
      I consequently banned reiserfs from all our servers.

    3. Re:fs issues by WTFmonkey · · Score: 1

      How did you do that?

      <voice class="preacher"$gt;I ABJURE THEE, reiserfs! Get thee gone, and never darken my doorstep again! Away with you! The power of Christ compels you! The power of Christ compels you!</voice>

    4. Re:fs issues by metalmaniac1759 · · Score: 1

      ReiserFS sucks in stability (atleast the eariler versions did).
      There were times when the power used to suddenly go off and when I rebooted there were parts of my files which had been swapped!! Really pissed me off and I switched to ext3 since then...

      Nandz.

    5. Re:fs issues by Anonymous Coward · · Score: 0

      Easy, we fired everyone who used ReiserFS. Only took one or two and the rest smartened up real quick.

    6. Re:fs issues by Anonymous Coward · · Score: 0

      Have you ever been dissappointed by its lack of a functioning, stable "fsck" ?

      I have.

  2. karma for Anonymous Coward by Anonymous Coward · · Score: 5, Informative

    The first item number is time, in seconds, to complete the test (lower
    is better). The second number is CPU use percentage (lower is better).

    reiser4 171.28s, 30%CPU (1.0000x time; 1.0x CPU)
    reiserfs 302.53s, 16%CPU (1.7663x time; 0.53x CPU)
    ext3 319.71s, 11%CPU (1.8666x time; 0.36x CPU)
    xfs 429.79s, 13%CPU (2.5093x time; 0.43x CPU)
    jfs 470.88s, 6%CPU (2.7492x time 0.20x CPU)

    What's interesting:
    * ext3's syncs tended to take the longest 10 seconds, except
    * JFS took a whopping 38.18s on its final sync
    * xfs used more CPU than ext3 but was slower than ext3
    * reiser4 had highest throughput and most CPU usage
    * jfs had lowest throughput and least CPU usage
    * total performance of course depends on how IO or CPU bound your task is

    1. Re:karma for Anonymous Coward by dustman · · Score: 4, Interesting

      I think another interesting metric to look at would be cpu time used. If one of them took 90% CPU for 10 seconds, that would be a big winner imo.

      reiser4 171.28s @ 30% CPU = 51.384s CPU
      reiserfs 302.53s @ 16% CPU = 48.4048s CPU
      ext3 319.71s @ 11% CPU = 35.1681s CPU
      xfs 429.79s @ 13% CPU = 55.8727s CPU
      jfs 470.88s @ 6% CPU = 28.2528s CPU

    2. Re:karma for Anonymous Coward by Ed+Avis · · Score: 1

      CPU is certainly a good metric to look at. But over the long term, CPUs speed up a lot faster than disks speed up, so using more CPU to save a little disk access is usually a good tradeoff.

      Many systems, however, do not run many entirely CPU-bound processes. If the CPU would otherwise be idle, it doesn't matter (apart from power consumption in a laptop) whether the usage by the filesystem is 10% or 90%. Only if there are other things wanting to run does it make a difference.

      An interesting benchmark would be to run something that has a mixture of CPU-bound and IO-bound work; perhaps 'find . -type f | xargs -P 4 lzop'. lzop is a fast compression program and -P 4 tells xargs to run up to four of them at once, so the CPU will be kept busy. Given a large number of small files to compress in this way, which filesystem is fastest?

      --
      -- Ed Avis ed@membled.com
  3. Speed is not of the essence by mnmn · · Score: 5, Informative

    There are other things about filesystems us sysadmins to know about. Which is the most stable and crashproof filesystem? Ive suspected it to be XFS from which I recovered data after doing dd if=/dev/zero of=/dev/hda1 count=2048576

    Also what filesystem would require the lest or no syncing at all? Befs?

    In server environements with stripped 15K cheetah SCSI drives, you'd worry more about stability than speed.

    --
    "Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
    1. Re:Speed is not of the essence by spokes · · Score: 1
      I recovered data [from XFS] after doing dd if=/dev/zero of=/dev/hda1 count=2048576
      Cool. How?
    2. Re:Speed is not of the essence by Molina+the+Bofh · · Score: 3, Funny
      Pick one of the three answers:

      With the new and improved Magnetic Forensics Tunnelling Electronic Microscope, used by the FBI

      He had a backup

      XFS takes so long to sync that by the time it would start to sync, he turned his computer off

      --

      -
      Roses are #FF0000, Violets are #0000FF, find / -name '*base*' |xargs chown -R us && mv zig greatjustice
    3. Re:Speed is not of the essence by mnmn · · Score: 3, Interesting

      I did a dd to dump the whole filesystem to a file in a larger filesystem for recovery purposes. The XFS tools that came with RedHat 7.1 seemed to crash so I did a small slackware install and went from there. xfs_restore and other tools sounding like xfs_sync or something found my third superblock. I used it to reconstruct the other superblocks. The root and early directories were gone, and the files I needed were in the root directory, well the xfs_restore linked them all to numbered directories and files. A few greps later I was in a directory that had those files all intact. I recovered around 80% of all the files, and 100% of the ones I was seeking. Just because I succeeded in one major recovery op from an XFS filesystem, I feel confident with it more than the ext3 or reiser guys. I havent really compared them.

      --
      "Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
    4. Re:Speed is not of the essence by skinfitz · · Score: 1

      In server environements with stripped 15K cheetah SCSI drives, you'd worry more about stability than speed.

      I think if I had stripped drives I'd be more worried about what sort of sicko strips a hard drive, and is he still hiding in the server room?

    5. Re:Speed is not of the essence by Bernie · · Score: 2, Interesting
      In server environements with stripped [sic] 15K cheetah SCSI drives, you'd worry more about stability than speed.


      I'm not convinced. ext3 under 2.4 on SMP can be cripplingly slow entirely for software reasons--lock_kernel() being the biggest culprit.

      Having said that, my new big fileserver is going to be ext3 on 2.6 (eventually!); the data volumes being on a 12-way RAID-5 set and the journals on a RAID-1 pair. It seems to perform adequately with dir_index and sparse_super.

      The main thing that swings it for me is the brilliant e2fsck.

  4. Incomplete comparison? by Futurepower(R) · · Score: 4, Interesting


    From the article:
    "reiser4 171.28s, 30%CPU (1.0000x time; 1.0x CPU)
    reiserfs 302.53s, 16%CPU (1.7663x time; 0.53x CPU)

    What's interesting:
    * reiser4 had highest throughput and most CPU usage"


    The comparison seems incomplete to me. Reiser4 took about half the time, with twice the CPU usage. The

    Total Work Done by the CPU = Percent * Time.

    Reiser4 did the work in half the time, but the total work was roughly equal. Actually, ReiserFS was more efficient considering total CPU cycles.

    1. Re:Incomplete comparison? by Anonymous Coward · · Score: 0

      However, these numbers would change pretty drastically based on the relative speeds of the cpu vs harddrive/controller. As processor speeds continue to increase faster than harddrive speeds, if possible, you'd like to spend more cpu cycles if it allows you to access the harddrive less.

    2. Re:Incomplete comparison? by makapuf · · Score: 5, Insightful

      Except that if you have a mostly idle CPU and your task is more & more waiting for the disk to complete, you don't care about 11 or 30 % if the other 89 or 70 % of the CPU are idle.

      Comparing CPU cycles needed is NOT a fair benckmark, unless your task is CPU _AND_ IO Bound.(if it's not io bound, take whatever fs you have, it doesn't matter.)

      The benchmark was right in giving results as TWO parameters : CPU used and time spent. Might be interesting to see how it depends of the drive type or the CPU arch, though)

    3. Re:Incomplete comparison? by p3d0 · · Score: 3, Informative
      I disagree. Consider:
      • FS1: 1sec CPU time, 1sec IO time. CPU usage=50%.
      • FS2: 2sec CPU time, 6sec IO time. CPU usage=25%.
      FS1 is undoubtably better. It consumes less CPU and IO time. Yet noobs would complain about the high CPU usage of FS1. The truth is, CPU usage is higher only because IO is more efficient.

      I can think of no corresponding pathological case in which CPU load would be a more appropriate predictor than CPU time.

      if you have a mostly idle CPU and your task is more & more waiting for the disk to complete, you don't care about 11 or 30 % if the other 89 or 70 % of the CPU are idle.
      Exactly. That's why CPU load doesn't matter.
      Comparing CPU cycles needed is NOT a fair benckmark, unless your task is CPU _AND_ IO Bound
      No, CPU cycles matters for CPU-bound tasks. I think that's pretty self-evident.
      (if it's not io bound, take whatever fs you have, it doesn't matter.)
      Wrong. In a compute-bound task, CPU time adds directly to the total execution time of the task, so you should choose the FS with the lower total CPU time.
      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    4. Re:Incomplete comparison? by Yaztromo · · Score: 1
      Exactly. That's why CPU load doesn't matter.

      I don't completely agree. Sure, CPU load doesn't matter in a single-tasking system, but if I'm running a multi-tasking system I don't want it to noticably "hiccup" each and every time some background task has to do some disk I/O. The system should _not_ bog down just because Mozilla is cleaning out its cache -- it should still be responsive.

      Obviously, priority levels can have a big effect on this sort of thing -- but in a multitasking environment where you are utilizing most of the CPU with various tasks, having high CPU usage for I/O can be detremental. If it's sufficiently high enough, the system will start to feel like Windows 3.1, where everything stoped until I/O is complete.

      Yaz.

    5. Re:Incomplete comparison? by p3d0 · · Score: 1

      Ok, that's true. So it's not totally irrelevant. But I'd certainly consider CPU time first.

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  5. Another thought... by BrokenHalo · · Score: 2, Interesting

    It seems to me to me that these benchmarks would probably have been more meaningful (and useful) if they had run the same tests against 2.4.21. Anybody got any thoughts on this? I know there are low-latency features and so forth with the new kernel, but I would be curious to know if that has any real impact on disk I/O.

  6. Kerneltrap on new server.. by molo · · Score: 3, Informative

    FYI, kerneltrap just moved to a new server (this week!). It used to be run on Jeremy's DSL line, which was why it would get shot to shit whenever it got slashdotted.

    Now its in a colo on new (fast) hardware (paid for by users' donations), and upgraded to drupal 4.2 (faster here too).

    So slashdot away!

    -molo

    --
    Using your sig line to advertise for friends is lame.
  7. Blah... This was on lkml... by j.e.hahn · · Score: 5, Informative

    And a number of people complained that it wasn't a great benchmark. Hans Reiser admitted it was just a quickie, and I forget who it was that said it, but ext3 has some performance enhancements that are on the cusp of merging into Linus' tree from the -mm kernels.

    Wait until 2.6 is out folks. These numbers are still open for mass fluctuation.

    1. Re:Blah... This was on lkml... by Webmonger · · Score: 1

      No, this is a newer set of benchmarks.

  8. Stability will be fixed by r6144 · · Score: 1

    Stability bugs will be fixed if the performance is good enough, but speed problems can be hard to fix without altering the most basic things about a filesystem.

    1. Re:Stability will be fixed by T-Ranger · · Score: 3, Interesting
      Speed problems can be solved by throwning hardware at the problem. Faster disks, more ram, more servers. A filesystem not desigined with stability in mind wont be stable regardless of the hardware.

      Stability should be at the top of the list when desigining a filesystem. If its a quetsion of stability or speed, stability should win.

    2. Re:Stability will be fixed by be-fan · · Score: 1

      Which is why Reiser4 is so great. While it may yet have some bugs (understandable for a brand-new FS running on a development-series kernel!) it was designed with robustness in mind. It does data-journaling in addition to metadata-journaling, so your file data is protected as well. And it does all this while being really fast!

      --
      A deep unwavering belief is a sure sign you're missing something...
  9. unBuggyness vs. Robustness by r6144 · · Score: 1

    I think for "Stability" you mean "Robustness", so that data loss can be avoided in a power outage or other out-of-spec situations (things supposed to be fixed with data journaling, atomic transactions, etc.), while I mean "lack of Bugs". It is true that robustness is a problem that should be figured out at design time and it can be more important than speed, but I think the original poster means unbugginess for stability. Anyway, reiser4, ext3 (ordered/data journaling mode) are robust enough, but the former is new and buggy.

  10. grandparent insightful by mikeee · · Score: 1

    Right, but they gave CPU % used, rather than the number of cycles used during the test - Reiser's CPU numbers looked higher *because* it was fast in terms of wall time.

  11. Oracle posted some stats by Stone316 · · Score: 5, Informative

    With respect to filesystems and database performance. EXT3 came out on top.

    --
    "Thanks to the remote control I have the attention span of a gerbil."
    1. Re:Oracle posted some stats by nyteroot · · Score: 2, Informative

      In that article, 4 filesystems are used: ext2, ext3, ResiserFS, and JFS. However, Reiser4, which is the clear leader in these benchmarks, was not tested.

      --
      Ratio of replies to old sig content : replies to actual post content > 0.5. Sig changed.
    2. Re:Oracle posted some stats by Gherald · · Score: 1

      Reiser4 wasn't availeable back in Oracle 8i days

      Those tests are very outdated.

  12. CPU Usage?! by metalmaniac1759 · · Score: 4, Funny

    30%-40% CPU Usage - whoa. What happens to poor me - with a PIII 550 MHz, 128 MB SDRAM and KDE running all the time.

    I switched from RedHat to Gentoo and KDE stopped crawling. God knows what will happen with ReiserFS4.

    Nandz.

    1. Re:CPU Usage?! by p3d0 · · Score: 1
      The CPU usage of ReiserFS is higher partially because it spends less time on IO. For IO-bound tasks, even your lowly machine would see an improvement.

      Besides, your P3 550 would have a slower disk.

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    2. Re:CPU Usage?! by Anonymous Coward · · Score: 0

      ReiserFS is even better to use on older systems. I switched to ReiserFSv3 from ext3 on my 333 MHZ comp and I saw a significant improvemnt. It definately makes up for the slower disks. I can't wait for Reiser4.

  13. Stupid question by Hard_Code · · Score: 2, Insightful

    Why do we need fast file systems? Or rather, why should we spend so much effort on performance differences of such small magnitude (take out reiser4 and its high cpu usage)? The only things I can think of are: swapping memory - this will really slow the system down, because disk is orders of magnitude slower than main memory; and databases that absolutely require high performance. But both of these typically *already* use custom file systems (or raw partitions) tailored to their exact needs. As normal user (somebody who is not constantly copying Mozilla source tries around), why should this matter? I can't remember the last time I thought "boy, this file access was slow, I wish I had a faster file system".

    --

    It's 10 PM. Do you know if you're un-American?
    1. Re:Stupid question by anthonyrcalgary · · Score: 1

      Something I haven't seen anyone mention...

      Could ReiserFS be used as the custom file system for a database? With it's extensibility and metadata, it's suitable for the basis of at least some databases.

      As normal user (somebody who is not constantly copying Mozilla source tries around), why should this matter? I can't remember the last time I thought "boy, this file access was slow, I wish I had a faster file system".

      Well... a journaling file system is useful for data integrity, which is useful to almost everyone, but if most journaling filesystems are a lot slower than stuff like ext2, then the performance is a big deal because users get integrity without a big speed drop.

      The only things I can think of where it wouldn't be beneficial is systems like laptops where you want the hard drive to spin down whenever possible, and cases where you want secure file deletion. Overwriting your file a hundred times won't make it unrecoverable if all those writes get journaled somewhere else.

      --
      When someone might yell at me, it has to be OpenBSD.
    2. Re:Stupid question by aszaidi · · Score: 1

      > Why do we need fast file systems?

      Try streaming a high quality video or two from your drive. An efficient file system will really make it go smoothly, leaving most of the CPU free for the decoder.

    3. Re:Stupid question by Arandir · · Score: 2, Informative

      When you're streaming a file, you're streaming a single file. All file systems will be roughly equivalent at this point. After all, once the harddrive has seeked to the starting sector, odds are very high that the data will be contiguous.

      Where you want the efficiency is when you have to deal with a large number of files. For example, I use UFS on FreeBSD. Before I used softupdates, untarring the ports tree severly bogged down the system, because there were tens of thousands of very small files. But after I turned on softupdates, the same operation was extremely fast. But moving a single large file around (tarball of my homedir) didn't make any difference if softupdates was on or off.

      --
      A Government Is a Body of People, Usually Notably Ungoverned
    4. Re:Stupid question by pair-a-noyd · · Score: 1

      I work with video files that are average size of 5 gigabytes per file, some as large as 11 gigabytes each.

      Also there are people that deal with huge scanned images (as I have in the past) that are many hundreds of megabytes per image.

      Speed is critical.

    5. Re:Stupid question by Anonymous Coward · · Score: 0

      I can't remember the last time I thought "boy, this file access was slow, I wish I had a faster file system".

      You don't run KDE, do you? ;-)

  14. quick question by perlchild · · Score: 3, Interesting

    quick question, what mount options were used on each? Wouldn't e3fs appear slower than the others if it was data-journaling? (I read the article a bit fast, but I didn't see any "how we tested") Also wouldn't a good test take at least two hours on each filesystem?

    1. Re:quick question by Anonymous Coward · · Score: 0

      FS performance is very important for daily work, though most people are focused on CPU speed. How long does it take for your favorite browser to start? How long does it take for your OS to boot? Improved I/O scheduling or data structures can make a big difference. My biggest complaint about notebook computers? Slow disk I/O. Cheers.

    2. Re:quick question by perlchild · · Score: 2, Informative

      Some of that however, has nothing to do with FS design though, browser start speed has dns components, thread starting, tuning. OS booting also has lots more factors than just the filesystem(on one of my systems, using XFS, even a dirty start and fsck of five filesystems, the part before the fsck is only 1% of the start of the machine, why? dns resolution by the daemons I need started near the end of boot)

  15. Because people use Linux at work too. by tdyson · · Score: 2, Informative

    I care about performance because I have a few hundred thousand files on servers that 100 employees need to access. Caching will only get you so far when you have a lot of people going after a lot of files through out the day. Since we have the choice of which fs to use, it is nice to have more info to pick between them.

    There is more to this world than home users and databases.

    1. Re:Because people use Linux at work too. by Arandir · · Score: 1

      So business people want fast file systems. Game geeks want to dump XFree86.

      If you give everyone what they want, you'll end up with a system that can do everyone, but nothing, at the same time.

      --
      A Government Is a Body of People, Usually Notably Ungoverned
  16. One word: Maildir by Gothmolly · · Score: 2, Informative

    If I have 2500 4k files in a directory, and my filesystem uses a stupid algorithm to store that information, it might take 2 or 3 times longer to retrieve a list of files than it does on a decent filesystem. When you've got a webmail frontend to your Maildir, and your webmail code has to grope through all directories, all files, even small differences are HUGE. Clever FS design will win the day.

    --
    I want to delete my account but Slashdot doesn't allow it.
  17. How much RAM? by Anonymous Coward · · Score: 0
    Ext2 uses little RAM for my old P133-32MB RAM, 8 GB.

    how much RAM uses reiser4?

    open4free

  18. record locking by Anonymous Coward · · Score: 0

    Unix traditionally doesn't support record locking for a portion of a file. For portability, this is best handled at the application (db engine) level.

    As far as using the fs itself as a rudimentary hash, FreeDB does exactly that: the compact disk ID (along with some other stuff) forms the lookup key, which is used as the filename. The Postfix MTA also uses fs-based hashing, with the first two characters of the message queue ID being used also as a subdirectory and sub-subdirectory name, to help speed up directory searches.

  19. JFS's figures by Anonymous Coward · · Score: 1, Informative

    One thing not mentioned in the kerneltrap discussion is the fact that JFS uses lazy allocation, and a complete dissociation of inum and block number. Under JFS, data space on disk isn't allocated until it's ready to be flushed from the buffer, which accounts for the 38 seconds of sync time at the end of the tests.

  20. Some XFS performance tweaks by Simon+Kongshoj · · Score: 2, Informative

    Greetings gentlebeings,

    Since I recently reinstalled my Debian system, I decided to put some effort into implementing my filesystems right. I decided on XFS for various reasons, mainly that it has always been rock solid for me (I've had some problems with ReiserFS causing heavy data corruption -- it's a long time ago and they've undoubtedly improved the system since, but still I prefer XFS since I've never even once had a problem with it, and know nobody who has), and that its good large-file performance is more useful to me than Reiser's kickass small-file performance. Also EAs and ACLs are neat. What sucks about XFS is poor small-file performance and abysmal delete performance.

    Anyways, I made a few bonnie++ runs and messed around with some of the many mkfs and mount options of XFS. In the end, my tweaked XFS filesystems beat ext3 (mode=ordered) for delete performance, which was a substantial improvement over a standard XFS mount.

    I made a writeup about the whole procedure at Everything2. Go slashdot the poor bastards. Warning: The language is tailored to the fact that not all E2 users are geek hardcore.

    --
    Six sick .sigs, the Number of the Beast!
  21. Improvement by Evolution by Jeppe+Salvesen · · Score: 1

    By having several file systems competing, the developers feel the stakes are higher and work harder to be the best. We also see a bit of differentiation, in that the different file systems typically are good at different things. In my experience, XFS is great for largish mysql databases, while ReiserFS does an excellent job on handling many, small files (like your /tmp).

    So, as long as Linus only allows stable filesystems in the stable kernels, I'm all for file system innovations making their way into the kernel! Everyone don't need 'em, but some do!

    --

    Stop the brainwash