Benchmarking Linux Filesystems Part II

Very interesting article... by toofast · 2006-01-06 05:36 · Score: 4, Interesting

An interesting analysis in every aspect, and it's fine and dandy for the person who uses 400 GB drives and a ATA controller on a 500MHz computer but I'd like to see how the filesystems compare on a bigass RAID system run by a Power5 server, or a few Itaniums that usually have with a few hundred connected users. Something a bit more "entreprise" - where the choice of a filesystem is a bit more critical than a small server or a home PC.

Re:Very interesting article... by CastrTroy · 2006-01-06 05:41 · Score: 2, Insightful

I'd like to see how they perform on a 12 GB Disk on a P2 266. You really start to see the differences when working on older hardware.

--

Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
Re:Very interesting article... by 110010001000 · 2006-01-06 05:42 · Score: 0

Toofast, if you (or anyone) would like to donate to me such a setup I would be happy to reperform the benchmarks and share them with the community. I was only able to test with the equipment I had on hand. Just drop me an email.

Thanks
Re:Very interesting article... by chrismcdirty · 2006-01-06 05:45 · Score: 1

Reiser4 kills old disks, supposedly. I (mistakenly) used it on my (at the time) year old laptop, and after about 4-6 months I kept getting something like "drive seek complete" errors. It got to the point where it wouldn't boot because of the errors. So I had to reinstall everything, and no data could be saved since reiser4 hated my drive. Been running Reiser3 ever since, and I haven't had any problems.... yet.

--
It's like sex, except I'm having it!
Re:Very interesting article... by cli_man · 2006-01-06 06:11 · Score: 0

I had 3 Squid cache servers that I run off from pretty old cheap hardware using Reiser and I found the best way to work it was to have 4 or 5 drives for the cache for best performance, reiser worked great but on the older equipment they had trouble with concurrent open files etc. By adding more drives I could have a lower end PIII handling 2,000 - 3,000 open files at a time, pretty impressive I would say.

--
The nice thing about Windows is - It does not just crash, it displays a dialog box and lets you press 'OK' first. Reg
Re:Very interesting article... by Captain+Segfault · 2006-01-06 06:20 · Score: 3, Interesting

It is completely absurd for a filesystem to kill a disk. If you were getting those errors (with the "drive ready" and "seek complete" bits being set being most common) it *strongly* suggests that either your disk is broken or it is improperly powered.

If you're actually using that disk, still, have a look at it with smartctl. In particular, run "smartctl -t long" on it, and have a look at the results. If it doesn't pass that, don't even think of trusting it with your data.
Re:Very interesting article... by chrismcdirty · 2006-01-06 06:39 · Score: 2, Interesting

'Kill' was a little strong for how I meant to use it. What I really meant to say (and can now find any data backing me up) is that Reiser4 deals with the disk so intensely that it uncovers flaws and errors that other filesystems may (A) never find, or (B) live with.

I'd look through the Namesys page, but it's large and the TOC didn't reveal any warnings.. or I wasn't looking hard enough.

--
It's like sex, except I'm having it!
Re:Very interesting article... by flaming-opus · 2006-01-06 07:18 · Score: 1

You raise an excellent point, though the only way to get ahold of enough hardware to make that test interesting is to get the system vendor to provide the hardware, in which case you often have limited ability to publish any results they don't like. (Been there, didn't publish that)

Furthermore, once you get into that high-end of a system, you're generally not all that interested in "general purpose" benchmarks. I have a lot of experience benchmarking filesystems on high-end systems. (15GBytes/s and so on) In those cases you're benchmarking everything: the application, the filesystem, the filesystem settings, the operating system, the OS settings, multipathing drivers, san environments, raid controllers, down to even the disk drives in the raids. It's hard to isolate the filesystem from this mess, except in the performance of the particular application.

In a sense, generic benchmarks only make sense on small servers and workstations, as you run a diverse set of applications, and have a limited set of hardware, that changes only modestly with time (though 500mhz is getting pretty antique there dude). Benchmarking a dual 2.4 ghz dell slab with a a mirrored pair of 10k scsi drives might be a little more useful, as there are a LOT those out there running linux. Benchmark mail-serving, web-serving, file-serving. Since these are the sweet-spot for linux servers, benchmarking these things would probably be most instructive to the broadest group of people. The microbenchmarks Mr. Piszcz runs are a little too workstation-like for my tastes. I don't consider workstation disk performance to be all that important, at least compared to server tasks.
Re:Very interesting article... by Karol+Trojanowski · 2006-01-06 08:15 · Score: 1

I would like to see someone do this test on a "QuantumOptical 6.1 GHz AtomChip Ultra Portable Wireless Tablet-notebook"
Re:Very interesting article... by Anonymous Coward · 2006-01-06 08:47 · Score: 0

'Something a bit more "entreprise"' --- Keep in mind that those "enterprise" servers tend to be busy doing other things, serving web pages, running a mail system or database, providing file services, compiling, etc, so you do not want a file system that requires a lot of CPU, especially if you use software RAID. Think about what happens in the case of software raid combined with software compression during backup of a big filesystem. You didn't want to use that server while it is backing up, did you?
Re:Very interesting article... by budgenator · 2006-01-06 10:09 · Score: 1

A journaling file system for squid caches seems like over-kill what's the rational for Reiser with a historic slowness with smallish files over ext2; isn't that kinda like doing back-ups on /tmp?

--
Apocalypse Cancelled, Sorry, No Ticket Refunds
Re:Very interesting article... by cli_man · 2006-01-06 12:41 · Score: 0

In playing around with different filesystems Reiser seemed to work the best, for the amount of files it seemed to be the most stable (1,000,000 files per drive). Also I was using Reiser V3 which I have heard is not as processor intensive.

--
The nice thing about Windows is - It does not just crash, it displays a dialog box and lets you press 'OK' first. Reg

pretty dry, except... by samyool · 2006-01-06 05:40 · Score: 1, Funny

wow, what a dry article.

However, scroll to the bottom. More latin translations than you can shake a stick at, including my personal favorite:

I have a terrible hangover.
Crapulam terriblem habeo.

-S

Need to be careful... by Conor+Turton · 2006-01-06 05:42 · Score: 3, Insightful

One thing this does show is that you need to be very careful to match the filesystem type to the main tasks the PC is going to be used for. Personally, there's no real clear winner as all have major gains or deficiencies in some areas. One very interesting point was the vast difference in the amount of available space after a partition and format between the different filesystems.

--
Conor "You're not married,you haven't got a girlfriend and you've never seen Star Trek? Good Lord!" - Patrick Stewart

Re:Need to be careful... by Raphael · 2006-01-06 06:19 · Score: 4, Insightful

One very interesting point was the vast difference in the amount of available space after a partition and format between the different filesystems.

Unfortunately, that graph is rather misleading. The ext2 and ext3 filesystems keep some percentage of the disk space as "reserved" and only root can write to this reserved area. This is useful if the disk contains /var or other directories containing log files, mail queues and other stuff. Even if a normal user has filled the disk to 100%, it is still possible for some processes owned by root to store some files until an administrator can fix the problem. On the other hand, if your filesystem contains only /home or other directories in which users are not competing for disk space with processes owned by root, then it does not make much sense to have a lot of disk space reserved for root. That is why you should think about how the filesystem is going to be used when you create it, and set the amount of reserved space accordingly.

The default behavior for both ext2 and ext3 is to reserve 5% of the disk space for root. You can see it in the section Creating the Filesystems from the article:
4883860 blocks (5.00%) reserved for the super user
You can change this behavior with the -m option, specifying the percentage of the disk space that is reserved. The article did not mention how the filesystem was supposed to be used if it had been used in production. However, I would guess that the option -m 0 or maybe -m 1 could have been used in this case. This would have provided a fair comparison and suddenly you would have seen all filesystems in the same range (close to 373GB available), except maybe for Reiser3.

--
-Raphaël
Re:Need to be careful... by flaming-opus · 2006-01-06 07:27 · Score: 2, Informative

except you don't want to do this. As disks approach full, the contigious stretches of free-space approach lenght zero, due to fragmentation. This is true on all filesystems. The result of this is that space allocation on a 98% full disk is much much slower than on a 2% full disk. With disks as cheap as they are, one shouldn't be sitting around with 95% full disks. If that's the case, there are work-flow/administration issues that need to be worked out, rather than unlocking that last little bit of space.

As I recall, the default on xfs for irix was to reserve the top 10% for root only.
Re:Need to be careful... by Anonymous Coward · 2006-01-06 08:48 · Score: 0

Of course you can use tune2fs later (at any moment, really) to modify this root-reserved percentage. For really large drives (which are more and more common), where most may be your /home partition (I have some partitions of over 160GB) the extra few percent may be enough to park that extra load of photos from your 1GB SDcard, rip that audio cd into, or whatever, until you find the time to burn more stuff to dvd or sort out what you can throw away... man tune2fs for more info.
Re:Need to be careful... by Anonymous Coward · 2006-01-06 09:01 · Score: 0

Hmm no. For a user, a graceful performance degradation is much better than a sudden disk full error at 95% utilization.

Re:First Prime Factorization Post by __aaclcg7560 · 2006-01-06 05:50 · Score: 1

Is "First Prime Factorization" is what you do when you have too much time on your hands? Persoanlly, I would put my time to better use by playing Quake 4 and blowing zombies to kibbles 'n' bits on a fast hard drive subsystem. :P

Hardware mismatch by lostlogic · 2006-01-06 05:52 · Score: 5, Interesting

It is widely known that Reiser filesystems are heavy on CPU usage 4 more than 3. These benchmarks seem to show a CPU bound IO situation as opposed to an IO bound IO situation. As an earlier comment pointed out, the hardware used in this test was a 500mhz CPU. My slowest computer is a 1000mhz system, which is usually IO limited, not CPU limited. I'd be interested to see these same benchmarks run on real hardware, or some more complex benchmarks (random RW, DB load, etc.). The hardware used for this test would be suitable for a fileserver, but not much else. In that situation, E2, E3 or XFS are probably the right choices as it points out. What about desktop loads, enterprise loads, or something more interesting?

--
--Brandon

Re:Hardware mismatch by Hextreme · 2006-01-06 06:00 · Score: 3, Informative

This was definitely an issue in testing here. The wide range of "winning" filesystems for the different tests clearly indicates the bottleneck is somewhere other than the disk. In most modern systems, this isn't an issue.

From TFA: ReiserFS takes a VERY long time to mount the filesystem. I included this test because I found it actually takes minutes to hours mounting a ReiserFS filesystem on a large RAID volume.

Looks like this guy makes a habit out of using systems with 500MHz CPUs... my dual 3GHz xeon box mounts a 1.2TB raid5 array formatted with ReiserFS in about 33 seconds, give or take a couple seconds.
Re:Hardware mismatch by Anonymous Coward · 2006-01-06 06:19 · Score: 0

It's a filesystem.
It shouldn't take a massive mechine to mount a drive.
A dual 3ghz is massive for serving out files.
If all you are doing is using samba or netatalk to serve files even 500mhz is overkill.
Re:Hardware mismatch by Clover_Kicker · 2006-01-06 06:35 · Score: 2, Insightful

> If all you are doing is using samba or netatalk to serve files
> even 500mhz is overkill.

Not for ReiserV4 :)

Seriously though, there's nothing wrong with designing a new filesystem to take advantage of modern CPU horsepower as long as everyone understands the system requirements.
Re:Hardware mismatch by Bronster · 2006-01-06 16:41 · Score: 1

ReiserFS takes so long to load (yes, many minutes on our 2TB volumes) because it loads the entire filesystem bitmap into memory at mount.
http://marc.theaimsgroup.com/?t=112068507200001&r= 1&w=2
Hopefully the patch for making this a mount option will make its way into 2.6.16 along with another couple of patches that we (FastMail.FM) add for other reiser issues. It's still the best filesystem by far for large Cyrus installations though.
Re:Hardware mismatch by saurik · 2006-01-07 02:09 · Score: 1

33 seconds?!? OH MY GOD. I have a Dual 2.8GHz Xeon machine (so rather similar) with a 0.75TB (66% as large) over RAID5, formatted to ext3. It takes me about _0_ seconds to mount it. My entire computer's bootup sequence from beginning to load the kernel until I get a login prompt is only 22 seconds. It takes you 33 _seconds_ just to _mount_ a drive?!?

Here's what's missing by CastrTroy · 2006-01-06 05:54 · Score: 5, Interesting

Here's what's missing. They forgot to tell you how well the drive performed after being used for 1 year, and having constantly moved data from one place to another, and constantly deleting and creating new data. It would have been a better test if the drive was about 75% full, with data from 2 years of use, and then the same tests were performed.

--

Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.

Re:Here's what's missing by phoenix.bam! · 2006-01-06 05:58 · Score: 1

Good idea. You should get right on that. Don't forget to keep accurate logs as well as make us pretty graphs to show us how well each filesystem performs?

Thanks.
Re:Here's what's missing by ebrandsberg · 2006-01-06 06:49 · Score: 1

Worse--it doesn't say what mount paramaters are used, or if any tuning was done. You can change the performance characteristics significantly if you tune the paramaters of the mount. I suspect that reiser4 was in a failsafe mode for data integrety, while the others were doing a bit more caching.
Re:Here's what's missing by drinkypoo · 2006-01-06 06:52 · Score: 1

Given that filesystem creation was shown, we can probably safely assume that no tuning was done, and that if he had specified mount options, he probably would have showed us those, too... though that last part is in a bit more question.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Re:Here's what's missing by flaming-opus · 2006-01-06 07:34 · Score: 2, Informative

You're absolutely correct, as free-space fragmentation can play a HUGE role in the speed of space allocation. Of course, this plays no role at all in stat, rename, remove, readdir, operations, or any reads or any writes to existing parts of files.

Since the benchmarks presented are so rudimentary anyway, this is maybe not the first thing to worry about.
Re:Here's what's missing by smartdreamer · 2006-01-06 12:05 · Score: 1

Fragmentation means longer seek times (proportional to angular distance on the same platter) and disk can't take advantage of read ahead caching.
Re:Here's what's missing by j3110 · 2006-01-06 19:34 · Score: 1

Exactly...

The Hans Reiser mentality at the moment seems to be make it go fast for his simple tests. There's nothing wrong with that, you have to start somewhere.

I used reiser 4 and reiser 3 on production systems for about a year with random updates/appends/deletes. After about a year, the reiserfs systems slow down. I switched it to JFS, and have had no issues.

I think it's the tails... they get stored off on their own, and as files get increased or whatever, those blocks with the tails have to get shuffled around somehow, which probably interfere's with the continuity of the metadata. I recommend JFS for anyone that doesn't reinstall their system every year, or runs any kind of database server.

--
Karma Clown

SATA? by ruiner13 · 2006-01-06 05:55 · Score: 1

How about some SATA benchmarks? PATA is good, but I suspect things will be much improved with SATA and NCQ. Does anyone have any links?

--

today is spelling optional day.

Re:SATA? by MarcQuadra · 2006-01-06 06:14 · Score: 2, Informative

IIRC NCQ isn't 100% fully-baked on Linux yet, so even NCQ-capable controllers and drives won't take advantage of it yet. I just upgraded my home file server with NCQ-capable gear and I don't think it's using it yet, even though I'm running the latest kernel.

There are patches for libATA that enable NCQ, but they're not in the mainline yet.

The only thing worse than testing without the new technologies would be testing with half-baked implementations of them. Let's wait until NCQ is done before we try testing with it.

--
"Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails
Re:SATA? by undeadly · 2006-01-06 06:22 · Score: 1

How about some SATA benchmarks? PATA is good, but I suspect things will be much improved with SATA and NCQ. Does anyone have any links?
Most won't notice any speed difference when moving from PATA to SATA. On PATA you typically have two harddisk one the same controller, but that hurts performance when using both disks at the same time. With SATA this is not a problem, assuming you have enough SATA connections available. NCQ may reduce desktop performance, and is most usefull for server like environments. For more info, search Storage Review
Re:SATA? by Anonymous Coward · 2006-01-06 07:28 · Score: 0

Not trying to troll here, but one of the main arguments for Linux/Unix is that the open-sourcedness makes it possible to more rapidly add features. NCQ has been around for at least a year now, how is it that Linux doesn't fully support it yet? I'm guessing drive makers aren't holding back the technology forcing developers to reverse engineer it. What's up?
Re:SATA? by bani · 2006-01-06 07:33 · Score: 1

Because few vendors and drive manufacturers support NCQ. Getting a combo which correctly supports both is tricky. It has nothing to do with linux, it's all about the hardware itself.
Re:SATA? by MarcQuadra · 2006-01-07 09:40 · Score: 1

Well there's HUGE changes going on in the Linux dsk-access area, libATA is abstracting virtually ALL the storage functions into an in-kernel library that will be more modular and allow drivers to be developed more rapidly.

Getting NCQ working isn't easy work, there was absolutely no functionality in the kernel for it before, and adding it the right way means waiting for libATA to be fully-baked and adding it there.

--
"Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails

I would agree by jd · 2006-01-06 05:56 · Score: 2, Informative

From a brief examination of the benchmarks, I'd say the following would seem to hold up:

JFS: Great for software development, as it allows rapid file and directory reads, writes, creates and deletes
XFS: Seems to work best with much more stable content. Creating and mounting the partition is also fast, and the FS overhead seemed low. Should be good for static databases, particularly if you're going to use a network filing system to access the drive, say using a SAN.
Reiser4: Surprisingly, I didn't see Reiser4 really shine at a whole lot in the benchmarks. The massive mount time tells me it needs to be a local drive that only needs mounting the once. Just not sure what sort of data would be best on it.
Ext2/Ext3: Mediocre at almost everything. Distros like Fedora that mandate the initial install ONLY use Ext3 are being stupid. The best fall-back filing systems if you can't find anything better for what you want the partition to do, but should never be used in specialized contexts.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)

Re:I would agree by lawpoop · 2006-01-06 06:16 · Score: 4, Interesting

I'm no expert by any means, but I think the idea behind the ReiserFS is breaking down the FS paradigm from the file level to the line level.

There is the classic example from the Reiser website. If your password file gets hacked, you have to ditch the whole file if you're using traditional file systems. You only know whether or not the file's been changed. However, with the Reiser system, it can tell you *what line*, and thus which user/password, was changed.

That's just a taste of where you can go with the ReiserFS. There are other things coming down the pipe; check out the reiser website for a better idea of the new features that ReiserFS promises.

--
Computers are useless. They can only give you answers.
-- Pablo Picasso
Re:I would agree by Anonymous Coward · 2006-01-06 06:26 · Score: 5, Insightful

Ext2/Ext3: Mediocre at almost everything. Distros like Fedora that mandate the initial install ONLY use Ext3 are being stupid. The best fall-back filing systems if you can't find anything better for what you want the partition to do, but should never be used in specialized contexts.

Huh? Sorry, did you read the same graphs or are you just trolling?

This article shows that ext2 and ext3 are close to the top performer in most tests and do not have many "worst-case scenarios" (unlike, e.g. Reiser3 and Reiser4).

If there is anything that you can conclude after reading this study, it is that ext3 is a reasonably good default choice for a filesystem.
Re:I would agree by david.given · 2006-01-06 06:37 · Score: 1, Redundant

Reiser4: Surprisingly, I didn't see Reiser4 really shine at a whole lot in the benchmarks. The massive mount time tells me it needs to be a local drive that only needs mounting the once. Just not sure what sort of data would be best on it.
I think the ReiserFS mount times in the benchmark are misleading. From my experience, mkreiserfs creates an extremely basic file system; the first time you mount it, the file system driver itself will do a lot of heavy housekeeping, which takes ages. Subsequent mounts are much faster.
In fact, I find the whole benchmark a bit dubious. A lot of the operations will vary wildly in speed depending on how much data is currently in the buffer cache or not. This means that performing the benchmarks in a different order is going to vastly change the results... couldn't he at least put a 'sync' in every now and again?
Re:I would agree by m50d · 2006-01-06 06:43 · Score: 2, Interesting

Reiser4: Surprisingly, I didn't see Reiser4 really shine at a whole lot in the benchmarks. The massive mount time tells me it needs to be a local drive that only needs mounting the once. Just not sure what sort of data would be best on it.
Reiser4 now defaults to journalling everything - file data as well as metadata. If they left it like that, then no wonder it's slower - but it's the best choice if data integrity is important.

--
I am trolling
Re:I would agree by JonXP · 2006-01-06 06:44 · Score: 1

From TFA:

NOTE1: Between each test run, a 'sync' and 10 second sleep were performed.

--
Still IMing in the stone age?
Re:I would agree by david.given · 2006-01-06 06:48 · Score: 1

NOTE1: Between each test run, a 'sync' and 10 second sleep were performed.
D'oh!
But what's the sleep in aid of? It'll achieve precisely nothing --- the sync will block until all I/O is complete.
Re:I would agree by Tet · 2006-01-06 06:57 · Score: 1

Reiser4 now defaults to journalling everything - file data as well as metadata. If they left it like that, then no wonder it's slower - but it's the best choice if data integrity is important.
Best choice for you, perhaps. If data integrity is important, then reiserfs is the last place I'd be looking. I'd be going with ext3 with data journalling enabled.

--
"The invisible and the non-existent look very much alike." -- Delos B. McKown
Re:I would agree by m50d · 2006-01-06 07:00 · Score: 1

I meant best choice of those tested. I'd certainly like to see benchmarking of reiser4 against ext3 with data journaling.

--
I am trolling
Re:I would agree by smoker2 · 2006-01-06 07:03 · Score: 2, Insightful

Ext2/Ext3: Mediocre at almost everything. Distros like Fedora that mandate the initial install ONLY use Ext3 are being stupid. The best fall-back filing systems if you can't find anything better for what you want the partition to do, but should never be used in specialized contexts.
How the hell did you come up with that opinion ?
Ext3 came 1st or 2nd in 24 out of the 40 tests done. If you were producing an OS for general purpose computing, would you use a specialist fs or the best performing general purpose one ?
You seem to have good words for JFS and XFS though, and XFS had only 13 1st or 2nd places !
How do you work out that Ext3 is "mediocre" from those figures ?
(you sound like you run debian)
Re:I would agree by pegr · 2006-01-06 07:04 · Score: 1

NOTE1: Between each test run, a 'sync' and 10 second sleep were performed.

D'oh!

But what's the sleep in aid of? It'll achieve precisely nothing --- the sync will block until all I/O is complete.

Maybe it's to flush from the internal drive cache to the platters? Just because the OS says the data is flushed doesn't mean the data is flushed...
Re:I would agree by KiloByte · 2006-01-06 07:32 · Score: 1

Please, don't use the words "ReiserFS" and "data integrity" in the same sentence.

Reiser eats filesystems like popcorn. I have used it for around a couple of months on two boxes, and in both cases every file bigger than around 4KB went to hell; in one case on the whole filesystem and in a big subtree in the other. I'll be damned if I ever give it another try, especially considering that other FSes trump it speedwise as well.

Why? ReiserFS has an order of magnitude more code than ext3, and more than twice as much as the biggest contender; also, that code has seen little scrutiny. Quoting the words of Hans Reiser: "I personally think filesystems should be rewritten from scratch every 5 years". How exactly does he expect to have it bug free?

Generally, your choices are:
JFS -- small files, lost of creations/deletions
XFS -- throughput for large files (video, etc)
ext3 -- simple, reliable

--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
Re:I would agree by m50d · 2006-01-06 07:46 · Score: 1

Reiser eats filesystems like popcorn. I have used it for around a couple of months on two boxes, and in both cases every file bigger than around 4KB went to hell; in one case on the whole filesystem and in a big subtree in the other.
Well, that's the opposite of my experience. When I got fed up with fsck times with ext2, I tried ext3 only to have it unreadably corrupted within a few months. Since then I've used reiser on every system I have, with no problems (including the same disk that was trashed by ext3).
I'll be damned if I ever give it another try, especially considering that other FSes trump it speedwise as well.
Not true on the whole, especially for small files. This benchmark is meaningless in relation to a modern system - drives like that and then 500mhz processor?

--
I am trolling
Re:I would agree by diegocgteleline.es · 2006-01-06 07:49 · Score: 2, Insightful

Distros like Fedora that mandate the initial install ONLY use Ext3 are being stupid

It's amazing that such commentaries are moderated interesting these days. So, uh, fedora developers are stupid and you're smarter than them?. Please take a look at this commentary to understand why such decisions aren't so simple. You can tune your car's engine and it'll be faster, right? But why not everybody tunes their engines?

Let me quote a ext3 paper: "The ext2 and ext3 filesystems on Linux are used by a very large number of users. This is due to its reputation of dependability, robustness, backwards and forwards compatibility, rather than that of being the state of the art in filesystem technology."
Re:I would agree by stedo · 2006-01-06 07:57 · Score: 1

One other important thing about XFS though, it's really bad in cases of power failure. Running ext3 or Reiser, a machine can almost definetly survive a power failure, but if your system crashes with XFS there's a good chance you'll be left with an unmountable disk.
Re:I would agree by Shelled · 2006-01-06 08:13 · Score: 2, Interesting

"Reiser4 ..... it's the best choice if data integrity is important."

Any time I've lost a drive to data corruption it was formatted Reiser, every attempt at using Reiser eventually resulted in massive data corruption. This was various hardware and distros. I don't know about the newest version but trice bitten forever XFS for me.
Re:I would agree by joeljkp · 2006-01-06 08:23 · Score: 1

I have to second the GP. I've used ext3 and Reiser3 alternately on my laptop, and my Reiser3 experiences have usually ended in disaster. From a refusal to mount to Reiser Windows programs not copying stuff over correctly, I've learned to stick to ext3 if you want to be safe.

--
WeRelate.org - wiki-based genealogy
Re:I would agree by Anonymous Coward · 2006-01-06 09:16 · Score: 0

what is best for a "dynamic" (lots of inserts and updates) database?
Re:I would agree by wahwah · 2006-01-06 09:22 · Score: 1

Why not calculate a simple regression? I know that it doesn't follow the way you normally calculate statistics. But let's say one task for one system is one observation. Then you would get something like this:
Call: lm(formula = secs ~ fs, data = F) Residuals: Min 1Q Median 3Q Max -21.554 -14.636 -7.982 14.958 97.796 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 17.13381 4.30162 3.983 0.000117 *** fsEXT3 -0.02476 6.08341 -0.004 0.996759 fsJFS -3.39286 6.08341 -0.558 0.578072 fsREISERv3 1.04524 6.08341 0.172 0.863870 fsREISERv4 4.49048 6.08341 0.738 0.461863 fsXFS -0.07143 6.08341 -0.012 0.990651 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 19.71 on 120 degrees of freedom Multiple R-Squared: 0.01424, Adjusted R-squared: -0.02683 F-statistic: 0.3467 on 5 and 120 DF, p-value: 0.8835
So the "estimate" column shows the relation between the filesystem and overall mean. Ext3 with -0.02 is just average. JFS with -3.39 is the fastest. XFS with -0.07 is just a little bit faster than EXT3.

This "analysis" assumes that each task is equally important, which is not true. So I think the best test would be to log all the disk activity and log it, so it would look like:
task | size |time -----+------+------- read | 100 |0.001 write| 200 |0.02 ... ... ...
The normal disk activity would get logged. Then one would just spend one day with each filesystem. Then one would make a regression that would show a real (or at least close to real) estimation of each filesystem performance.
Re:I would agree by m50d · 2006-01-06 10:14 · Score: 1

I lost a drive to ext3 (my first ever journalled FS), switched to reiser and have never looked back. No corruption in 3 years of having various numbers of systems. But whatever works for you (I've always avoided XFS because I like to put my bootloader in the superblock, less trouble when dual-booting with windows).

--
I am trolling
Re:I would agree by m50d · 2006-01-06 10:16 · Score: 1

Well, as I said, not my experiences of them. But I agree windows reiserfs programs suck.

--
I am trolling
Re:I would agree by jd · 2006-01-06 12:28 · Score: 1

Databases are interesting, in that files tend to creep in size and only occasionally packed down. These are read/write operations, only initial setup will involve any creates, and there are probably no destroys at all. Files will not be looked for, and the file datestamp is unimportant.

The closest parallel to most of these operations is in the copying of a tarball, and then TARring up the Linux kernel. (Untarring creates files, and as I've said, creating files is not the bulk of a database's work.)

JFS does best, according to these benchmarks, with ext2/ext3 following close behind. The other filesystems aren't worth a damn for database work.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Re:I would agree by Wolfrider · 2006-01-06 17:59 · Score: 1

This might be a better command to use:

# Flush i/o buffs on dev
blockdev -v --flushbufs /dev/hda

--
.
== WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
Re:I would agree by f0rt0r · 2006-01-07 07:09 · Score: 1

Actually you can use XFS during the initial installation of Fedora Core 3 and 4 ( I haven't tried it with Fedora Core 1/2 ). At the installation menu, at the "boot" prompt, type "linux xfs", and when you get to the disk setup screen, you can now specify XFS format for partitions.

Enjoy!

--
I can't afford a sig!

Meaningless? by bombshelter13 · 2006-01-06 05:57 · Score: 0

So he's benchmarked two different file systems on two almost completely different hardware setups (different drives, different raid controllers, different ammounts of RAM) and produced completely meaningless results? This is news how?

Warning by c0dedude · 2006-01-06 05:57 · Score: 2, Informative

Remember, fastest!=best. Some filesystems cannot shrink. Some cannot change size at all. If you're doing anything with LVM or RAID, generally ext3 is the way to go. If you're just formatting a disk and using it without anything on top of it, these FS's may be for you. Then again, ext3 looks damn good in the tests as stands. XFS looks like the clear loser.

--
Since when has this country used intellectual elite as a pejorative term?

Re:Warning by Dionysus · 2006-01-06 06:16 · Score: 1

How often do you shrink a volume, anyways? I think fastest when it comes to reading the data you expect to have on a given volume is much more important than whether the system can shrink or not. Don't know about jfs, but both xfs and reiserfs lets you grow the filesystem. if you are using LVM and RAID, both Reiserfs and XFS are great.

--
Je ne parle pas francais.
Re:Warning by lividdr · 2006-01-06 06:23 · Score: 1

Problem is that EXT2/EXT3 don't do online resizing. I see there are kernel and e2fsprogs patches to support it, but I keep seeing 'Make sure you have a very good backup' in the notes. Reiserfs, at least, does online grow very nicely.

Sucks to find out your ext3 /usr is a pinch too small for the new OOo 2.0 build you just did and have to kill off just about everything (or reboot into single) just to unmount /usr.

--
Give a man a beer and he wastes an hour. Teach a man to brew and he wastes a lifetime.
Re:Warning by drinkypoo · 2006-01-06 07:01 · Score: 3, Insightful

XFS does things that ext? and Reiser can't do. Reiser does things other FSes don't do as well. It's a true 64-bit filesystem and it supports insanely large filesystems, up to 9 million terabytes in 64 bit mode (with a 64 bit kernel.) It even provides realtime support, although I guess that's still beta in linux? It can be defragged and even dumped while live. It has insanely quick crash recovery. And of course, it does other stuff too; check the project page. XFS may not be the fastest filesystem - it may even be the slowest - but it's got features no other filesystem has. If you need them, XFS is the winner. Hell, if you just trust XFS more than you trust other filesystems, it's the winner. (Sorry, but I wasn't sleeping when reiser was eating everyone's data, and ext3 handles corruption much more poorly than any of the other Journaled options.)

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Re:Warning by bani · 2006-01-06 07:39 · Score: 2, Interesting

You can't fsck an xfs mounted filesystem, even if it's mounted read-only. If your root fs gets damaged and you need to fsck it, you need to boot from a rescue CD. If it's a server in a remote location, you're shit outta luck.

ext3 and reiser at least let you fsck read-only mounted filesystems.

I brought up this problem to xfs developers and their response was "well, it's not a problem on SGIs so we're not going to fix it". Nice.
Re:Warning by rubycodez · 2006-01-06 07:44 · Score: 1

not SOL with the right server grade hardware, just upload the CD ISO image to the remote management board on the server and boot from it.
Re:Warning by bani · 2006-01-06 07:52 · Score: 1

so you have to buy special management hardware just to support xfs, that no other filesystem requires. nice.
Re:Warning by bored · 2006-01-06 08:27 · Score: 1

Reiser does things other FSes don't do as well. It's a true 64-bit filesystem and it supports insanely large filesystems, up to 9 million terabytes in 64 bit mode (with a 64 bit kernel.)

I don't know about the setup being tested, but when I ran Reiser on a very large (many Tb) file system I discovered it gets slower the larger the filesystem, after a while its simply to slow to use. So while it may "support" large file systems, I'm betting no one has plugged it into a 50Tb file system to see if it really works.
Re:Warning by Anonymous Coward · 2006-01-06 08:29 · Score: 0

Some of us realize that our boot partition doesn't need to be journaled.
Re:Warning by quarkscat · 2006-01-06 08:57 · Score: 1

Amen to that!

XFS is a mature, stable, and very versitile filesystem. This FS shines best when used with fast disks and battery-backed caching RAID controllers. I am using it quite successfully with Slackware 10.1, and cheap IDE RAID controllers for homebrew NAS, as well as a PostgreSLQ server. SGI was very generous in releasing XFS source and dedicating resources to the OSS community. The CXFS, or Cluster XFS version of this filesystem would rock Linux if/when it becomes available.
Re:Warning by drinkypoo · 2006-01-06 09:20 · Score: 1

Sorry, I was talking about XFS there. I realized that my sentence did not mean what I meant it to only after submitting it (and I did preview it first... Yay for human error.)

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Re:Warning by drinkypoo · 2006-01-06 09:23 · Score: 1

Or you could netboot it. I hope you've got remote console access to these remote servers...

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Re:Warning by dhasenan · 2006-01-06 09:34 · Score: 1

Or you could use LILO or GRUB, have a backup Linux on a separate hard drive, modify a conf file, and reboot. No special hardware or software necessary. That's still a downtime of as much as ten minutes, but it's on an unreliable server anyway. Or else it's due to routine maintenance, so you can plan for the downtime.

Still, ten minutes just to fsck the damn thing....
Re:Warning by rubycodez · 2006-01-06 10:09 · Score: 1

not special hardware, such things are built into "real" servers as standard equipment, e.g. HP's ILO boards. Even using a cheap %40 eBay bought PC as a server, there's still plenty of options if trouble needing fsck on root partition, good to have a backup or emergency boot parition anyway regardless of filesystem used. After all, I've had older reiserfs and ext filesystems that couldn't be fixed at all (have to give this v4 stuff a whirl sometime)
Re:Warning by bani · 2006-01-06 10:20 · Score: 1

still doesn't excuse xfs from being able to do a very basic function that just about every other linux filesystem on the entire planet is capable of.

it's a very glaring omission, and the xfs developers' attitude toward the problem ("it's not a problem on SGIs") is not the least bit reassuring.
Re:Warning by Anonymous Coward · 2006-01-06 10:28 · Score: 0

Since when has this country used intellectual elite as a pejorative term?

Since the American Revolution. Read Richard Hofstadter's book, Anti-Intellectualism in American Life.
Re:Warning by bored · 2006-01-06 10:43 · Score: 1

Chuckle, yah we are using XFS too, but there are limitation in some versions of the linux kernel at assorted places that keep it from getting very large too. There are a number of patches (from what I understand some of which have made it into the kernel recently) that allow it to get larger than anything we have here. Problem temporarly solved.
Re:Warning by mandolin · 2006-01-06 11:50 · Score: 1

The boot partition doesn't need to be, but the root partition sure as heck should be.
Re:Warning by rubycodez · 2006-01-07 02:47 · Score: 1

you can force a repair of a mounted read-only XFS (including root) partition. It's just an unsafe practice.

from the man page xfs_repair(8):

-d Repair dangerously. Allow xfs_repair to repair an XFS filesystem mounted read only. This is typically done on a root fileystem from single user mode, immediately followed by a reboot.

But the right way to do it (whether running XFS or reiserfs or xfs or ext2/3 or whatever) is to have machine already set up to have alternate root partition.

The SGI developers you spoke with probably were hinting that piss-poor admin practices shouldn't be considered.
Re:Warning by alexq · 2006-01-07 03:36 · Score: 1

hmm... could you explain why it is that you think ext3 is best for RAID?
Re:Warning by bani · 2006-01-08 13:40 · Score: 1

The SGI developers you spoke with probably were hinting that piss-poor admin practices shouldn't be considered.

No, they just flatly stated that XFS didn't need this functionality:
1) because XFS "never gets corrupted"
2) because SGIs have a PROM monitor

Basically they were hinting that XFS was a filesystem primarily for SGI hardware and Irix, not for Linux. Eg if SGIs dont need it then you dont need it either.

I assume they added -d after being told over and over that they were full of shit.

FWIW I have alternate boot partitions setup with RIP. Strangely enough the machines running xfs were the only ones I ever needed to use it on -- they were the only machines which ate themselves for breakfast.

Something's up by Anonymous Coward · 2006-01-06 05:59 · Score: 1, Interesting

I'll leave aside the fact that all the other benchmarks I've seen are very favourable to Reiser4 and this is very unfavourable, and concentrate on the discrepancies.

Reiser4 is the slowest in searching, creating and removing. It performs a lot better when tarring and untarring, which indicates that reading and writing is much better than other filesystems. However, when you get to copying and creating large files, it loses again.

Why the discrepancy? These benchmarks contradict others, but don't make sense when taken alone. I'm inclined to believe the other benchmarks.

Re:Something's up by 0xABADC0DA · 2006-01-06 08:07 · Score: 1

Having actually *used* Reiser4 for a while I can give anecdotal evidence backing this benchmark showing very slow times (although the reality here is probably that it is updating the access time whereas the others weren't, or doing data journaling, something like that). I used r4 with gentoo, so LOTS of tiny files from portage, and vmware so lots of LARGE files that get their contents modified and occasionally expanded, and many emerge world so LOTS of opportunities to scatter data all over the disk.

At first and for a while reiser4 was blazing fast. I'd untar some source code and it would be done writing thousands of files before the return key was even released. But then it started getting *really* slow. More and more frequently the fs would, I presume, rebalance it's trees causing all IO to lock up for 10+ seconds. This was particularly annoying because doing ":wq" in vi would often cause this (I suppose due to a sync/fsync operation). Whatever reiser4 was doing in these lockups, it must not have succeded since it kept happening. All around, the performance just tanked. So if this guy's benchmarks recreated the conditions that cause reiser4 to asplode I can completely believe the results. I've since reformatted with ext3 and it's fairly fast, but mostly the benefit is that it is predictable and consistent.

Also I was using JFS once on a data-partition and was copying things around to redo the partitioning. I assume this is linux's fault, but the whole memory would fill up with unwritten data and then performance would tank almost as if the copies I was doing were reading 4k, then writing 4k (the source and dest were on different IDE chains so you'd think it would at least overlap these...). So I tried doing "while :; do sync; done" and amazingly this made the speed *much faster* (like 30mb/s I would expect instead of like 1mb/s after a while). Only problem is half-way through the jfs filesystem being copied to just gave up the ghost. No power fluctuation or loss, nothing. The files were just not there and I got error messages. Reboot and it couldn't even mount or fsck the jfs partition. Just another anecdote fyi... other than that jfs seems to rock hardcore.

how to lie with statistics by Clover_Kicker · 2006-01-06 06:00 · Score: 4, Insightful

I love the CPU utilization graph for "touch 10,000 files".

A quick glance shows ReiserV4 as much more CPU intensive, you have to look at the scale to realize it only used 0.3% more CPU.

Re:how to lie with statistics by j0ebaker · 2006-01-06 06:52 · Score: 1

ReiserFS is much more CPU intensive when it comes to writing small files. Remember how the partition is broken up into smaller custers (a size of 4k comes to mind but this is customizable). Well other filesystems use each of those chunks as the smallest amount of space which can be allocated to a file. Well ReiserFS uses disk space much more efficiently by packing more than one of those files into a cluster.

So after touching 10,000 files it would have been interesting to analyze how much space on the disk was used by each of the compeditors of ReiserFS.

I like to use ReiserFS on IMAP mail servers where the sizes of individual messages can be very small.

On the other hand I've seen huge beowulf clusters where client machines touch files over nfs where the cpu load on the fileserver went ballistic precicely because ReiserFS was trying to make such efficient use of the space.
Re:how to lie with statistics by Cyno · 2006-01-06 07:15 · Score: 0, Troll

Yeah, right, they're lying to you. Those evil anti-reiser zealots. They only provided 40 graphs of various benchmarks, and only 80% of those scale up from 0. Its so obvious how biased these results are against ReiserV4. Its so full of lies I bet Microsoft funded this benchmark.
Re:how to lie with statistics by fossa · 2006-01-06 08:37 · Score: 1

I haven't used Reiser much, but I thought this "tail packing" or whatever they call it was optional?
Re:how to lie with statistics by Clover_Kicker · 2006-01-06 12:27 · Score: 1

Take a pill, son.

That graph was misleading, hard to say if it was malice or incompetence.
Re:how to lie with statistics by Cyno · 2006-01-08 10:26 · Score: 1

Yes, graphs can be misleading.

That's why they often include labeled axis and a legend.

I remember once in high school I was in this class and my teacher was explaining all about this. She would draw several graphs on the overhead projector and have us plot out various functions and show us how to label them and everything. It was a lot of fun. You should try it sometime. I suggest you sit down with some graph paper and just play around, plot out a few functions and rescale them. Maybe even pull out the geometry book and try some rotation and translation.

To think this was either malice or incompetence is assuming the graph was misleading. However, I had no problem reading and understanding the graph and I appreciated the rescaling. Perhaps you shouldn't be taking so many pills there, pop. Its obvious from these benchmarks this system was CPU bound and resierfs is not the fastest filesystem in every benchmark when you're running on old CPU bound hardware.

But to think they are intentionally lying with statistics, that's absurd. Hence my original sarcastic comment. My appologies for backing it up with logic.
Re:how to lie with statistics by Clover_Kicker · 2006-01-09 01:22 · Score: 1

If you weren't so busy basking in your own genius, you might have heard of the classic How to lie with statistics.
Re:how to lie with statistics by Cyno · 2006-01-09 09:40 · Score: 1

You actually read that? Hehe

Yeah, I heard of that classic, but I never read it because I already know how to lie with statistics.. isn't it obvious?

Anyway, did he lie with these statistics?
Is that what you are trying to say?
Does that make you feel bad or something?

Get over it..
Re:how to lie with statistics by Clover_Kicker · 2006-01-09 15:04 · Score: 1

> Get over it..

You're the one who keeps trying to have the last word.

no reason to switch by Anonymous Coward · 2006-01-06 06:01 · Score: 1, Informative

Actually, what I take from this is there's no need to switch from a safe, standard EXT3 FS which is the default of many distros.

I use the same machine! by denverradiosucks · 2006-01-06 06:04 · Score: 1

[quote]
COMPUTER: Dell Optiplex GX1
CPU: Pentium III 500MHZ
RAM: 768MB
SWAP: 2200MB
CONTROLLER: Maxtor Promise ATA/133 TX2 - IN PCI SLOT #1
DRIVES USED: 1] Seagate 400GB ATA/100 8MB CACHE 7200RPM
2] Maxtor 61.4GB ATA/66 2MB CACHE 5400RPM
DRIVE TESTED: The Seagate 400GB.
[/quote]

It's comforting to know I'm not the only one still using one of these! Those are almost the exact same specs as my linux server!

Re:I use the same machine! by Anonymous Coward · 2006-01-06 06:42 · Score: 0

Me too: I have one of these exact machines but with only 512Mb RAM and 1Gb of swap space that hosts several ATA/100 HDDs grouped with LVM and formatted to a single 0.5Tb XFS volume. The OS - GNU/Linux - boots and runs from an ultra160 18Gb Seagate drive.

Until yesterday when a drunken slip of the finger entered init 0 instead of 1 [doh!] it had happily been running for ~200 days, providing rock solid and high speed file serving of mostly large files [hence XFS].

I've just obtained another almost identical machine [GX110, Pentium3 @ 1GHz] which will be outfitted almost identically except for x2 power in just about everything that counts: CPU is twice as fast, RAM will be 1Gb and it will be stuffed with 1Tb of cheapo ATA disks. An ultra320 SCSI for the OS will finish the job.

Luckily I get these boxes free from work as they are retired, and just have to spring for the SCSI kit and extra RAM normally: nonetheless, if anyone is looking to build themselves a great value for money fileserver I would certainly recommend one of these old GX* boxes as a base. They're miles better than the Dell crap I have to look after these days [Optiplex GX280, for example].
Re:I use the same machine! by Procyon101 · 2006-01-06 07:43 · Score: 1

Yep. My primary Apache and Subversion server is a 500Mhz AMD K6II with 300Gigs of HD. Got it sans hard drive for free, along with 2 other machines for helping a friend move.

Heck, my main dev box is a dual 1 Ghz PIII coppermine with 1 Gig of ram, and the only issue I have with it is when it chokes on extremely large eclipse projects. I'm sure it can't play games worth a damn, but I don't need it to. I think I sunk a total of about $30 for this monstrosity.

Thinking about it, my entire business' network of about 10 machines cost a cash outlay of about $400, and most of that is for the gigabit switch and the archiving HD.

somewhat worthless by aachrisg · 2006-01-06 06:06 · Score: 5, Insightful

His benchmark data is ruined by using a gross unrealtistic piece of hardware - modern fast hard disks coupled with a cpu which is absurdly slower than anything you can buy.

Re:somewhat worthless by cli_man · 2006-01-06 06:17 · Score: 0

I am always surprised when I go into companies to see the old equipment they are running. It is not unusual to find a machine in the 500 mhz range with an ide raid controller to hold a few hundred gig's of space. Not everyone has the budget for the newest and greatest and not everyone has to have the processing speed. Many people are running fileservers that store a ton of info but don't actually process anything.

--
The nice thing about Windows is - It does not just crash, it displays a dialog box and lets you press 'OK' first. Reg
Re:somewhat worthless by Josh · 2006-01-06 06:26 · Score: 1

Other people have mentioned that it is not uncommon to use slower CPUs for fileservers since more CPU is often overkill. But even for workstations, one situation that should be of interest to many people is compiling one or more large source trees - there CPU usage of the filesystem is very relevant.
Re:somewhat worthless by Coplan · 2006-01-06 09:56 · Score: 1

I agree with you. I think it might have been better to show two hardware setups...a lesser machine and a more modern machine. However: I think it is important to know how these filesystems will run on a slower machine. One of the major appeals that Linux has is that you can compile your kernel to run on slower machines. It would not be entirely unlikely that someone has linux installed on their slower machine. In my personal setup, my lesser powered machine is the linux box. It is a firewall, e-mail and caching name server. These tests are very useful in picking the filesystem. I have a slow machine, it rarely needs to write 1gb files, and frequently needs to write small files in large quantities. So...while it would've been nice to see a more modern machine - it wouldn't be right to take the slower machine entirely out of the picture.

agreed: please rescale CPU utilization graphs by Anonymous Coward · 2006-01-06 06:08 · Score: 0

I have to agree with you, I got fooled looking at these charts.
It would be a lot more helpful to me from a practical standpoint
(i.e. which filesystem to choose) if all CPU graphs were scaled
from 0 to 100. That would help me understand which differences
were important, and which were irrelevent.

Sample size by rongage · 2006-01-06 06:13 · Score: 2, Insightful

Am I reading this "benchmark" correctly? Did he base his results on a sample size of 1?

At the very least, you run multiple times and average the results to give statistically meaningful numbers. I can't think of ANY time where a sample size of 1 was meaningful for anything.

What would be really interesting is to come up with a reasonable UCL and LCL for each test, and then calculate out a cpK for each test. It's one thing to say "I got these results one time", it's something much more impressive to say "I can achieve this result +-10%".

Of course, if a particular benchmark can't even hit a cpK of 1, then maybe there is room for improvement in the coding of the driver.

For those of you who haven't done much with statistics, cpK is a measure of "capability" in a machine or process. It shows how repeatable the measured process is. A higher number indicates that you have a highly targeted, low deviation process whereas a low number (1 or less) indicates that your process is incapable of repeatability and/or accuracy.

--
Ron Gage - Westland, MI

Re:Sample size by bubulubugoth · 2006-01-06 06:52 · Score: 1

No, you didnt read TFA correctly.

The made 3 samples of each test, and put the average.

The author states this at the beggining of TFA, where we was expaining the metodology and the test cases

--
Â_Â
Re:Sample size by Atzanteol · 2006-01-06 07:04 · Score: 1

Am I reading this "benchmark" correctly? Did he base his results on a sample size of 1?

No you're not, and no he didn't. FTFA:

NOTE5: All tests were run 3 times and the average was taken, if any tests were questionable, they were re-run and checked with the previous average for consistency.

--
"Ignorance more frequently begets confidence than does knowledge"

- Charles Darwin
Re:Sample size by hackstraw · 2006-01-06 08:27 · Score: 1

Am I reading this "benchmark" correctly? Did he base his results on a sample size of 1?

At the very least, you run multiple times and average the results to give statistically meaningful numbers. I can't think of ANY time where a sample size of 1 was meaningful for anything.

Why isn't there a -10 Wrong moderation option?

From the weak FA:

NOTE1: Between each test run, a 'sync' and 10 second sleep were performed. NOTE2: Each file system was tested on a cleanly made file System. NOTE3: All file systems were created using default options. NOTE4: All tests were performed with the cron daemon killed and with 1 user logged in. NOTE5: All tests were run 3 times and the average was taken, if any tests were questionable, they were re-run and checked with the previous average for consistency.
Re:Sample size by NereusRen · 2006-01-06 08:51 · Score: 1

Am I reading this "benchmark" correctly? Did he base his results on a sample size of 1?

From TFA: "NOTE5: All tests were run 3 times and the average was taken, if any tests were questionable, they were re-run and checked with the previous average for consistency."
Re:Sample size by toofast · 2006-01-06 09:08 · Score: 1

A sample size of 1 is adequate when you only have one server - and you run the benchmark on the actual server you're going to be using.

Having a benchmark use a sample of 1000 different units and tell you that ReiserFS 4 is the best is useless if it performs worst on your specific hardware platform.
Re:Sample size by Anonymous Coward · 2006-01-06 11:56 · Score: 0

Sample size of 1: Your wife is a nasty bitch in bed

PS: If you want me to increase the sample size, please send me your sister's info.
Re:Sample size by timeOday · 2006-01-06 12:12 · Score: 1

I can't think of ANY time where a sample size of 1 was meaningful for anything.
What are the sources of variance you think should be addressed by averaging?

It would be nice if... by bhirsch · 2006-01-06 06:15 · Score: 4, Insightful

There were some current (recent 2.6 kernel with XFS, JFS, possibly Reiser4, etc) benchmarks done on highend servers (or at least something with drives a few steps up from the CompUSA weekly special), especially if anyone wants to see Linux succeed in the enterprise.

Normalized results by dtfinch · 2006-01-06 06:15 · Score: 3, Informative

Based on the geometric mean of all the benchmark times for each filesystem, which effectively weights all benchmarks equally:
JFS won
EXT2 and EXT3 took 17% longer than JFS
XFS took 29% longer than JFS
Reiser3 took 38% longer than JFS
Reiser4 took 52% longer than JFS

Now, 1.52 seconds is not a whole lot longer to wait than 1 second. With any luck we'll see a post from Hans explaining why Reiser4 took longer, or what sacrifices were made to make the others faster, if there are any.

Re:Normalized results by phoenix.bam! · 2006-01-06 06:44 · Score: 5, Insightful

Reiser uses much more CPU for file system tasks. ReiserFS is a modern filesystem meant to run on modern machines. This machine is only 500mhz and therefore Reiser performs poorly. Had this machine been a 2ghz (standard now, 4x faster than the test machine), or even a 1ghz (Outdated and 2x as fast) machine Resier would have performed much better.

If you want to use parts from 1997 to build a computer, Reiser is not for you. 500mhz is at least 8 year old technology if I remember correctly.
Re:Normalized results by Nimey · 2006-01-06 07:08 · Score: 1

About six years. 500 MHz processors came out in late 1999.

--
Hail Eris, full of mischief...

E pluribus sanguinem
Re:Normalized results by Anonymous Coward · 2006-01-06 07:16 · Score: 0

Early 1999.
http://www.aceshardware.com/read_news.jsp?id=55000 489
Re:Normalized results by Anonymous Coward · 2006-01-06 07:18 · Score: 0, Insightful

Don't use that software garbage excuse of "there's more cpu lets use it always cause we can".

That's why stock dell's and HP's are so much god damn slower than a much worse specced machine.

If that's the concept for reiser, I can only guess a large portion of the linux population is retarded.
Re:Normalized results by Westley · 2006-01-06 07:27 · Score: 3, Insightful

It's one thing to say "Let's use more CPU because we can."

It's another to say "Let's use more CPU (which is usually relatively idle) in order to improve the normal bottleneck, which is IO."

I don't see what's wrong with that at all. Of course, it's no good if you've got a machine which doesn't represent the "normal" current situation, any more than using a graphics card for "acceleration" makes sense if the graphics card in question is 10 years old but you're using a fast new CPU.

Jon
Re:Normalized results by Anonymous Coward · 2006-01-06 07:36 · Score: 0

anal nitpicks, the two of you.
Re:Normalized results by dtfinch · 2006-01-06 07:43 · Score: 1

I like it when my system can run at 0-5% cpu usage during even the most intense disk activity, with minimal I/O wait. Disk activity should not be cpu intensitive. Reiser4 might win on a faster processor, but any filesystem that takes more than a few % cpu smells of possibly poor scalibility, which might pin the CPU to 100% on certain loads or configurations even on a modern system. Maybe there's a bunch of O(n) list operations going on in there that could be made O(log n) or O(1), or maybe it's doing a lot of poll waiting. There's no reason I can imagine for a filesystem to need a lot of CPU.
Re:Normalized results by cecom · 2006-01-06 07:46 · Score: 2, Interesting

While I basically agree with you, 500MHz is not four times slower than 2 GHz. However in this case it is probably worse, since a 500MHz PIII implies a slow 100MHz side bus, slow 33MHz PCI bus, slow PC100 memory. A terrible system for doing benchmarks in 2006! It is completely unrepresentative of anything.
Actually, I am getting angrier as I write this. It was just wrong to publish an article using such an outdated system. People worried about high FS performance are not going to be using anything like that.
Re:Normalized results by KiloByte · 2006-01-06 07:46 · Score: 1

If you want to use parts from 1997 to build a computer, Reiser is not for you.

If you want to use your CPU for things other than handling the filesystem, Reiser is not for you. If you know that having enough RAM to hold currently used files, Reiser is not for you. If you want a filesystem that is good at quickly creating/deleting a lot of small files (compiling, etc: JFS), ReiserFS is not for you. If you want a good linear throughput (video processing: XFS), ReiserFS is not for you. If you want something light-weight for a virtual machine (ext{2,3}) or for something that must be really stable (ext3), ReiserFS is not for you. If you want something that is GPL-compatible (_not_ mkreiserfs which has an advertising clause attached), ReiserFS is not for you.

Add abysmal data integrity on top of that, and the image gets pretty clear.

--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
Re:Normalized results by Procyon101 · 2006-01-06 07:50 · Score: 1

It depends on the purpose of the machine. If the FS is utilizing CPU and RAM to build an L1 Cache, align writes, do simple defragmentation during idle, etc, it might chew up quite a bit more CPU than a more conventional FS, which would be really bad if your using the machine as a PHP server or something... but if you are running the machine as a simple file server, say a remote /usr partion for your network, as you mention, a conventional file system will use at most only 5% of the proc, which means 95% of your machine is IO bound and laying idle during the most stressed times. A FS that utilizes more of that proc to take load off the disk is going to be wonderful for a machine that doesn't do anything but play with it's disk.
Re:Normalized results by Arandir · 2006-01-06 08:14 · Score: 1

So what? Doing the same benchmark with faster CPUs will make ReiserFS look faster, but doing the same benchmark with a faster IO system will make all the others catch up again.

It's bad enough when video games call you and idiot because you don't have last week's video card, but it's crossing the line with file system authors call you an idiot because you don't have last week's CPU.

If the CPU ain't broke, don't throw it away!

--
A Government Is a Body of People, Usually Notably Ungoverned
Re:Normalized results by LordMyren · 2006-01-06 08:33 · Score: 1

Forgive me sir, but I seriously do not think so.

Celeron 300A was Spring/Summer of `98. I cant imagine it took Intel a full 18 months to go from 450 mhz on the cheap end to 500 mhz on the fast end.

Those were the days. I still have a couple old Alpha heatsinks with what must've been an AMAZING 50 mm fan, weighing almost a whole third of a pound of solid aluminum. HA, fantastic. Two of the BP6's I built are still alive and kicking today. Dual powered celerons, whoo.

Compare that to my <a href="http://www.thermalright.com/a_page/main_prod uct_xp120.htm">XP-120</a>... hilarious. What a differnt age.
Re:Normalized results by Anonymous Coward · 2006-01-06 08:37 · Score: 0

I don't know what video games you have been playing but none of mine have ever called me an idiot. Video games want the best because they can do the most with it. Same thing with Reiser.

The reason reiser looks so bad on a slow processor is reiser is designed around the idea that the filesystem can increase performance by utilizing CPU and help to increase actual througput to the I/O system relieving that bottleneck.. As CPU speed increases, so will the performance of Reiser while the other filesystem are entirely tied to I/O speeds. Reiser in this case is not as fast but there is a point along the inceasing graph of CPU speed vs Performance where Reiser would become the best performing (in more situations, I'm not saying overall).

If I/O speed was to increase Reiser would also benefit. The point is, Reiser was not poorly designed and does not use a lot of CPU because the coders were lazy. Reiser was specifically designed around using a powerful CPU to be able to handle more data.

I'm not saying throw away your old CPUs, just don't run Reiser on them. But newer CPUs can run reiser and the user will benefit.
Re:Normalized results by LordMyren · 2006-01-06 08:40 · Score: 1

Reiser's stated goal is to increase throughput through additional CPU usage. If you're not using the CPU anything, since, say, you're blocking for I/O to go through, it makes sense to expense some CPU if you can get I/O done faster.

On the other hand, if you're running a database, yes, you need all the cpu you have. The question then becomes, how much I/O do you get per unit of CPU, but the situation is very complex; its not necessarily some easily reducable linear system. It could be that you get, for example, twice the I/O at ten times the CPU usage, but this doesnt imply you'll only get 1/5 the CPU at one times the cpu usage. Again, these systems dont have to be linear... scalability has many dimensions to it.

But yes, clearly the goal of ReiserFS is to serve a desktop platform where desktop apps are often going to be awaiting data to proceed.
Re:Normalized results by timeOday · 2006-01-06 12:21 · Score: 1

It's another to say "Let's use more CPU (which is usually relatively idle) in order to improve the normal bottleneck, which is IO."
It's easy to go overboard with conventional wisdom, including this.
For instance, a lot of stuff now is XML, which is text based. Parsing text files takes significant CPU power, so you don't want your filesystem using it up. In my experience, simply reading in a big text file containing floating point numbers, on a 1.6 GHz Pentium-M, I found that by far the bottleneck was CPU, NOT disk speed. I switched to a lazy approach where the entire file is read at startup, but only searched for newlines. Then individual lines are parsed only as needed. Startup time was cut by over 90%.
Re:Normalized results by level_headed_midwest · 2006-01-06 16:08 · Score: 1

I remember getting a Compaq with a AMD K6-2 at 500MHz in late 1999, but it certainly wasn't the fastest chip then. I remember that was a PIII-700 or better.

--
Just "gittin-r-done," day after day.
Re:Normalized results by Josh · 2006-01-06 17:27 · Score: 1

It doesn't make sense to give equal weight to the time required for file system creation as for the other types of tasks unless you do it roughly as often (which is highly unlikely).
Re:Normalized results by dtfinch · 2006-01-06 21:04 · Score: 1

It beats doing a straight average of all the benchmarks, where some take 50 seconds and others take 0.03 seconds. I was in a hurry.
Re:Normalized results by Anonymous Coward · 2006-01-07 00:08 · Score: 0

Parsing text files is generally fairly fast, but XML is one of the most complicated text formats, and thus slowest to parse.

While parsing the text file containing floating point numbers, have you checked whether the bottleneck is actually parsing the numbers or allocating the storage for them? If you're reading them into separate objects, you're probably spending most of your time allocating those objects...
Re:Normalized results by Josh · 2006-01-07 06:08 · Score: 1

My point isn't to pass out criticism, but rather to encourage decision-makers (I'm using this term literally, not as a euphemism for people controlling corporate purse strings) to read the individual benchmarks and see how they apply to their typical usage patterns. In particular, spending a high percentage time creating the initial filesystem would be a very unusual pattern of usage, and therefore that test skews the analysis against ext? in the geometric average.

Of course Reiser4 was slow by Anonymous Coward · 2006-01-06 06:17 · Score: 1, Insightful

Everyone knows Reiser4 uses a lot of CPU, and these guys run the test on a 500MHz machine!!

Re:Of course Reiser4 was slow by Arandir · 2006-01-06 08:17 · Score: 1

So? Who the hell modded this insightful? "Redundant" maybe, "whiny" most likely, but not "insightful".

--
A Government Is a Body of People, Usually Notably Ungoverned

I think trying on a P2 266 is a bad idea by H4x0r+Jim+Duggan · 2006-01-06 06:18 · Score: 5, Interesting

Reiser is not designed for slow CPUs. AFAIK, a key part of the design was the Hans Reiser realised that CPUs were vastly underused. IO resources were maxed out and CPUs were sitting idle. So he found ways to use the CPU to make more efficient use of the IO resources. So this benchmark on a 500Mhz machine will of course show Reiser in a bad light, and moving lower down to a 266Mhz will make it even worse.

For a decent benchmark of how filesystems work on modern hardware: use modern hardware.

--
Please help publicise swpat.org - the software patents wiki

Re:I think trying on a P2 266 is a bad idea by Anonymous Coward · 2006-01-06 07:03 · Score: 0

I'm fairly certain using more CPU in most cases is RETARDED.
Re:I think trying on a P2 266 is a bad idea by Trifthen · 2006-01-06 07:06 · Score: 1

What this says to me, is to never use Reiser on a DB machine. Sure, the disk churn is much more prevalent on such a beast, but the CPU(s) aren't exactly sitting around idle, either.

It actually sounds like Reiser would do really well as a disk controller in a dedicated drive array. I wonder if anyone has put embedded Linux on such a device, to act as a Reiser RAID controller...

--
Read: Rabbit Rue - Free serial nove
Re:I think trying on a P2 266 is a bad idea by wavq · 2006-01-06 07:07 · Score: 1

I used to work for a company where 10000 files in a single directory is considered
*small*. Try ten or a hundred times that many, and watch EXT(2|3) come to a grinding
halt. Reiser (v3) happily obliged with these kinds of loads.

Any test can be made to highlight the good/bad given "properly" (ahem) chosen parameters.
Re:I think trying on a P2 266 is a bad idea by StarHeart · 2006-01-06 07:19 · Score: 2, Informative

I am pretty sure that ext3 fixed that with htree indexing. Htree has been around for a while.

--
Havoc Penington, the bane of my Linux desktop.
Re:I think trying on a P2 266 is a bad idea by H4x0r+Jim+Duggan · 2006-01-06 07:52 · Score: 1

Consider the purpose of the computer when choosing the filesystem? Yep, can't argue with that.
One other point I thought of just after posting my first comment is that CPU power is growing far faster than IO resources are growing, so changes in the technological environment are causing CPU-using filesystems to be increasingly a good idea.

And here's a benchmark which backs up what I was suggesting. It shows Reiser4 as being the fastest, and the most CPU-using, of the 5 main journaling filesystems. And given that CPU cycles are becoming increasingly numerous and cheap, that's probably ok for most uses.

--
Please help publicise swpat.org - the software patents wiki
Re:I think trying on a P2 266 is a bad idea by captain_craptacular · 2006-01-06 08:12 · Score: 3, Insightful

So this benchmark on a 500Mhz machine will of course show Reiser in a bad light, and moving lower down to a 266Mhz will make it even worse.

If you look at the charts, the "editing" doesn't help either. For example one cpu usage chart showed a range starting @ 92% and ending @ 94%. The Rieser4 bar was 3x as long as the next bar, but guess what, it was using something like .7% (ie 93.7% as opposed to 93%) more CPU. If the scale hadn't been jacked up you wouldn't have been able to spot the difference at all, but they way they chose to present the data, it looked like a total smackdown.

--
They who would give up an essential liberty for temporary security, deserve neither liberty nor security
Re:I think trying on a P2 266 is a bad idea by molnarcs · 2006-01-06 08:35 · Score: 1

But the difference is HUGE - utilizing more CPU power, reiser4 underperforms every filesystem discussed. Do you say the situation would be reversed if running on a better CPU? How much better we are talking here? I mean if the difference would have been small, I could expect some improvement if moving to a more modern hardware, but there is a really big gap between ext2/ext3 (+XFS) performance and Reiser4 - and 500Mhz is not that slow, especially doing nothing else but copying files! Let's assume that reiser4 is a safe bet from every single point of view except this (it isn't - for production servers no sane person would chose something as experimental as reiser4). Then we have to make sure that there is always spare CPU cycles available (let's say 30% is always IDLE) for the filesystem? How can you guarantee that on a busy webserver or database? No, I don't think reiser4 looks good at all, no matter how much CPU power you throw at it - it can make results somewhat better, but for now, it seems reiser4 is not a safe bet at all.
Re:I think trying on a P2 266 is a bad idea by Anonymous Coward · 2006-01-06 09:04 · Score: 0

Ahh... a classic from ``How to lie with statistics'' book :-)
Re:I think trying on a P2 266 is a bad idea by Trifthen · 2006-01-06 09:21 · Score: 1

Wow, that benchmark is 2 years old. Hans mentions in those posts that Reiser-4 is experiencing a reduction in CPU as it evolves and "cruft" is removed, but it doesn't seem that different after the two-years elapsed. It's also important to know if the CPU was highly taxed in the benchmarks posted today concerning Reiser4: if it never redlined, the results mean the filesystem was adequately supplied with CPU. I mean, 30% on a P4-1.5Ghz would be about 100% on a 500Mhz system, but if the 500Mhz system never topped 60%, there was still CPU to spare, and it wasn't the cause of Reiser's poor performance this time.

Looking at those graphs, it's fairly obvious all of the filesystems were being starved of CPU. Many of the results were being capped off at 90+%, which is not proving anything about the capability of the filesystems themselves. In order to adequately test a variable, it needs to be isolated; this did not happen here.

--
Read: Rabbit Rue - Free serial nove

Uhm, whats with the chart? by FunkyELF · 2006-01-06 06:19 · Score: 1

I'm looking at the all test times chart and it seems to mis-represent the time taken to cat a 1Gb file to /dev/null http://linuxgazette.net/122/misc/piszcz/group002/i mage018.png In the last set of data points shows REISERv3 as the 4th best but... http://linuxgazette.net/122/misc/piszcz/group002/i mage017.png is showing it as the clear loser. Also, the data at the bottom of the article confirms it. WTF?? I call shenanagins (sp?) ~ELF

Re:Uhm, whats with the chart? by Jason+Hood · 2006-01-06 07:26 · Score: 1

I noticed this as well. I would recommend people completely disregard this test as its very impractical and inaccurate.

--
Are you intolerant of intolerant people?
Re:Uhm, whats with the chart? by timotten · 2006-01-06 09:49 · Score: 1

I noticed the same thing. A few other inconsistencies:

* In image018, XFS is clearly the performance loser in all tests. But in the other charts, we see a more divided picture -- with XFS, Resier4, and Resier3 each taking the "performance loser" position in a few tests.

* It's not just a matter of labeling or confusing datapoints -- the datapoints for the last test are entirely different in image017 and image018. Note that, in image017, the times go as high as ~38 sec. In image018, they go as high as ~140sec.

These kinds of inconsistencies make one wonder about the credibility of the results.

IDE Drives Cause other Overheads by j0ebaker · 2006-01-06 06:20 · Score: 4, Insightful

It would be interesting to see the results of the same tests running against a SCSI drive system where there is less IO overhead to see if the results differ.
There are other considerations here as well. What about the I/O elevator's tuning options.
Yes, I'd much rather see this test occur against a SCSI drive or better yet against a RAM drive for pure software performance.

Cheers fellow slashdoters!
-Joe Baker

Re:IDE Drives Cause other Overheads by oglueck · 2006-01-06 06:29 · Score: 1

The IO scheduler should not matter as they are only important when multiple processes access the disk.
Re:IDE Drives Cause other Overheads by j0ebaker · 2006-01-06 06:38 · Score: 1

Wouldn't a single journaling filesystem transaction be considered three independant writes?
I've also learned that the first part of a hard drive is the fastest. I trust that this user used the same partitioning scheme for each test to be fair. If I'd known the first part of the hard drive was faster my laptop's swap partition would be the first partition on the drive instead of the last.
Re:IDE Drives Cause other Overheads by oglueck · 2006-01-06 06:50 · Score: 2, Informative

Wouldn't a single journaling filesystem transaction be considered three independant writes?

No. A single transaction comes from a single thread. So the IO scheduler has no freedom here. It consists of these operations:

1. write redo log
2. write
3. clear redo log

They must occur in exactly this order. There are flush operations involved as well but I am not an expert here.
Re:IDE Drives Cause other Overheads by j0ebaker · 2006-01-06 09:35 · Score: 1

As you mention there are caches involved which must be periodically flushed. I think of the I/O scheduler as a broom that sweeps back and forth across the widest (from beginning to end) parts of the drive which are actively being written to. The I/O scheduler holds queues writes and reads these days till the broom makes it past that point on the drive. Some arbitration is decided when an I/O request has waited too long and then exception handling moves the broom to service the request out of the normal sweep cycle. My description could be oversimplified. With SCSI systems am I correct in assuming that the drive's controller takes care of the I/O scheduling and not the computer's CPU?

These tests really need to be done in more than a single thread, do you agree? Somehow we need to generate predictable, multi-threaded traffic to gain a closer picture of how these different filesystems compare.

Can you think of other ways that the IDE processing might affect one filesystem type more or less than others?

-Joe
Re:IDE Drives Cause other Overheads by sjames · 2006-01-06 11:53 · Score: 1

I can see where you're going there, but seek times are really important to a filesystem's performance, so running in a RAM drive won't give very realistic figures. I do think testing against SCSI would be interesting. Also with a fast CPU to see how that affects Reiser's times.

How is ext3 mediocre? by mrcparker · 2006-01-06 06:20 · Score: 1

It seemed to be either first or second at most of the benchmarks. I really don't consider that mediocre.

I was pretty surprised by ext3's performance. I also read the article.

Re:How is ext3 mediocre? by Anonymous Coward · 2006-01-06 06:40 · Score: 0

I must agree with the above. From the tests performed, I was very suprised and impressed with ext2 and ext3, both of which I had moved away from for the 'newer breed' filesystems, assuming, falsly, the newer is generally better. (why make something that is not better then what already exists and If you can study what already exists you can improve upon it) However the tests performed are not necessarly conclusive, as others have stated (more enterprise setup, etc).

Also, these tests don't include tests like file storage efficiency, (sectors used, etc) stability, longevity, etc, etc.

Still, all in all the results are interesting.
Re:How is ext3 mediocre? by budgenator · 2006-01-06 10:29 · Score: 1

I use XFS to spite SCO, but I agree, and was impressed with EXT3's general performance in areas that I typicaly use. You know if they did things like making sure that the partions being used most are centered on the disk, like we did in the old days it would probably make as much difference.

--
Apocalypse Cancelled, Sorry, No Ticket Refunds

Part III by renrutal · 2006-01-06 06:31 · Score: 1, Funny

Part III of the test should feature the filesystem behaviour during a Slashdot Effect.

I must say the filesystem they're trying in the current effect is really failing. No pages served booh!

wrong conclusions by penguin-collective · 2006-01-06 06:31 · Score: 1

What I take away from these benchmarks is that Ext3 is still the most reasonable choice: mature, well supported, and good overall performance.

JFS, XFS, and ReiserFS are small players with a fraction of the user community and a fraction of the tools and support; their performance would have to be astounding in comparison to Ext3 to even consider them, but it isn't.

Unfortunately, benchmark-happy people like you, people who optimize for the wrong thing, are far too frequent in this industry.

Re:wrong conclusions by HidingMyName · 2006-01-06 07:02 · Score: 1

I think XFS supports caced block access (using DMAPI if I recall. This helps with low level access, so that the dump utility can operate on a live file system (although write activity during the dump could still cause inconsitency if a snapshot is not used). Ext2FS/Ext3FS don't have such support as far as I know.
Re:wrong conclusions by jabuzz · 2006-01-06 11:04 · Score: 1

Though that would be fairly silly. XFS is the only one of these filesystems to support freezing which along with LVM snapshots is the reaons I use XFS.

Forget SATA by WindBourne · 2006-01-06 06:35 · Score: 1

I want to know the SANTA benchmark. How did he travel all over the world and when will he not be able to handle anymore?

--
I prefer the "u" in honour as it seems to be missing these days.

Ok, I'll be blunt. by Anonymous Coward · 2006-01-06 06:36 · Score: 0

Aside from specific needs like constant-speed streaming (XFS), it's a processor thing:

Do you have lots of CPU time available?

Yes: ReiserFS will extract the most of your computer.

No: Don't use ReiserFS; maybe JFS or something...

Just my personal opinion, not endorsed by anyone here.

Outdated hardware... by tetabiate · 2006-01-06 06:39 · Score: 3, Informative

Anyway, how is the average user supposed to be concerned by these results?
In my daily work I manage hundreds of GB's of data and have hardly seen a significative difference between XFS, JFS and ReiserFS v.3 on relatively modern hardware (Tyan S2882 Pro motherboard, two Opteron 244 processors, 4 GB RAM and two 250-GB SATA HD's) running OpenSuSE 10. I put the most important data on a XFS partition but also have a small ReiserFS partition which can be read from Windows.

-- Help us to save our cousins the great apes, do not use cell phones.

Re:Outdated hardware... by Gramie2 · 2006-01-06 06:53 · Score: 1

"significative" is a perfectly cromulent word.
Re:Outdated hardware... by tetabiate · 2006-01-06 08:50 · Score: 1

Ces drôles de Frenchies!

-- Help us to save our cousins the great apes, do not use cell phones.

a couple of comments as AC by Anonymous Coward · 2006-01-06 06:41 · Score: 0

1) the physcal machine is the same? but you've just said you've replaced the HD and the HD controller!

2) I notice in small print at the bottom what I believe to be the case too, after looking at the overall figures. XFS seems to be the best performer overall in terms of CPU load and speed of file system for day to day tasks. okay, it loses big time ona few items. I'd never realised how painfully crap reiserFS is for many many files....and yet its constantly been 'bigged up' as the choice to make for MAILDIR systems. why?? who would do somthing like use reiserfs for a mail server?

let's clarify things a little by Karaman · 2006-01-06 06:46 · Score: 0

ext3 without -j (journal mounting option) is no more than a ext2 partition and -j with defaults of 2 to 4 seconds of commit is just pushing luck for AC blackouts as for xfs, it repairs itself without external tools :) p.s. and just for the record: I have used so far ext3, ext2 and xfs partition types for / and data partitions and to tell the truth I dont like ext2 at all because I get errors too often. Unlike it, xfs has never failed me except once, when I had to mount a partition readonly to repair it (still debugging this case) ext3 without -j is just the same although my brain didnt realize it until repair to the partition (brain still intact) was fruitless. ext3 with -j was slower than my granma.

--
sex is better than war!

Sample size == 3 by Anonymous Coward · 2006-01-06 06:47 · Score: 0

NOTE5: All tests were run 3 times and the average was taken,
if any tests were questionable, they were re-run and
checked with the previous average for consistency.

Bad graphs to prove a point by Anonymous Coward · 2006-01-06 06:49 · Score: 2, Informative

The total free space graph is poor statistical representation

It starts at 345GB and goes to 375GB on the y scale. This makes the difference between 355 and 370 look like a 50% difference rather than that 5.7% increase.

He does it again in make 10,000 directories 99.5% is not double the cpu use of 97%

Re:Bad graphs to prove a point by Anonymous Coward · 2006-01-06 08:24 · Score: 0

The percentage doesn't look all that important to me, considering the difference in available space is big enough to store the entire contents of my hard drive.

And for god's sake, someone fix this BS or at the very least tell me how damn long I'm supposed to wait instead of picking random numbers out of thin air.
Slashdot requires you to wait between each successful posting of a comment to allow everyone a fair chance at posting a comment.
It's been 37 minutes since you last successfully posted a comment
Re:Bad graphs to prove a point by Hal_Porter · 2006-01-06 13:18 · Score: 1

You wouldn't have these problems if you didn't try to follow up to your own posts Mr Coward.

If that in fact is your real name.

--
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;

That's what I call ironic! by aconkling · 2006-01-06 06:49 · Score: 1, Offtopic

Are those graphs really created in MS Excel?

Nice stats... but wrong... by strredwolf · 2006-01-06 06:56 · Score: 3, Interesting

You know, I was looking at all these stats from this roundup... and while I'm glad they have one nice stat (how much the FS itself takes, the rest for space), I'm not happy that there is no "We've loaded it up, lets see how much is left" statistic.

What am I saying? I want to know how efficent these filesystems are in packing the data on the HD.

I know Reiser v3 has "tail packing" to take small files and ends of files that stick out past a block boundary, and packing them inside "sub-blocks" to save space. ext2/3 is stuck at the block boundary (even though you can adjust the size of these blocks)
I don't know if ext2/3 has been enhanced to pack small files in inode data.
JFS and XFS does not have a tail-packing feature, and is too stuck at (adjustable) block boundaries.

I'm glad that you get more data out of Reiser v4, JFS, and XFS at formatting time, but my feeling is that Reiser v4 (once profiled, tweaked and refined for speed and space) will pack data tighter than anyone else. Meanwhile, I'm looking for something like ext3 that packs better.

--

--
# Canmephians for a better Linux Kernel
$Stalag99{"URL"}="http://stalag99.net";

Interesting? How about a DECENT one? by diegocgteleline.es · 2006-01-06 07:01 · Score: 5, Interesting

I'm *sick* of reading filesystem benchmarks of people who doesn't even care about even reading the documentation of the filesystems they compare

OK, so ext3 is not the fastest filesystem on earth. But it has some default options which makes it suck even more than it usually do, and those options are *documented* in Documentation/filesystem/ext3.txt

* Ext3 does a sync() every 5 seconds. This is because ext3 developers are paranoid about your data and prefers to care about your data than win on benchmarks. Syncing every 5 seconds ensures you don't lose more than 5 seconds of work but it hurts on benchmarks. Other filesystems don't do it, if you are doing a FAIR comparison override the default with the "commit" mount option

* ext3's default journaling mode is slower than those from XFS, JFS or reiserfs, because it's safer. When ext3 is going to write some metadata to the journal, it takes care of writting to the disk the data associated to that metadata. XFS and JFS journaling modes do *not* care about this, neither they should, journaling was designed to keep filesystem integrity intact, not data, ext3 does it as an "extra", and it's slower because of that. But if you want to do a fair comparison, you should use the "data=writeback" mount option, which makes ext3 behave like xfs and jfs WRT to journaling. Reiserfs default journaling mode is like XFS/JFS, but you can make it behave like the ext3 default option with "data=ordered"

ext3 is not going to beat the other by using those mount options, but it won't suck so much, and the comparison will be more fair. And remember: ext3 tradeoffs data integrity for speed. There's nothing wrong with XFS and JFS, but _I_ use ext3.

Re:Interesting? How about a DECENT one? by molnarcs · 2006-01-06 08:07 · Score: 1

Hmmm.... I don't unerstand your outburst: "OK, so ext3 is not the fastest filesystem on earth" - what? I looked at the benchmarks, and it appears that ext2/3 wins every single test that matters to me: find dirs, files, untar, tar, copy tarballs(s), the kernel source tree (ext2/3 now outperforms reiser3). I understand your points, and everything you wrote is true, but there is no need to defend ext3, for it performs admirably (did you RTFA?).
Re:Interesting? How about a DECENT one? by cecom · 2006-01-06 08:09 · Score: 2, Interesting

May be I am misinterpreting the data somehow, but from a quick look at the article EXT2/3 is performing quite well.

touch files - slowest
find files - fastest
remove files - fastest
make directories - slowest
find directories - second best
remove directories - best
copy tarball to cur disk - middle of the pack
copy tarrball to other disk - middle of the pack
untar kernel - fastest
tar kernel - second best
remove kernel sources - fastest
copy tarball - fastest
create 1GB file - fastest
copy 1GB file - fastest
spilt 100MB - fastest
copy kernel sources - fastest
cat to /dev/null - middle of the pack

Based on this, I'd say ext2/3 is doing exceedingly well overall (if not the best!). So, what am I missing ?
Re:Interesting? How about a DECENT one? by Anonymous Coward · 2006-01-06 08:50 · Score: 0

"journaling was designed to keep filesystem integrity intact, not data"

No, journaling was designed to keep DATA integrity intact and to allow rollback to a previous point in time. Remember, journaling has been around for a long, long time and on many types of systems. Unix specific metadata was not an abstract design goal, just a system specific implementation detail.
Re:Interesting? How about a DECENT one? by diegocgteleline.es · 2006-01-06 09:01 · Score: 1

it appears that ext2/3 wins every single test that matters to me: find dirs, files, untar, tar, copy tarballs(s), the kernel source tree (ext2/3 now outperforms reiser3

It still loses on others. benchmarking it properly can change things...
Re:Interesting? How about a DECENT one? by xenocide2 · 2006-01-06 11:02 · Score: 1

If they've missed the obvious ones on ext3, what does this say about reiserFS?

--
I Browse at +4 Flamebait
Open Source Sysadmin
Re:Interesting? How about a DECENT one? by HiThere · 2006-01-06 11:28 · Score: 1

What it probably says is that all the systems were tested with the default options. And, despite the gp, that's a perfectly reasonable choice, thought they should be explicit about it.

E.g., what isn't mentioned in the gp is that Reisser performs better on a swarm a small files than on honking big ones. Or that Reisser is more space efficient with the typical small files. So there is a bias in the selection of tests, too.

P.S.: I always choose ext3. I used Reisser once several years ago and found that the advantages weren't, for me, significant. And at that time the file recovery utilities weren't there. I presume that they now exist, but since the advantages weren't significant for me, I haven't checked. I haven't experimented with anything else but ext2, and I find that the advantages of ext3 over ext2 ARE significant. (Faster mounts, better error recovery, etc.)

OTOH, I have a small system. 2 240G disks ... which means that they are large enough that the small files won't eat them alive, and I don't currently have any huge files. (The closest I have is an old backup tarball that I'm planning to discard. I think I'll replace it by a copy of the /home directory.) OTOH, I am thinking of installing a DVD writer to use as a backup device...only I've heard some very bad things about DVD durability. But when I had a tape backup system, I noticed that I didn't have the discipline to use it regularly, and keep the backups current, with a DVD I could copy the entire /home directory (perhaps I'd need to tar it), but if it isn't stable, can I trust it? But a backing hard disk could be destroyed by the same power burst that took down the main disk.... Yiyiyiyi....

Still, none of this appears to help choose which file system to use.

--

I think we've pushed this "anyone can grow up to be president" thing too far.
Re:Interesting? How about a DECENT one? by TCM · 2006-01-06 11:52 · Score: 1

* Ext3 does a sync() every 5 seconds. This is because ext3 developers are paranoid about your data and prefers to care about your data than win on benchmarks. Syncing every 5 seconds ensures you don't lose more than 5 seconds of work but it hurts on benchmarks. Other filesystems don't do it, if you are doing a FAIR comparison override the default with the "commit" mount option

Is it just me or does this sound like a ridiculously ugly approach to data integrity?

--
Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
Re:Interesting? How about a DECENT one? by diegocgteleline.es · 2006-01-06 12:03 · Score: 1

Is it just me or does this sound like a ridiculously ugly approach to data integrity?

It must be just you. By sync'ing your data every 5 seconds they ensure that you can't lose more of 5 seconds of work. Other filesystems try to avoid syncing as hard as possible and you could lose 10, 20, 30 seconds of work...
Re:Interesting? How about a DECENT one? by yourlord · 2006-01-06 12:46 · Score: 1

It depends on what you're testing. If your goal was to test the filesystem performance based on the default behavior of the filesystem then the test was fine.

If you're going to performance tune one of the filesystems, then you need to performance tune all of them before comparing.
Re:Interesting? How about a DECENT one? by kesuki · 2006-01-06 13:04 · Score: 1

only I've heard some very bad things about DVD durability.

only problems i've had with DVDs are crappy write method drives that wound up making many coasters... i've since gotten a more reliable drive, have burned over 150 DVDs and havn't lost a single one.

apparently though DVD (and CD) media are susceptable to fungal growth that 'eats' the dye. I've never stored my media in a 'cool dark place' (it's always been sitting out where i can find it off the spindles) so ambient light appears to be adequate to 'prevent' data loss via fungal blooms.

--
https://www.gnu.org/philosophy/free-sw.html
Re:Interesting? How about a DECENT one? by sydb · 2006-01-08 09:19 · Score: 1

Actually, just looking at performance figures alone is meaningless anyway. I can give you the fastest filesystem in the world, but it might not hold your data! You have to look at what you want from a filesystem - its limits, reliability, integrity and THEN its speed. As you say it is valid to test for performance of a default configuration but whether something is "valid" or not doesn't say much about how *useful* it is to most people.

--
Yours Sincerely, Michael.

Thanks! by Anonymous Coward · 2006-01-06 07:03 · Score: 0

It's missing the most critical stuff from the tests, but I guess those things are hard to measure without manually creating a hardware failure.

I'm glad for this information, though. It affirms my choice of Ext-3 as the best all-around filesystem for my Linux servers and workstations. It's not the fastest, certainly not the slowest, but it's well-supported with utilities, and standard in every bootdisk kernel.

Poor benchmark writeup. MS Excel graphs? by Srdjant · 2006-01-06 07:04 · Score: 2, Informative

What's with the Microsoft Excel style graphs? They're not very precise or professional-looking.
You would have thought the author would use something better like gnuplot?

The author's opinion "Personally, I still choose XFS for filesystem performance and scalability."
is largely irrelevant here and sounds like bias, although the author acknowledges this.

There is no discussion of the results. The text between the graphs only mentions superficially
what is obvious to anyone looking at the graphs.

Seems a far cry from the very nicely done BSD and Linux benchmark at http://bulk.fefe.de/scalability/

Old Comps by Zaurus · 2006-01-06 07:06 · Score: 1

> 500mhz is at least 8 year old technology if I remember correctly.

Close. When I bought a computer in Jan 1998 (8 years ago this month), the fastest processor available from Dell/Gateway was a PII-300. I'd say it's more 6-7 year-old tech.

JFS ... by Pegasus · 2006-01-06 07:11 · Score: 2, Interesting

Of course JFS won, since it was designed to be as simple as possible ... it's originating from OS/2, afterall. On such a machine as used in this test, this is a huge advantage.

How is ext3 mediocre? Default limitations is how by WoodstockJeff · 2006-01-06 07:12 · Score: 1

I have personally had to deal with the results of forgetting to change from EXT3 to something else when setting up one of our servers. Took a year, but one of the database files reached that magical compiled-in limit of 4GB... Fortunately, I caught it shortly after it happened, and was able to rearrange things to keep the server from too far out of sync with the rest of the cluster.

EXT3 has a lot going for it, but the default compile options (at least the ones used by several of the popular packagers) make it incompatible with large files. "Large files" is, of course, a relative term, but more than a few people deal with 4+GB files nowadays, like DVD ISOs, so it's not just billion-record databases that blow up EXT2/3.

If I wanted small files, I'd have used FAT32! :)

I would disagree with you by Anonymous Coward · 2006-01-06 07:14 · Score: 0

ext2/3 are actually the clear speed winners.

Almost the only useful part of the article is the table "File Benchmark II Data". Scrolling through all the bar graphs is a waste of time. (Unfortunately, the first few pieces of data appear only in bar graphs, and not in the table.) The "All Test Times" plot would be very useful, except half the points don't have a corresponding label on the x-axis!

And what's important for the kind of evaluation you're doing are only the real world benchmarks like "UnTAR Kernel 2.6.14.4 Tarball", "Copy 2.6.14.4 Kernel Source Tree", "Mount Filesystem", and so on. ext2/3 are the fastest in almost all of these benchmarks. Then comes XFS and JFS, then ReiserFS, and finally Reiser4, which appears to be something of a dog so far as these benchmarks are concerned.

Obviously benchmarks don't tell the whole story. There are advanced features for Reiser4, ReiserFS, XFS, and JFS which could easily be more important for a user than these small differences in speed. And although ext2 is apparently fast for real world use, it is not journalled, which should disqualify it for most users.

The more fundamental benchmarks like "Make 10,000 Directories" are not useful for choosing which filesystem to use. ext2/3 stink on ice when making large numbers of directories, but it's not obvious what fraction of time users spend making directories. Plus, most directories that are created will eventually be removed, at which point ext2/3 win back most of the speed that they lost. What the fundamental benchmarks are good for is figuring out why a particular filesystem is unusually fast or slow at some benchmark, and so how that filesystem could be tuned or otherwise improved.

Anyway, it really looks like ext3 is a decent choice for the general user. I think it's bad to mandate the use of ext2/ext3 at install time, but that's a separate issue.

For instance, servers. by Anonymous Coward · 2006-01-06 07:15 · Score: 0

If you are thinking about a file server, it's very important to match the file system to the type of network protocol (NFS, SMB, whatever).

Ext3 and NFS work ok, except that writing a bunch of small files (as in untarring to an NFS mounted dir) is as slow as, well, swimming in cold tar (:P. An untar that takes less than a second to a local ext3 filesystem can take over a minute to the same file system mounted via NFS. Yes, that's run from the same fast server. It's a bad interaction, having to do with how NFS and ext3 deal with write commits. Even with all the NFS tweaks in place, it's still a couple of orders of magnitude slower.

Similarly, if your Samba server uses a non-Windows authorization backend, such as NIS or MIT Kerberos/LDAP, your filesystem really doesn't matter -- it's going to be slower than needed due to the overhead of translating the protocols.

A web or other server that does primarily read, read, read all day, should avoid ReiserFS, because if you get slashdotted you want your CPU as free as possible. That's not the only factor, but the point is that a lightweight, non-journalled FS can be your friend if you aren't writing to disk much.

Re:For instance, servers. by budgenator · 2006-01-06 10:41 · Score: 1

There's a lot to be said for mounting most partions read-only, only /tmp, /var, /home need to be R-W most of the time.

--
Apocalypse Cancelled, Sorry, No Ticket Refunds

A bit ridiculous... by Anonymous Coward · 2006-01-06 07:19 · Score: 0

Well. Wow. A benchmark that runs for between 0.03 and 0.07 seconds sure is quite precise, free from random variations and stuff. I bet most of the time was taken by spawning the "touch" process 10k times anyway.

About the "free space" issue, some filesystems count the space used by their internal data structures, some don't ; so for instance reiser has less free space after formatting but that doesn't mean you can put less data on it...

Also the results are pretty useless. I'm not really interested by knowing how long it will take to sequentially access a small (10K) number of files which are cached in RAM anyway, so it's CPU-limited and not disk-limited. I'm more interested in what happens when the dataset is larger than RAM, how intelligently stuff is cached, etc. These make a lot of difference in the way the computer feels, between sluggish and responsive, but there is really no benchmark for that...

All I know is that reiser4 is the only filesystem that made my crap slow laptop harddrive at least usable. That's a good enough benchmark for me...

Slow processors, compiling by Stunning+Tard · 2006-01-06 07:21 · Score: 1

As you and other have pointed out running the benchmarks with a slower CPU is useful.
So I'd agree that these tests aren't worthless, but they're only a start.

Also useful would be running these tests with a faster cpu to see how things change. The CPU might be a bottleneck in some cases it would be interesting to see how the picture changes. The CPU utilization went to 100% on many of his tests.

You could also try some tests with a filesystem mounted in memory to see where seek time becomes a bottleneck. Because you can't be too sure if flash drives might overtake harddrives for price AND speed. Some people use flash drives regardless of the cost.

These tests are also application independant which limits their usefulness a little. When somebody benchmarks a new 3d video card they'll start with 3dMark. But then they'll continue on and test with actual games.

So I'd like to see some practical benchmarks. Compiling something large is a great start. Then try various database loads. Some workstation or home pc desktop apps. Games. Some of the tests done by the folks at StorageReview.com might be relevant too.

Flash / SWF by fire-eyes · 2006-01-06 07:22 · Score: 0, Redundant

Editors: Please don't post links to such garbage ridden pages like this. I got at least three or four prompts in konqueror 3.5 to save a .swf file or cancel.

--
-- Note: If you don't agree with me, don't bother replying. I won't read it.

those FS are junk by Anonymous Coward · 2006-01-06 07:24 · Score: 0

no NTFS, no fucking care.

iozone, other comments by homebrewmike · 2006-01-06 07:26 · Score: 1

Reasonable test plan (clear and reproducable,) but not much theory. (Why was he running the untar? Is there a special feature of a file system optimized for tar? Would have been nice to see what he was trying to measure.)

Also, it would have been nice to see a run of iozone against the competing filesystems. Iozone (or bonnie, for that matter) is a pretty standard benchmark, and should have been included.

Finally, these are basically synthetic tests - it would have been a little more useful to see something like database access tests. One thing that quite a few benchmarkers ignore is the impact of the unit under test on the overall system: that sort of thing is more readily discernable with a application test.

Re:iozone, other comments by ananke · 2006-01-06 13:09 · Score: 1

I haven't tested reiserfs4, but recently I've been trying to figure out which iSCSI initiator is best for my purposes. I've used iozone for the benchmarks, and the results are at http://staff.vbi.vt.edu/dom/iscsi/testing

Pretty graphs would help, and eventually I'll get them finished.

--
--- d'oh
Re:iozone, other comments by Anonymous Coward · 2006-01-06 14:39 · Score: 0

In the case of creating the tarball he's looking at how the filesystem handles reading data from lots of small files and writing to one big file. Conversly the untar tests how it deals with reading from one big file and writing lots of small files. It's pretty clear that some filesystems are engineered to perform well on small files or large linear datasets, and this tests how well a filesystem performs when it has to do both simultaneously.

Ghost by Anonymous Coward · 2006-01-06 07:27 · Score: 0

I use EXT3 because it is supported by Norton Ghost 2003.
Does anyone know if newer Ghost software supports one of the newer filesystems?

Dan

Re:How is ext3 mediocre? Default limitations is ho by Isaac-Lew · 2006-01-06 07:29 · Score: 1

I believe that the current maximum filesize limit for ext3 is 2 terabytes (possibly more by now).

ZFS? by Jerrry · 2006-01-06 07:30 · Score: 1

Is anyone working on porting Sun's ZFS to Linux? Now that's a really cool filesystem.

Re:ZFS? by Directrix1 · 2006-01-06 08:16 · Score: 1

It just seems like a mixture of LVM, RAID, and any journaled filesystem. Except ZFS is patent crippled, so I doubt it'll ever happen.

--
Occam's razor is the blind faith in the natural selection of least resistance and in universal oversimplification. -- EF
Re:ZFS? by Jerrry · 2006-01-06 09:21 · Score: 1

I know there are a few efforts to port it in the *BSD world, so it seems they're not too worried about any patents.

The thing that really makes ZFS cool is its simple, but powerful administrative interface. It literally takes me two minutes to set up a complex system in ZFS that takes over an hour to do the equivalent in Linux.
Re:ZFS? by Directrix1 · 2006-01-06 13:37 · Score: 1

Really? I've never really thought doing stuff like that in Linux was that complicated. Care to post some examples of equivalent commands in both (I haven't used ZFS, I just watched one of their demo videos, and read their site)?

--
Occam's razor is the blind faith in the natural selection of least resistance and in universal oversimplification. -- EF
Re:ZFS? by beep999 · 2006-01-06 16:08 · Score: 1

Oh, zfs has more cool features than that. One is that it is copy on write, so it's faster and potentially safer than any journalled filesystem. It also has automatic volume identification. Basically, you can build a large, multi-volume filesystem on a machine, pull all the disks out, insert them into another machine in random order, then just tell zfs that it has some new disks and it will find the filesystem and mount it up, no mess, no fuss. The copy on write allows for low-overhead snapshots for clean backups or versioning.

It also has excellent checksum protection on all data blocks, so it can identify failing disk drives early and in some cases heal itself from the problem. Since the LVM support is built in at the filesystem level, if a mirror is broken and then the same disk re-attached, it only has to rebuild changed data, not the whole device as a low-level LVM would.

Lastly, it has very low overhead filesystem creation and resizing. The Sun guys I was talking to said they're normal use pattern for zfs is to give each user on a system their own filesystem. They've built systems with tens of thousands of mounted filesystems and no performance issues.

Available sample size by Anonymous Coward · 2006-01-06 07:31 · Score: 0

I can't think of ANY time where a sample size of 1 was meaningful for anything.

I bumped into my neighbor yesterday and she complained about the size of your sample not being meaningful, too.

Re:How is ext3 mediocre? Default limitations is ho by StarHeart · 2006-01-06 07:35 · Score: 1

I not sure what distributions your are using, but the biggest users of ext3 are RHEL and Fedora, and they have had large file support for years.

--
Havoc Penington, the bane of my Linux desktop.

benchmarks that take less than 1/10 of a second by hansreiser · 2006-01-06 07:40 · Score: 4, Insightful

If someone does not know that filesystem benchmarks that take less than a tenth of a second are meaningless, it makes you wonder if they made errors in other aspects as well. These results are not consistent with the results that we have had. I bet he did not make an effort to ensure that you had to read the disk for these benchmarks, that he did not copy his file set from the same fs as he was measuring (makes a HUGE difference to performance and it is the mistake every beginner makes), etc. You'll note that the way he makes his graphs makes 1% differences look huge, etc.

Re:benchmarks that take less than 1/10 of a second by dtfinch · 2006-01-06 08:13 · Score: 1

If I exclude the "less than 1 second" benchmarks, the reiser filesystems are a little better off, but still in last place. If I additionally exclude the "Remove 10,000 Directories" benchmark, reiser 4 and 3 move up to 2nd and 3rd place, and EXT2 and 3 move into last. JFS seems to win 1st no matter how I work the numbers.
Re:benchmarks that take less than 1/10 of a second by Professor+Calculus · 2006-01-06 11:01 · Score: 1

What if you exclude all of the tests which have >60% CPU use as well? Regardless of how long a test took to run, it cannot be considered to be I/O bound unless the cache hit rate is low. The biggest flaw in these tests is that he only did a 'sync' in between each test, instead of a 'umount'; all of the cache-bound results are virtually meaningless.

As another poster pointed out, I'd love to see the a set of benchmarks using CFQ v3 as well as the anticipatory I/O scheduler...

XFS - UPS = Disaster by fire-eyes · 2006-01-06 07:41 · Score: 2, Interesting

XFS is a nice filesystem, I like it. Not enough to use in production, but I like it. Personally I use reiserfs3.6 on many production servers, and have never seen a problem. I am experimenting with 4 at home.

I have a strong warning if you are considering XFS. If you don't have a GOOD power backup (UPS), then don't use it. XFS caches very agressively for writes in RAM. You lose power, you lose that data.

XFS was designed with datacenters with good power backups in place, not home users. So chose carefully.

--
-- Note: If you don't agree with me, don't bother replying. I won't read it.

Re:XFS - UPS = Disaster by ananke · 2006-01-06 13:03 · Score: 2, Interesting

Recently I've been doing some benchmarks to test iSCSI initiators on linux. So far [until 2.6.15], XFS is the only filesystem that got damaged after some kernel panics. On 2.6.15 I've damaged JFS almost everytime I got a kernel panic, very frigthening.

Anyway, for anybody interested, the results are at: http://staff.vbi.vt.edu/dom/iscsi/testing

--
--- d'oh

i dont trust this by carlosGames · 2006-01-06 07:57 · Score: 1

the first impression is JFS is the winner, but i have find some errors at the comments about graphs, CPU is slow (im not asking for a benchmark using an opteron or something like that neither) so it may have influenced directly over the benchmarks results. Another problem is the benchmark as commented before is based in a result of one only test (apparently) so the results may be completly different if a file system benchmark part III is done. it would be a great idea to make public the scripts which where used to realize this benchmark so this might be used to do some more benchmarks in different hardware and lot of times.

Re:Very interesting article... NOT! by hackstraw · 2006-01-06 08:06 · Score: 5, Insightful

I would rather see these benchmarks on a computer less than 5 years old. I would also appreciate an open source version of the tests so they could be reproduced. For ease of reading, I think the article should be on a separate page on the site as well.

I've got a screaming Dell 1.6 GHz P4 to test with and here are my results for a couple of tests it only has ext3 and a whatever cheap harddrive came with the box. I'm not sure if dma is enabled or if I've done any hdparam tunings, but I'm not sure of their test system either:

my touch 10,000 files: 24.314 seconds theirs 48.25

I used a shell script that called /usr/bin/touch

Now if I use a Perl open() call, I get 8.887 seconds
Now with a cheesy C that uses fopen() and fclose() I get 4.639 seconds

my make 10,000 directories: 56.832 seconds theirs 49.87

that is a shell script

If I user perl, I get 35.171 seconds

The /dev/zero stuff is completely bogus. No indication of the blocksize that was used.

The copy kernel stuff to and from a different slower disk with an unknown filesystem on it is useless.

The split tests are not indicative of anything in real life, and they took on order of between 60 seconds and 130 seconds to perform on their 500MHz system with most being in the 130 second range. I got 16.547 seconds.

I do not see how any relevant information can be obtained from this article. I'm disappointed in the Linux Gazette and Slashdot for printing this information.

Old Shitty Machine, Shitty Results by LordMyren · 2006-01-06 08:09 · Score: 2, Insightful

<blink> Test is flawed! </blink>

Checkout the CPU utilizations; reiserfs is pegged at 100% cpu utilization for ~8 tests. For a FS which describes itself as willing to use more CPU in order to achieve better I/O than the competition, running the benches on an antiquated 700 mhz machine is simply not fair.

OTOH, Untarring and tarring are notably NOT cpu limited, and still pretty lackluster for Reisers case. Disappointing, very disappointing. I was extremely impressed in the ext's; I simply had no idea how consistently well performing they were.

I'd also like to see FreeBSD's UFS /w and w/o softupdate benched.

Myren

Re:Old Shitty Machine, Shitty Results by Anonymous Coward · 2006-01-06 08:58 · Score: 0

The aggregate results appaer to be off by one.

The image 016.png shows JFS in the middle; for some reason the final points did
not plot to the end of the graph, possible due to the way I graphed it, I needed
to make sure the graph sizes were below a certain pixel allocation (600x400 I
believe) -- when I did the graphs and specified them to show a test for each
point on the graph, the exceeded that size. If you recall in the last test MANY
people complained (rightfully so) regarding the problem of the fonts looked
horrible (and they did etc) because, originally I had generated the graphs very
large and scaled them down.

All of the tests are correct and the raw numbers at the bottom for anyone to
re-generate the composite graph if necessary, the composite graph(s) appear to
be the only issue. I had made sure I fixed the one issue with the last benchmarks,
that was the 1000, 1024, 2048 split, where it did not graph correctly to 8192 bytes,
unfortunately I had missed it by mistake in the composite graph.

I am sorry for the error.

Justin.
Re:Old Shitty Machine, Shitty Results by JPyun · 2006-01-06 09:44 · Score: 1

Actually, its even worse. They used a 500 mhz test machine.
Re:Old Shitty Machine, Shitty Results by Aragorn379 · 2006-01-06 14:32 · Score: 1

I was extremely impressed in the ext's; I simply had no idea how consistently well performing they were. I'm not so sure about consistent performance. I've run in to the extreme slow down caused by many files in the same directory many times and it can be pretty painful. I'm not sure if this has been addressed at all, but from my understanding it is a limitation of the way the file system data is stored on disk with extra levels of indirection being required. With Reiser, I can have directories with 10,000s files with no problems.

OT: Sig by LordMyren · 2006-01-06 08:13 · Score: 1

What are you grievances with Mr. Pennington. Just curious, thanks.

Myren

Re:OT: Sig by StarHeart · 2006-01-06 16:05 · Score: 1

He was an architect of the philisophy behind the dumbing down of Gnome, and strongly advocates it.

--
Havoc Penington, the bane of my Linux desktop.

Crappy benchmarks. by bored · 2006-01-06 08:32 · Score: 0

I have a few systems here at work that have many Tb of storage and bandwith requirements >1G/sec. It would be nice to see someone accually test this kind of setup with the linux file systems. From what I've personally seen XFS is just about the only legitimate solution for our application (its not exactly optimium either, its getting its butt kicked by NTFS). This is because linux simply uses to much CPU to maintain any reasonable throughput (aka >500Mb/sec) and the larger file systems tend to either get so slow as to be unusable, or they simply don't work.

Re:Crappy benchmarks. by Anonymous Coward · 2006-01-06 10:58 · Score: 0

google for "ext3 extents mballoc"

Re:How is ext3 mediocre? Default limitations is ho by WoodstockJeff · 2006-01-06 08:55 · Score: 1

Yes, the theoretical maximum is comfortably large... But, that limit is not what is compiled in by RH, Mandrake, and a few others I've tried. I haven't tried RedHat's "Enterprise" kernel, but most are compiled with a 4GB limit on file size.

How much better are we talking here? by H4x0r+Jim+Duggan · 2006-01-06 08:58 · Score: 1

> How much better are we talking here?

See my second post in this thread, I give a link to a benchmark where Reiser is twice as fast as its nearest competitor.

--
Please help publicise swpat.org - the software patents wiki

Re:How much better are we talking here? by molnarcs · 2006-01-06 09:55 · Score: 1

That benchmark is from 2003... since then, it appears that ext2/3 (and xfs) has improved a lot (the new benchmark mentions that). Anyway, thanks for the link - I'm still not convinced about the capabilities of reiser4 though, neither as a stable/reliable fs, nor as something that outperforms the competition. I don't see how much more CPU can help it ... afterall, it did use more CPU to perform worse than the others, and 500Mhz is not much, but it should be enough, especially for a test where nothing is done but copying files.

excep by LWATCDR · 2006-01-06 09:01 · Score: 1

All the tests where done on a 500 mhz PIII machine.
Not exactly what I would call state of the art. The test results seem valid for a home server that you built out of left over parts but not for much else.
Did he compile the FSs himself? If so what optimizations did he use with the compiler.
I don't get the importance of deleting thousands of directories. Do you do that all that often? Why would you?
What was the point of the test? What environment where they trying to test for?
Desktop?
Home server?
Small office?
Mail server?
What about data integrity?

--
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.

Slow filesystem? by charlesnw · 2006-01-06 09:05 · Score: 2, Funny

Wow. They must have a super slow file system if they are just now getting the results of the tests and the first half of the article ran in the first half of last year! :) This is supposed to be funny by the way.

--
Charles Wyble System Engineer

Re:How is ext3 mediocre? Default limitations is ho by ISayWeOnlyToBePolite · 2006-01-06 09:13 · Score: 1

Debian Woody has LFS for ext3 (although I belive it was a backport and 2.4 only) and that was released on 19th of July 2002; what distro are you refering to?

Conclusion... by Anonymous Coward · 2006-01-06 09:17 · Score: 0

I ran a few standard deviation checks on the statistics, and the winner of the benchmark would be... JFS, believe it or not. ReiserFS is the clear loser here. Then again, Reiser would benefit from having more up-to-date hardware. Not totally unexpectedly, Ext2/3 is average. So is XFS.

Yep by Anonymous Coward · 2006-01-06 09:17 · Score: 0

"Ext2/Ext3: Mediocre at almost everything. Distros like Fedora that mandate the initial install ONLY use Ext3 are being stupid. The best fall-back filing systems if you can't find anything better for what you want the partition to do, but should never be used in specialized contexts."

Huh? Sorry, did you read the same graphs or are you just trolling?

Yep. It almost seems as if he thought the longer bars meant it was better. *Sigh*

From what I can see, the one outstanding statistic is that ReiserFSv4 still needs allot of work if it's going to catch up with v3 (if you're into that kind of file system..)

It really looks to me that EXT3 is THE standard if performance in the areas that really matter are what you're looking for. It is way ahead of the pack at finding files and making directories. And, it keeps up with EXT2 in almost every other area which is impressive considering the added complexity.

I really don't think you should touch something like ReiserFS with a barge pole unless you're paranoid about losing data.

Put it this way, if you want to run a super computer and still keep integrity, you can do that with EXT3. All you have to do is copy dat every now and then and you will do this MUCH faster than with ReiserFS's integrity measures. In reality, you will have failsafes, you will be able to resolve a power failure downtime easily. You'll do it better with ResierFS but the cost in performance is just too high.

Better to do it manually with EXT3 and have a Ferrari in terms of performance. Actually, JFS and XFS look excellent too.

I can't see any reason to not use EXT2/3. For me, my main use for a good file system is all round performance. You can ALWAYS build in your own integrity measures easily. The dilemma with ReiserFS is that you you use it's integrity measure but that makes IT slow.

I think a Windows server would beat the heck out of a ReiserFS based Linux server in terms of performance. Why bother?

You can bet google are not using ReiserFS on their servers!!

The two composite graphs are now fixed. by Anonymous Coward · 2006-01-06 10:06 · Score: 0

http://users.adelphia.net/~apiszcz/piszcz.html

You can view them here for now until Linux Gazette gets a chance to fix them!

Thanks,

Justin.

Re:The two composite graphs are now fixed. by Anonymous Coward · 2006-01-06 11:13 · Score: 0

Next time I am going to get some people to review it more thoroughly, test 015 has also been fixed to reflect the data. Anyone recommend good software in Linux for bar graphs and composite graphs?

Thanks,

Justin.

Statistics and lies. by chris_sawtell · 2006-01-06 11:19 · Score: 1

In my practical experience this article is load of flushable nonsense. Just what do the /. editors think they are doing giving it space? I have run ext[23], Reiser[34], and xfs, All of them pretty extensively. For me the most important criterion is whether my data is safe. Everything else is of secondary importance. Yet this is not even mentioned in the article. The other issue is whether a file access intensive process bogs down the machine so much so that it is unusable for tasks other than the one accessing files. Yet both of these most important facets of file system performance are totally ignored by the article. My experience is that Reiser4 wins hands down on both these vital features. The data recovery abilities of the fsck.reiser4 utility appears to be nothing less than 'magic'.

Question for Linus: Why are you not letting the Reiser4 drivers into the mainline kernel? Imnsho you are doing the Linux user community something more than a mere disservice by not allowing everyone simple access to these excellent file system drivers.

Moving from ReiserV4 to JFS by Anonymous Coward · 2006-01-06 11:23 · Score: 1, Interesting

I've seen those benchmarks before, and last time I saw them, I decided Reiser was for me. I've been using it ever since.

Based on the new benchmarks, I'm serious considering moving to JFS. It seems to be much faster for my typical desktop usage.

I'm curious what hash function he was using with ReiserV4, though. TEA produces a more responsive filesystem, for me, than R5. Reiser defaults to R5, so I'm guessing he was using that, but I'd like to see the difference that TEA produces.

Hmmm. by jd · 2006-01-06 12:07 · Score: 1

No, I'd argue that the reason most people use ext2 and ext3 is that these are the filesystems that you get to use when you install the system. It's like arguing that most people use Windows because it's better - no, it's because it's there.

Secondly, you don't need to give every option to everyone. Red Hat's installer already lets you pick whether you're installing on a desktop, a server, or a custom system - so that automatically tells you which filing systems are likely to be wanted.

(eg: If you are installing for a desktop, you don't - probably - want a filing system geared for high-end servers. Likewise, a server box won't want a desktop-optimized filesystem. Custom installs should be exactly that, allowing you to custom-pick whatever you damn well like for the filesystem.)

So, uh, fedora developers are stupid and you're smarter than them?

That's easy. Yes.

The entire release cycle methodology is flawed (development needs to be split into alpha and beta, where alpha is the latest release and beta is the set of RPMs that will co-habit the hard drive. When RPMs are built, a complete dependency map should be constructed and compiled.

Since "development" basically means "it'll compile, but it's not tested", you could cross-compile development trees for EVERY architecture Linux supports, on the grounds that you don't give a damn if unsupported binaries will actually run, but if they do, you've increased interest in that distribution and are in a position to expand and support other architectures if the interest turns out to be there. Costs nothing, but potentially earns lots.

It is obnoxiously difficult to get 3rd party RPMs into even the extras branch. Many programs have multiple configurations possible, but are often compiled with random, unexplained ones that make no obvious sense. RPMs that are present are sometimes ancient (HDF5 is at 1.6.5 with no szip support, in the extras, but the current version is 1.7.52 and szip is at 2.0. ATLAS - a very important maths library - is at 3.6.0, but the current "recommended" release is 3.7.11! The version of LAM is ancient and should be replaced with OpenMPI anyway.)

Yes, I would regard the Fedora developers as too slow, too entrenched and not interested in producing the optimal distribution, only the best one they can produce at minimal effort. To me, that is wholly unacceptable. Towards the end of the last ice age, you could understand people conserving effort. That is no longer the most efficient way to get things done.

(If anyone out there cares to provide some disk space, I'd be more than happy to show how Red Hat could be done better, with greater versatility, yet with fewer headaches for the novices.)

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)

CVS/SVN by jd · 2006-01-06 12:21 · Score: 1

If you're a programmer using a distributed project manager, then creating large numbers of directories (but not removing them) is quite likely going to be a significant operation. For the same reason, computers acting as FTP mirrors will find that an important statistic.

PERL CPAN users will likely also be familiar with the notion of massive numbers of directories being created. Programs that create workspaces in the /tmp directory, on the assumption that the system will clean it up later, are also part of the create-only plague.

So if you are in any of these categories, ext2/ext3 should NOT be used for your workspace partition or the partition on which the /tmp directory resides.

There will be other cases, but those are the clearest to me.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)

These statistics prove! by halr9000 · 2006-01-06 12:33 · Score: 1

That if you like to delete files, EXT2 or 3 is the way to go!

Also, studies show that Reiser v4 and v3 were obviously swapped, because normally subsequent versions are supposed to IMPROVE performance.

Speed is the most unimportant aspect... by Anonymous Coward · 2006-01-06 14:13 · Score: 1, Informative

Speed is the most unimportant aspect you can think of...

First off:
People pissing and moaning that it's a old 500mhz machine should realise that very rarely in real world your cpu is completely dedicated to managing files.

That is when your running other proccesses it's likely that you'd only have '500mhz' worth of performance for a single proccess's file system accees left after the kernel is finished scedualing all the other proccesses and thread.

So stop bitching. It's a good as any benchmark.

Secondly if you value your data at all you should be running Ext3.

It's not so much that XFS sucks or JFS sucks.. it's that YOUR HARDWARE SUCKS.

Yes. That PC sitting in front of you with the nice big SATA drive and nforce chipsets is a hunk of shit. All PCs are like that and it's a reality you have to live with with PC-class hardware. Even on the server.

It maybe fast. But fast ain't everything.

With XFS and JFS they are designed for a different class of hardware.

These are machines that are built specificly for a task and the operating system modified/designed to suite that specific hardware and that specific task. These have nice big harddrive caches with battery backups (just for the harddrive's cache), they have big capaciters in the power supplies with all sorts of redundant ways to monitor different aspects of hardware. They are designed to be used with nice UPS and redundant power supplies.

If the power goes out to the machine and the UPS fails there is enough time on the machine to abort proccesses and make sure that the file system and data on the system is in a consistant state in the second it takes for the hardware to finally fail. And the hardware fails in specific orders to avoid data corruption.

This shit is expensive. This is the 'high end unix iron' stuff that people talk about. This isn't your Dell dual cpu crapbox with Windows 2003 thrown on it. The only way to get close to this level of reliability with PC hardware is to use Linux clustering with multiple multiple redudant redundancies and failover network file system support and such... and even then there is limitations.

This is what XFS and JFS is designed to do. Even low end versions of AIX and IRIX-using hardware had special features to assist the file system in protecting itself.

Ext3 on the other hand is designed specificly to work with your crappy PC hardware. That's it's purpose. That's what it is designed for and that is why 'enterprise' style Linux distros like Redhat use it almost exclusively, that's why they helped create it.

When your PC hardware looses power it craps out randomly. Your cpu could still send data to your harddrive while the delicate memory is busy flipping out and sending random garbage down all the channels on it's bus. There is no intellegent way for the OS to handle power failures and hardware failures because the hardware has no intellegent way to handle this stuff.

That's why Ext3 still has fsck. XFS, for instance, has the ability to journal your directory system.. but not data. Ever noticed that? Ext3 supports multiple journalling features including full data journalling.

That's also why ext3 is tied into linux clustering with things like Lustre and GFS.

That's why you, in my opinion, should use Ext3.

It may not be as cool as ReiserFS, but if your data matters then use Ext3 AND backups.

It looks like all file ops have gotten 30% slower? by Burz · 2006-01-06 16:17 · Score: 1

Can that be right?

It would be a shame if 2.6 came with that kind of a performance penalty.

Also, I don't think I can consider a benchmark on such an old system to be representative. The relationship in timing between the filesystem and the spinning platters themselves is bound to yield quite different results under a 2x or 3x faster CPU.

ext3 wins the file removal race by MadBrassMan · 2006-01-06 17:31 · Score: 1

If you are ready to throw your computer out the window, I recommend trying the slightly less destructive # rm -rf / which can be just as satisfying. Sometimes you just need to start fresh, and ext3 can get you there in seconds.

What about filesystems 4 TB? by djoslin · 2006-01-06 17:59 · Score: 1

We store images. Lots of images. Customer images. We recently bumped into the EXT3 4TB files system limit.

Redhat will only support GFS in addition to EXT3 and EXT2. I know we can carve up a >4tb volume into several smaller filesystems, but the nature of our storage architecture is that larger volumes are more efficient. I have seen very little mention of GFS and we strongly desire to maintain vendor (RH) support on these production systems.

Comments?

Re:Very interesting article... NOT! by NotBorg · 2006-01-06 18:10 · Score: 1

I think your post would have been more "insightful" in the context of the discussion if you had actually posted some scores from different file systems. Basically all I gathered was that you ran some benchmarks and they were different from the articles on your machine.

--
I want this account deleted.

Re:not Linus but anyway by Anonymous Coward · 2006-01-06 21:52 · Score: 0

> Question for Linus: Why are you not letting the Reiser4 drivers into the mainline kernel?

This has been flamed to a crisp several times over on LKML. My guess is that it's because they let Hans and his ego do the negotiation. Flame ensues, technical issues don't get solved, no merge.

Re:not Linus but anyway by chris_sawtell · 2006-01-06 22:54 · Score: 1

O I C. Pity. Shame when personalities get in the way.
Not that I need to care 'cos I run the -mm kernel tree, which seems to go pretty well for me.

Re:What about filesystems 4 TB? by sad_ · 2006-01-07 03:36 · Score: 1

hmm, don't know. it always sounds bad to me to have such large filesystems. are all files in there without subdirectories? perhaps you can move the subdirs of to other volumes. if you care to shell out large amounts of cash for support etc. you could go with veritas fs/vm.
in that case the OS will be supported by RH and your fs by veritas.

--
On a long enough timeline, the survival rate for everyone drops to zero.

Fragmentation is ignored. by omry_y · 2006-01-07 08:06 · Score: 1

Note that real file systems life-cycle is not :
format, create some files, delete some files, create some directories, delete some directories, format.
in real world, a file system can last for years, and file fragmentation can have very a serious effect on performance.
I think a real rest would be to create a random sequence of file related actions (create dirs/files, delete dirs/files, move, rename), mix it very well, and run it on several file systems.
this will create fragmentation, and will show how well each fs handles it.

--
Omry.

Hans Reiser to Speak at SCALE 4x by irabinovitch · 2006-01-08 17:31 · Score: 1

Han Reiser will giving a presentation on the Reiser4 File System at SCALE 4x.

Reiser4 article by Anonymous Coward · 2006-01-13 06:26 · Score: 0

The so-called review was intersting in that it is good from time to time to raise one's head and make sure your not living in a dream world or on outdated info or presumptions.

This article made me re-evaluate my use of R4.

Beyond that is was a serious waste of time. So-called reviews that take on the air of scientic evaluation but dont stand up to five minutes critical analysis abound on the internet.

what is surprising is that this even got linked by slashdot.

dec(slashdot.cred,5)

Think before you link. ;)

Slashdot Mirror

Benchmarking Linux Filesystems Part II

255 comments