Ask Slashdot: NFS on Free OSes Substandard?
Yet another fearless member of Clan
Anonymous Coward wrote in with this intriguing
issue: "I am trying to convince
my company to move off of Digital Unix and
Sun OS to either FreeBSD or Linux as our
primary server platform. The main argument
I am getting is the NFS client performance
on these free OSes is much worse than that
of Solaris or DU. Can anyone give any
recent data on relative NFS performance on
these platforms?"
I just ran a series of benchmarks/simulations for NFS on FreeBSD 3.0 and Linux 2.2.1. FreeBSD NFS performance was very poor compared to Linux; this result was counter-intuitive because most of the other tests I ran for disk and network performance showed FreeBSD slightly besting Linux.
Linux showed a 14% NFS performance penalty over local disk access, and FreeBSD showed a 55% NFS performance penalty.
If anyone cares about the test specs, send me email at mka@ieee.org
NFSv3 support was finished for Linux 2.2 last week. Check for the patches in the kernel-list archives.
http://www.linuxhq.com/
NFS preformance has increased /greatly/ with the new kernel based NFS implementation.
It is unfortunately not NFSv3 yet, though a fair amount of the features have been implemented. And there are still a few lingering bugs from what I've heard, though I personally have yet to run into a problem with it.
The raw preformance I've seen is around 40% faster on a single client, which is corroborated by RedHat's experience. They also claim over 400% improvement in multi-client environment.
Probably the most important considerations is WHAT ARE YOUR NEEDS?
If you have extremely large traffic requirements (read large number of clients or large files) or if you absolutely need NFSv3 compliance (for 64-bit file handles, etc), then don't use Linux or *BSD.
If, on the other hand, you are handling a couple dozen clients with low-to-middling NFS requirements, save yourself a boatload of money and use a Redhat server.
But even if the free Unices don't make sense for your NFS servers, by all means recommend them for other tasks - mail server, web server, database server, router, etc, etc.
Some weeks ago there was a huge thread on freebsd-hackers about NFS and the implementation in FreeBSD which has "slight" problems currently. JKH estimated the time necessary to fix these problems in months. It was even suggested to fund one or two developers to take care of the NFS "thingies."
Then, there is Linux. The 2.0 kernel suffers from the userland-only nfsd implementation which has a real impact on the speed on especially fat pipes (>100MBit/sec). The interaction between userland/kernel demands many context changes and data copying between the two areas decreasing the overall speed and increasing the server load.
Linux 2.1/2.2 uses a kernel nfs implementation which is currently under heavy development and as such overall reliability cannot be foreseen. It still suffers from problems with using bigger read/write blocksizes. But HJLu (I think he is working on that) and the other contributors are doing a great job, so this area will improve over time.
If you want to run a big network with many clients (300+), you should currently go with a commercial OS such as Solaris (I don't know anything about HP-UX' NFS performance/reliability) and run it on vendor hardware (yes, I'm conservative). At the current stage of open source implementation of NFS, it would only discredit open source and yourself as a open source advocate, if you would suggest to use open source software for running a huge network. You can easily go with Linux or FreeBSD, if you want to build a rather small network (I have a client with ~70 networked stations depending on a FreeBSD 3.0 server) and don't need a really scalable solution.
We have a small development network of about 8 client machines (linux & sparc boxes) and a linux box for an nfs server. These are some of the times I have collected while I was following the optimization section of the linux NFS howto:
/etc/vfstab file
/opt/stuff nfs - yes rsize=1024,wsize=1024,rw
/etc/fstab file:
/opt/stuff nfs rsize=4096,wsize=4096,hard,intr,suid 0 0
I use the following commands to write and read a file, respectively (see the NFS howto):
(1) time dd if=/dev/zero of=/opt/stuff/testfile bs=16k count=4096
(2) time dd if=/opt/stuff/testfile of=/dev/null bs=16k
One of the sparc clients has a line like this in its
linuxServer:/opt/stuff -
A linux client has the following line in it's
linuxServer:/opt/stuff
This is a typical (I say typical because I'm substituting the average times)
out put from (1) on the sparc client:
4096+0 records in
4096+0 records out
real 0m20.90s
user 0m0.24s
sys 0m2.49s
And for (2):
4096+0 records in
4096+0 records out
real 0m0.69s
user 0m0.04s
sys 0m0.62s
For the linux client, (1):
4096+0 records in
4096+0 records out
0.01user 2.05system 0:21.00elapsed 9%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (89major+15minor)pagefaults 0swaps
and (2):
4096+0 records in
4096+0 records out
0.00user 1.49system 0:36.13elapsed 4%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (96major+15minor)pagefaults 0swaps
For (2), the sparc is significantly faster.
If I change the rsize and wsize of the sparc client to 4096 each, the sparc
client will crash nfsd on the linux server.
We are just a small group of developers who happen to be linux enthusiasts, so we configured this setup ourselves. In short, we don't claim to be masters at configuring unix networks.
The sparc client is an ultra 5, the linux client is an 450MHZ HP Vectra (p.o.s.), and the linux server is a 333MHZ Dell Dimension.
That's kinda true (at least a few years ago) because Solaris NFS wanted 8k packets, while Linux does 1k by default. No hard numbers for a performance change, but increasing rsize and wsize to 8192 should help out a lot.
Posted by The King of the Potato People:
:) It's okay, that's the standard 'I'm new here, I don't know how anything works, let's use Linux!!' opinion that most newbies who don't know Solaris have. Maybe I've just seen this too many times.. I hate it when people get hired and want to change OPERATING SYSTEMS on the servers because they have a personal preference..
If it works, why do you want to change it? Surely your loyalty to Linux/FreeBSD isn't clouding your judgement now, is it?
Why reinvent the wheel?
ash
Doesn't the Hardware itself play a large role in the NFS server? I can't see an NFS server needing massive CPU power, but I can draw some lines to Memory I/O bandwidth, SCSI systems, and Network Interface devices. When any hardware componant is "weak" it could potentially effect the preformance some percentage, right?
So, comparing a Sparc w/ Solaris to a x86 w/ Linux/FreeBSD just makes me think your actually comparing a lot more than just OS's, and I would want to know the detailed specs on the systems being compared.
Or can someone somehow prove to me that the software is the over-riding influencing factor, and the hardware doesn't matter?
I have to agree with others who say the NFS implmentation in Solaris is the one that others should be measured against. Rock solid, having adminned Solaris and SunOS for many years.
Recent versions of HP-UX seem to have borrowed a lot of Sun technology (an update to 10.20 gave NFSv3, Sun's autofs, and ONC+ in one fell swoop, and 11.00 incorporates all these also). I work at an all HP shop right now, and I've have to say it works OK, though we don't make heavy use of NFS (no shared home dirs, SW builds on NFS filesystems, etc.), just some light data sharing.
As for FreeBSD, I only have a 3.0-CURRENT box current as of Jan. or so, and a 2.2.8 box, so I don't have firsthand experience, but reading the mailing lists, significant progress has been made on general NFS stability and functionality (e.g. NFS over TCP). I don't have a Linux box, so I can't comment there.
Mike.
Of course, people are working on fixing this...
Host your own websites, anywhere!
i have a P5-100 and an ultra 10 sitting next to each other on my desk at work. i use NFS to cart crap back and forth between them and they're on their own ports on a 100base-T switch. with 2.0.x and 2.2.x and the latest userland nfsd (whatever the latest RH 5.2 update was) i got 2 MB/sec pretty consistently going both ways. i didn't tweak read and write block sizes; they're whatever the out-of-the-box default was. now the P5-100 is running RH 6.0, kernel 2.2.5, and knfsd 1.2.2 (also not tweaked) and i get 3 MB/sec going both ways which is faster than i usually get via ftp. if i had a faster cpu and disk on the linux end it would probably be even better. it also seems like knfsd is a lot more responsive for automounting and grabbing lots of small files, but i don't have any numbers to back that up. as a comparison, i get 5-5.5 MB/sec when moving files between two ultra 10's running solaris 2.6 (seagate cheetah drives on both ends).
linux's forte really is as a desktop unix (my friggin' P5-100 is _so_ much more responsive under X on the console than the ultra it isn't funny) and i think even the performance hit of the userland nfsd is outweighed by the performance gains in other respects (mostly X and file caching). it's also true that linux comes with a lot more software prepackaged whereas with most commericial unices you have to spend a week digging up and compiling such basic stuff as perl, python, or even bash.
until knfsd shakes out a bit more i probably wouldn't want to use linux as a really hardcore NFS server (multiple hundreds of clients, heavy load, etc.), but in my experience it's fine in more modest environments and as an NFS client with HP or Sun NFS servers.
tim
hiding in shadows / i hear you coming closer / you will explode soon -- a quake haiku
It is not "just FUD" ... I'd like to be able .25 million for the netapp... But please don't dismisss this
to post performance numbers, say from a specific
number of operations between a Netapp filer and
a sun, lintel, linux-alpha, dec-alpha, and winnt
box, but I don't have the extra
report as FUD... We love linux to pieces but the
NFS performance has been a showstopper for us.
Tuning it isn't the solution; NFS writes are slow.
No fud here -- it's a certainty, not uncertainty,
no doubt about it at all.
-fb Everything not expressly forbidden is now mandatory.
(damned enter key, excuse me...)
It seems like most of the responses lean
toward "not using NFS" or "using something
else (CODA, AFS)" but apparently we are still
lacking in the NFS department.
Unfortunately for linux in at least one place
I know, this is terrible. What if (your shop) wants
to use Network Appliance for your storage solution? All of a sudden you have an argument
against linux based strictly on a technical merit -- NFS Performace. Not good. (When you get a
Netapp filer talking coda, call me!) Even dyed-in-the-wool linux advocates in my company are
forced to bite the bullet and use other platforms
because linux isn't suitable to task for NFS with lots of writes.
NFS itself is not to blame, after all Digital Unix
performs well in this context. Even a commercial
NFS implementation would be okay for a solution.
Poor NFS performance has been a problem with linux
for too many years now. I keep waiting for the
problems to magically go away, but I guess they
aren't going to. I've studied filesystems but still don't think I can fix this...
-fb Everything not expressly forbidden is now mandatory.
I have had more problems with Linux NFS than anything else on the system. Sometimes it just fails to work, both client and server side, for no apparent reason, All the correct rpc.* daemons are running, and the exports file is correct. Other times the same setup works properly.
Also try mounting a Linux export from a Sun system and watch the Sun complain.
This is with the userspace NFS server, I haven't tried the new kernelspace one yet.
Why is it necessary to migrate from Sun and Digital anyway?
Of all the comments I've ever posted, this is definately one of them
Linux NFS is just not in good shape, IMHO. I would recommend against using it in any critical situation... but then again, I'd recommend against using NFS in general... =)
Since I don't yet have access to Transarcs AFS client for Linux 2.2, I am using NFS to mount AFS off another machine that is capable of AFS. Working out authentication was a pain (luckily somebody did most the work for me) and it still isn't trustworthy. I login remotely to machines that are really on AFS if I have to do anything more than just a quick edit...
but among the odd things i've noticed: files I don't have permission to just don't show up. This is really annoyingish, when say, I accidently open a file with a stupid mode and it suddenly disapears (spent a good while debugging programs tonight before i thought to log into a remote machine and voila... there was my file, mode 000) I don't know for sure, though, if this is Linux's fault or NFS in general...
anyway, to sum up, I'd say stick with something else for NFS for now (I like Slowlaris... it was my first UNIX)... Linux still has a ways to go, methinks...
You don't need very complex test setups to measure those differences--simply read and write a bunch of big files with "dd". If you want to do it simultaneously from several clients, there are some simple, free tools that let you execute the same command on multiple systems in parallel.
This is documented right at the top of the nfs man page, and makes a world of difference. My group at work has a very similar situation to yours (most shares served by Digital Unix but adding more Linux boxes every day), and NFS was definitely a problem until we fixed this.
Div.
--
But my grandest creation,
As history will tell,
Was Firefrorefiddle,
But my grandest creation, as history will tell,
Was Firefrorefiddle, the Fiend of the Fell.
Linux of 1.2 and prior era had lousy NFS.
:-)
2.0 Linux had a reasonable, but not brilliant client - certainly slower than the better commercial unices. It didn't do locking (at all), but was pretty stable (we had some occaisional problems on SPARC, but none on intel). If you didn't need locking then it worked fine (the other performance benefits of Linux outweighed the NFS degredation).
2.2 Linux is meant to be much better performance and locking is getting there... but its currently flakey. However I haven't used it seriously yet so ignore me on post 2.0 NFS
While there are some obscure bugs in FreeBSD's implementation of NFS client under very high loads (many of which are fixed in FreeBSD-current or will be fixed by pending changes to FreeBSD-current), I believe it to have extremely good performance. The attribute cache which was introduced in FreeBSD-3.x reduces network and server load significantly.
I am biased since I work on the FreeBSD kernel and in the past have been involved with fixing and optimising the NFS client but I also use NFSv3 on a regular basis and see excellent performance with 100baseTX (I haven't measured performance for about a year but I seem to remember multiple Mb/sec write performance).
I can't comment on Linux performance since I have only tested RedHat 5.2 (which has terrible NFS performance, IMHO) and I believe that many improvements have been made in the 2.2.x kernel series.
My network was setup originally with an SGI Challenge S box as an NFS server. The client machines were a combination of PCs running Linux or Solaris, and a couple of SGI Indys. With this setup, there was little to no problems with NFS.
However, I moved a whole bunch of stuff over to a Linux server (some home directories) and I got hit with problems on the client side hard.
The Linux clients talk to the Linux NFS server fine, but clients like the Indy take a real dislike to it. Even forcing the Indy to use NFSv2 and trying static mounts over automount/autofs, any process on the Indy that tries to use NFS to the Linux server just hangs.
No matter what I've tried, I can't seem to fix the problem. I've tried the Linux kernel implementation of nfs and the userspace versions. Same problem.
My advice; stick with commercial versions of NFS for the time being, ie those that come with Solaris and IRIX (especially since NFS comes packaged with 6.5 :)
-Frysco!
Here are 2 benchmarks I've had to do recently.
/mnt/ianb/testfile
/tmp files) AND MAKE -j 5
/tmp files) AND MAKE -j 5
/tmp files) AND MAKE -j 5
/tmp files) AND MAKE -j 5
/tmp files)
/tmp files) AND MAKE -j 5
They aren't a complete test of NFS performance
by any means but illustrate that a Network
Appliance (F720) can match local disk performance
and trounce a SUN at NFS server performance.
I would add that though I use linux as an NFS
server at home for my 4 machine network with
few problems, my empirical experience is that
it is not ready for high-load mision critical
NFS server applications.
My opinion is that a Network Appliance's
clever write caching technology reduces the
NFSv2 write penalty dramtically and increases
linux client useability. We run 100+ fast
linux clients and 20+Suns and I know that this
setup is best server ($$$ and performance)
by a Netapp rather than ANY UNIX solution.
This first one shows raw sustained NFS
performance by using dd to read or write a
100MByte file over a switched full duplex
100MBit ethernet between a P-II 333 (3COM 905)
and Redhat 5.2 and a NetApp720.
In summary I can achieve approximately:
33Mbits/sec NFS write performance to the network appliance and
62MBits/sec NFS read performance from the network appliance.
XXX.8x8.com 28: time dd if=/dev/zero of=/mnt/ianb/testfile bs=16k
count=6250
6250+0 records in
6250+0 records out
0.050u 3.740s 0:25.24 15.0% 0+0k 0+0io 96pf+0w
XXX.8x8.com 29: time dd if=/dev/zero of=/mnt/ianb/testfile bs=16k
count=6250
6250+0 records in
6250+0 records out
0.040u 3.840s 0:23.13 16.7% 0+0k 0+0io 95pf+0w
XXX.8x8.com 30: time dd if=/mnt/ianb/testfile of=/dev/null bs=16k
6250+0 records in
6250+0 records out
0.040u 2.690s 0:14.60 18.6% 0+0k 0+0io 99pf+0w
XXX.8x8.com 31: time dd if=/mnt/ianb/testfile of=/dev/null bs=16k
6250+0 records in
6250+0 records out
0.030u 3.150s 0:12.58 25.2% 0+0k 0+0io 114pf+0w
XXX.8x8.com 32: ls -al
-rw-r--r-- 1 ianb users 102400000 Mar 11 10:29
/mnt/ianb/testfile
This 2nd benchmark is a GCC compilation of
8000 lines of C in many files. It illustrates
both the dramtic differences between local
disk/netappNFS/solarisNFS and the benefits
of using gcc and make options to improve
efficiency by running parallel compiles
(which uses idle CPU that is lost whilst the
OS waits for the remote RPC's to complete in an
unparallelized compile) and by using
interprocess comuncation insted of files in
/tmp to communicate between different
compile stages. Conclusion using 100Mbit
network, compile performance approaches local
disk performance.
NADS BUILD (GCC) ON SUN NFS DISK (10Mb/s net)
6.480u 1.460s 0:38.36 20.6% 0+0k 0+0io 12596pf+0w
NADS BUILD (GCC) ON SUN NFS DISK (100Mb/s net)
6.480u 1.110s 0:29.69 25.5% 0+0k 0+0io 12596pf+0w
NADS BUILD (GCC) ON SUN NFS DISK (10Mb/s net)
WITH GCC -PIPE (NO
6.890u 1.770s 0:33.17 26.1% 0+0k 0+0io 12597pf+0w
NADS BUILD (GCC) ON SUN NFS DISK (100Mb/s net)
WITH GCC -PIPE (NO
6.660u 1.280s 0:25.22 31.4% 0+0k 0+0io 11736pf+0w
NADS BUILD (GCC) ON NETAPP NFS DISK (10Mb/s net)
6.650u 1.230s 0:17.67 44.5% 0+0k 0+0io 12596pf+0w
NADS BUILD (GCC) ON NETAPP NFS DISK (100Mb/s net)
6.490u 1.250s 0:09.35 82.7% 0+0k 0+0io 12596pf+0w
NADS BUILD (GCC) ON NETAPP NFS DISK (10Mb/s net)
WITH GCC -PIPE (NO
6.940u 1.430s 0:13.93 60.0% 0+0k 0+0io 11741pf+0w
NADS BUILD (GCC) ON NETAPP NFS DISK (100Mb/s net)
WITH GCC -PIPE (NO
6.830u 1.140s 0:08.77 90.8% 0+0k 0+0io 11736pf+0w
NADS BUILD (GCC) ON LOCAL DISK
5.730u 1.020s 0:11.31 59.6% 0+0k 0+0io 11069pf+0w
NADS BUILD (GCC) ON LOCAL DISK
WITH GCC -PIPE (NO
5.700u 1.040s 0:09.71 69.4% 0+0k 0+0io 10271pf+0w
NADS BUILD (GCC) ON LOCAL DISK
WITH GCC -PIPE (NO
6.000u 1.100s 0:08.19 86.6% 0+0k 0+0io 10280pf+0w
Are these performance penalty measurements derived from NFS read operations, write operations, or a mixture of the two? If NFS writes were involved in the measurement, there is a reason for the relatively poor measured performance of FreeBSD.
According to RFC1094, NFSv2 servers must commit data to nonvolatile storage before acknowledging that a block write request has completed successfully. Given the relatively small size of NFS blocks (8K or less for NFSv2), forcing a separate write/sync/acknowledge cycle for each block that is written can result in relatively poor NFS write performance.
There are several solutions to this problem. Linux (and some commercial UNIX variants) acknowledge NFS write operations without forcing a disk write and sync for each individual block. This is obviously somewhat more dangerous, but yields much better NFS write performance. Some vendors offer hardware add-ins (such as the PrestoServ board) that provide nonvolatile storage that can be written to faster than a disk.
NFS version 3 has much better support for asynchronous writes without resorting to such hacks. Hopefully, Linux NFS3 will be usable soon.
If you are willing to live with the risk of data not necessarily being written to disk on the server when clients think that it has been, you can force FreeBSD to acknowledge NFS writes asynchronously using the command
/sbin/sysctl -w vfs.nfs.async=1
Retrying the benchmark after configuring FreeBSD to act more like Linux may yield different results.