High Performance Network Applications

Sigh. Another Flawed "Test" by Anonymous Coward · 2001-06-15 02:46 · Score: 2

The flaw here was that the tests relied on 'simple C++ programs' to 'evenly' benchmark the different OSs. The problem is, in the real world, this is not how serious large-scale web applications are written and the sorts of results that this study comes up with are effectively meaningless. Show me a transaction server (or object broker). Show me how the systems scale with thousands of simultaneous users. Show me web performance based on code that people are actually likely to write in real life, not the TCP/IP equivalent of "hello, world" and you may have something that may be of interest outside of the context of an assignment for an undergraduate CS course in networking.

Real world please by Anonymous Coward · 2001-06-15 02:52 · Score: 2

I'm just sick and tired of these so-called "studies" which proclaim that they are, once and for all, going to end some religious battle. These studies do nothing for professionals or the industry, so why do people still bother?

As any professional will tell you, "it depends". Performance always depends on your needs, capabilities, money, skills, software, and hardware. Someone claiming that there is a simple answer by running some simple tests is just trying to either (1) sell consulting services, or (2) sell advertising space.

And nothing else.

So someone, tell me, please oh please, why I should pay attention to salesmen who claim to hold "answers". And tell me which CIOs really bite at these numbers. This is just for hit generation. Page views. These are not for me. They are not for the community. They are not for making good decisions.

Are they real experts? by Anonymous Coward · 2001-06-15 04:52 · Score: 4

They say:

> At Lyris Technologies, we write high-performance, cross-platform,
> email-based server applications. Better application performance is
> a competitive advantage, so we spend a great deal of time tuning all
> aspects of an application's performance profile (software, hardware,
> and operating system). Our customers frequently ask us which operating
> system is best for running our software. Or, if they have already chosen
> an OS, they ask how to make their system run our applications faster.
> Additionally, we run a hosting (outsourcing) division and want to reduce
> our hardware cost while providing the best performance for our hosting
> customers.

What a crap! They're claiming to be experts! Ha!
They just don't know how to tune Solaris or FreeBSD properly.
Results will be completely different if they've tuned it well.

Solaris Tuning Guide.

1) Apply latest recommended patches from http://sunsolve.sun.com
2) Add the following to the end of /etc/system:

* Raise TCP connection buffer size
set tcp:tcp_conn_hash_size=262144
* Increase various kernel buffers
set maxusers=2048
* Set hard limit on file descriptors
set rlim_fd_max=1024
* Set soft limit on file descriptors
set rlim_fd_cur=1024
* Increase directory name lookup cache
set ncsize=100000
* Should be the same as setting above
set ufs_ninode=100000
* Enable priority paging
set priority_paging=1

(These settings are based on information taken from:
http://docs.iplanet.com/docs/manuals/messaging/n ms 415/patch1/TuningGuide.html )

3) The following should be at the bottom of /etc/init.d/inetinit:

# TCP stack tuning
# default is 7200000
ndd -set /dev/tcp tcp_keepalive_interval 30000
# default is 240000
# change to "tcp_close_wait_interval" on Solaris 2.6
ndd -set /dev/tcp tcp_time_wait_interval 15000
# default is 128
ndd -set /dev/tcp tcp_conn_req_max_q 1024
# default is 1024
ndd -set /dev/tcp tcp_conn_req_max_q0 1024
# default is 8192
ndd -set /dev/tcp tcp_xmit_hiwat 32768
# default is 8192
ndd -set /dev/tcp tcp_recv_hiwat 32768

4) Speed up filesystem access under Solaris 2.7 and later.
Add logging to filesystem mount options in /etc/vfstab, like this:

/dev/dsk/c0t1d0s7 /dev/rdsk/c0t1d0s7 /opt ufs 2 yes logging,noatime

I have added noatime - this is another setting that might help
on very busy filesystem, but not that much as logging.

FreeBSD Tuning Guide

Recompile kernel with increased number of MAXUSERS (good number
to start is 256) and NMBCLUSTERS (I use 10000, see netstat -m
under load to get number that good for you).
You might want to play with "options HZ=1000".

Add this to /etc/sysctl.conf:

kern.maxfiles=65536
kern.maxfilesperproc=32768
net.inet.tcp.delayed_ack=0
net.local.stream.recvspace=65535
net.local.stream.sendspace=65535
net.inet.tcp.sendspace=65535
net.inet.tcp.recvspace=65535

Turn on softupdates on all filesystems
using tunefs -n enable (noatime might help as well).

Vadim Mikhailov

Re:This benchmark is baloney by dentin · 2001-06-15 03:34 · Score: 2

So you're saying that if you want good performance from Linux, you just code it normally - but if you want good performance from windows, you have to use all the platform dependent nonportable operating system extensions.

It might not be a valid benchmark, but perhaps there is a point to be learned from it after all...

-dentin

--
Alter Aeon Multiclass MUD - http://www.alteraeon.com

Re:This benchmark is baloney by Zapman · 2001-06-15 03:18 · Score: 2

You said:

3. They only tuned the Linux, FreeBSD and Solaris setups -- they should have tuned Win2k server as well.

Well, that's not a fair assersion. They did exactly 1 modification to each unix kernel: Change the number of file handles. They set each of them to use 65536. IIRC, windows2k doesn't need this tweak due to it's internal way of record keeping.

The greatest problems with benchmarks is what tweaking to do. Out of box tests fail because "Any competent admin will use tweak foo", and tweaked tests fail because "tweak foo on os1 is vastly more potent than tweak bar on os3." (think the first mindcraft test).

--
Zapman

Re:No one will ever be happy by edhall · 2001-06-15 06:39 · Score: 3

You're absolutely right. Their "benchmark" is perfectly valid, for their product running on a naively tuned operating system. But only a neophyte would put an out-of-the-box OS -- whether Linux, Solaris, Windows, or BSD -- into production as a high-performance network server. All the complaining boils down to two things:

The article asserts that performance of a single bulk-email program is a valid way to rank the four OSes, and
Ignores the system tuning that a competent system administrator would have performed.

The FreeBSD folks are especially upset because the article states that the OS was logging resource failures but the testers still didn't perform any tuning. That's an amazing level of incompetence to display in a magazine which is supposed to inform system administrators.

Now do you see what all the noise is about?

-Ed

Re:This benchmark is baloney by edhall · 2001-06-15 03:12 · Score: 5

Agreed -- it's been a long time since I've seen a "benchmark" as poor as this one. But I don't think Windows was treated any more poorly than the other OSes. It wasn't a fair test of any of them.

The "tuning" for the Unix systems consisted in bumping up the maximum number of file descriptors. That's it. The FreeBSD system in particular was left completely mistuned and clearly running out of socket resources -- they report that it was logging errors but seem entirely ignorant of what those errors were (beyond their being load-related) and how to correct them.

Polling is hardly the best system interface for multiplexing TCP connections on either Windows or FreeBSD. As you mention, completion ports are best for Windows. Kqueue is best for FreeBSD. It just happens that polling is used in the crappy commercial SPAM program they "benchmarked". (All the OSes support scatter/gather, BTW, so you can't claim Windows was treated unfairly by its omission.)

None of the systems were testing in a way that shows their actual capabilities. The article is just a thinly disguised commercial for a (barely-)cross-platform "bulk email" product.

-Ed

StreamModule architecture best... by Omnifarious · 2001-06-15 04:01 · Score: 2

The architecture they say performs the fastest, One-thread-many-tasks (asynchronous), is exactly the one encouraged and supported by my StreaModule system. I knew that things worked out this way, but I'm quite surprised to find such clear agreement by a third party. This idea doesn't really seem to crop up in many places.

--
Need a Python, C++, Unix, Linux develop

Re:This benchmark is baloney by Omnifarious · 2001-06-15 04:29 · Score: 2

Nice! So in other words, they used straight BSD sockets for their implementation - which is NOT the way to get performance from Windows. You need to use:

Asynchronous, Event based socket handling.

Completion ports.

Scatter/Gather buffering.

Polling is lousy no matter what way you do it. You'll lose most of your performance spent going round a small loop.

You're an idiot. They're using the 'poll' system call. If you bothered to read anything, you'd realize that 'poll' is the way to do asynchronous event based I/O under Unix. It's close to what 'WaitForMultipleObjects' does under NT.

They may use the sockets API, but as far as I know, that's the way to do TCP/IP under Windows. There are a few special calls to get NT 'handles' for your sockets so you can then do WaitForMultipleObjects based event based I/O handling. I'm betting this is exactly what they did.

As for scatter gather buffering, that depends a lot on your internal application architecture. I would agree that, in general, it's a good idea. I don't think their code would do scatter gather under Unix, and not under NT. Scatter gather is implemented nearly identically under both platforms.

Your comment shows a great deal of ignorance. It's a travesty that you were moderated to +5. *sigh*

--
Need a Python, C++, Unix, Linux develop

Re:This benchmark is baloney by Omnifarious · 2001-06-15 04:42 · Score: 3

You misunderstand 'poll' completely. poll asks the OS to suspend your process until one of the indicated events happens, then you get to go respond to it. It's essentially the same thing.

Say, for example, that your dumping data into a socket. Under Unix, you write to the socket until the OS tells you that the socket buffer is full by setting the socket to non-blocking and writing until write returns EAGAIN as an error. Then you put the ability to write to that socket on the list of OS events you're interested in. Then, you go do whatever else it is you have to do. After you get done servicing everything you can service, you call poll and it blocks your process (possibly running others) until one of the indicated events happens and there's something else to service. Same basic paradigm.

--
Need a Python, C++, Unix, Linux develop

Re:This benchmark is baloney by Omnifarious · 2001-06-15 04:48 · Score: 3

Also, VirtualAlloc there sounds and awful like like 'mmap'. Again, same basic idea, and Microsoft does it completely differently.

I know a fair amount about the insides of NT, and most design choices they made that are different than Unix's are worse.

Here are just two:

A FIFO VM?!?!? How stupid can you get? LIFO is much better, and while not really achievable, you can come closer than FIFO using a mark & sweep-like system (or perhaps there are better algorithms today).
WaitForMultipleObjects, you mean, every single semaphore and mutex call is an OS call now? No 10-20 cycle mutex grabs when there's no contention?

--
Need a Python, C++, Unix, Linux develop

Re:FreeBSD performance by MO! · 2001-06-15 04:11 · Score: 2

Additionall, they used an Intel EtherExpressPro 10/100 card (fxp driver). My understanding from the FreeBSD mailing lists is that this driver is being completely rewritten to eliminate significant performance issues in the FreeBSD 4.x versions. I suspect that even network performance would be noticably different had they used hardware with optimized drivers accross all platforms.

To say an OS's network or disk performance is poor, without considering the drivers used for your hardware, is kinda irresponsible.

It's clear, as your comment shows as well, they did not make any effort to properly tune and configure the overall system for each OS tested.

--
I AM, therefore I THINK!

Linux and ECN by Royster · 2001-06-15 03:36 · Score: 3

I'm sure Linux will talk just fine to Linux, but other platforms might not be tuned the same. (2.4 kernels were having trouble because of this recently. Linux implemented some feature that lots of routers didn't, and performance was hosed somtimes.)

You don't seem to understand ECN. ECN is now (as of June 12) an internet standard. It will improve the performance of the Internet by allowing ECN-aware stacks to note congestion and respond appropriately instead of waiting for packets to fail to be acked and backing off one the transmission speeds. (Ever got a 'stalled' message loading a /. page? ECN is supposed to help avoid that.)

Buggy routers responded incorrectly to ECN packets by terminating the connection. It appears as if the other computer isn't even on the net. Cisco has released bug fixes to correct this bug. They have not been applied by all of the admins.

Yes, Linux 2.4 shipped with ECN enabled. The distribution packagers generally (all?) included a command in the start-up scripts to disable the feature.

Because TCP/IP is a standard, there should not be performance differences between stacks whereas a stack performs better speaking to another stack of the same design. TCP/IP should be completely interoperable.

--
I have discovered a truly marvelous sig, unfortunately the sig limit is too small to contain i

Re:Linux and ECN by Ayende+Rahien · 2001-06-15 03:54 · Score: 2

> Because TCP/IP is a standard, there should not be performance differences between stacks whereas a stack performs better speaking to another stack of the same design. TCP/IP should be completely interoperable.

TCP/IP is indeed interoperable, but some things will be faster on one system, and some on another.
Because of the different implementation of TCP/IP.

--
Two witches watch two watches.

--

--
Two witches watched two watches.
Which witch watched which watch?

Emulating IO Completion Ports by cpeterso · 2001-06-15 06:09 · Score: 2

An IO Completion Port is just a thread pool blocked on a counted semaphore to call select() or WaitForMultipleObjects(). If you set the initial semaphore count to the number of processors, then the OS scheduler should efficiently pin each thread processing a select/WFMO event to its own processor.

--
cpeterso

Re:I'd be suspicious too. by Arandir · 2001-06-15 12:18 · Score: 2

They used all three of these systems "out-of-the-box". So look at what these systems are pre-tuned out of the box for: Redhat Linux for speed and FreeBSD for stability (and Windows Y2K for media benchmarking). Think about it.

--
A Government Is a Body of People, Usually Notably Ungoverned

Re:Benchmarks by mindstrm · 2001-06-15 06:37 · Score: 2

So, based on using 1 of the dozen or so filesystems linux supports, you determine it's
crap?

Try reiserfs....
I bet that 18 gigs takes *forever* to fsck if you reboot...

Well.. by mindstrm · 2001-06-15 06:39 · Score: 2

crappy benchmark, to say the least.

IF the question is 'which network stack is fastest' there are ways to sort that out. 'which is better under high load'.
There are so many questions that can be asked...

And any of the systems tested are capable of blindingly fast network operations if the programmer takes into account the best way to do things on that particular machine.
Compiling the same code on 4 machines and testing the output is more of a compiler/libarary benchmark than a system benchmark.

Re:This benchmark is baloney by AT · 2001-06-15 03:30 · Score: 4

While your point that this benchmark is somewhat flawed is correct, you also point out a large problem with Windows:

You are forced to use proprietary MS-only extentions rather than straight, standardized POSIX calls to achieve the best performance. That means you have to suffer proprietary lock-in if you want to code high performance network applications for Windows.

I think is deliberate: there is no reason why calls like malloc, creat, mmap, poll, whatever, couldn't have been tuned to get similar performance to the Windows specific VirtualAlloc, CreateFile, etc. Microsoft wants you to trade off portability for speed.

Re:This benchmark is baloney by Lazaru5 · 2001-06-15 10:23 · Score: 2

It wasn't tuning per-se, just the raising of maxfiles because Unix defaults to lower settings. They point out that it wasn't necessayr under Windows, presumably because the equivalent is uncessary.

--

--

--
My comments and opinions completely reflect those of anyone and anything I am remotely associated with.

I think you missed the sarcasm. by Lazaru5 · 2001-06-15 11:22 · Score: 2

If there were a Geek Speek generator on the net similar to the Mission Statement generators, that's what it would sound like.

How embarrased you must be.

--

--

--
My comments and opinions completely reflect those of anyone and anything I am remotely associated with.

Re:FreeBSD performance by Lazaru5 · 2001-06-15 11:34 · Score: 2

That doesn't mean that if_fxp is a poor driver currently. Everything can be improved however which is what the mii rewrite is doing. if_fxp is already a very excellent driver and is the best card/driver combo under FreeBSD (and probably most OS's).

--

--

--
My comments and opinions completely reflect those of anyone and anything I am remotely associated with.

Re:wake up boys by Polo · 2001-06-15 06:25 · Score: 2

Ok, so that's what it looks like. However, I did an awful lot of benchmarking at my last job to get our performance up on our hardware. So what I benchmarked was our software. It was a modified version of the apache webserver. So I had extensive results from the use of our product, and virtually NO results for SQL servers, spreadsheets, 3d-games or email applications.

I just think these guys did their job (optimizing their software), and ended up publishing their benchmark results to enlighten other people. I wish I had gathered up everything and put it out there.

Basically we tested a version of apache on BSDI 4.01, redhat linux 6.2 and solaris 7. The systems were compaq 1850r p2 450x2 boxen. BSDI needed a LOT of tweaks, but ended up being the most efficient. Solaris was pretty stable, but a little slower. Linux was about the same performance as BSDI... sometimes. Sometimes it would flake out at high loads. I'm sure it's much better now, especially with tux.

Slashdot non-biased? by garver · 2001-06-15 04:14 · Score: 2

In reading the top-moderated comments, one thought came to mind: Slashdot readers, who are accused of being rabid Linux supporters, are bashing a benchmark that came out pro-Linux.

Kudos to the Slashdot community for being objective, despite your theoretical biases.

Re:This benchmark is baloney by spectecjr · 2001-06-15 03:43 · Score: 2

So you're saying that if you want good performance from Linux, you just code it normally - but if you want good performance from windows, you have to use all the platform dependent nonportable operating system extensions.

If that were the case for Linux, the Tux guys wouldn't be trying to put an http daemon in the kernel. They'd just keep it in user-mode and 'just code it normally'

Simon

--
Coming soon - pyrogyra

Re:This benchmark is baloney by spectecjr · 2001-06-15 03:53 · Score: 4

I think is deliberate: there is no reason why calls like malloc, creat, mmap, poll, whatever, couldn't have been tuned to get similar performance to the Windows specific VirtualAlloc, CreateFile, etc.

... apart from the fact that they expose different paradigms entirely?

Malloc - heap based allocation
VirtualAlloc - allocates entire pages from the VMM. Allows you to reserve or commit pages when and as you need them.

fopen - opens a file handle
CreateFile - Allows you to open a file handle, specifying buffers to use, etc etc etc.

poll - you sit there waiting and doing nothing most of the time because you're asking all your connections "are we there yet?"
CompletionPorts - the OS comes back to you when it's done, and tells you that it's finished. You can now use those spare cycles doing something else - like another 1000 network connections.

Simon

--
Coming soon - pyrogyra

This benchmark is baloney by spectecjr · 2001-06-15 02:46 · Score: 5

I'm sorry, but I can't see how this is a valid benchmark.

"As a real-world test, we measured how quickly email could be sent using our MailEngine software. MailEngine is an email delivery server, ships on all the tested platforms (plus on Solaris for Sparc), and uses an asynchronous architecture (with non-blocking TCP/IP using
the poll () system call). So that email was not actually delivered to our 200,000-member test list, we ran MailEngine in test mode. In this mode, MailEngine performs all the steps of sending mail, but sends the RSET command instead of the DATA command at the last moment. The SMTP connection is then QUIT, and no email is
delivered to the recipient. Our workload consisted of a single message being delivered to 200,000 distinct email addresses spread across 9113 domains. Because the same message was queued in memory for every recipient, disk I/O was not a significant factor. We slowly raised the number of simultaneous connections to see how the increased load altered performance."

Nice! So in other words, they used straight BSD sockets for their
implementation - which is NOT the way to get performance from Windows. You
need to use:

1. Asynchronous, Event based socket handling.
2. Completion ports.
3. Scatter/Gather buffering.

Polling is lousy no matter what way you do it. You'll lose most of your
performance spent going round a small loop.

Similarly you can infer that they used straight malloc() for their memory
handling, and most likely file handling - again very lousy
performance-wise on windows compared to the alternatives, such as
VirtualAlloc, CreateFile(), scatter-gather file handling and more.

As for the second test, we can guess (from their comments) that they're
using straight C++/C file operations under windows instead of tuning them to
the architecture, so of course performance is going to be lousy -- they're
benchmarking Microsoft's C runtime implementation, nothing more, nothing
less.

Also note that:
1. They don't provide details of which compiler they're using.

2. They don't provide details of the actual benchmark code for test 2.

3. They only tuned the Linux, FreeBSD and Solaris setups -- they should have
tuned Win2k server as well.

Sheesh. Talk about a crappy way to benchmark.

Simon

--
Coming soon - pyrogyra

Re:This benchmark is baloney by be-fan · 2001-06-15 04:10 · Score: 2

One can look at it another way. The majority of developers use Win32! Why can't Linux get with the program?
If you're benchmarking an OS, you use whatever is fastest on the OS. And since most Win2K server programs WILL use whatever is fastest on Win2K, the benchmark can be valid as a real-world test (assuming all other factors are correct, of course!)

>>>>>
For the English-imparied, I'm not advocating that Linux switch to Win32. I'm simply stating that what is "normal" is in the eye of the beholder.

--
A deep unwavering belief is a sure sign you're missing something...
Re:This benchmark is baloney by ostiguy · 2001-06-15 07:41 · Score: 2

Actually, for 2k, they could have turned off 8.3 filename creation via a registry entry. On NTFS with x0,000's of thousands of files, the 8.3 name creation can be a drag. On a email server, you should be able to make the registry change, that breaks dos compatibility. SInce one of the tests specifically did create 10k files in one dir, the tweak might help ostiguy
Re:This benchmark is baloney by teg · 2001-06-15 05:30 · Score: 2

There is a difference between good performance (standard use) and exceptional performance, like the one Tux gets.
Re:This benchmark is baloney by crucini · 2001-06-15 04:25 · Score: 2

You need to use:...1. Asynchronous, Event based socket handling. ... Polling is lousy no matter what way you do it. You'll lose most of your performance spent going round a small loop.
Please type 'man poll'. You'll find that poll(2) is asynchronous and event based. Nothing to do with cycling in a tight loop. Which doesn't detract much from your point that the benchmarkers showed no signs of understanding or adapting to the Windows OS.

Target of this benchmark by Restil · 2001-06-15 04:37 · Score: 3

Anyone else notice the heavy concentration in that article about the efficiency of mailing out large numbers of email messages. Now, I'm certain there are many MANY legitimate reasons why someone would have a "test list" of 200,000 email addresses, its just that I can't seem to think of any at the moment.

-Restil

--
Play with my webcams and lights here

Re:Win2k could have been faster, but at what price by throx · 2001-06-15 09:35 · Score: 2

In case you didn't notice, NT is not Unix, never has been Unix and never will be Unix. There are so many design differences in the underlying system that it is hard to believe you are even suggesting it's a good idea to use a single code base.

I've seen plenty of code that has a single source for Unix and NT, but NONE of it is high performance and most of it behaves very strangely on NT when you compare it to a properly written NT service or application.

If you are writing high performance code then you are almost certainly writing for a particular system and have to write the code for that system. Writing for NT is different to writing for Unix (I prefer writing for NT personally but that's another issue) and trying to say that it's lock-in is just stating the obvious.

By your argument Linux is deliberately encouraging the use of non-portable code through applications like Tux which only work on Linux boxes and not Windows, or even other Unixes.

It's just daft.

--