High-Performance Programming Techniques on Linux

Disclaimer by Slashdot's+Attorney · 2002-06-26 11:36 · Score: 1, Funny

Slashdot makes no guarantees as to the accuracy of the above information. All data provided are for entertainment purposes only, and any damage to hardware as a result of following the above advice is the sole responsibility of the user.

--
Slashdot's Attorney

Re:Disclaimer by Anonymous Coward · 2002-06-26 13:54 · Score: 0

Thank you for the warning, kind troll! Your effort is much appreciated and could save many a person from disaster. Keep up the good work, I expect to see more from you!

This doesn't seem too interesting. by Lenolium · 2002-06-26 12:51 · Score: 2, Informative

Linux, always know for it's fast context switches, is faster that Windows, which is pretty slow at context switches, as far as OS's go. Is faster doing a minimal load multi-threaded application. Well, it's not quite a shocker, but glad to know that we are still ahead.

"high performance"??? by dirtydamo · 2002-06-26 13:30 · Score: 3, Insightful

That's the most ludicrous statement. How can one call a normal send/recv loop high performance socket code?

I, for one, remain totally unconvinced by this article (at least the guy who wrote it admits he doesn't know anything about Windows). How can one possibly compare "high performance" I/O on Windows without using overlapped I/O, and possibly even completion ports?

Re:"high performance"??? by Paranoid · 2002-06-26 17:15 · Score: 2

Likewise, how can one possibly compare "high performance" I/O on Linux without using O_NONBLOCK and SIGIO, and possibly even POSIX AIO? =)

I believe the point was trying to compare apples to apples, which is why the same API was used (to the extent possible) on both sides of the pond.

Perhaps the article had a misleading title. On the other hand, don't all benchmarks have the term 'high-performance' in them somewhere?

I actually liked these articles (even though I saw at least one of them before, here) - it seemed a good test of basic functionality, and as you rightly pointed out, the API they used really is basic. It did a far better job of comparing apples with apples than most comparisons, rather than shooting for some abstract (and uncomparable) "high-level" API, without even indicating how much of a benefit such an interface has over the base-level.

--
Paranoid
Bwaahahahahaa.

Why I don't do Windows by PD · 2002-06-26 14:47 · Score: 2

From the article about pipes:

The number 24 in the first executable line of code above was determined experimentally. I found no mention of it anywhere in the Platform SDK. If it is not present, the program doesn't work. Apparently, the pipe facility requires a 24-byte header on each write to the pipe.

If this were Linux, we'd be able to know what that 24 bytes was.

--
If tits were wings it'd be flying around.

Re:Why I don't do Windows by Anonymous Coward · 2002-06-27 09:59 · Score: 0

Are we sure that the pipe isn't simply set up for word-sized data and he tried to write byte-sized data to it?
Re:Why I don't do Windows by Anonymous Coward · 2002-06-27 12:29 · Score: 0

read it again.

24 BYTE header, not bits.

Benchmark bullshit and no knowledge of Windows by Twylite · 2002-06-26 19:07 · Score: 5, Informative

This article is a typical case of benchmark bullshit. The author has taken a deliberately Unix-centric view of comuputing, and ignored design and implementation concepts that are normal for Windows-based systems.

In the synchronisation article (the /. poster missed the link for that one) only Mutexes, Semaphores and Critical Sections are evaluated. It is well known that mutex performance on Windows is poor compared to *nix, but that is mitigated by a number of benefits in the Windows threading model.

Here's a brief intro, to show why they CAN'T be compared:
Windows has processes and threads as first class citizens, and they have fair (multi-level round-robin) scheduling. Mutexes, semaphores and critical sections are the primary locks, but there are also atomic check-and-increment functions as well as events/signals (long lasting flags). Every object (mutex, sem, section, event, thread, process, file, socket, etc) in Windows can be waited on, and you can wait on any number and combination of objects at once, in either an AND or OR configuration. e.g. wait for a mutex AND an async socket IO; or wait for a semaphore OR a thread to end OR an event
Linux's options are far more limited - to achieve the same results you have to use a different architecture (not that this is necessarily a bad thing); on the other hand Linux's primitives and context switching is faster than the Windows equivalents. Linux has kernel scheduled processes, userland threads (kernel threads are available), a fair but not deterministic scheduler, mutexes, semaphores and condition variables.
A condition variable is similar to an event, but is instantaneous - if no thread is waiting on the condition variable, nothing happens. An event stays set until it releases a thread (auto-reset events) or until explicitly reset (manual-reset events). A condition variable one of the few time-waitable objects in Linux (all objects are time-waitable in Windows; mutexes and semaphores are not time-waiting in Linux).

The comparitive power of the Windows' threading and synchronisation model may not be obvious to long-time Unix programmers, but consider the wider range of architectural possibility when you can wait (with a timeout) on any combination of any objects in the system.

In the socket article, the author compares the BSD socket API on Linux with the WSASocket API on Windows, which is meant primarily for asynchronous operation. Despite claiming techniques for "high performance" sockets, he fails to mention /dev/poll, POSIX AIO, or Window's IoCompletion Ports. POSIX AIO can be reasonably compared to Window's async socket/file support, but it is impossible to make a valid comparison between /dev/poll (or kqueue, etc) and IoCompletion Port because they require significantly different architectures to function at peak efficiency.

On to processes and threads. CreateProcess() has the combined functionality of fork() and exec(), so the article starts off on the wrong foot. It also supports security attributes, so the equivalent Linux example should have had a larger function starting with fork(), then dropping permissions in the child and exec()ing another binary.

The author incorrectly assets that Linux threads are scheduled by the CPU - he is using the pthreads library, which is userland threading. pthreads is also far from "fair"; Windows uses a multi-level round-robin algorithm, which makes thread scheduling very deterministic; pthreads is far more prone to thread starvation in a system where processing cascades between threads. e.g. an input thread, processing thread and output thread, which use mutex-protected queues to communicate; this is an excellent architecture for Windows, but performs poorly by comparison on *nix because a sudden heavy load will see the input thread scheduled more often that other threads, until it's load dies down, at which point the processing thread will get the load, and so on - throughput stays much the same as a Window system, but latency near-triples.

Benchmarking thread creation is a load of crap. Few seriously high-performance servers use a thread-per-connection architecture anymore; and at the very least they use thread pools.

The entire article is unfair to both sides: on Windows, threads are first-class citizens; on Linux you are more likely to use multiple processes for stability and performance.

I've already covered everything necessary to dispute the bullshit in the Scheduling article.

Conclusion: this is an excellent case of "don't believe the FUD". You can't compare apples and apples when some of the apples are growing on an orange tree. The only way to achieve a meaningful comparison of these platforms is to construct applications with equivalent functions, but designed and implemented for the target platform.

--
i-name =twylite [http://public.xdi.org/=twylite], see idcommons.net

Re:Benchmark bullshit and no knowledge of Windows by ianezz · 2002-06-26 21:18 · Score: 5, Informative
The author incorrectly assets that Linux threads are scheduled by the CPU - he is using the pthreads library, which is userland threading.
Uh?
Last time I checked, "pthread" is just an API, and on Linux you have at least two implementations of that:
- linuxthreads (kernel-based, uses the clone() system call, definively scheduled by the kernel), which is the one shipped with GNU libc (the one normally used, and the one used by the author of the article, btw).
- GNU Pth (completely userland).
IBM is also working to implement a M:N threading implementation with a pthread API, partially kernel-based and partially in userland.
Re:Benchmark bullshit and no knowledge of Windows by sigwinch · 2002-06-27 08:41 · Score: 2

Every object (mutex, sem, section, event, thread, process, file, socket, etc) in Windows can be waited on, and you can wait on any number and combination of objects at once, in either an AND or OR configuration. e.g. wait for a mutex AND an async socket IO; or wait for a semaphore OR a thread to end OR an event.
Not serial ports--they take a different API. (Last I heard, I may be misinformed.)
Linux has kernel scheduled processes, userland threads (kernel threads are available)...
As somebody else points out, Linux kernel threads do exist and are usually used. More importantly, the Linux kernel multiprogramming model makes no distinction between threads and processes. A thread is simply a process that shares memory with another process. Linux thread creation and switching are very fast, forking a new process is only a little more expensive than starting a new thread.
A condition variable one of the few time-waitable objects in Linux (all objects are time-waitable in Windows; mutexes and semaphores are not time-waiting in Linux).
However, when your pipes are fast you don't *need* a tasteless profusion of inter-context communication, and Linux pipes are time waitable (using the conventional I/O waiting API: select, poll, /dev/poll). The only thing Linux lacks is the ability to wait for several conditions to become true using a single system call. (You can do AND with blocking read(2), but you can't wait on anything else at the same time.)
Benchmarking thread creation is a load of crap. Few seriously high-performance servers use a thread-per-connection architecture anymore; and at the very least they use thread pools.
If your threads suck, you are constrained to use a thread pool. If your threads are good, you can use whatever is appropriate for the job. (There are many small jobs that thread-per-connection will handle just fine provided your OS isn't raping you for it.)

--
--
Kuro5hin.org: where the good times never end. ;-)

Very valuable information. by Anonymous Coward · 2002-06-26 20:11 · Score: 0

For those of us who write cross platform code, we need to know these numbers in order to see if we want to try to get by with the most common code possible, or do we spend the extra time on the windows platform and write some sort of non-portable crap to work around the fact that Windows Posix support truely, truely sucks.

And don't look for this to change anytime soon. It is in MS'es best interest to force people to use their API's instead of open API's developed with 35 years of experience as to what works and what doesn't.

mod parent up.. by johnfoobar · 2002-06-26 20:58 · Score: 2, Interesting

in fairness UNIX (or at least linux and the BSDs) are comparitively weak when it comes to multi-threading and lots of the slashdot zealots (sue me) could really benefit from actually sitting down with a copy of Inside Windows 2000 rather than just mouthing off about microsoft being evil and windows being crap.

multi-threading is why, for example aolserver can do with one process what apache needs a bunch of processes to do. (though i digress, aolserver only has to run tcl interps, where apache is much more versatile.)

meanwhile, both FreeBSD and NetBSD are trying to get SMP and scheduler activations into their kernels. this would improve their support for multi-threading substantially. there's a paper which explains this better than i ever could.

Re:mod parent up.. by Anonymous Coward · 2002-06-30 05:02 · Score: 0

multi-threading is why, for example aolserver [aolserver.com] can do with one process what apache needs a bunch of processes to do.

You're behind the times. Apache 2 has been out for a while now and it supports the same threading model as aolserver, as well as a bunch of others.

He doesn't know what he is doing. by benhaha · 2002-06-26 23:22 · Score: 3, Informative

The validity of the exercise is compromised by his assumption that that multiple processes as opposed to multiple threads was the best choice for whatever his benchmark is supposed to model, and that if they are, RPC, COM or shared memory are not more appropriate to the IPC task. Windows has many ways of doing IPC and concurrent tasking, and most applications use other IPC methods than pipes. This failure of choice is an important reason why such like-for-like benchmarks are of little value.

In short, these "high-performance techniques" are high-performance on Linux only, the way he does it. On windows, other methods, not available on Linux, are more used.

--
NO ID: BEING FREE MEANS NOT HAVING TO PROVE IT

Slashdot Mirror

High-Performance Programming Techniques on Linux

19 comments