Apache 2.0 doesn't actually do thread creation
very frequently. The thread creation cost occurs
mostly at startup. So the limiting factors for
threaded Apache performance on Linux are mainly:
Concurrency limitations of the Apache code
itself
This has been improving gradually with
successive 2.0 releases, as the remaining
global locks are removed or optimized.
General robustness of the thread
implementation
The current (2.4) Linux threading implementation
doesn't work well with debuggers.
At first glance, it looks like the NPTL could
be a win for threaded Apache on Linux, as offers
some solutions first the first and last of these
issues.
This will certainly not win me friends in the "everything should use threads because it's easier to do linear programming than to build a session reentrant state machine" camp, but...
It's hard to overstate the importance of the ease-of-programming issue. The current Apache 2 MPMs
(concurrency plug-ins) use threads rather than state machines because a pure state machine model would make life very, very difficult for third party module developers. The main reason for the current multithreaded model is that it offers a reasonable compromise between performance
and ease of development.
If Apache were a closed
system, it probably would be a big event loop
already. But Apache is really a
platform as much as it is a
product, and state machines don't allow
for easy module development.
Note that, because the concurrency model is a
plug-in in Apache 2.0, it's possible to drop in
a state-machine replacement for the current
threaded design. But it wouldn't be usable
with third party modules that make assumptions
about being able to do blocking I/O
Among the httpd-2.0 developers, there's been
some discussion of moving to a hybrid thread/async
design: run most of the module callback hooks
in a thread pool so that modules can use blocking
code, but complete the network writes in a
separate thread that runs a big event loop,
so that we don't need to keep a huge number of
worker threads around just to wait for network
writes to complete. (See, for example, http://marc.theaimsgroup.com/?t=103083818700002&r= 1&w=2)
Threading does provide a real improvement in
swap usage, though. On systems like Solaris
that reserve swap space equal to the parent
process's size upon fork, the 100 processes
at 15MB will require 1.5GB of memory+swap.
Copy-on-write will indeed keep you from having
to use that much physical memory, but the
swap space requirement is still a problem
with 1.3 (and 2.0 the prefork MPM).
- The speed with which the kernel can
schedule and context-switch among threads
m =103228014211983.
The O(1) scheduler patch for 2.4 seems to help
here.
- Memory usage per thread
- Concurrency limitations of the Apache code
itself
- General robustness of the thread
implementation
At first glance, it looks like the NPTL could be a win for threaded Apache on Linux, as offers some solutions first the first and last of these issues.For some recent data on this, see http://marc.theaimsgroup.com/?l=apache-httpd-dev&
This has been improving gradually with successive 2.0 releases, as the remaining global locks are removed or optimized.
The current (2.4) Linux threading implementation doesn't work well with debuggers.
This will certainly not win me friends in the "everything should use threads because it's easier to do linear programming than to build a session reentrant state machine" camp, but...
= 1&w=2)
It's hard to overstate the importance of the ease-of-programming issue. The current Apache 2 MPMs (concurrency plug-ins) use threads rather than state machines because a pure state machine model would make life very, very difficult for third party module developers. The main reason for the current multithreaded model is that it offers a reasonable compromise between performance and ease of development.
If Apache were a closed system, it probably would be a big event loop already. But Apache is really a platform as much as it is a product, and state machines don't allow for easy module development.
Note that, because the concurrency model is a plug-in in Apache 2.0, it's possible to drop in a state-machine replacement for the current threaded design. But it wouldn't be usable with third party modules that make assumptions about being able to do blocking I/O
Among the httpd-2.0 developers, there's been some discussion of moving to a hybrid thread/async design: run most of the module callback hooks in a thread pool so that modules can use blocking code, but complete the network writes in a separate thread that runs a big event loop, so that we don't need to keep a huge number of worker threads around just to wait for network writes to complete. (See, for example, http://marc.theaimsgroup.com/?t=103083818700002&r
Threading does provide a real improvement in swap usage, though. On systems like Solaris that reserve swap space equal to the parent process's size upon fork, the 100 processes at 15MB will require 1.5GB of memory+swap. Copy-on-write will indeed keep you from having to use that much physical memory, but the swap space requirement is still a problem with 1.3 (and 2.0 the prefork MPM).