Slashdot Mirror


Should Servers be Mono-Process or Multithreaded?

An anonymous reader wonders: "How would you design the fastest possible Linux-based server application today? A few years ago, the thinking was that multi-threading was not the way to go — instead, high-performance servers used an event-driven, mono-process model (consider lighttpd and haproxy). However, things have changed. Today CPUs have dual cores, and over the next few years this is only likely to increase. Also, the 2.6 Linux kernel has made multi-threading much more efficient. So I'm wondering, does Slashdot think that modern high performance server software should be designed to be multi-threaded, or does it still make more sense to use an event driven, mono-process architecture, despite the advances in the Linux 2.6 threading and the arrival of multi-core CPUs?"

6 of 96 comments (clear)

  1. Debugging by Spazmania · · Score: 2, Interesting

    Debugging a preforked C program like Apache 1.3 is still much easier than debugging a monolithic process with non-blocking I/O like Squid.
    Debugging a monolithic C process like Squid is still much easier than debugging multithreaded software like Microsoft Windows.

    This isn't likely to change regardless of how many processors you throw in the machine.

    The rules change when you move to something like Java where cross-thread contamination of the data structures is relatively difficult. But then if you're talking about Java then why are you seriously considering hardware performance issues?

    --
    Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
    1. Re:Debugging by Anonymous Coward · · Score: 1, Interesting

      You are right....for a C/C++ based program. For Erlang (or even Oz) I would disagree. Ejabberd and Tsung are good examples of what Erlang is capable of in terms of simultaneous users and performance. Check it out.

  2. Re:the answer is yes... by Anonymous Coward · · Score: 1, Interesting

    What is the server for? is it serving a database to 27 hojillion simultaneous users?? serving a static intranet to 5?? filtering mail for AOL.com??

    Bingo.

    I recently replaced a server for a client. His previous one was just too slow, and his response was to keep replacing it with a new one with a faster CPU and more RAM. His result? Only marginal increases in speed.

    I profiled the machine as he was using it, noticed that the bottleneck was disk access, and replaced the IDE drives he was using with a caching SCSI RAID. The result? 20x speed improvement (yes, 20X. operations that used to take 8 to 10 minutes were being run in under 20 seconds.)

  3. This hits home. What I did. What should I do? by kingradar · · Score: 4, Interesting

    What should I do?

    This debate hits home with me. I wrote a server daemon to handle the SMTP and POP protocols, and when I first started out I had to make a choice. The choice I made back then was to use a threaded model. The way it works is I spawn X threads which collectively use blocking calls to the accept() function. Each thread will only return from accept() once they have been assigned a new connection by the kernel. For performance I spawn the threads ahead of time. This architecture was a mistake. The issue is that I have to spawn a seperate pool of threads to listen on port 25 (SMTP), port 110 (POP), port 465 (SMTP over SSL) and port 995 (POP over SSL). With this model if I could end up with extra threads listening on port 25, when I need more threads listening and processing connections on port 465. This problems leads me to overcompensate by spawning _extra_ threads just in case. Of course this strategy wastes resources as now extra threads eat memory without benefit.

    To address the SEGFAULT issue, ie one rouge thread taking the whole system down, I also fork multiple processes. In my case I fork 12 processes with 128 threads each. If one process gets killed by a SEGFAULT, the remaining processes continue to work. When I first launched the system, and it faced a torrent of email... 100K+ messages a day, I would have about one process die every 24 hours. With careful debugging work, I've gotten the code stable enough now that I haven't lost a process in about 9 months.

    My theory when I first wrote this code was to leave scheduling to the kernel. I figured that if a thread was blocked waiting for IO data the kernel wouldn't schedule time slices for it. This meant those extra threads sat in the background waiting, but not using CPU time. I am starting to wonder whether this is a good theory? I am considering switching to a different model (more on that in a second), but am not sure which one is best? By the way, the reason each process has so many threads is for DB connection pooling. Each process gets 8 DB connections which are shared between the 128 threads. Each process also has its own copy of the antivirus database. I know its possible, but trying to share DB connections and data between processes is much more difficult.

    I plan to refactor this code soon, and have been struggling with what to do. I am curious to hear the thoughts of others?

    The current plan is to move to a model where I spawn a single thread for each port. When these listening threads have a new connection, they dump the socket handle, and the the protocol into a buffer. I would then also spawn a pool of worker threads which read the incoming connections out of the buffer. Using semaphores and reflection these worker threads would pickup incoming connections and feed them to the right function depending on the protocol. I think this model would work much better than what I have now, but is this the best option?

    The other option is to create system where I spawn only 8 worker threads (or some similar number). This pool of 8 threads then uses epoll() to find out which sockets need attention and address them accordingly. The problem with this model is that if an incomplete message is receieved, the thread couldn't process all the way into the output stage. Instead the data would need to be stored until the message sending was complete. Let me give an example, the thread might get "RCPT TO: " the first time it checked a socket. The thread stores this incomplete message. Then the second time around another thread picks up "example@example.com". The thread assembles the message into "RCPT TO: example@example.com" and then processes the entire command accordingly.

    Does this model work better? Keep in mind that when DB calls need to be made, the MySQL library won't work the same way. A slow database server could hang all 8 worker threads effectively killing the model. There are also SPF, SPAM and Virus libraries. Any one of them could tie up a thread for an extended period, thereby killing this model. What does everyone else think? Am I not thinking about an event processing model correctly? Or is that this type of daemon is better off served using the one thread per connection model?

  4. Whatever fits the reqs by ziggyboy · · Score: 3, Interesting

    I just happen to be in the middle of the design process of a server for a large telco in Australia. We have decided to use both select() and threads in handling client connections. Clients of the same class/type will be handled each by a thread. Each have their uses, pros and cons, but if you intend on using threads for spawning each client then that's not a very good idea. Pre-created threads would be ok, though. Better than pre-forked processes.

    My only complaint with POSIX threads is they do not have a "generic" join function that grabs *any* threads that have exited.

  5. Re:outdated info by Phillup · · Score: 2, Interesting
    Also... they opening statement and its bias toward Unix "for obvious reasons" doesn't lend towards it's credibility.

    Eh?

    Here is what I see:
    The discussion centers around Unix-like operating systems, as that's my personal area of interest, but Windows is also covered a bit.


    Perhaps the bias, and lack of credibility, is elsewhere...
    --

    --Phillip

    Can you say BIRTH TAX