Slashdot Mirror


Running 100,000 Parallel Threads

An anonymous reader writes "This story explains how the latest Linux development kernel is now able to start and stop over 100,000 threads in parallel in only 2 seconds (about 14 minutes 58 seconds faster than with earlier Linux kernels)! Much of this impressive work is thanks to Ingo Molnar, author of the O(1) scheduler recently merged with the 2.5 Linux development kernel."

11 of 387 comments (clear)

  1. Re:Posix thread... by Wolfier · · Score: 5, Informative

    Your answer:

    http://www.cs.wustl.edu/~schmidt/ACE.html

    This is so far the best library I have used for pthread programming. Powerful, easy to use, and encapsulates message passing really well...

  2. Re:Not 100,000 threads in parallel, just 50. by vvikram · · Score: 5, Informative


    Yeah right. And modded to "Informative"? Slashdot moderators are the _pits_.

    Read ingo's reply to Linus. They _did_ start
    one test serially and also _parallelly_ . In short he says that its possible.

    vv

  3. Re:Not 100,000 threads in parallel, just 50. by kinnunen · · Score: 5, Informative
    Read Ingo's posts too:
    actually, that was Ulrich's other test, which tests the serial starting of 100,000 threads. the test i did started up 100,000 concurrent threads which shot up the load-average to a couple of thousands. [the default timeslice the parent has is enough to start more than 50,000 parallel threads a pop or so.]
    And another one:
    Anton tested 1 million concurrent threads on one of his bigger PowerPC boxes, which started up in around 30 seconds. I think he saw a load average of around 200 thousand. [ie. the runqueue was probably a few hundred thousand entries long at times.]
  4. Re:I'm only a humble C programmer, but.... by bm_luethke · · Score: 5, Informative

    probably none. On the other hand the field I work in (high performance computing) this will be a great help. Currently we are running a 500,000 processor simulation on a four node cluster, startup and running both is a pain. Remeber, on of the great things about linux is some of the neat/usefull applications being ran on it (human genome, nuclear simulations, fluid simulations). Windows is a toy and geared toward "normal" users (read very few threads not processor intensive). Linux is more of a workhorse (many threads, computationally expensive, and high uptimes). While there are exceptions to this look at advances such as this in that light. And finally, just because you won't use it compiling a kernel doesn't mean it's not needed.

    --
    ------- Sorry about the spelling, I suffer from two problems. Dyslexia makes it difficult to spell well, lazy makes it
  5. Re:Not 100,000 threads in parallel, just 50. by the_quark · · Score: 5, Informative

    No, seriously. Process creation under Linux was time-similar to thread creation on other OSs. That's because Linux was as fast at creating *a process* as other OSs are at creating *a thread*. IIRC, threading was initially implemented in Linux from the process-creation methods, so it was similar in speed (the main advantage in Linux from threads was the shared memory space if your application wanted that sort of thing). That's why Apache 2.0 is bringing NT performance more in line with Linux 1.3 performance: NT's threading speed is a lot closer to Linux's forking speed. Again, I'd like to underscore I'm not an expert on this, and it's possible I'm mistaken about relative benchmarks (is NT w/Apache 2.0 a little faster than Linux w/Apache 1.3? Could be...) but I'm very confident of the basic underlying point, that Linux process creation is essentially comparable to other OSs' thread creation, perhaps even faster.

    See, for example, http://www.linux.cu/pipermail/linux-prog/2001-Febr uary/000027.html, just one of the first Google links that popped up when I went looking for proof that I'm not on crack: "Linux newcomers often are unaware of the substantial differences between Linux and other operating systems. To implement concurrency, they use multithreading exclusively, mistakenly assuming as high an overhead associated with Linux multiprocessing as on other platforms." In fact, knowing how fast Linux's process creation is relative to other systems' thread creation makes this even more impressive in my mind. This isn't just a bug fix; much like with process creation before it, Linux is doing something fundamentally better than its counterparts.

    Don't forget: Just because this is /. doesn't mean I'm just a Windows-hating troll. I try to make sure all my Windows-hating-troll-posts are at least backed up by facts. ;)

  6. Re:Alternative headline by Dahan · · Score: 4, Informative
    Gigantic performance problem in Linux code fixed after several years of "many eyes" scanning over it.

    Uh, why did that get moderated as a troll? Oh, right, Linux is absolutely perfect, and anyone who says otherwise must be a troll.

    Come on, Linux's scheduler has long been known to have performance problems once you have a lot of processes/threads... for example, read this paper [text version] (appropriately subtitled "How I Learned to Love the Alpha and Hate the Scheduler"):

    0.8.1 Create a fixed priority scheduler.
    Currently, the Linux scheduler is very different than the traditional Unix schedulers. Although the Linux scheduler is very efficient when only several processes are running, it is not scalable. In order to match the performance of *BSD and other Unices, another scheduling algorithm must be used.
    Moderators, don't be Slashbots, moderating according to the groupthink. Educate yourselves, and you'll be better moderators, and better people.
  7. Re:Windows comparison by Courageous · · Score: 4, Informative

    Very thread uses a minimum of *1 PAGE* of reserve memory for its statck, which is 64K. However, you have to go out of your way to use less than 1 megabyte of reserve memory. Since only 2GB of reserve memory (addressable memory) is available to user applications, this would fit your 2000 thread figure like a glove.

    C//

  8. Re:Not 100,000 threads in parallel, just 50. by brianpane · · Score: 4, Informative
    Apache 2.0 doesn't actually do thread creation very frequently. The thread creation cost occurs mostly at startup. So the limiting factors for threaded Apache performance on Linux are mainly:
    • The speed with which the kernel can schedule and context-switch among threads
      For some recent data on this, see http://marc.theaimsgroup.com/?l=apache-httpd-dev&m =103228014211983. The O(1) scheduler patch for 2.4 seems to help here.
    • Memory usage per thread
    • Concurrency limitations of the Apache code itself
      This has been improving gradually with successive 2.0 releases, as the remaining global locks are removed or optimized.
    • General robustness of the thread implementation
      The current (2.4) Linux threading implementation doesn't work well with debuggers.
    At first glance, it looks like the NPTL could be a win for threaded Apache on Linux, as offers some solutions first the first and last of these issues.
  9. Re:nice, but... by Magnus+Reftel · · Score: 4, Informative

    According to a mail from Ingo Molnar halfway down the linked article, M:N threading doesn't really solve the real problem - it's good at switching back and forth between running threads, but the real reason for having very large amounts of threads (be they kernel or user space threads) to begin with, is to do IO, and for that, there is no real advantage of user space threads.

    More info on the 1:1 vs M:N issue can be read in the white paper

    --
    print "Yet another p{erl,ython} hacker\n",
  10. Re:Not 100,000 threads in parallel, just 50. by Karellen · · Score: 5, Informative

    It's not process/thread _creation_ times that make the difference, it's the process/thread _context_switch_ times that really mount up, which is where Linux shines.

    And yes, Linux's process context switches are on a par (possibly faster - can't be bothered to look up benchmarks) with NT's thread context switches.

    K.

    --
    Why doesn't the gene pool have a life guard?
  11. Re:I'm only a humble C programmer, but.... by cduffy · · Score: 5, Informative

    KDE actively discourages threads. Perhaps that will change now. Likewise servers, such as apache, will speed up.

    I'm not so sure about that.

    A threaded model doesn't necessarily offer advantages -- Apache's multiprocess model is really just as good on platforms without serious performance penalties on fork(), and Boa (which neither forks nor threads) is much, much faster than either Apache mode (though of course on SMP systems multiple instances must be run to use all the available CPUs).

    Indeed, unless SMP is being taken advantage of, a well-written single-threaded application will always be faster than an equivalent multithreaded application. Such an application has less overhead and is able to jump between its "subprocesses" only when needed -- and without the latencies involved by letting the OS handle said scheduling. Back in the Real World, I still write threaded code -- but because writing unthreaded code (in the problem spaces where threads are useful) is harder, not because it's faster.