Running 100,000 Parallel Threads

← Back to Stories (view on slashdot.org)

Running 100,000 Parallel Threads

Posted by michael on Saturday September 21, 2002 @05:24PM from the when-one-isn't-enough dept.

An anonymous reader writes "This story explains how the latest Linux development kernel is now able to start and stop over 100,000 threads in parallel in only 2 seconds (about 14 minutes 58 seconds faster than with earlier Linux kernels)! Much of this impressive work is thanks to Ingo Molnar, author of the O(1) scheduler recently merged with the 2.5 Linux development kernel."

18 of 387 comments (clear)

Min score:

Reason:

Sort:

I'm only a humble C programmer, but.... by cdrobbins · 2002-09-21 17:34 · Score: 4, Interesting

And this is great news, and, indeed, impressive. But my question is, what (if any) change is this going to make to my daily use of linux (for gcc, reading slashdot, and that's about it...) Am I going to notice any performance differences?
Parallelism by inkfox · 2002-09-21 17:36 · Score: 5, Interesting

This is very cool; but does it scale to multiple CPU systems? More and more, SMP, split-bus and multi-core architectures are going to be taking over. If this holds up in those environments, Linux may actually have a leg up on some of the dedicated task heavyweights.

--
Says the RIAA: When you EQ, you're stealing bass!
1. Re:Parallelism by Anonymous Coward · 2002-09-21 18:53 · Score: 2, Interesting
  
  I believe it said in the article/discussion that they were using a dual p4 for testing. That would imply that scaling isn't a problem
  
  Many algorithms work great for one extra processor but fail miserably with more.
  In most cases, you can just busy wait on a semaphore with two CPUs and never notice the hit. 8, 32 or 512 CPUs and you're going to throw away most of your processing time.
Windows by jeffbru · 2002-09-21 18:08 · Score: 3, Interesting

Just out of curiousity, how does the benchmark in windows compare?

--
- Jeff Brubaker
Re:Windows comparison by pVoid · 2002-09-21 18:30 · Score: 3, Interesting

Very interestingly enough, either windows has a quota, or some sort of memory leak or something...

Max I can create in a process is 2031 threads... That being done in 700ms.

It's odd cause I can create more if I run several processes. It doesn't look like the kernel is choking on thread creation...

will investigate more.
This isn't for everyone, though by Anonymous Coward · 2002-09-21 18:42 · Score: 1, Interesting

There was a patch for an O(1) scheduler awhile. What this means is it takes the same amount of time to select what runs next and it's not affected by how much is running. But you won't notice an improvement unless you have about 200 processes running at the same time. This may be good for servers, and the like, but it's a lot slower if you have few processes running. Keep this in mind...
nice, but... by g4dget · 2002-09-21 19:00 · Score: 4, Interesting

It's nice that the Linux kernel can handle that many threads. But user level threads generally are even more lightweight, and high performance implementations like those on Solaris provide both user level and kernel level threads and map the former onto the latter. Is Linux going to get something similar? Is Sun perhaps donating their implementation? Or are these new kernel threads so lightweight and quick that they are competitive with Solaris on their own, without the mess and complication of adding user level threads?
How will this affect Mozilla, OpenOffice... by 3770 · 2002-09-21 19:01 · Score: 4, Interesting

How will this change affect Mozilla, the Sun JVM and OpenOffice, for instance.

While it probably is generally true that it will take some time for most applications to start using the new threading model some larger applications could support it fairly soon.

Can we expect these applications to be adapted to the new threading model some time soon, and how will it affect performance?

--
The Internet is full. Go Away!!!
Hooray for fixing the dynamic linking problem! by Foresto · 2002-09-21 19:57 · Score: 2, Interesting

It looks like speed isn't the only improvement they've made with this library. From the notes:
" - - libpthread should now be much more resistant to linking problems: even if the application doesn't list libpthread as a direct dependency functions which are extended by libpthread should work correctly."
This ought to be a big help for those of us who write plug-in modules for servers like Apache 1.x and PHP. The existing thread library doesn't work properly unless the program executable explicitly links to it, which means that my shared libraries can't take advantage of standard thread management such as pthread_atfork().
Does this help Apache 2.x by mustprotectdata · 2002-09-21 20:16 · Score: 2, Interesting

Given that Apache 2.x can utilise threads as well as processes, does this mean that you can configure a large web server with, say "MaxSpareThreads 1000000" so that you can cope when you're slashdotted ;-)?
Re:Real World Example by flux · 2002-09-21 20:46 · Score: 2, Interesting

What is the fundamental reason select/poll should be that much faster anyway? Well, you win the context switch-times, if you can handle many clients in a tick. But on the other hand it does affect the way you need to design the code, and doing some stuff that neveer stalls withouot threads might be tricky.

Just imagine a situation where a thread might need to calculate something, or initialize a big array. Now, if it's run under a select-loop, you need to do that in parts to avoid starving the server. With threads, you just do the trick and don't care about the rest of the world which keeps serving the clinets, no matter how long youo stay in the functino.
the linux c10k problem - solved? And Java? by Anonymous Coward · 2002-09-21 22:47 · Score: 1, Interesting

Does this solve the c10k problem? As I can start a thread for every socket? See the C10K problem
And does this mean the Java will start to really scale on linux?
Re:Posix thread... by Anonymous Coward · 2002-09-21 22:54 · Score: 1, Interesting

>Are these completely independant, and competing, projects? Can these two groups work together and complement each other?

It is exactly this library that Redhat is hoping to stop. IBM's is a good library, but it is heavy. It will also complicate threads by moving part of the scheduling into user space. That is not neccesarily a bad thing, just bigger and more complicated. Redhat's should be thinner than the current pthreads lib. One thing that I have hated about pthreads is it's use of signals for some control. Now, if I read this correct, Redhat has moved all the control into the kernel which means the kernel handles it all.
Re:POSIX compliance ahead? by inode_buddha · 2002-09-22 01:30 · Score: 2, Interesting

Nobody ever said that linux-specific behavior is POSIX-compliant. Last I heard, POSIX is not about the specifics of any given UNIX-compatible or class of system. Rather, it attempts to be the abstraction and distillation of those class of systems, as codified by The Open Group. Please correct me if I am wrong in this idea. Linux simply simply "aims to be..." POSIX-compliant, as promulgated by the LSB, the FHS, et al. --

That all said, I totally agree with you -- especially regarding cancellation points, fork(), and documentation.

Please bear in mind that much of this behavior will be inherited from whatever libc it it compiled against. IMO, this simply shows the power of C, nothing else.

The above scenario simply points out the differences between OpenGroup/POSIX and GNU/FSF... if things like that "bug" you (no pun intended, seriously), then perhaps you should recompile with whatever "-- posixly-correct" options you have available.

And yes, I have a copy of the SUSV3 spec right here, in fact.

--
C|N>K
Multithreaded core files on Linux? by truth_revealed · 2002-09-22 05:11 · Score: 2, Interesting

I can't seem to find any info on whether Linux core files still produce one core file per thread or just one core file per process (as does Solaris). Has `gdb' been enhanced to handle multithreaded programs (or multithreaded core file) on Linux? If I have a thousand threads - I sure don't want 1000 core files in the event of a crash. Is there a way around this?
This may not even make it INTO 2.5.x... by Wolfrider · 2002-09-22 10:12 · Score: 2, Interesting

See here ( http://lwn.net/Articles/9632/ )
and here ( http://lwn.net/Articles/10248/ )

--Linus is being pigheaded about this patch, wanting to "keep the code simple" instead of implementing Ingo's **fast** + Fixed solution.

To quote LWN:
[ So it's fast - though a few extra features have been requested. But this patch has stirred up a bit of a debate. Rather than put in a complicated new PID allocator, it is asked, why not just make the maximum PID be very large? Then, in theory, the quadratic part of get_pid() will never run so the performance problems go away, and the code stays simpler. Linus prefers this approach, as do a number of other developers; he has put a simple patch along these lines into his pre-2.5.37 BitKeeper tree.

Ingo disagrees, pointing out that any reasonable maximum PID size can be exceeded eventually. He would rather fix the problem than try to hid it behind a large process ID space. In the absence of real-world examples that show people being bitten by get_pid()'s behavior in a larger PID space, though, Linus appears unlikely to accept any more complicated fix.
]

--
.
== WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
Linus didn't think much of O1 scheduler by jelle · 2002-09-22 10:22 · Score: 3, Interesting

I remember that Linus made a remark that he tought that the O1 scheduler wouldn't impact Linux much at all, and that its development would not be a biggie for Linux, downplaying the importance of what it can achieve. Go Ingo for keeping at it!

--
--- Hindsight is 20/20, but walking backwards is not the answer.
Re:Not 100,000 threads in parallel, just 50. by himi · 2002-09-22 14:14 · Score: 4, Interesting

The latency issues that cause mp3 skipping under heavy load in Linux have nothing at all to do with context switching, and everything to do with /scheduling/ latency: how long it takes for a process that has work to do to actually get control of the cpu. Context switching has /nothing/ to do with that.

The low latency patches go through the kernel breaking up areas where spinlocks are held for long periods of time. That's what causes massive scheduling latency in the kernel.

Context switching under Linux /is/ extremely fast - it's actually been measured (a lot), and it's something the kernel developers pay a lot of attention to and optimise very carefully. They literally count cpu cycles in these code paths. Context switching time is a serious performance limiter in many areas, so getting it right is important, and it's something that Linux does /very/ well.

Go do some real research before you accuse someone who's right of karma whoring bullshit.

himi

--

My very own DeCSS mirror.