Slashdot Mirror


Running 100,000 Parallel Threads

An anonymous reader writes "This story explains how the latest Linux development kernel is now able to start and stop over 100,000 threads in parallel in only 2 seconds (about 14 minutes 58 seconds faster than with earlier Linux kernels)! Much of this impressive work is thanks to Ingo Molnar, author of the O(1) scheduler recently merged with the 2.5 Linux development kernel."

387 comments

  1. Posix thread... by alexandre · · Score: 1

    I frequently hear people bitching about pthread lib and how f*cked up it is... is this going to change the way we use thread too? :-)

    1. Re:Posix thread... by Wolfier · · Score: 5, Informative

      Your answer:

      http://www.cs.wustl.edu/~schmidt/ACE.html

      This is so far the best library I have used for pthread programming. Powerful, easy to use, and encapsulates message passing really well...

    2. Re:Posix thread... by Anonymous Coward · · Score: 0

      I'm very curious about what, if any, relationship this project has with the Next Generation POSIX Threads project.

      Are these completely independant, and competing, projects? Can these two groups work together and complement each other?

      --
      Topher

    3. Re:Posix thread... by Anonymous Coward · · Score: 0

      ZThreads is another great thread library for portable MT code

    4. Re:Posix thread... by Anonymous Coward · · Score: 1, Interesting

      >Are these completely independant, and competing, projects? Can these two groups work together and complement each other?

      It is exactly this library that Redhat is hoping to stop. IBM's is a good library, but it is heavy. It will also complicate threads by moving part of the scheduling into user space. That is not neccesarily a bad thing, just bigger and more complicated. Redhat's should be thinner than the current pthreads lib. One thing that I have hated about pthreads is it's use of signals for some control. Now, if I read this correct, Redhat has moved all the control into the kernel which means the kernel handles it all.

    5. Re:Posix thread... by smittyoneeach · · Score: 2

      So maybe there is a heavyweight library for some applications, and a lighter weight one for common use.
      Probably you do the light one, and include it in the heavy when required.
      Ah, the one-size-fits-all thought process...

      --
      Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
    6. Re:Posix thread... by DrunkenPenguin · · Score: 2, Funny

      ..actually.

      Your answer:

      http://www.linux.ncsu.edu/lug/lectures/rpm-pres/mg p00033.html

      This is so true to all of us ;)

  2. Hold this thread while I walk away by DoctorHibbert · · Score: 3, Funny

    The linux song

    --
    Arbitrary sig
  3. How long before by Anonymous Coward · · Score: 0

    M$ is suddenly also capable of doing the same...

    1. Re:How long before by Anonymous Coward · · Score: 0

      What, you mean the same way they discovered they mysteriously had the exact same faults in IE that were found in the GPL/MPL'ed Mozilla?

    2. Re:How long before by Anonymous Coward · · Score: 0

      That would almost make sense were it not for the fact that IE had certificate chain of authority functionality for years before Mozilla was a wet dream in Jamie Zawinski's eyes you dumb fuck.

    3. Re:How long before by _Knots · · Score: 2

      Be careful who you call a dumb fuck. Netscape had a functional browser long before IE3, aguably the first usable version of IE. And it would not surprise me if Netscape 1 predated IE 1, though I can't say I know that for sure.

      Speeding The Net is an excellent book about Netscape vs Microsoft, in case anybody cares (it's been a long while since I read it, thus why my date memory is rusty).

      --
      Anarchy$ dd if=/dev/random of=~/.signature bs=120 count=1
    4. Re:How long before by Anonymous Coward · · Score: 0

      Ha ha... you act like Netscape ripped off IE, not the other way around!! Some people are so stupid, they have only the foggiest, ZDNet idea of history...

    5. Re:How long before by Anonymous Coward · · Score: 0

      Ha ha... you act like Mozilla and Netscape are the same thing!

    6. Re:How long before by Anonymous Coward · · Score: 0

      Yea, Netscape was around long before IE. In fact,
      the early versions of IE were from a company called
      spyglass. Their code was bought by MS because they
      needed a code base to enter the browser wars. (The
      in-house created browsers as MS *sucked*).

    7. Re:How long before by mumkin · · Score: 1

      Netscape is the direct descendant of NSCA Mosaic, the Ur browser. Frankly, I don't remember what the big deal about Netscape 1.0 was, relative to Mosaic, but there was much hype. Maybe something really hardcore, like introducing background colors?

    8. Re:How long before by MyAss · · Score: 1

      Actually I think the two big features was the "stop" button and the ability to open more than one connection as a time. (So it would load the images much faster)

      --

      They misunderestimated me. -- George W. Bush
    9. Re:How long before by Zeinfeld · · Score: 2
      Netscape is the direct descendant of NSCA Mosaic, the Ur browser. Frankly, I don't remember what the big deal about Netscape 1.0 was, relative to Mosaic, but there was much hype. Maybe something really hardcore, like introducing background colors?

      Wrong in every respect.

      First Mosaic was not the 'Ur browser'. Tim's NextStep browser was. Mosaic was browser number 15 or so. The significant things about Mosaic were that 1) it actually compiled without having to hack the code yourself or mess with 6 different support packages like tkwww and 2) it was the first X-Windows browser that did not look really amateur.

      Second, Netscape does not contain any code from Mosaic, although it was written by the same main author - Eric Bina. NCSA sold the commercial rights to Mosaic to Spyglass.

      Third IE was originally based on the Spyglass code, so if any browser is 'the direct descendant' it would be IE. Go look at the 'about' box on IE, although the original Mosaic actually had more lines of CERN code than NCSA code which were never acknowledged.

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
  4. 100,000 Linux threads by Anonymous Coward · · Score: 5, Funny
    1. Re:100,000 Linux threads by notanatheist · · Score: 2, Funny

      are those M$ employees looking for code?

    2. Re:100,000 Linux threads by Anonymous Coward · · Score: 0

      nah.... M$ employees would be on their hands and knees stealing from the nests......

    3. Re:100,000 Linux threads by Citizen+of+Earth · · Score: 2

      this image springs to mind

      Is that red splatter on the ground the remains of Bill Gates?

    4. Re:100,000 Linux threads by Anonymous Coward · · Score: 0

      ...or perhaps bandwidth.

    5. Re:100,000 Linux threads by Anonymous Coward · · Score: 0

      (obligatory Futurama reference)

      Oh my god! It's like Hong Kong!

    6. Re:100,000 Linux threads by Anonymous Coward · · Score: 0

      What's that reddish-brown stuff that's smeared all over the place?

  5. Re:1000,000 threads?? by Anonymous Coward · · Score: 0, Troll

    frits lysdexic boewflu pr0st!

  6. Win ME Kicks that sorry statistic!!!! by SlimFastForYou · · Score: 4, Funny

    It takes two seconds to start 100,000 threads???? Piff! With my ME computer, It doesn't matter how many parallel threads I am running... I can stop them all instantly by simply attempting to use my computer :P.

    1. Re:Win ME Kicks that sorry statistic!!!! by CoolVibe · · Score: 4, Funny
      Pff... I can start a million threads on my FreeBSD box and stop them all in an instant...

      ...by hitting the reset button.

    2. Re:Win ME Kicks that sorry statistic!!!! by Anonymous Coward · · Score: 0

      It's funny because it's true.

    3. Re:Win ME Kicks that sorry statistic!!!! by grmoc · · Score: 1

      I betcha your BSD box doesn't have enough memory/address space to start up 1 million threads... (since I'm guessing its running on x86 hardware...)...

  7. Threads by Anonymous Coward · · Score: 0

    Wow, Slashdot may have had over 100,000 threads too. But then it took more than five years.

  8. I'm only a humble C programmer, but.... by cdrobbins · · Score: 4, Interesting

    And this is great news, and, indeed, impressive. But my question is, what (if any) change is this going to make to my daily use of linux (for gcc, reading slashdot, and that's about it...) Am I going to notice any performance differences?

    1. Re:I'm only a humble C programmer, but.... by SlimFastForYou · · Score: 5, Funny

      Just wait until Spyware For Linux(TM) comes out... With Bonzai Buddy For Linux(TM), Real Center For Linux(TM), XMMS Agent(TM), Linux Messenger(TM), Linux Update(TM), and FindFast for OpenOffice.org(TM). Then you will know why 100,000 parallel threads in two seconds is a good thing :P.

    2. Re:I'm only a humble C programmer, but.... by cdrobbins · · Score: 1, Offtopic

      My above comment was moderated as a troll, and yeah, maybe it sounded like that. But it's a serious question. I'd like to know what benefits us normal uses will see.

    3. Re:I'm only a humble C programmer, but.... by Anonymous Coward · · Score: 0

      why are you posting at zero?

    4. Re:I'm only a humble C programmer, but.... by Anonymous Coward · · Score: 0

      well, I will give you a point for your first comment, but take one for the other 2 offtopic ones

    5. Re:I'm only a humble C programmer, but.... by mattdm · · Score: 3, Insightful

      Java likes to run many threads very cavalierly, so it's likely to help there somewhat.

    6. Re:I'm only a humble C programmer, but.... by Anonymous Coward · · Score: 0, Offtopic

      And this is a perfect example of moderation abuse on slashdot. I swear to all things holy, give a geek just a little power and it goes straight to their head! You stupid fucking mods! The guys post was not a troll, it was an honest question seeking an honest answer. Someone with half a brain mod the guy up. He is new here, asked a simple question and one of your power mad mods have modded him down for it and labeled him a troll. What a great start to slashdot he has recieved. This place is so full of abuse between those who post just to flame people, mods who go overboard and think they are special in some way or more worthy of choosing what is seen, and the zealot hordes that when someone new here posts a valid question maybe you should give them a chance. This is ridiculous...

      For cdrobbins: you are new here and thus probably not compulsive in your viewing of slashdot. I recommend leaving before you are addicted. You have now seen first hand how unfair the mod system is and suffered from the abuse of an idiot with mod points. I would not blame you for never coming back here. Maybe some day, there will be no mod system, we can only hope.

    7. Re:I'm only a humble C programmer, but.... by bm_luethke · · Score: 5, Informative

      probably none. On the other hand the field I work in (high performance computing) this will be a great help. Currently we are running a 500,000 processor simulation on a four node cluster, startup and running both is a pain. Remeber, on of the great things about linux is some of the neat/usefull applications being ran on it (human genome, nuclear simulations, fluid simulations). Windows is a toy and geared toward "normal" users (read very few threads not processor intensive). Linux is more of a workhorse (many threads, computationally expensive, and high uptimes). While there are exceptions to this look at advances such as this in that light. And finally, just because you won't use it compiling a kernel doesn't mean it's not needed.

      --
      ------- Sorry about the spelling, I suffer from two problems. Dyslexia makes it difficult to spell well, lazy makes it
    8. Re:I'm only a humble C programmer, but.... by Citizen+of+Earth · · Score: 2

      But my question is, what (if any) change is this going to make to my daily use of linux... Am I going to notice any performance differences?

      My question is why does the multithreading in Mozilla suck so badly on Linux and will this help it?

    9. Re:I'm only a humble C programmer, but.... by Jester99 · · Score: 1, Offtopic

      And why wouldn't you want a mod system?

      If you want to view what everybody posts, just set your default viewing threshold to -1. Simple as that. I found that scanning at +4 typically lets me get a good sense of things if I'm short on time. If I've got more time to spend, then I view at a lower threshold, like +2. If there's an interesting looking thread, then I'll view that whole thread.

      However, I simply don't have the time to cut through all the noise to the signals on my own. Without the moderation system, I would just not be able to read comments manageably, at all. And that's just the truth.

      The mod system does do a decent job of reducing the S:N ratio, on balance.

    10. Re:I'm only a humble C programmer, but.... by Subcarrier · · Score: 2

      But my question is, what (if any) change is this going to make to my daily use of linux...?

      Well, for one thing, you're now going to have to start typing a helluva lot faster. The machine is not going to slow you down. ;-)

      In truth, this is great news for those running servers but you probably won't notice much of a difference on a desktop, barring a few really thread heavy applications. UML (User Mode Linux) is one notorious example.

      --
      "I have opinions of my own, strong opinions, but I don't always agree with them." -- George H. W. Bush
    11. Re:I'm only a humble C programmer, but.... by Anonymous Coward · · Score: 0, Offtopic

      You know, if you were to come off of your +4 high horse and delve in the trenches of 0 for a while you would possibly see that you are wrong. The very post that you replied to will never make it to +4, just as your reply probably will not either. Yet here we are, having a discusssion that must be somewhat relative to both of us. I, the anonymous coward and you the logged in user. Moderation abuse is out of control and has been for a long time. The post from cdrobbins just highlighted it and I, who does browse at -1 or 0 am simply tired of seeing posts abused by mods whose only redeeming quality seems to be positive karma and the length of time they have been here. Think about it, these are the very people who karma whore to get there posts moderated, then one day become mods themselves. The system is corrupt, and broken. It only rewards group think! In short, stay around long enough and post with karma in mind and you will one day be eligible for mod status also. Yet the one true way to get is to post the same type of groupthink drivel the hordes feed on. The karma system is nothing more than a vicous cycle, a breeding ground for mods who will continue to reinforce the groupthink status quo of slashdot. How about a system that scores posts based on the amount of replies they recieve? Novel idea huh? It seems like posts relevant to the conversation at hand would garner the most replies, and more than likey accurately reflect what the hordes wish to discuss. The whole thing could be automated easily, the only problem being people who reply to their own posts to boost the score. I am sure the brain trust of /. can figure out a way around that, and no junior taco groupthink mod enforcers would be required.Ack.

    12. Re:I'm only a humble C programmer, but.... by Anonymous Coward · · Score: 0

      Is this the bible of karma whores?

    13. Re:I'm only a humble C programmer, but.... by Anonymous Coward · · Score: 0

      Because Mozilla is written by a bunch of windoze coders who wouldn't know how to program Unix/Linux even if Linus Torvalds told them in word of two syllables or less. Mozilla is a bloaty C++ windows program kuldged to run on Linux - they REIMPLEMENTED COM, FOR GOD'S SAKE!

    14. Re:I'm only a humble C programmer, but.... by Anonymous Coward · · Score: 0

      Imagine a beowulf cluster of bibles for karma whores.

    15. Re:I'm only a humble C programmer, but.... by bbtom · · Score: 1

      Even worse than that, the infernal Comet Cursor...

      --
      catch (HumourFailureException e) { e.user.send("You, sir, are a humourless idiot."); }
    16. Re:I'm only a humble C programmer, but.... by Anonymous Coward · · Score: 0

      Have to disagree with you here. I think that this will encourage more threads to come about in our code. KDE actively discourages threads. Perhaps that will change now. Likewise servers, such as apache, will speed up.
      The real problem is that this may be a while in coming. How many of you are running 2.5 for production? if any, you are foolish.

    17. Re:I'm only a humble C programmer, but.... by Chops · · Score: 2

      The performance improvement won't mean much, but the POSIXization of the thread library might make a difference. Linux's thread support has up till now been pretty kludgy (signal handlers per-thread instead of per-process, wrong coredumps, etc.), and that made things like debugging threaded programs difficult; you may have run into this with gdb or whatever. Now that the Right Things have been coded in all over the map (kernel/libc/gcc/etc), we can drop the kludge and start doing it right.

    18. Re:I'm only a humble C programmer, but.... by cheekyboy · · Score: 0

      but its all software... big deal

      linux is less special, its just different.

      --
      Liberty freedom are no1, not dicks in suits.
    19. Re:I'm only a humble C programmer, but.... by cduffy · · Score: 5, Informative

      KDE actively discourages threads. Perhaps that will change now. Likewise servers, such as apache, will speed up.

      I'm not so sure about that.

      A threaded model doesn't necessarily offer advantages -- Apache's multiprocess model is really just as good on platforms without serious performance penalties on fork(), and Boa (which neither forks nor threads) is much, much faster than either Apache mode (though of course on SMP systems multiple instances must be run to use all the available CPUs).

      Indeed, unless SMP is being taken advantage of, a well-written single-threaded application will always be faster than an equivalent multithreaded application. Such an application has less overhead and is able to jump between its "subprocesses" only when needed -- and without the latencies involved by letting the OS handle said scheduling. Back in the Real World, I still write threaded code -- but because writing unthreaded code (in the problem spaces where threads are useful) is harder, not because it's faster.

    20. Re:I'm only a humble C programmer, but.... by Mark+J+Tilford · · Score: 1

      On a single CPU computer, yes. What about on an SMP computer?

      --
      -----------
      100% pure freak
    21. Re:I'm only a humble C programmer, but.... by cduffy · · Score: 1

      On a SMP box, you'll generally get best performance if your application can run a standalone, unsynchronized, entirely separate process on each CPU.

    22. Re:I'm only a humble C programmer, but.... by joib · · Score: 2

      But then again, in the Real World (TM), different processes/threads often need to communicate with each other (for ex. scientific applications), or save memory by sharing stuff like script interpreters, db connections etc. (for eg. web servers).

    23. Re:I'm only a humble C programmer, but.... by cduffy · · Score: 3, Insightful

      Yes, it still (like everything) depends on your application.

      That said, though, sharing (and putting locks around) your DB connections or script interpreters is an easy way to lose performance and introduce potential deadlocks (or other hard-to-track, hard-to-reproduce bugs due to bad shared state) as opposed to having each process able to operate completely independantly from the others. Shared state is a Good Thing when it's genuinely needed -- but should be avoided when it's not.

      I'm not saying -- and I've never tried to say -- that threading is worthless; I just object to people who take the position that making an application multithreaded will necessarily make it faster.

    24. Re:I'm only a humble C programmer, but.... by Anonymous Coward · · Score: 0

      I swear to God, if you replace "Mozilla" with "KDE/Qt" in that rant, then it fits just as well. Well fuck-de-doo... howza bout dat.

    25. Re:I'm only a humble C programmer, but.... by ajs · · Score: 2

      You are correct. To state that in more specific terms, threads are a "big hammer" that can be applied to the need to manage multiple resources at once. In my experience (on general purpose hardware) you can always optimize that resource management better in a single process than the kernel can by performing lightweight context switching.

    26. Re:I'm only a humble C programmer, but.... by Anonymous Coward · · Score: 0

      Yes, you can optimize against latency better than the kernel but at a price. The usual way of 'managing' these resources in a single thread is by polling. Polling takes an enormous amount of CPU resources. With higher speed processors, you are burning huge amounts of cycles that other processes could be using while you busy-wait.

    27. Re:I'm only a humble C programmer, but.... by Anonymous Coward · · Score: 0

      Depends... it wasn't so long ago that the compiler technology on Windows boxes routinely gave us 2X the performance as a Linux box using the same hardware (box dual booted) on the same C codes we used (yes, on parallel codes too as I worked in high-performance/parallel computing as well). From my experience, Unix/Linux had the benefit that most of the people in the field were already familiar with it and that it does remote shells and such with ease - not necessarily because it was the highest performance. In any case, Linux had its own problems in massively large systems (for instance, we had no ends of troubles using remote shells trying to connect over 64 machines together on our cluster of 325 machines both in startup times and timeouts while waiting on all the connections to form - we had to write our own stuff to get around it and eventually could use our hardware like we wanted to).

    28. Re:I'm only a humble C programmer, but.... by clubin · · Score: 1

      "How about a system that scores posts based on the amount of replies they recieve?"

      So, you want "flamebait" to receive the most karma? It's the insightful comments that receive the least replies, as they would most likely just "yeah, you're right" posts (save for the few that are further insightful, with thought inspired by the original). There is no sensible, reply-count-based system of rating.

      However, given your taste in posts, maybe you should join kuro5hin, where one might say that posts that get people talking (and thus thinking, hopefully) are the most desired.

    29. Re:I'm only a humble C programmer, but.... by Fjord · · Score: 2

      I'd have agreed with you yesterday, but these improvements could change that. Something to consider. Kind of like the time (back in 98, I think) I realized my application ran faster in floating point than in fixed point, changes to the infrastructure can change the way you approach problems.

      --
      -no broken link
    30. Re:I'm only a humble C programmer, but.... by diablovision · · Score: 2, Insightful

      "Indeed, unless SMP is being taken advantage of, a well-written single-threaded application will always be faster than an equivalent multithreaded application."

      Two words: Blocking IO. You are correct that multithreading imposes an inevitable overhead on CPU intensive tasks running on a single processor machine, but most applications are not processor bound. The fact is that almost all applications that do anything besides scientific work have large portions of their execution times used by blocking on IO. Multiple threads allow the time spent waiting on IO in one thread to be spent doing something else "useful" in another thread--provided your OS supports native threads (if not, one thread can block an entire process).

      "...but because writing unthreaded code (in the problem spaces where threads are useful) is harder, not because it's faster."

      Isn't this almost a tautology? Restating: "In the areas where threads make things easier, it is easier to use threads than to not use threads."

      --
      120 characters isn't enough to explain it.
    31. Re:I'm only a humble C programmer, but.... by pclminion · · Score: 2
      KDE actively discourages threads. Perhaps that will change now.

      I'm only guessing, but the reason KDE discourages threads is probably because it's a real bitch to write a truly thread-safe library, and they don't want to fuck with it.

      In other words, it probably isn't because of performance.

      If there are any core KDE developers reading, please correct me if I'm wrong.

    32. Re:I'm only a humble C programmer, but.... by Jay+L · · Score: 2

      Two words: Blocking IO.

      Right. If you are going to use a single-threaded process, you must use non-blocking I/O.

    33. Re:I'm only a humble C programmer, but.... by Anonymous Coward · · Score: 0

      Since you appear to be confused, poll (2) doesn't spin, it just blocks and lets the processor do other things.

    34. Re:I'm only a humble C programmer, but.... by cduffy · · Score: 2

      Two words: Blocking IO.

      Three words: Non-blocking IO. The APIs exist (such as the POSIX AIO standard) -- they're just rarely used.

      Isn't this almost a tautology? Restating: "In the areas where threads make things easier, it is easier to use threads than to not use threads."

      "Useful" and "easier" aren't quite the same.

      There are places where threads are useful -- because the application demands it -- and places where threads are easier -- because the programmer doesn't have the time, will, knowledge or need to define a nonthreaded solution even if such a solution would be more efficient.

    35. Re:I'm only a humble C programmer, but.... by ajs · · Score: 2

      changes to the infrastructure can change the way you approach problems.

      Yes, certainly. However, thread management has more headaches than can be hidden by the kernel and libraries easily. At least some of that overhead evidences in your userspace program. In many cases when people use threads, they're really just not thinking about their application. That's fine if performance is not a strong concern, but when it is (as evidenced by recent work in the Web server arena), threads should just be a tool in your box, along with many other techniques.

    36. Re:I'm only a humble C programmer, but.... by JohnsonJohnson · · Score: 1

      There is one sigificant caveat to your assertion that a single threaded application will always outperform a multithreaded version. "a well-written single-threaded application that never makes a blocking system call (such as I/O) while there is useful worrk to be done will always be faster than an equivalent multithreaded application", emphasis mine. Since a large number of useful applications: DB's, mail clients, browsers, processor simulators, web servers etc. have exactly this kind of behaviour it is very desirable.

      To cover my own ass, I will add one more caveat. This perfomance advantage only shows up in thoughput on a balanced workload. It is trivial to create "tests" which cripple a multithreaded application by finding a code path which causes all threads to block on a shared resource (ie force a DB to write to tables that have little disk/memory locality simultaneously) which a single threaded app composed to deal with that situation will show higher performance in. Single threaded apps using that architecture generally fail in more balanced (you can argue whether that's real world or not) situations.

    37. Re:I'm only a humble C programmer, but.... by cduffy · · Score: 2

      a well-written single-threaded application that never makes a blocking system call (such as I/O) while there is useful work to be done will always be faster than an equivalent multithreaded application

      Since when did I suggest the use of blocking I/O? Any single-threaded application of this sort needs to use non-blocking I/O as a matter of course. See POSIX AIO for one non-blocking I/O API.

      Finally... you mention that error states are possible in threaded applications, and can be exercised by appropriately written code ("tests" which cripple a multithreaded application). Personally, having a number of almost impossible to find deadlocks hidden away in my system until some end-user chances upon one makes me very, very nervous.

    38. Re:I'm only a humble C programmer, but.... by JohnsonJohnson · · Score: 1

      Since when did I suggest the use of blocking I/O? Any single-threaded application of this sort needs to use non-blocking I/O as a matter of course. See POSIX AIO for one non-blocking I/O API.

      Use of nonblocking I/O is in general almost as difficult as writing good multithreaded code. In effect the developer has to reimplement the logic of a multithreaded application in user space and wrap it around the nonblocking I/O calls. For applications such as a log file writer of course using nonblocking I/O without extra logic to check the results is fine though. Consider the case of a plant controller though, where multiple sensors could be triggering events for multiple controlers asynchronously. In that case a multithreaded design may make for simpler code than writing (and optimizing) what is in effect a thread scheduler for a single threaded application. In short, unless one does not read the results of a previous write, there is no such thing as truly nonblocking, I/O. You will always have to check a semaphore to determine whether it is safe to read (or have some subsytem below implement the logic).

      Finally... you mention that error states are possible in threaded applications, and can be exercised by appropriately written code ("tests" which cripple a multithreaded application). Personally, having a number of almost impossible to find deadlocks hidden away in my system until some end-user chances upon one makes me very, very nervous.

      Now you are taking liberties with my statements. I did not say multithreaded code necessarily contains an error state. I was thinking of the case of a TPC like test. One common architecture for a multithreaded application is to have a pool of threads available which can perform all tasks available to the system. If the "benchmark" causes all the available threads to be contending for a single resource (like a write to the same record) then the application may be unavailable for future requests thus reducing throughput. A single threaded application will generally batch requests, so the multiple writes will be condensed to a single write request and the throughput will be higher. Nothing prevents a multithreaded application from doing the same thing, but you are correct the overhead in communication required in a multithreaded application would probably prevent it from equalling a single threaded app's performance on such a test. Deadlocks are a challenge because the most popular current languages have poor support for writing multithreaded applications (compare the relatively unpopular Ada95 to C/C++ for example). Deadlocks are not a necessity anymore than unstructured assembly code, but just as higher level languages encourage better structured code, newer languages encourage writing of reentrant, nonlocking code (Java for example strikes a reasonable middle ground between Ada and C as far as multiple threads go).

    39. Re:I'm only a humble C programmer, but.... by tomstdenis · · Score: 1

      Not as easy as you think. Most socket stuff is blocking to some degree and even still the "timed" stuff is platform dependent. At least with pthreads [which is nicely portable] you can work with trivial socket code in multiple threads.

      Also things like web servers don't work nicely as single threaded applications. If you have ever written one [I have, http://tom.iahu.ca] its very simple in a threaded model.

      Tom

      --
      Someday, I'll have a real sig.
    40. Re:I'm only a humble C programmer, but.... by Jay+L · · Score: 2

      Not as easy as you think.

      Well, I think it's not very easy, so it IS as easy as I think. It might not be as easy as someone else thinks, though. But it gets easier with practice and good libraries.

      At AOL, nearly all of our servers were single-threaded, based on a standard kernel of state-managing, event-calling support functions. We had non-blocking sockets, non-blocking database I/O, non-blocking DNS, really everything with the possible exception of local disk I/O (which is rarely necessary on a production server, and which can be solved with a cluster of "worker" processes doing the actual I/O).

      I worked on the mail system, so that's the part I know best. The difference in performance between sendmail (which forks multiple processes) and our own mail server (which ran single-threaded) was nothing short of astounding. Our days-late delivery problems disappeared almost instantly. This was on Suns and HPs; granted, we're now talking about forked processes rather than threads, and I don't know how the Sun and HP schedulers compare to Linux's, but the main point is that it's possible to write such a complex app single-threaded. I think the only significant thread-based app at AOL is AOLServer, which was developed independently.

      Writing single-threaded servers certainly takes skill and experience. Given the state-management problem, I had assumed that it was also more error-prone than writing threaded code, but from what I am learning about thread-safety, that may not be the case. But, no matter how little overhead you have doing a context switch, you have even less without it. At some performance level, that matters.

    41. Re:I'm only a humble C programmer, but.... by cduffy · · Score: 1

      Use of nonblocking I/O is in general almost as difficult as writing good multithreaded code.

      Absolutely. As I said, writing good singlethreaded code is often harder than writing multithreaded code for a similar function. However, it's also faster. My entire argument here is that threading is unwise as a performance-driven decision, not that threading is a poor idea on any other grounds (except for the deadlock-related potshots, which I'm willing to withdraw).

      In that case a multithreaded design may make for simpler code than writing (and optimizing) what is in effect a thread scheduler for a single threaded application.

      Absolutely -- I never intended to imply that a good single-threaded solution would be simplest of those options available. Note, however, that libraries implementing such a scheduler are already available, so one need not implement it ground-up. See Cheap Threads for one example.

    42. Re:I'm only a humble C programmer, but.... by tomstdenis · · Score: 1

      Yeah note how you are writing code to work on one platform. Try writing a POSIX compatible web server with sockets then watch as it doesn't work on half the platforms because "non-blocking" is something they don't support.

      I still can't imagine how anyone could think its a better solution. For example, consider a web server. When I submit a request with POST data I send a content-length. Now instead of just doing multiple recv()'s until you get it all your going to have some global structure which will patch the data as select() determines its available.

      Its *possible* to not use threads and get the same job done its just not as conceptually simple. For example, in the threaded case you'd have on function to receive all of the POST data. In the non-threaded model you'd have two functions. One function which calls select() waiting for activity and another which delegates the handler for each active socket.

      On multi-threaded apps thread-swapping amounts to saving the registers and loading new ones. Its not entirely a difficult task [so to speak, hahaha punny]. Unless you have like a couple 1000 active threads you're not going to notice.

      Tom

      --
      Someday, I'll have a real sig.
    43. Re:I'm only a humble C programmer, but.... by Anonymous Coward · · Score: 0

      It depends on what you are doing. I can easily think of a peer-to-peer application with 50,000 connections (50,000 ports active with connections to other boxes across the internet). With one thread per port doing I/O on a local database, another thread per port handling I/O on the network (both incoming and outgoing), and another thread acting as a control for the other two, you can have 150,000 threads running on one app. This is good for peer to peer and cluster applications -- and is one example, there are more.

    44. Re:I'm only a humble C programmer, but.... by pthisis · · Score: 2

      Not as easy as you think

      But far easier than multithreaded programming.

      Threads are a way of saying "screw protected memory". They should be used only when you don't want memory protection within your application. Almost always, using threads is the wrong choice; multiple processes and/or a state machine with non-blocking I/O(depending on the problem) will accomplish the same ends as efficiently* and are much easier to implement**

      *Remember that processes (COEs which don't share memory) are nearly as fast as threads in Linux, and faster in some cases. On other OSes (Irix, Solaris, Windows), processes are inefficient and threads are implemented more efficiently. That's a horrible hack to make up for ridiculously heavyweight processes; it's true that a small number of things can be optimized in a thread implementation (setting up VM mappings), but the actual speed implications of that are negligible in real-life programs. I'd be extremely surprised if you could find even one program exhibiting a measurable speed difference in Linux attributable to the scheduling, creation, and destruction properties of threads.

      **Threaded solutions often seem straightforward. The devil is in the details, though; locking, synchronization, and debugging issues tend to bite hard, and in the end I've never dealt with a problem where threading was a win over multiprocs and/or a state machine. The advantage of multiprocesses is not only in keeping memory protection; it also forces you to be explicit about what's shared and how that is communicated (and greatly simplifies debugging). Resulting designs tend to be much clearer and easier to make correct and maintain.

      Sumner

      --
      rage, rage against the dying of the light
    45. Re:I'm only a humble C programmer, but.... by tomstdenis · · Score: 1

      I think you are writing as a person who has never had to use either. Threads can really be life savers when used correctly. Sure you have to implement locking but that's what pthread_mutex is for.

      On low-mem devices making full copies of the process to spawn copies is just insane.

      And on windows the Thread implementation is *intentional* not accidental. The idea is that people using threads will take advantage of the speed increase.

      Tom

      --
      Someday, I'll have a real sig.
    46. Re:I'm only a humble C programmer, but.... by pthisis · · Score: 3, Insightful

      I think you are writing as a person who has never had to use either.

      I have written a dynamic content server that over the past 2 years has served over 6 billion requests, with 5 9's of uptime. I've written several realtime instrument control applications. I've written a distributed text mining application that does index-assisted regex searches of 1/2 terabyte of data in Threads can really be life savers when used correctly. Sure you have to implement locking but that's what pthread_mutex is for.

      On low-mem devices making full copies of the process to spawn copies is just insane.

      1) Look up COW and memory sharing.
      2) I never said "use only processes". A combination of processes and event loops is the way to go 99% of the time. There are some corner cases where threads are useful, but they tend to be abused by people who think "threads are good" without considering the alternatives nor the ramifications of that choice.

      And on windows the Thread implementation is *intentional* not accidental. The idea is that people using threads will take advantage of the speed increase.

      It's not a speed increase. Thread switching and thread creation on Windows are slower than process creation and process switching on Linux. On a par, but slower. Process creation on Windows is laughably slow, though, and process switching is substantially slower than thread switching.

      It's not that Windows figured out how to make their threads go fast, it's that their processes were dog-slow and they had to create an entirely seperate execution primitive to get any sort of reasonable concurrency. Linux did things the right way by making them both fast, and now allows you to choose between the two for _design_ reasons (do I want to share memory?) rather than artificial implementation reasons.

      You'll find a lot of knowledgeable people (Larry McVoy, former SGI kernel architect) who echo the same belief: use threads sparingly. Use as many threads as you have CPUs, and use processes instead if that makes more sense. Use more threads than that only if you're intimately familiar with the alternatives and know why they don't work, because while a state machine with non-blocking I/O may seem hard at first glance it'll almost certainly turn out to be easier to implement correctly, easier to debug, faster, and easier to maintain.

      Sumner

      --
      rage, rage against the dying of the light
    47. Re:I'm only a humble C programmer, but.... by Marijuana+al-Shehi · · Score: 1

      What? Do you have any idea how much time (and how much RAM) it would take to launch 100,000 threads in Java? Do you even know how it is done? This may help:

      Userspace threads:Java::Kernel space threads:C on Linux
      . Nice way to use cavalierly in a sentence though.
      --
      "I think all foreigners should stop interfering in the internal affairs of Iraq"
      -- Paul Wolfowitz, 7/21/2003
  9. Excellent by Anonymous Coward · · Score: 0

    Fantastic job, well done. Every little step counts, one day linux will be the primary OS on everybody's desktop!!!

  10. If you want to destroy my boxen. . . by endeitzslash · · Score: 3, Funny

    Launch 100,000 threads while I walk away. . .

    OK I'll shut up now.

    1. Re:If you want to destroy my boxen. . . by Wayfare · · Score: 1

      *as* I walk away.

      Sorry - had to.

  11. Parallelism by inkfox · · Score: 5, Interesting

    This is very cool; but does it scale to multiple CPU systems? More and more, SMP, split-bus and multi-core architectures are going to be taking over. If this holds up in those environments, Linux may actually have a leg up on some of the dedicated task heavyweights.

    --
    Says the RIAA: When you EQ, you're stealing bass!
    1. Re:Parallelism by Anonymous Coward · · Score: 0

      I believe it said in the article/discussion that they were using a dual p4 for testing. That would imply that scaling isn't a problem.

    2. Re:Parallelism by Anonymous Coward · · Score: 2, Interesting
      I believe it said in the article/discussion that they were using a dual p4 for testing. That would imply that scaling isn't a problem

      Many algorithms work great for one extra processor but fail miserably with more.

      In most cases, you can just busy wait on a semaphore with two CPUs and never notice the hit. 8, 32 or 512 CPUs and you're going to throw away most of your processing time.

    3. Re:Parallelism by _Knots · · Score: 2

      I've been following LKML, not that I can contribute much, but still. Most of the scheduling work, if memory serves, is tested on large-way boxes (the number 32 leaps to mind).

      You are encouraged to read the list for yourself because it's early in the morning and my brain might be playing tricks on me.

      --Knots;

      --
      Anarchy$ dd if=/dev/random of=~/.signature bs=120 count=1
    4. Re:Parallelism by Anonymous Coward · · Score: 0

      Maybe not, that it succeeded in starting up 100,000 threads in 2 seconds isa nice thing. But I doubt that any of the threads does any computation. The only thing that happened is that the thread was added to scheduler tables.

      All we see here is that the O(1) knows when to ignore
      a thread and give another thread time, or it simply
      prioritze creation of threads.

      Then, there is no better scheduler used than on FreeBSD. You got to admit the feeling is best there than the rubber one with Linux (must be all that penguinism)

    5. Re:Parallelism by deKernel · · Score: 1

      My question to this is, I didn't think that the P4 could do SMP?

    6. Re:Parallelism by cheese_wallet · · Score: 2

      My question to this is, I didn't think that the P4 could do SMP?

      The p4 xeons can.

    7. Re:Parallelism by Anonymous Coward · · Score: 0

      Which is why you don't busy-wait on locking mechanisms....

    8. Re:Parallelism by Anonymous Coward · · Score: 0

      What? FreeBSD only supports userspace threading. The scheduler will never be under any kind of pressure.

      Do you even know what you're talking about?

    9. Re:Parallelism by Anonymous Coward · · Score: 0

      SMT: Xeons do it, P4 doesn't yet (officially).

      SMP: P4 does it. as does Xeon.

      -k

    10. Re:Parallelism by NerveGas · · Score: 2

      Yeah, I wish that SMP was taking over. We're not really that much farther in that regard than we were 5 or 6 years ago, with the Pentium Pro. In fact, in some regards, we're WORSE off: Since the PPro, all Intel chips had SMP capabilities, even the "non-SMP capable" celerons. However, now the norm is for their chips to NOT have SMP capability.

      Yes, the Itanium and upcoming SledgeHammer are going to change things. But we've been hearing that for a decade. We'll see if things REALLY change or not.

      steve

      --
      Oh, you're not stuck, you're just unable to let go of the onion rings.
  12. Just try to... by Anonymous Coward · · Score: 0

    Imagine a Beowulf cluster of such Linux developer kernels!

  13. ...from the Build-the-Playroom Dept.... by Anonymous Coward · · Score: 0

    "This story explains how the latest Linux development kernel is now able to start and stop over 100,000 threads in parallel in only 2 seconds..."

    I didn't know the Linux kernal was a mother-to-be.... ;-)

  14. IE/Mozilla by teasea · · Score: 1

    Got a link for that?

  15. Great news! by zensonic · · Score: 2, Funny

    So now I'm able to open up 100.000 pr0n pictures in just 2 sec. Ubercool ;-)

    --
    Thomas S. Iversen
    1. Re:Great news! by Bishop923 · · Score: 2

      Only problem would be getting an HDD to transfer at 25.6 GB/s (assuming each pic was 500k)
      Now THAT would be impressive. :-)

    2. Re:Great news! by Anonymous Coward · · Score: 0

      500k what kind of porn do you look at?

    3. Re:Great news! by Anonymous Coward · · Score: 0

      I'm guessing BBW.

    4. Re:Great news! by Anonymous Coward · · Score: 0

      How about one of these?

      Well, I guess actually getting one of these would still be impressive, unless you are, of course, filthy rich.

    5. Re:Great news! by damiam · · Score: 1

      No problem, just steal Google's RAM array.

      --
      It's hard to be religious when certain people are never incinerated by bolts of lightning.
  16. I know a faster way... by ndogg · · Score: 0, Redundant

    It's called "pulling the power cable."

    --
    // file: mice.h
    #include "frickin_lasers.h"
  17. Not 100,000 threads in parallel, just 50. by Tronster · · Score: 1, Informative
    The title and description is misleading. From the comments further down in the article, Linus points out that only 50 threads at a time were running in parallel:

    From: Linus Torvalds
    Subject: Re: [ANNOUNCE] Native POSIX Thread Library 0.1
    Date: Fri, 20 Sep 2002 06:01:47 +0000 (UTC)

    Rik van Riel wrote:

    >I agree, it's pretty silly. But still, I was curious how they
    >managed to achieve it ;)

    You didn't read the post carefully.

    They started and waited for 100,000 threads.

    They did not have them all running at the same time. I think the
    original post said something like "up to 50 at a time".

    Basically, the benchmark was how _fast_ thread creation is, not now many
    you can run at the same time. 100k threads at once is crazy, but you can
    do it now on 64-bit architectures if you really want to.

    Linus
    1. Re:Not 100,000 threads in parallel, just 50. by Anonymous Coward · · Score: 0

      lol the title and the description were misleading!

      *gasp*

    2. Re:Not 100,000 threads in parallel, just 50. by vvikram · · Score: 5, Informative


      Yeah right. And modded to "Informative"? Slashdot moderators are the _pits_.

      Read ingo's reply to Linus. They _did_ start
      one test serially and also _parallelly_ . In short he says that its possible.

      vv

    3. Re:Not 100,000 threads in parallel, just 50. by the_quark · · Score: 2

      This could be huge for things like webservers, though, which spend a lot of their time kicking off new (logical) processes. As I understand it, on Linux, a big part of the reason Apache 2.0 hasn't taken off (aside from lack of availability of major packages) is that Apache 2.0's main win is in threading support. Under Linux, thread creation hasn't been much faster than process creation, because process creation was so dang fast.

      So, am I right in thinking this means threading (and hence Apache 2.0) will be a big win for Linux web servers, now?

    4. Re:Not 100,000 threads in parallel, just 50. by mikec · · Score: 2

      A later post pointed out that Linus was wrong. They actually did both tests: one test created and destroyed threads as fast as possible; the other created 100K threads first and then killed them all.

    5. Re:Not 100,000 threads in parallel, just 50. by DoctorHibbert · · Score: 2, Insightful

      True, however the feat is still quite impressive. By making the creation and destruction of threads cheaper, it frees developers from having to worry so much about the overall system impact when spawning threads.

      For instance, because of the expense many applications use thread pools, which is simply a bunch of idle threads that sit around doing nothing, waiting for work to do. These idle threads still take up system resources even though there not actually using CPU. Not to mention the extra work the developers have do to make the thread pools work for there applications.

      --
      Arbitrary sig
    6. Re:Not 100,000 threads in parallel, just 50. by Anonymous Coward · · Score: 0
      Here's a link that talks about this thread creation further {{google cache, in case its /.ed}}. It seems that 100,000 simultaneous threads is not going to happen on your desktop linux box anytime soon, but this guy says 50,000 started *and running* in less than a second is not too far away.

      Keep up the good work, guys!

    7. Re:Not 100,000 threads in parallel, just 50. by grytpype · · Score: 2
      I'm afraid YOU didn't read the article very carefully, Ingo replied as follows to Linus's post:
      actually, that was Ulrich's other test, which tests the serial starting of 100,000 threads. the test i did started up 100,000 concurrent threads which shot up the load-average to a couple of thousands. [the default timeslice the parent has is enough to start more than 50,000 parallel threads a pop or so.]
      --

      - Have a picture

    8. Re:Not 100,000 threads in parallel, just 50. by kinnunen · · Score: 5, Informative
      Read Ingo's posts too:
      actually, that was Ulrich's other test, which tests the serial starting of 100,000 threads. the test i did started up 100,000 concurrent threads which shot up the load-average to a couple of thousands. [the default timeslice the parent has is enough to start more than 50,000 parallel threads a pop or so.]
      And another one:
      Anton tested 1 million concurrent threads on one of his bigger PowerPC boxes, which started up in around 30 seconds. I think he saw a load average of around 200 thousand. [ie. the runqueue was probably a few hundred thousand entries long at times.]
    9. Re:Not 100,000 threads in parallel, just 50. by ergo98 · · Score: 3, Insightful

      Under Linux, thread creation hasn't been much faster than process creation, because process creation was so dang fast.

      That's called "making lemonade out of lemons". Clearly this test has shown that thread creation in Linux was horribly broken, not the flip side that process creation was so wonderfully good.

    10. Re:Not 100,000 threads in parallel, just 50. by the_quark · · Score: 5, Informative

      No, seriously. Process creation under Linux was time-similar to thread creation on other OSs. That's because Linux was as fast at creating *a process* as other OSs are at creating *a thread*. IIRC, threading was initially implemented in Linux from the process-creation methods, so it was similar in speed (the main advantage in Linux from threads was the shared memory space if your application wanted that sort of thing). That's why Apache 2.0 is bringing NT performance more in line with Linux 1.3 performance: NT's threading speed is a lot closer to Linux's forking speed. Again, I'd like to underscore I'm not an expert on this, and it's possible I'm mistaken about relative benchmarks (is NT w/Apache 2.0 a little faster than Linux w/Apache 1.3? Could be...) but I'm very confident of the basic underlying point, that Linux process creation is essentially comparable to other OSs' thread creation, perhaps even faster.

      See, for example, http://www.linux.cu/pipermail/linux-prog/2001-Febr uary/000027.html, just one of the first Google links that popped up when I went looking for proof that I'm not on crack: "Linux newcomers often are unaware of the substantial differences between Linux and other operating systems. To implement concurrency, they use multithreading exclusively, mistakenly assuming as high an overhead associated with Linux multiprocessing as on other platforms." In fact, knowing how fast Linux's process creation is relative to other systems' thread creation makes this even more impressive in my mind. This isn't just a bug fix; much like with process creation before it, Linux is doing something fundamentally better than its counterparts.

      Don't forget: Just because this is /. doesn't mean I'm just a Windows-hating troll. I try to make sure all my Windows-hating-troll-posts are at least backed up by facts. ;)

    11. Re:Not 100,000 threads in parallel, just 50. by Anonymous Coward · · Score: 0

      It better be more than 50 at a time, even nachos can do that :)

    12. Re:Not 100,000 threads in parallel, just 50. by brianpane · · Score: 4, Informative
      Apache 2.0 doesn't actually do thread creation very frequently. The thread creation cost occurs mostly at startup. So the limiting factors for threaded Apache performance on Linux are mainly:
      • The speed with which the kernel can schedule and context-switch among threads
        For some recent data on this, see http://marc.theaimsgroup.com/?l=apache-httpd-dev&m =103228014211983. The O(1) scheduler patch for 2.4 seems to help here.
      • Memory usage per thread
      • Concurrency limitations of the Apache code itself
        This has been improving gradually with successive 2.0 releases, as the remaining global locks are removed or optimized.
      • General robustness of the thread implementation
        The current (2.4) Linux threading implementation doesn't work well with debuggers.
      At first glance, it looks like the NPTL could be a win for threaded Apache on Linux, as offers some solutions first the first and last of these issues.
    13. Re:Not 100,000 threads in parallel, just 50. by kinnunen · · Score: 1

      This is just the latest example why slashdot needs a -1 bullshit/misinformation moderation option. The post isn't a troll, it's not really a flame[bait] either and it certainly isn'f off-topic. That leaves -1 overrated. The -1 is fine, but it doesnt change the moderation reason shown with the comment. So if you moderate +5 informative post with -1 overrated the next idiot moderator will just see the +4 informative and think "hey this is a good post, it really should be +5 informative".

    14. Re:Not 100,000 threads in parallel, just 50. by Karellen · · Score: 5, Informative

      It's not process/thread _creation_ times that make the difference, it's the process/thread _context_switch_ times that really mount up, which is where Linux shines.

      And yes, Linux's process context switches are on a par (possibly faster - can't be bothered to look up benchmarks) with NT's thread context switches.

      K.

      --
      Why doesn't the gene pool have a life guard?
    15. Re:Not 100,000 threads in parallel, just 50. by Anonymous Coward · · Score: 0

      I'd love to see you do that on any windows "kernel"

      kthx~~

    16. Re:Not 100,000 threads in parallel, just 50. by Wdomburg · · Score: 2

      > The title and description is misleading. From the
      > comments further down in the article, Linus
      > points out that only 50 threads at a time were
      > running in parallel:

      And the next comment down is from Ingo:

      actually, that was Ulrich's other test, which
      tests the serial starting of 100,000 threads.

      the test i did started up 100,000 concurrent
      threads which shot up the load-average to a
      couple of thousands. [the default timeslice the
      parent has is enough to start more than 50,000
      parallel threads a pop or so.]

      So, yes, they did manage 100,000 threads running in parallel.

      Matt

    17. Re:Not 100,000 threads in parallel, just 50. by Anonymous Coward · · Score: 0

      parallelly? is that a word?

    18. Re:Not 100,000 threads in parallel, just 50. by Sivar · · Score: 2

      Linus was... Wrong?!

      Whoa, that's going to completely shatter the world view of many Slashdotters.

      --
      Computer Science is no more about computers than astronomy is about telescopes. --E. W. Dijkstra
    19. Re:Not 100,000 threads in parallel, just 50. by ProtonMotiveForce · · Score: 0

      Maybe that's because NT offers so much more of a low level object security model, and so many more features at that level?

      You can gripe about NT at the gestalt OS level all you want (I don't, but I almost understand the mentality of those who do), but you have to admit that it's got a decent low-level architecture. Everything is an object and it's all got a complex set of controls, e.g. ACL's.n It's probably just not geared to high speed context switches.

    20. Re:Not 100,000 threads in parallel, just 50. by Anonymous Coward · · Score: 0

      Actually, he is wrong a lot. What makes him kind of special is if someone points it out with a solid argument, he will admit he was wrong.

    21. Re:Not 100,000 threads in parallel, just 50. by sagei · · Score: 2

      No, seriously. Process creation under Linux was time-similar to thread creation on other OSs. That's because Linux was as fast at creating *a process* as other OSs are at creating *a thread*. IIRC, threading was initially implemented in Linux from the process-creation methods, so it was similar in speed

      It was and still is implemented by the process creation methods. Threads were (and still are) the same as processes in Linux (to the kernel, anyhow). All process creation is done by do_fork(), which accepts clone() flags that specify what to share between the parent and the child. "Threads" (as opposed to normal processes) just happen to share a few things: address space, signal handlers, open files, etc.

      But yah, process creation in Linux is sick. Hold your head high.

      --

      Robert Love

    22. Re:Not 100,000 threads in parallel, just 50. by AJWM · · Score: 2

      You're quite right, and in fact this predates Linux and NT -- Unix was always good at process creation, whereas VMS process startup was very heavy on overhead.

      It's not surprising that Linux (modelled on Unix) and NT (originally modelled on VMS) show similar characteristics. It's the reason that many Unix applications tend to be written as a bunch of cooperating processes, whereas NT apps are monolithic monsters with lots of threads.

      Unfortunately, thanks to a generation of CS students having learned bad habits on Windows, we're starting to see a lot of Linux apps written as monolithic monsters. (Of course there are few old Unix apps out there like that too, perhaps some old mainframe mentality leaking through.) There are advantages to cooperating/communicating processes vs the monolithic multithreaded approach: it's easier to test the components separately, it's easier to reuse the components to make different systems, and a bug in one place won't necessarily clobber the whole thing.

      --
      -- Alastair
    23. Re:Not 100,000 threads in parallel, just 50. by gregorio · · Score: 1

      And yes, Linux's process context switches are on a par (possibly faster - can't be bothered to look up benchmarks) with NT's thread context switches.

      And why do we need a low-latency patch just to listen to a mp3 and move windows without having to listen a lot of noise (when you move a window)?
      What you just said is just plain BS, hoping for karma, if Linux had a good context switching performance, we wouldn't need a low latency patch.

    24. Re:Not 100,000 threads in parallel, just 50. by himi · · Score: 4, Interesting

      The latency issues that cause mp3 skipping under heavy load in Linux have nothing at all to do with context switching, and everything to do with /scheduling/ latency: how long it takes for a process that has work to do to actually get control of the cpu. Context switching has /nothing/ to do with that.

      The low latency patches go through the kernel breaking up areas where spinlocks are held for long periods of time. That's what causes massive scheduling latency in the kernel.

      Context switching under Linux /is/ extremely fast - it's actually been measured (a lot), and it's something the kernel developers pay a lot of attention to and optimise very carefully. They literally count cpu cycles in these code paths. Context switching time is a serious performance limiter in many areas, so getting it right is important, and it's something that Linux does /very/ well.

      Go do some real research before you accuse someone who's right of karma whoring bullshit.

      himi

      --

      My very own DeCSS mirror.
    25. Re:Not 100,000 threads in parallel, just 50. by napir · · Score: 1

      The actual time spent in the kernel during a context switch seems like it would be only a small part of the penalty you'd pay during a (process-changing) context switch. Switching from one process to another means moving to a different address space, which means that most of the stuff in your caches is going to be trash. The number of cache misses you're going to suffer when the process first starts running again seems like it would be the largest performance problem. This is the main argument for using threads instead of separate processes, not how long they take to create.

    26. Re:Not 100,000 threads in parallel, just 50. by himi · · Score: 2

      Yes, that's a performance issue, but it's not a /latency/ issue - the new process is running, and from there on in the latencies are only a few hundred cycles rather than measurable in microseconds. Until the next time the process enters the kernel, or page faults, or whatever. As far as latency goes, context switching is of minimal importance unless you're worried by latencies on the order of less than a microsecond (depending on hardware and the like, of course).

      The argument that threads trash cache less than full processes seems fairly bogus to me - the cache trashing will be much more dependant on the size of the working sets of all the running processes, and there's nothing to say that a thread will have a smaller working set than a process. The text segment will be shared, yes, but it's the same with multiple instances of /any/ process, because the program text will be mmapped read only, allowing the memory to be shared, and thus kept in cache. The TLB flush needed would be an added cost, but unless the cache really is being trashed completely by your program it'll be reloaded straight from cache, and that shouldn't be more than a few hundred cycles (I think - don't quote me on that).

      In any case, the real performance comparison isn't between multiple processes versus multiple threads, it's between a multithreaded implementation and a single-threaded one. In /that/ comparison threads come last, simply because they /have/ those kinds of cache interactions and so forth, where a single-threaded version won't. They also have overhead due to locking, greater debugging difficulties, and other added complexities. On the other hand, though, you can't make use of more than one processor without having multiple processes, whether they're threads or full processes . . .

      I think the biggest thing making threads attractive to people is the fact that a threaded approach will often make things simpler to think about in the design stage. You can make all the independant threads of control in your design /real/ threads of control in the implementation. That comes at a cost, though . . .

      Personally, I like the quote from Alan Cox that I've seen in a few people's .sig: "Threads are for people who can't program state machines". It's more complex than that, but it does seem to capture a lot of what motivates threaded designs.

      himi

      --

      My very own DeCSS mirror.
    27. Re:Not 100,000 threads in parallel, just 50. by pthisis · · Score: 2

      And yes, Linux's process context switches are on a par (possibly faster - can't be bothered to look up benchmarks) with NT's thread context switches.

      Last time I benchmarked, which was a long time ago (NT 3.51 days), Linux process switch times were 5x faster than NT thread-switch times on the same hardware. Linux thread-switch times were on a par with process-switch times, NT thread-switch times were about 20x faster than NT process-switch times.

      I'd expect all those numbers to have changed, though.

      Sumner

      --
      rage, rage against the dying of the light
    28. Re:Not 100,000 threads in parallel, just 50. by Karellen · · Score: 2

      _Need_ the low latency patch? We don't.

      karellen $ uname -a
      Linux foo 2.4.17 #1 Sat Jul 13 12:21:18 GMT 2002 i686 unknown
      karellen $ cat /proc/cpuinfo | grep -E "model|cpu"
      cpu family : 6
      model : 3
      model name : AMD Duron(tm) Processor
      cpu MHz : 757.485
      cpuid level : 1
      karellen $ cat /proc/meminfo | grep MemTotal
      MemTotal: 126732 kB
      karellen $

      So, I'm running 2.4.17 on an AMD 750 with 128MB of RAM. You'll have to take my word that that's a stock 2.4.17, with no patches, but I'm playing a list of .ogg files with xmms, while ripping and ogging a CD in the background, with Mozilla running, and grabbing a mozilla window and moving it around the desktop (with opaque window moving switched on) really quickly for 20 seconds results in - no skipping.

      Yeah, reducing latency will be nice, but as far as I can tell, it's not actually needed for anything to do with the `user experience' at the moment.

      Don't know what you've got running in the background, but it must be pretty hefty.

      K.

      --
      Why doesn't the gene pool have a life guard?
  18. Re:It's cool! by Anonymous Coward · · Score: 0

    warning goatse link! don't click!

  19. Best Quote: by Fallen+Kell · · Score: 1

    Why so many threads? "Because we can :)"

    --
    We were all warned a long time ago that MS products sucked, remember the Magic 8 Ball said, "Outlook not so good"
  20. How does this compare? by Anonymous Coward · · Score: 0


    How does this compare to other OSes, such as Solaris, NT, OSX, etc?

  21. Sounds cool, but all I could think of... by Geek+Tragedy · · Score: 5, Funny

    "Hello, my name is Ingo Molnar. You killed -9 my process: prepare to die."

    Sorry, had to :P

    1. Re:Sounds cool, but all I could think of... by w4r3z_d00d · · Score: 0

      wtf?

    2. Re:Sounds cool, but all I could think of... by unsinged+int · · Score: 5, Funny

      I think it's more commmonly this:

      "Hello, my name is Ingo Molnar. You kill -9 my parent process. Prepare to vi."

    3. Re:Sounds cool, but all I could think of... by Anonymous Coward · · Score: 0

      I read this like 15 times before I realized you guys were trying to say, My name is Iniego Montoya. And for those not in the know its from the princess bride.

    4. Re:Sounds cool, but all I could think of... by Sycle · · Score: 1

      Ingo Molnar is credited in the summary (try the top of the page) as being the major contributor to the project.

      It's nice you got half of the reference, anyway.

    5. Re:Sounds cool, but all I could think of... by rweir · · Score: 1

      Talking of vi and patches...

      No one's as manly as Al Viro

  22. NOOO!!!!! by Monkelectric · · Score: 3, Funny

    At school (before I graduated so long ago) we would "fork bomb" the compute servers [ while(1) do { fork(); } ] in an attempt to extend deadlines or simply be assholes :)

    --

    Religion is a gateway psychosis. -- Dave Foley

    1. Re:NOOO!!!!! by powerlord · · Score: 2

      Hehehe I had a classmate do that accidentally.

      One of our final projects was to impliment our own shell. This would of course necessitate a fork() command... he hadn't checked conditions quite right and managed to use up all the resources for his account. Fortunately someone had set the Ultrix (Unix on VAX) system up with a little intelligence. He only bombed his own account and had to get the Prof. to go in and kill the out of control Shell :)

      I on the other hand merely got half-baked tokenizing. Great teacher (pity the disbanded the Comp-Sci department around us).

      --
      This space for rent. All reasonable inquiries will be entertained at proprietors discretion.
    2. Re:NOOO!!!!! by ez76 · · Score: 5, Funny

      I am replying pre-emptively to dissuade the AC's who would otherwise reply to you and point out that your post should not have been modded funny because this innovation would not prevent fork() bombing because it involves spawning threads and not processes.

      I am further replying pre-emptively to dissuade the AC's who would otherwise reply to me and point out my egregious abuse of run-on sentences.

      I am further replying pre-emptively to dissuade the AC's who would otherwise reply to me and point out my egregious abuse of +1 bonus.

      I am further replying pre-emptively to dissuade the AC's who would mod this post down as off-topic because they do not get the parallel allusion to fork-bombing.

    3. Re:NOOO!!!!! by AJWM · · Score: 3, Funny

      I did something like that back in my school days on a dual-CPU Burroughs B6700, but with a twist: Each process forked itself twice, then waited. When it received a signal about a child process being killed, it spawned two more. I had a sleep of a few seconds or so in there so it didn't grow too fast.

      The fun part of that was when the system operators saw the processes replicating like crazy and started to kill them, that made it worse.

      Another fun trick with that machine was to set up a circularly-linked list and invoke the LLLU (linked list lookup) instruction on it...

      (Yeah, stupid things to do. At least I only did them during relatively quiet times.)

      --
      -- Alastair
    4. Re:NOOO!!!!! by stephanruby · · Score: 1
      At school (before I graduated so long ago) we would "fork bomb" the compute servers [ while(1) do { fork(); } ] in an attempt to extend deadlines or simply be assholes :)

      That explains why some professors wouldn't give us any extensions when the labs' servers were down.

    5. Re:NOOO!!!!! by Anonymous Coward · · Score: 0

      We have a solution to this: If you homework was not
      in by deadline, tough luck. So, fork bombing just
      makes you an asshole.

    6. Re:NOOO!!!!! by inio · · Score: 5, Funny

      Dude, you seriously need to look into writing patents.

    7. Re:NOOO!!!!! by Agent+Orange · · Score: 1

      Heh, I remeber defining an inifitely-recursive function that kept spawning subshells in shell once. only _then_ did I discover that the main server didn't have any limits...

      15,000 zombie processes later....*CRASH*

      oops :)

    8. Re:NOOO!!!!! by Anonymous Coward · · Score: 0

      this horse-fucker has no friends and no future

    9. Re:NOOO!!!!! by inode_buddha · · Score: 1

      Fond memories of the Burroughs B6900 at my local college here... 25 years later it was replaced by a Gateway2000 with dual Xeons...

      --
      C|N>K
    10. Re:NOOO!!!!! by Anonymous Coward · · Score: 0

      I have one very important question to ask you. How would an AC mod a post?

    11. Re:NOOO!!!!! by defile · · Score: 2

      For any admins subjected to such clever users, the correct way to handle a forkbomb is to first send STOP to all processes, which will prevent them from replacing the siblings which you kill, then you break out the nine.

    12. Re:NOOO!!!!! by Error27 · · Score: 1

      at the kernel level threads are just processes with shared memory.

  23. Re:I'm back! by Anonymous Coward · · Score: 0

    Not much longer. Microsoft is going to be brought down by the next generation of IIS worms.

  24. Windows by jeffbru · · Score: 3, Interesting

    Just out of curiousity, how does the benchmark in windows compare?

    --
    - Jeff Brubaker
    1. Re:Windows by CoolVibe · · Score: 2, Troll
      Oh, I'll bet Microsoft could rig an system without the graphics, network, most driver subsystems and the GUI stuff to skimp on overhead and winge their way to a higher number of parallel threads in less time.

      Or they could just blatantly pay some other company that does "independant testing" *cough*mindcraft*cough to lie about it :)

    2. Re:Windows by w4r3z_d00d · · Score: 0

      all i get is teh gaping assholes. wtf am i doing wrong?

      is i tht e lunix i sintalled on my b0x?

    3. Re:Windows by Courageous · · Score: 2, Insightful

      It is *impossible* to even allocate more than about 31,000 threads under windows on 32 bit machines. You simply CAN'T do it. The minimum thread stack size is 1 64KB page. You an only address 2GB of memory on a 32 bit windows OS. Do the math.

      C//

    4. Re:Windows by akintayo · · Score: 1

      maybe i am confused, but i thought windows had a switch that would allow the user application to address 3G of memory

      --
      Woe be on to them, all who rise against poor people, shall perish in a the end. Buju Banton
    5. Re:Windows by bmajik · · Score: 2

      Two nitpicks:
      you can address 3GB with the /3GB switch :)
      you can address significantly more with AWE/PAE, but i dont know that you can use that additional memory for thread stacks.

      Just FYI, Yesterday i had SQL server 2k running with 1914 threads ( in AWE mode)

      --
      My opinions are my own, and do not necessarily represent those of my employer.
    6. Re:Windows by Courageous · · Score: 2

      Well, that may be the case. I was making more of a reference to the reserve memory lower limit on a thread's stack size.

      C//

  25. Re:It's cool! by Anonymous Coward · · Score: 0

    No, this is a goatse.cx link.

  26. Real World Example by robinjo · · Score: 2

    I'm building a project where there will be one huge database with up to 200 different companies connected to it pretty much nonstop. 1-10 users from every company depending on the time of the year. 2 threads for every connection.

    200*10*2=4000 threads.

    1. Re:Real World Example by khuber · · Score: 2
      Why would you use 2 threads per connection instead of the more common select() + worker thread pool? That doesn't seem like a scalable design.

      -Kevin

    2. Re:Real World Example by flux · · Score: 2, Interesting

      What is the fundamental reason select/poll should be that much faster anyway? Well, you win the context switch-times, if you can handle many clients in a tick. But on the other hand it does affect the way you need to design the code, and doing some stuff that neveer stalls withouot threads might be tricky.

      Just imagine a situation where a thread might need to calculate something, or initialize a big array. Now, if it's run under a select-loop, you need to do that in parts to avoid starving the server. With threads, you just do the trick and don't care about the rest of the world which keeps serving the clinets, no matter how long youo stay in the functino.

    3. Re:Real World Example by khuber · · Score: 2, Informative
      I was talking about a hybrid design, not pure select. Of course you are right about the limitations of pure select. Thread per client starts to bog as the number of simultaneous clients increases.

      It's not practical to serve hundreds/thousands of clients with a thread per client model. A typical machine can't handle the load well because it has limited resources. It will thrash. By having a thread pool you place a limit (throttle if you will) on resource utilization. Most high performance, highly scalable web and app servers use this model or a variant.

      There is another architecture based on event driven state machines aka SPED (single process event driven) that is high performance and single process/single thread in its pure form. The Zeus web server does this.

      -Kevin

    4. Re:Real World Example by saurik · · Score: 1

      Another model is SEDA ("Staged Event-Driven Architecture"). This was mainly examined by the guy who wrote what became Java's new java.nio package.

      Here's a link: http://www.cs.berkeley.edu/~mdw/proj/sandstorm/

      If I remember correctly, SEDA uses a few thread pools to handle clients at different stages of their work.

      From his website:
      "We have built a number of applications to demonstrate the SEDA framework. Haboob is a a high-performance Web server including support for both static and dynamic pages that outperforms both Apache and Flash (which are implemented in C) on a SPECWeb99-like benchmark. Other applications include a Gnutella packet router and Arashi, a Web-based email service similar to Yahoo! Mail."

  27. boxen. . . by Catskul · · Score: 2, Troll

    Could you please refrain from using "boxen". It makes my head hurt

    --

    Im not here now... Im out KILLING pepperoni
    1. Re: boxen. . . by Naikrovek · · Score: 1

      agreed, "boxen" is not a word. "Boxes" is a word.

      Non-words used as words is like, so 1999, man.

    2. Re: boxen. . . by Anonymous Coward · · Score: 0

      pfft..... j00s be jealous coz u ain't got no boxen in yo life

    3. Re: boxen. . . by BlacKat · · Score: 1

      Maybe you need to go visit dictionary.com?

      http://www.dictionary.com/cgi-bin/dict.pl?term=b ox en&db=*

      Which has this entry from the jargon file:

      boxen /bok'sn/ (By analogy with VAXen) A fanciful plural of box often encountered in the phrase "Unix boxen", used to describe commodity Unix hardware. The connotation is that any two Unix boxen are interchangeable.

    4. Re: boxen. . . by spongman · · Score: 2
      Non-words used as words is like, so 1999, man.
      And so is using time as an adjective.
    5. Re: boxen. . . by BlacKat · · Score: 1

      How the hell was my post "overrated"?

      It was directly to the point at hand, that the word "boxen" is a valid "word" in use by many people.

      Yeesh.

    6. Re: boxen. . . by bsartist · · Score: 1

      Could you please refrain from using "Im" instead of "I'm?" It makes my head hurt.

      --
      Lost: Sig, white with black letters. No collar. Reward if found!
    7. Re: boxen. . . by heffrey · · Score: 1

      What I want to know is why use boxen rather than boxes?

    8. Re: boxen. . . by BlacKat · · Score: 1

      Got me, tho "boexn" just sounds better to me then "boxes". ;)

      It's also acceptable according to the Jargon File so hey... Boxen it is!

    9. Re: boxen. . . by pthisis · · Score: 2

      What I want to know is why use boxen rather than boxes?
      "boxes" refers to the physical objects (ie the cases and contents thereof)

      "boxen" refers to the notional servers.

      My Linux boxen could be retasked to be FreeBSD boxen, but they'd still be the same boxes.

      AFAIK, "boxen" was derived from "VAXen". And it was never "VAXes", that would brand you as computer-illiterate as quickly as saying "What's the http for that?" or using "PC" to mean "Windows box".

      Sumner

      --
      rage, rage against the dying of the light
  28. Re:It's cool! by Anonymous Coward · · Score: 0

    man, that chick is hot. Hot grits, anoyone?

  29. whoa! by RestiffBard · · Score: 4, Funny

    I have no idea what the hell you're talking about but it certainly sounds impressive. :)

    --
    - /* dead coders leave no comments */
  30. Great by C0D3X · · Score: 3, Funny

    Now we finally have the power to run 99,999 pop up ads when we visit that pr0n site

  31. Re:Windows comparison by pVoid · · Score: 3, Interesting

    Very interestingly enough, either windows has a quota, or some sort of memory leak or something...

    Max I can create in a process is 2031 threads... That being done in 700ms.

    It's odd cause I can create more if I run several processes. It doesn't look like the kernel is choking on thread creation...

    will investigate more.

  32. Possible use by captaineo · · Score: 2

    Normally I am of the "use only as many threads as CPUs" school of thought, but I can think of a reason to use 100,000 threads - imagine a large FTP server, or a multi-homed HTTP server, where you need to provide each connected user with his own set of access privileges or filesystem context. A one-thread-per-connection server may be the easiest way to build security into the system.

    1. Re:Possible use by vsync64 · · Score: 2, Informative

      Except that threads, as far as I am aware, share the same address space. Multiple processes need to arrange to share memory, and therefore are less likely to trample on one another or careen out of control.

      --
      TO BUY A NEW CAR WOULD MAKE YOU SEXUALLY ATTRACTIVE.
    2. Re:Possible use by jhines · · Score: 2

      Uber monster sim program, with a city full of residents, each run by a thread.

    3. Re:Possible use by bmajik · · Score: 2

      of course the penalty you pay for this is that fork() is expensive, and shared memory is a finite system resource. try the command "ipcs" on a sys-v type box.

      It is also generally the case that switching between processes is more expensive than switching between threads.

      to the parent poster : 1 thread per connection is a pretty naive way to do it, but its got advantages - simplicity. It's a moot point since on a stock OS you'd run out of socket descriptors long before you'd run into a thread-count maximum.

      --
      My opinions are my own, and do not necessarily represent those of my employer.
    4. Re:Possible use by be-fan · · Score: 4, Insightful

      "use only as many threads as CPUs"
      >>>>>>>>
      Then please stay away from my GUI apps. I hate those UNIX grognards that come from that school of thought, then try to code GUI applications with only one thread and end up with apps that can't update the GUI while doing I/O. On my 300 MHz PII, that particular trait made Galeon unusable. It had one rendering thread for all the tabs, so when I was loading a complex page like /. in another tab, whatever tab I was actually reading would freeze up.

      --
      A deep unwavering belief is a sure sign you're missing something...
    5. Re:Possible use by Nevyn · · Score: 1
      of course the penalty you pay for this is that fork() is expensive
      Not on Linux, fork() is about the same speed of thread creation ... by design.
      and shared memory is a finite system resource. try the command "ipcs" on a sys-v type box.
      Wrong again, you can use mmap() for file backed shared memory.
      It is also generally the case that switching between processes is more expensive than switching between threads.
      Somewhat, in that you can keep the TLB etc., but you have the cache syncronisation problems with threads that you don't get with processes (Ie. multiple threads write to the same point in memory then it bounces around their caches). And of course pretty much everyone gets the locking wrong in threads, so the app. either serialises on locks or goes really fast and stops.
      --
      ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
    6. Re:Possible use by captaineo · · Score: 2

      I hate those UNIX grognards that come from that school of thought, then try to code GUI applications with only one thread and end up with apps that can't update the GUI while doing I/O

      Then they are stupid. I/O can be done asynchronously in a single-threaded program; you just need to use non-blocking I/O (sockets) or AIO (disk).

      As for Galeon - the rendering code needs optimization. Rendering in a background thread is NOT going to help. (think: the socket connection to the X server is inherently serialized; if rendering is the bottleneck then the rendering code itself is the problem)

    7. Re:Possible use by be-fan · · Score: 2

      I/O can be done asynchronously, but that forces an interrupt-driven model, which (IMO) is even more complex than a threaded model. As someone who started programming in the 90's (when threads were commonplace) and started GUI programming by playing around with BeOS, it seems much more natural to me to just have a thread that sleeps unless its drawing or getting updated information about the window contents. AIO vs threads aside, the aversion to threads causes a major problem with GUI programming. Most current windowing systems tend to encourage multithreaded GUI apps (take a look at Win32, apps there use as many threads as BeOS apps ever did) while none encourage AIO. As a result, programmers who don't like threads end up not using *either* model.

      I don't know the details of the Galeon code, but it doesn't seem to be the connection to the X server that's holding it up. While its loading a web page in one tab, it doesn't respond to events going on in another tab. Galeon could not possibly be spending all that time rendering (especially since Gecko is supposed to be so fast). What it seems to me is that the parsing and rendering and event handling are all occuring in one thread, so while the browser is parsing and loading one page, it can't render or respond to input events. Breaking the user-interaction into another thread would allow the GUI to respond while the page was loading.

      --
      A deep unwavering belief is a sure sign you're missing something...
    8. Re:Possible use by bmajik · · Score: 2

      maybe on linux - and maybe now. so does this announcement now mean that you can fork 100,000 processes in a matter of seconds as well ?

      (and what linux kernel lets you have 100k simultaneous processes ?)

      and given what i remember about fork, isn't it the case that you memcopy the entire address space of the forking process for each fork() (barring optimizations such as perhaps shared text segments) ?

      are you telling me that pthread_create() on _Every_ platform copies the entire process address space ? i dont think this is the case. a large orchestrated memcpy of course is a perf hit - one that afaik, forking required, and threading does not.

      Note that im not very well versed in how _linux_ threads work - i only know that they've always been " a bit different ".

      Cache synchronization in MT apps is hardly on the same scale as reading/writing to shared memory (or mmap()ed regions?!) If you demonstrate that your forked app works faster via mmap than a single-address space process with multiple threads writing to shared non-thread-local storage, on a platform that has a reasonable threads implementation (not necessarily linux, but i haven't followed linux's threading at all, honestly), i'll be pleasantly surprised.

      fork() has advantages. however, dismissing threading outright by claiming that fork() is equal or superior makes anything you claim dubious.

      re: thread switching vs process switching:
      this argument seems totally ridiculous. i can't possibly fathom how _any_ cache coherency solution that is thread-specific is time-comparative with flushing and reloading the tlb, flushing and reloading _all_ caches, and so on. and cache coherency has been well attacked in designs like the SGI O3k. Are you telling me that a cache-line or two is going to be _less_ efficient than dumping all caches and tlbs ?

      Locking isn't as bad as you claim. And the same fundamental problem(s) exist w.r.t. sharing resources wether your talking threads or processes.

      --
      My opinions are my own, and do not necessarily represent those of my employer.
    9. Re:Possible use by captaineo · · Score: 2

      Although I disagree that threaded programs are inherently easier to write than state machine programs, I can understand your point of view. It's probably a matter of personal preference and experience. (yes, Win32 and BeOS both strongly encourage a multithreaded style of programming, but I don't think that decision was made purely on technical merit. Multithreading was the "cool" thing to do back when these APIs were invented...)

      If you need an example of AIO in a GUI application - consider Netscape 4 or Mozilla on Mac OS 9. These programs are multithreaded on most operating systems, but since Mac OS 9 has weak support for threads, the Netscape runtime simulates multiple threads using coroutines and AIO. (this is done transparently to code running on top of the runtime, so it still appears to be written in the multithreaded model...)

      I use Mozilla and I do notice that it sometimes becomes unresponsive to input while loading a page. But when I boot into Windows, Internet Explorer has no problem loading the same pages without noticeable delays. (not only are other IE windows not blocked, but the one that's loading isn't blocked for long either - so it can't be IE doing all the work in a background thread). I blame inefficient code in Gecko... Pushing the work into a background thread is only a band-aid; it's like pushing a bubble sort algorithm into a background thread when you could just use quicksort and get the same work done much faster.

    10. Re:Possible use by be-fan · · Score: 2

      It probably is a matter of personal preference. I like multithreading because if you look at each thread seperately, each on by itself is linear. With state machines, everything is together, but its non-linear.

      The thing with AIO and multithreading is that it isn't really faster, but it *seems* faster (to the user). User-interface code takes very little CPU power, but its extremely time sensitive. Doing UI handling asyncronously prevents any sort of background work (efficient or not) from influencing the speed of the UI.

      --
      A deep unwavering belief is a sure sign you're missing something...
    11. Re:Possible use by vsync64 · · Score: 1
      and given what i remember about fork, isn't it the case that you memcopy the entire address space of the forking process for each fork() (barring optimizations such as perhaps shared text segments) ?

      Ever hear of copy-on-write? This kind of technique is why that's a big deal. The point of that, if I understand correctly, is that while the program thinks its memory space has been copied, it's still in the exact same place until it actually tries to write to that memory. At that point, I imagine the system would still only copy the page being written to.

      This is where Linux memory allocation (used to; I'm pretty sure I saw an article about them fixing it recently) fall down under stress sometimes. Programs could allocate all sorts of memory, but until they tried to write to it, the memory wouldn't be taken out of free space. If software suddenly starts claiming nonexistant memory it was promised, the system runs out of memory and it has to kill things off.

      --
      TO BUY A NEW CAR WOULD MAKE YOU SEXUALLY ATTRACTIVE.
    12. Re:Possible use by imroy · · Score: 1
      ...isn't it the case that you memcopy the entire address space of the forking process for each fork()...

      No. There's this little trick called Copy On Write (COW). It uses the MMU and works like this.

      1. Process fork()'s
      2. Kernel sets up the child process.
      3. Instead of copying the whole memory space over, it simply points the childs pagetable entries at the parents pages. Both the parents and childs pages are also marked as read-only (this is the important part).

      Now, when either process goes to write to data in their memory spaces:

      1. An exception is raised because the page is marked as read-only and the CPU jumps into the MM code of the kernel.
      2. The kernel now does the copy for real, but only of that page.
      3. The parent and child now have there own seperate copy of this page of data that they can work with.

      Tada! Data is only copied when it's needed. Depending on the program involved, this may be a lot or a little. For programs that will immedediately perform an exec() after the fork, vfork() gives even better performance. Under Linux (according to the vfork(2) man page), vfork doesn't even copy the pagetables and it suspends the parent until an exec() is performed. It's kind of a hack, but it saves some work when all you want to do is spawn a new program. Which isn't all that uncommon in the Unix/Linux environment.

    13. Re:Possible use by rossy · · Score: 1

      Hear Hear! I think the fear is that software race conditions will be a problem. I've worked with "LOTS" of GUI apps that can't keep up with my typing speed. In an earlier life I found that the X-windows distribution was limiting the xterm baud rate to 9600 baud! (Upping the rate made stuff scroll by faster in SUNOS). I hate it if I have to wait more than 3 seconds for ANY screen repaint, and if ever the APP can't keep up with my typing speed. Someday I would hope that with 2Ghz clock cycles of 500ps, I can get a couple of characters fed into the CPU! I don't think the real problem is processor speed, or CPU/MEMORY bandwitdh. In most cases, it is priorities. We all want to be listened to. I think that multi-threading has to be the answer to this problem... a software agent and GUI signalling setup that tells you that the CPU has the info, and is working on it.

      --
      Ross Youngblood
  33. Apache 2.x will fly... by Anonymous Coward · · Score: 0

    3 step plan:
    1. write multithreaded web server
    2. ???
    3. PROFIT!

  34. Gary Kasparov by Raven42rac · · Score: 1

    so this means Gary Kasparov can get beat at chess that much faster now?

    --
    I hate sigs.
  35. Threads? What about processes? by WetCat · · Score: 1

    It's much interested to have so many processes,
    not threads in UNIX-like system...
    Leave threads for those Window-ers...

    1. Re:Threads? What about processes? by Mr+Z · · Score: 1

      This is Linux. In Linux, the difference between a thread and a process is more-or-less whether the VM space is shared or copy-on-write. That's pretty much it as I understand it.

      Starting 100000 processes, therefore, seems like it should be doable, so long as there aren't so many "dirty" pages in the forked executables that the system thrashes itself to death handling the COW pages. This is where having a 64-bit system and a lot of RAM would help you. (You need the 64-bit system to address the RAM sanely. The whole "highmem" kludge on x86 is just that, a kludge.)

      --Joe
    2. Re:Threads? What about processes? by Anonymous Coward · · Score: 0

      If I ever heard any of my coders say anything like that. They'd be on fthe fast track to the back exit.

      That was a truly ignorant comment.

  36. This isn't for everyone, though by Anonymous Coward · · Score: 1, Interesting

    There was a patch for an O(1) scheduler awhile. What this means is it takes the same amount of time to select what runs next and it's not affected by how much is running. But you won't notice an improvement unless you have about 200 processes running at the same time. This may be good for servers, and the like, but it's a lot slower if you have few processes running. Keep this in mind...

  37. Threads? by SashaM · · Score: 1

    I thought Linux didn't have real threads, and they were implemented as processes... Am I missing something?

    1. Re:Threads? by Anonymous Coward · · Score: 1, Informative

      A thread is a "light weight process". Threads have many definitions and types.
      In java's green threads, the JVM has 1 process and many user-space threads.
      In Solaris, there are actually 2 types of threads: a kernel and a user-space thread.

      Linux has had a fast process creation and context-switch so Linus chose to implement kernel threads in terms of a process (clone is the creation function). Pthread is simply a posix API wrapper for a linux thread.

      While there are many thread libraries, pthread is the main one now. NGPT (Next Gen Posix Thread from IBM) was being positioned to replace pthread with a M:N implementation. It is heavier than current pthread, but with ability to create both kernel and user space threads.

      Redhat's library may actually be the replacement just due to it's simplicity (read this as fast and less buggy).

  38. Re:Alternative headline by Dahan · · Score: 4, Informative
    Gigantic performance problem in Linux code fixed after several years of "many eyes" scanning over it.

    Uh, why did that get moderated as a troll? Oh, right, Linux is absolutely perfect, and anyone who says otherwise must be a troll.

    Come on, Linux's scheduler has long been known to have performance problems once you have a lot of processes/threads... for example, read this paper [text version] (appropriately subtitled "How I Learned to Love the Alpha and Hate the Scheduler"):

    0.8.1 Create a fixed priority scheduler.
    Currently, the Linux scheduler is very different than the traditional Unix schedulers. Although the Linux scheduler is very efficient when only several processes are running, it is not scalable. In order to match the performance of *BSD and other Unices, another scheduling algorithm must be used.
    Moderators, don't be Slashbots, moderating according to the groupthink. Educate yourselves, and you'll be better moderators, and better people.
  39. no more slashdot effect? by at10u8 · · Score: 1

    I suppose this means that sites will want to switch to Linux/Apache in order to avoid being incapacitated when linked by Slashdot?

  40. Re:Windows comparison by Courageous · · Score: 4, Informative

    Very thread uses a minimum of *1 PAGE* of reserve memory for its statck, which is 64K. However, you have to go out of your way to use less than 1 megabyte of reserve memory. Since only 2GB of reserve memory (addressable memory) is available to user applications, this would fit your 2000 thread figure like a glove.

    C//

  41. nice, but... by g4dget · · Score: 4, Interesting

    It's nice that the Linux kernel can handle that many threads. But user level threads generally are even more lightweight, and high performance implementations like those on Solaris provide both user level and kernel level threads and map the former onto the latter. Is Linux going to get something similar? Is Sun perhaps donating their implementation? Or are these new kernel threads so lightweight and quick that they are competitive with Solaris on their own, without the mess and complication of adding user level threads?

    1. Re:nice, but... by Magnus+Reftel · · Score: 4, Informative

      According to a mail from Ingo Molnar halfway down the linked article, M:N threading doesn't really solve the real problem - it's good at switching back and forth between running threads, but the real reason for having very large amounts of threads (be they kernel or user space threads) to begin with, is to do IO, and for that, there is no real advantage of user space threads.

      More info on the 1:1 vs M:N issue can be read in the white paper

      --
      print "Yet another p{erl,ython} hacker\n",
    2. Re:nice, but... by g4dget · · Score: 2

      Thanks for the pointer. Sounds like they went with 1:1 for a good reason. I always thought of M:N threading as kind of a kludge and not entirely trustworthy anyway (scheduling and I/O become rather iffy).

    3. Re:nice, but... by ProtonMotiveForce · · Score: 0

      Wow. So this must be some kind of amazing innovation, since almost every real operating system maker went with a real thread implementation using both user and kernel threads.

      Ingo Molnar is I assume the smartest person in the Universe, and his reasoning instantly redecides the whole issue that many smart people at multiple large OS companies had though they'd mastered?

      Linux - AIX II. "We know best, we'll redesign Unix from scratch!". In the meantime, as a developer I'll take HP-UX or Solaris anyday.

    4. Re:nice, but... by Anonymous Coward · · Score: 0

      Actually, Solaris 8 offers and Solaris 9 makes default a 1:1 threading model. Sun has finally realized that M:N does not work well in any process where each thread does something significant, like I/O.

      On Sun's larger boxes processor affinity and cache locality are performance problems. 1:1 threading lets processes use more CPUs without special tricks, but processor sets can be used to limit thread migration ad resource usage. Blocking on I/O without stopping any threads that are ready to run is much easier in 1:1 mode.

  42. How will this affect Mozilla, OpenOffice... by 3770 · · Score: 4, Interesting

    How will this change affect Mozilla, the Sun JVM and OpenOffice, for instance.

    While it probably is generally true that it will take some time for most applications to start using the new threading model some larger applications could support it fairly soon.

    Can we expect these applications to be adapted to the new threading model some time soon, and how will it affect performance?

    --
    The Internet is full. Go Away!!!
    1. Re:How will this affect Mozilla, OpenOffice... by madmarcel · · Score: 1

      Sun JVM?? Hmmm...

      That reminds me of a little Java game I write for an assignment last year. It was a cheap rip-off of Zelda. Being a lazy sod, I'd set it up as such that each monster on the screen had its own unique 'thread' + a thread for the player + a thread for the 2D buffering.
      <<Did anyone mention lamos writing monster applications? ;P >>

      It ran reasonably well, but I did notice that on my triple-boot machine (Win98, linux & winNT - yes, I'm a sicko :) the game ran at very different speeds depending on which OS I used...linux being the slowest of the lot :(
      Not sure it has anything to do with the thread-handling or perhaps the way linux handles graphics or (more likely) the JVM being optimized differently/more/less for each OS :\

      <<Resists temptation to dust of old java game and start hacking>>

    2. Re:How will this affect Mozilla, OpenOffice... by Anonymous Coward · · Score: 0

      The SUn JVM on linux is really low performance - Linux is cross-compatible with POSIX threads as used on Slowlaris via the pthreads library, but native non-POSIX linux threads created with the clone() syscall way outperform the inneficiant emulation layer - I never obther with a threading library, personally, just code for the linux clone() system. Unfortunately, the Sun JVM on Linux is nearly a straight recompile of the Slowlaris sources - so Java threading on 2.4.x Linux is abysmal.

    3. Re:How will this affect Mozilla, OpenOffice... by egghat · · Score: 1

      Interesting.

      What can one do about it? Use blackdown's JVM for example?

      TIA for your answers.

      Bye egghat.

      --
      -- "As a human being I claim the right to be widely inconsistent", John Peel
    4. Re:How will this affect Mozilla, OpenOffice... by Anonymous Coward · · Score: 0

      I don't think vi will run very much faster.

    5. Re:How will this affect Mozilla, OpenOffice... by alext · · Score: 2

      Threading performance may be poor on Linux, though personally I haven't noticed it, other aspects are fine though. I'd say that big Java applications start up about 20% faster on my Linux partition than on my Windows one using Sun JVM 1.3.1_04.

      In fact, Solaris LWP threading has caused me more headaches - it seems that the old N:M thread model can deadlock with native libraries such as the Oracle OCI drivers, theoretically using the alternate 1:1 model fixes this but I haven't yet proved the case to my own satisfaction. Read up on this new model here, or try it by putting /usr/lib/lwp in your LD_LIBRARY_PATH.

    6. Re:How will this affect Mozilla, OpenOffice... by Wesley+Felter · · Score: 2

      There is no new threading model. The thread APIs are the same, so when you install the right kernel and glibc all apps will benefit.

    7. Re:How will this affect Mozilla, OpenOffice... by Anonymous Coward · · Score: 0

      Nothing. Blackdown's JVM is where sun got their linux port. IBM's JVM is marginally better on linux, but it's still basically the same sources, slightly tweaked. Kaffe remains hopeless.

      Native compilation with GCJ will get good performance.

      Just switch to a better language system, I guess - Try Xanalys LispWorks on Linux, it's good, and much more capable than Java. Expensive though.

    8. Re:How will this affect Mozilla, OpenOffice... by psamuels · · Score: 1
      There is no new threading model. The thread APIs are the same, so when you install the right kernel and glibc all apps will benefit.

      True, but note that if they just drop this stuff straight into glibc there could be some application incompatibilities. Ulrich Drepper explicitly mentions this: if your app relies on (read: works around) any of Linux's former deviations from the POSIX thread standard, it will break now that glibc pthreads are POSIX-compliant.

      At the prospect of breaking ABI compatibility (something the glibc developers are loath to do - glibc 2.0 was supposed to be the last non-backward-compatible C library for Linux) - I imagine they will tread rather carefully when it comes to "transparently" replacing Linuxthreads with NPTL.

      --
      "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
  43. Great... Now every lamer with no design knowledge by Alex+Belits · · Score: 2

    ...will start writing horrible monsters running hundreds and thousands of threads, and their creations will suffer from all other shortcomings of that decision.

    --
    Contrary to the popular belief, there indeed is no God.
  44. I am appalled. by Anonymous Coward · · Score: 0

    That is just sooo typical. Just when I thought I'd be superl33t and, like, "underground" by trying a more minimalistic approach installing Win 3.11 on my superfly dual P4, you guys go ahead and spoil it all by telling me something about 100.000 concurrent threads on something called "Linux", which at least to me seems to be some kind of binary Viagra? You ruined all my plans for world domination. You guys are no fun.

    But rest assured, I'll try again - with Win95.

  45. In light of recent news.... by Gyorg_Lavode · · Score: 1
    the development kernel will now be nicknamed "Heroin Spider".

    (I'm sorry. I had to do it.)

    --
    I do security
  46. Re:Great-RH strikes again. by Anonymous Coward · · Score: 0

    And the nice thing is that the improvement was brought to us by that company everyone loves to hate, no not microsoft, but Red Hat.

  47. How *I* got kicked out of the computer lab by Naikrovek · · Score: 3, Funny

    I ran this in DOS:

    prompt "Enter Password:"

    No one could figure out that all i did was change the prompt from "$P$G" to that, and everyone was asking what the password was. haha, good old teacher was infinitely frustrated as well! IT WAS BEAUTIFUL.

    I got kicked out for a year (not beautiful).

    1. Re:How *I* got kicked out of the computer lab by kasperd · · Score: 0, Offtopic
      I ran this in DOS: prompt "Enter Password:"

      I remember in highschool on most of our DOS based computers some joker had changed the prompt. Not as sophisticated as yours though, he had just replaced the prompt with a lot of nasty words. Since I didn't find that funny, and I actually would like to use the computers, I did a minor trick to the setup. At the end of AUTOEXEC.BAT I would call another .BAT file instead of the menu. The other .BAT file did three things:
      1. Set the prompt the way I liked it.
      2. Load a 16byte TSR program to disable the beeps.
      3. Run the menu as usual.
      The joker never found out where I had put the new prompt. Of course I was a little worried that some teacher would find out I had messed with the configuration. Even more worried I got when a teacher one day asked if I was the one who had installed the TSR program. Turned out he just wanted to ask for permission to use it on his own computer as well. *Phew*.
      --

      Do you care about the security of your wireless mouse?
    2. Re:How *I* got kicked out of the computer lab by Monkelectric · · Score: 0, Offtopic
      my god thats the best thing Ive ever heard! mod this guy up! :) (and) Lemme tell a couple stories :)

      Back in junior high I had this little TSR program that would make letters (in text mode) drop off the screen, they would fall until they landed on another letter, if not, they would drop right off the screen ... this was back in the day where you had a dual floppy system (one disk for the OS, one for your applications). We jurry rigged as many OS disks as we could to load 4 copies of the program when it booted ... Long story short, as students started to complain letters were falling off the screem, the teacher flipped and couldnt figure out WTF was going on :D you could see the panic in is weasel face. This was right around the time of the Michaelanglo virus, so eventually my buddy and I suggested it was a virus and that we would take the disks home and disinfect them ... we undid our dirty work and laughed about it for the rest of the year :)

      My favorite story though, my highschool had *1* computer in the library, it was a dog slow 286, and it ran some mico-fiche searching software, and some cd-rom based article. It had this crap-ass DOS menu program that supposedly locked you out of the OS. One day it segfaulted and gave the location of the program (**EXCEPTION IN C:\MENU\WHATEVER.EXE). Ahhhh a dos prompt! But what could I do ? ... nothin really except run deltree, and thats kind of cruel. So I thought for awhile, and well I knew the location of the EXE ... so I whipped up a little boot disk that zipped up a copy of the menu directory and then rebooted ... I walked up to the computer, put the disc in, toggled the power, and walla a copy of the menu program :) I had hoped I could glean the password from the datafiles, but no such luck. so I started to think, what COULD I do ... well long story short, there was this program called goldwave (I think), that would attach a MOD+player to an exe (for suck-ass demo makers), and execute them in paralell. Clearly with a copy of the EXE and goldwave, evil was the only viable option. I picked this GOD AWFULL mod file of some screetching followed by more screetching and drums [those who were around in the day must remember how truely awfull mods could be] :) I attached the mod+player to the exe and modified the batch file that executed the menu program to copy a fresh copy of my singing menu each time the menu program was started, but only after the SECOND time the menu was run (this was real important). I made another bootdisk that renamed the old exe and copied in my new batchfiles/mods.

      So I went to the machine, put my disk in, toggled the power again, and it installed my modified menu exe. Well remember I said it started singing the SECOND time the menu was run? That was so I could be nowhere near the machine when it started singing. Sure enough, the next morning when the machine was turned on for the day, it starter caterwalling out its PC speaker. I wasn't there, but I was told it was quite a spectacle and that the librarians flipped out completley :) They never could proved who did it, but it was pretty obvious I was the one.

      Ok, story time is over, continue your trolling :)

      --

      Religion is a gateway psychosis. -- Dave Foley

    3. Re:How *I* got kicked out of the computer lab by Anonymous Coward · · Score: 0

      I love you.

      Recall me the day I changed the strings "Male", "Female" to "Big" and "Small" in a genetic teaching software.

      Sex: Big 63%
      Small 37%

      Okay. I was young. You hack is way better.

    4. Re:How *I* got kicked out of the computer lab by bunratty · · Score: 0, Offtopic
      I got started with computers in the early 80's on a PDP-11. Once I found a program that converted files from uppercase letters to lowercase letters. The interesting thing is that it had the supervisor bit set so that the program would have supervisor privileges while runnnig. I could destroy most compiled software by "lowercasing" the "uppercase" data in them. Lucky for them I didn't know which files were part of the OS!

      Then there was the time I told someone the admin's password was "zapper". I watched the keys he typed as he entered his password!

      --
      What a fool believes, he sees, no wise man has the power to reason away.
    5. Re:How *I* got kicked out of the computer lab by Anonymous Coward · · Score: 0

      Perhaps you should have spent more time in the library, then you wouldnt make yourself look such a fool by saying "walla".

    6. Re:How *I* got kicked out of the computer lab by Anonymous Coward · · Score: 0

      Perhaps you should stop being an asshole and realize it's just a fucking sound effect?

    7. Re:How *I* got kicked out of the computer lab by Anonymous Coward · · Score: 0

      Er, no, it's not. It's a word. In French. "voila".

  48. Re:keep ignoring qjkx by Anonymous Coward · · Score: 0

    Your days of being ignored are over, dude.

  49. big deal by leomekenkamp · · Score: 4, Funny

    100.000 threads? What nonsense; everybody knows that no computer would ever use more than 640.

    --
    Wenn ist das Nunstueck git und Slotermeyer? Ja! Beiherhund das Oder die Flipperwaldt gersput.
    1. Re:big deal by sg_oneill · · Score: 2

      wry. verry wry. *g* (mod it up)

      --
      Excuse the Unicode crap in my posts. That's an apostrophe, and slashdot is busted.
    2. Re:big deal by Anonymous Coward · · Score: 0

      I wondered if someone was gonna post that comment. cuz when i thought about posting it myself i figured it would get modded as flamebait or such. Good luck, leomekenkamp.

    3. Re:big deal by leomekenkamp · · Score: 1
      That comment was definitively posted as a joke.

      I even wonder who would consider it flamebait...

      --
      Wenn ist das Nunstueck git und Slotermeyer? Ja! Beiherhund das Oder die Flipperwaldt gersput.
  50. Hooray for fixing the dynamic linking problem! by Foresto · · Score: 2, Interesting
    It looks like speed isn't the only improvement they've made with this library. From the notes:

    " - - libpthread should now be much more resistant to linking problems: even if the application doesn't list libpthread as a direct dependency functions which are extended by libpthread should work correctly."

    This ought to be a big help for those of us who write plug-in modules for servers like Apache 1.x and PHP. The existing thread library doesn't work properly unless the program executable explicitly links to it, which means that my shared libraries can't take advantage of standard thread management such as pthread_atfork().
  51. Real benefit, only time will tell by Anonymous Coward · · Score: 0
    Can anyone point to any paper which offers any proof that "threads" are of any value in general purpose application development? I suspect not. That would be equivalent to demonstrating that there is a general purpose way to parallelize an algorithm. And if that were the case, threads would likely only be useful in a correctly designed MP environment.

    As far as I can tell, the current use of "threads" mostly boils down to a faster way to fork(). From an algorithmic point of view, not all that interesting.

    In any case, hats off to the Linux developers for filling out the features checklist.

  52. Does this help Apache 2.x by mustprotectdata · · Score: 2, Interesting

    Given that Apache 2.x can utilise threads as well as processes, does this mean that you can configure a large web server with, say "MaxSpareThreads 1000000" so that you can cope when you're slashdotted ;-)?

  53. lwp by Anonymous Coward · · Score: 0

    Sun's user level threads package has been available for years as the the Sun lwp library. It is very fast and very portable. Use it with poll() or select() and some state machine glue, and you can implement almost any "threaded" algorithm which you can imagine.

  54. 100000?! by thelexx · · Score: 3, Redundant

    640 should be enough for anybody!

    LEXX

    --
    "Gold still represents the ultimate form of payment in the world." - Alan Greenspan, 1999
    1. Re:100000?! by WaKall · · Score: 0, Offtopic

      Mod parent up :) I never had mod points when I need them.

  55. Re:Hi there! by Anonymous Coward · · Score: 0

    May I have a cheesecake?

  56. Group think by Subcarrier · · Score: 1, Offtopic

    In any large group of people you will find a few idiots, a few luminaries, and a great number of average thinkers. Sometimes the only thing that separates idiots from luminaries is their lack of social grace. Welcome to democracy.

    --
    "I have opinions of my own, strong opinions, but I don't always agree with them." -- George H. W. Bush
  57. but will it know to presoak my wash? by Rooked_One · · Score: 1

    Or perhaps know which part of the banana peel is the good part to smoke? =) GOOD JOB OPEN SOURCE!!! KEEP AT IT!

  58. WIPO, is that you? by Anonymous Coward · · Score: 0

    We've missed you man. Welcome back. Have some nice hot grits.

  59. MOD PARENT UP by Anonymous Coward · · Score: 0

    That was an interesting read.

  60. I've heard somewhere that... by Anonymous Coward · · Score: 0

    640 threads must be enough for everyone.

    1. Re:I've heard somewhere that... by Anonymous Coward · · Score: 0

      yes, the posts above that already made the same joke.

  61. Re:Native threads... man clone() by Anonymous Coward · · Score: 0

    don't forget that everyone can use the very simple and efficient native linux threads using the clone() sys call (see man clone)
    since there is less overhead than using the more complicated Posix API, clone threads will always be faster.

  62. good job by Skal+Tura · · Score: 0

    Good job Ingo Molnar! Fantastic performance rise but how about stability, is it less or more stabile? i hope that atleast same level of stability.

  63. 100,000 parallel threads by inode_buddha · · Score: 1


    This ought to make RedHat, Dell, IBM, and Oracle very happy, given a few of the newer contracts with large retailers using Oracle's back-end... if you read the article closely you notice that RH takes the claim for sponsoring a bunch of the work involved in developing this.

    --
    C|N>K
  64. Wow! by mnordstr · · Score: 2

    Combine this with Apache2's Multi-threaded or Hybrid MPM and you'll have a heck of a web-server!

  65. the linux c10k problem - solved? And Java? by Anonymous Coward · · Score: 1, Interesting
    Does this solve the c10k problem? As I can start a thread for every socket? See the C10K problem

    And does this mean the Java will start to really scale on linux?

    1. Re:the linux c10k problem - solved? And Java? by Anonymous Coward · · Score: 0

      Right, responding to my own question ;) I forgot that every thread requires real RAM for stack. Which means 100K threads is not that practical and the c10k problem still remains unsolved. At least for the average server.

    2. Re:the linux c10k problem - solved? And Java? by Mr+Z · · Score: 1

      And why, perchance, does it remain unsolved without actual investigation?

      There is an issue of timeslice granularity -- each thread will only get a timeslice occasionally when there are many threads. So, you'll need to crank up HZ a bit and/or service multiple clients per thread.

      The main thing, though, is that if you consider that the threads share their VM (that is, their program code and static data), you could probably do 10000 clients without too much heartache. The "COW" space (the amount on data that's private due to "copy on write") is the bounding factor.

      Now, 10000 clients with 100% dynamic content? Ha. Too many dirty pages. 10000 clients with largely static content? Sure -- just sendfile it all. :-)

      --Joe
    3. Re:the linux c10k problem - solved? And Java? by Anonymous Coward · · Score: 0

      You are missing the point baby.

      Did you read the c10k page? The point is listening on sockets, that is what is complicated and does not scale on current implementations.

      In theory we maybe could create 50K threads with very little (couple of kb-s) of stack space and make their only task to listen sockets, passing any incoming events on to "worker" threads. In theory.

      But I figure this discourages batch processing with high loads and takes context switches to heaven, effectivly killing the server.

      What we need is real AIO.

    4. Re:the linux c10k problem - solved? And Java? by Mr+Z · · Score: 1

      Yes, I read the page. The person I was responding to was only talking about RAM usage for the stack, so that's all I was talking about.

      --Joe
  66. Re:Alternative headline by himi · · Score: 3, Insightful

    Alternatively, you might want to consider that Linux's scheduler was very nicely tuned for far and away the most common case - where you have only a small number of running processes.

    Likewise, threading support under Linux has been oriented towards what the developers considered sane: a fairly small number of threads. They had good reasons for considering that the right way to do it - for a start, it worked nicely for what they wanted, and it was sufficiently simple that they didn't have to put in lots of complex code. Further, it's almost never a good idea to have a program architecture that requires very large numbers of threads - it generally only shows up in naive code where people simply don't understand the problems it brings. So, as far as the kernel developers were concerned, stupid people hurting themselves wasn't something to put any effort into amelioriating. This has changed recently, as people have started using Linux in areas where this kind of thing /isn't/ insane, and hence these new developments have come along.

    You need to understand the reasoning behind a lot of these decisions before you can start complaining about them. First and foremost, you simply /have/ to realise that the kernel developers care about how people actually use the system, rather than crappy benchmarketing numbers. These developments have come about because people needed them, and they didn't happen earlier because no one had needed them before. Go back and read the last few years of the lkml archives, and /then/ come back and talk about this kind of thing, when you understand /why/.

    himi

    --

    My very own DeCSS mirror.
  67. POSIX compliance ahead? by rkit · · Score: 2, Informative

    Scalability is a good thing, no doubt about that. However, there is another aspect that should be pointed out: the current thread API in linux is quite different from the POSIX specification and somewhat crufty. Just to mention the biggest problems:
    missing cancellation points: testing whether a thread has been cancelled should be done in lots of system calls, but linux pthreads do not support this. Instead, you have to call pthread_testcancel() before and after every such call. A real drag.
    signal handling: linux pthread signal handling is very different from the POSIX specification. However, proper signal handling is crucial for any real world application.
    fork() will not work as expected. This is a real nuissance if you want proper daemon behaviour for your application.
    documentation of linux-specific behaviour is poor. As a result, most of the existing literature on thread programming is pretty useless for linux.
    All these points can be worked around, for sure. Nevertheless, it makes writing portable software a nightmare. Porting threaded software to linux, well ... All in all, linux threads really need much better integration with the standard system API. A lot of applications could profit from multithreading. Just think of GUI responsiveness. Also, using threads makes some programming tasks much easier. No need for asynchronous hostname lookup, for example.
    A solid, well documented, standard conforming threads implementation will make linux a much nicer environment for serious programming than it already is. I am really looking forward to this.

    --
    sig intentionally left blank
    1. Re:POSIX compliance ahead? by inode_buddha · · Score: 2, Interesting

      Nobody ever said that linux-specific behavior is POSIX-compliant. Last I heard, POSIX is not about the specifics of any given UNIX-compatible or class of system. Rather, it attempts to be the abstraction and distillation of those class of systems, as codified by The Open Group. Please correct me if I am wrong in this idea. Linux simply simply "aims to be..." POSIX-compliant, as promulgated by the LSB, the FHS, et al. --

      That all said, I totally agree with you -- especially regarding cancellation points, fork(), and documentation.

      Please bear in mind that much of this behavior will be inherited from whatever libc it it compiled against. IMO, this simply shows the power of C, nothing else.

      The above scenario simply points out the differences between OpenGroup/POSIX and GNU/FSF... if things like that "bug" you (no pun intended, seriously), then perhaps you should recompile with whatever "-- posixly-correct" options you have available.

      And yes, I have a copy of the SUSV3 spec right here, in fact.

      --
      C|N>K
    2. Re:POSIX compliance ahead? by rkit · · Score: 1

      Yeah, I know, linux only aims at POSIX compliance... Despite some things in linux pthreads that are non-optimal, performance is reasonable, and everythings runs quite stable, once you have found out how to do it, so my critique may sound unfair. I notice you do not agree on my complaint about signal behaviour, and to be honest, this is a somewhat arcane thing also on other systems :-(
      However, my main point is that there is no real smooth integration of threads in glibc. IMHO this is much more important than largescale scalability. These guys are working to change that, which is a Very Good Thing(TM). As it is, using threads only makes sense if you really need them, and have time at hand to explore the nifty details that must be correct for a real world application.

      --
      sig intentionally left blank
    3. Re:POSIX compliance ahead? by Nevyn · · Score: 1
      *sigh* feel free to read the data pointed to by /.
      signal handling: linux pthread signal handling is very different from the POSIX specification. However, proper signal handling is crucial for any real world application.
      fork() will not work as expected. This is a real nuissance if you want proper daemon behaviour for your application.

      These were both fixed as part same rewrite that helped Linux scale pthreads to 100s of thousands or threads.

      Not sure about the cancellation points, but the library was fairly largley rewritten ... so I'd check that data too as it may well be out of date.

      --
      ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
    4. Re:POSIX compliance ahead? by inode_buddha · · Score: 1

      True, there doesn't seem to be any smooth way for glibc to deal with this... I must have missed your point then. Please don't misunderstand about "fair" criticism; constructive criticism is always useful, and I certainly didn't think your comment was "unfair" -- just realistic. I truly hope that there are clues in this "mini-conversation" that everyone can use.
      It's great that this sort of thing is being dealt with, though I'll probably never use it on my desktop to it's fullest extent. I've applied the relevant patches to my plain-vanilla 2.4.19 kernel and rebuilt; on my main workstation (dual P3/1gB ram) the difference becomes noticeable (in a good way) under heavy loads but not on the day-to-day basis. I need to find some way to measure things regarding that, if you have any ideas...

      --
      C|N>K
  68. Funny by Anonymous Coward · · Score: 0

    [Ed: long list of requirements deleted]

    Once all these prerequisites are met compiling glibc should be easy.


    Phew!

  69. Ur browser??? by Anonymous Coward · · Score: 0

    Ur browser, naa, it were the World Wide Web who started it all

    http://www.w3.org/People/Berners-Lee/WorldWideWe b. html

    1. Re:Ur browser??? by Anonymous Coward · · Score: 2, Informative

      I can only suppose you don't know what Ur is, maybe because you come from a very different culture...

      Anyway, and I'm really not well qualified to answer this, Ur was an ancient city-state from which a prominent ancestral of the Jewish-Christian-Islamic heritage (Abraham, if I'm not wrong).

      This city, IIRC already found, was sumerian (I'm not sure about this), the folks who are said to be the inventors of the wheel, among other neat things.

      So an Ur browser would be the primeval browser, in other words.

      Upon writing a note, one must be sure it will be understood; nonetheless, the "Ur" mention boosted the note level way up. All in all, I think it was great and I'm all for it.

      But explanations as these sometimes become necessary.

  70. Re:Windows comparison by IkeTo · · Score: 1

    Okay, where did you come up with the 64K figure, and also the 1 mega (megi) byte figure?

    All Intel processors have 4KiB pages. Each Linux thread has two things of its own: its own stack, which can be as small as 1 or 2 pages if the code to run is simple enough, and also its own task_struct, which is 1 page including kernel stack for the thread. So all in all, you need 12KiB for each thread. Multiplying with the 100000 figure you get 1200000KiB or 1.144GiB, which is quite affordable for a 2GiB system.

  71. NGPT by p3d0 · · Score: 2

    Then, with NGPT (Next-Generation Posix Threads), those 100,000 threads would be in user space and may be even cheaper.

    --
    Patrick Doyle
    I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    1. Re:NGPT by Nevyn · · Score: 1

      Benchmarks are showing NPTL to be about 4x faster than the IBM NGPT library (Note that NPTL was written knowing about the IBM work, and knowing how to make it better).

      --
      ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
    2. Re:NGPT by benh57 · · Score: 2

      Nope. Apparently NPTL is Four times faster than NGPT.

    3. Re:NGPT by p3d0 · · Score: 1
      Hmm. I was wondering what would happen if they implement m:n on top of NPTL, but it would probably just slow it down and introduce the headaches of two-level scheduling.

      Thanks for the link.

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  72. Re:Alternative headline by Anonymous Coward · · Score: 0
    You need to understand the reasoning behind a lot of these decisions before you can start complaining about them. First and foremost, you simply /have/ to realise that the kernel developers care about how people actually use the system, rather than crappy benchmarketing numbers.

    That's a typical Linux zealot answer: "if Linux doesn't implement it properly, you don't need it, and if you need it you're an idiot".

    I don't buy that. If I want to have a server with 10,000 clients, I want to be completly free to implement that with threads... Wasn't Linux supposed to shine on the server side? Likewise, if I want to create a browser with one thread per window, then let me be.

    I don't want to be lectured by fascist arrogant pigs who are too lazy to implement correctly a proper Operating System (whose goal is to adapt to a wide variety of different uses). Fuck you. Have a nice day.

  73. Re:Alternative headline by croftj · · Score: 2, Insightful

    I think we need to pull some old stats out of our ass. This paper is about athe 2.2.x kernel. Correct me if I'm wrong, but hasn't there been massive overhauling of the 2.4.x and 2.5.x kernels in the scheduling area?

    I think I'll just slam XP performance based off of NT benchmarks and aricles. What the hell, thier both from MS the argument must be a valid.

    Get a grip!

    --
    -- Many men would appreciate a woman's mind more if they could fondle it
  74. Re:Windows comparison by Anonymous Coward · · Score: 1, Informative

    The 1 MB is the default stack size for every Posix thread. It takes some effort to determine what the smallest valid stack size since PTHREAD_STACK_MIN doesn't specify enough space for the start_func stack frame. The parent post merely stated that the default is 1MB and you have to work to lower that, and he is correct. If the grand-parent poster was just creating threads without specify a stack size, he would run out of RAM pretty quickly.

    I wouldn't want to guess where the 64k figure came from.

  75. Wait, there's more by Quixote · · Score: 2
    From the ensuing discussion on the list:
    Ingo:...Anton tested 1 million concurrent threads on one of his bigger PowerPC boxes, which started up in around 30 seconds. I think he saw a load average of around 200 thousand. [ie. the runqueue was probably a few hundred thousand entries long at times.]
    Wow.. this is pretty good.The ability to spawn & run 1 million concurrent threads should keep even the most demanding users happy for a few years...

    OTOH, I hope this post doesn't become the butt of jokes a few months from now ("and you thought 1 million was a lot! Ha! My Palm 5000XL does more than that!")...

  76. Re:Great... Now every lamer with no design knowled by Anonymous Coward · · Score: 0

    Yes, of course! Everybody knows that the only reason "every lamer with no design knowledge" doesn't do that now is because they take into account the shortcomings of running 50,000 parallel threads.

    This crap gets a +2?

  77. Re:Alternative headline by Anonymous Coward · · Score: 0

    You need to understand the reasoning behind a lot of these decisions before you can start complaining about them. First and foremost, you simply /have/ to realise that the kernel developers care about how people actually use the system, rather than crappy benchmarketing numbers.

    That's a typical Linux zealot answer: "if Linux doesn't implement it properly, you don't need it, and if you need it you're an idiot".

    I don't buy that. If I want to have a server with 10,000 clients, I want to be completly free to implement that with threads... Wasn't Linux supposed to shine on the server side? Likewise, if I want to create a browser with one thread per window, then let me be.

  78. Re:Alternative headline by Anonymous Coward · · Score: 0

    The goal of linux isn't to be the perfect OS for every single task. If you think that I have a bridge to sell you.

    The goal of linux is to be the best it can be.

  79. Moderate down what you don't understand by magellan · · Score: 0, Offtopic

    That's what's happening here. It is much like ignoring somebody asking a question you do not know the answer to.

    Today's development environments do not do a great job of autmatically parallelizing code, and there are very few outstanding multithreading programmers in this world. Thank goodness one of them is working on this for Linux. The improvement in threading support is critical for Linux to scale in shared memory, multi-processor environments (SMP, NUMA), but will only be important for certain applications. The typical Slashdot reader uses a dual processor Linux system at best, where excellent multithreading is not necessary.

  80. Now that's impressive by Jasa · · Score: 1
    I was begining to think this sites standards were getting low:-
    • Portable MP3 Players (Done before)
    • Net shotting guns (1960s James Bond movies)
    • Build your own sub woofer (My friend built a 500mm (1'8") X 500mm X 1000mm(3'4") Sub Woofer in 1988 and then put it in his Ford Transit)
    • Tiny Linux boxen (Seen 1000s of these and *BSD boxen as well)
    But 100,000 Threeds on Linux now that's impressive, too bad it won't make one iota of a difference to most of us who use Linux for just reading /.

    --
    -Jasa -- Linux - The SOURCE will be with you, ALWAYS
  81. Re:Alternative headline by himi · · Score: 1

    If you want to do stupid things with your programs, that's fine by the kernel developers. Just don't expect /them/ to bend over backwards to make /your/ stupid design work as well as you want it to. That's your problem, and no one elses.

    himi

    --

    My very own DeCSS mirror.
  82. 2.5 - 2.6? by Dunkirk · · Score: 1

    Since I absolutely suck at getting kernels from source to work correctly (I never get everything in there that I need I guess), the question is: When does all this great stuff reach production? (To then be pre-packaged by RedHat, et. al.)

    --
    Acts 17:28, "For in Him we live, and move, and have our being."
    1. Re:2.5 - 2.6? by psamuels · · Score: 1
      When does all this great stuff reach production? (To then be pre-packaged by RedHat, et. al.)

      2.6 is supposed to feature-freeze on All Hallows' Eve, and there is enough buzz about this that it may actually happen on schedule. Expect a few months of shakedown before 2.6.0 gold, then (if they're smart) Red Hat will pound on it for a couple monts more, minimum, before selling it to the people to whom they promise a stable OS.

      Then again, I guess it's possible that Red Hat, being the sponsor for this work, will backport it to their 2.4-based kernels and package it up for the enterprise. (*shudder* - Doesn't "enterprise" mean "business"? Why does it always, in the computing world, imply "big business"?)

      --
      "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
  83. Faster app start times? by Pope+Raymond+Lama · · Score: 1


    (about 14 minutes 58 seconds faster than with earlier Linux kernels)
    . Ok, it is a genuine and serious question I have:
    Were these 15 minutes extra responsaible for the extra painfull
    long start times of apps like Mozilla, and Openoffice?

    If so, as soon as I upgrade my
    distro, I will boot it into 2.5.

    --
    -><- no .sig is good sig.
    1. Re:Faster app start times? by Mr+Z · · Score: 1

      It'll have nothing to do with it. Mozilla is at most a handful of threads.

      The bulk of the startup time for those apps is in dealing with all the dynamic linking and in faulting in all the code pages from disk. There's also a fair bit of time in loading non-code resources from the various files the app keeps. (eg. all the XUL and chrome that defines the user interface in Mozilla.)

      One way possibly to speed up loading large apps is to staticly link 'em (icky!) and then do "cat app > /dev/null" before starting it so it's in the disk cache first. That'll help a little. It won't solve the data-file issue, but then you could cat all those to /dev/null prior to app startup also.

      The reason catting a file to /dev/null gives a speedup is that the cat operation is linear. Thus, it gets good utilization from the disk. (assuming your filesystem isn't heavily fragmented, that is.) When an app relies on page faults to bring its code in from disk, the faults may be pseudorandomly spread throughout the file. Further, they may be timewise further apart, so the various I/O clustering algos (eg. elevator seek) won't cluster the reads.

      That said, you're still stuck with the CPU time the app takes to initialize and configure itself internally. Starting a statically linked Mozilla or OpenOffice from a RAM disk would still take a non-trivial amount of time.

      --Joe
    2. Re:Faster app start times? by psamuels · · Score: 1
      One way possibly to speed up loading large apps is to staticly link 'em (icky!) and then do "cat app > /dev/null" before starting it so it's in the disk cache first.

      ELF runtime linking is inefficient and does unpleasant things to your COW pages. I haven't investigated this in full, but Jakub Jelinek of Red Hat has written a program to "pre-link" your ELF libraries and executables, so that you can mmap, sanity check, and go without any runtime relocation.

      Not a new idea, actually - IRIX, I believe, solves the same problem in pretty much the same way. Digital Unix attempts to avoid it by maintaining a cache of "allocated" address ranges and letting the compile-time linker read and update said cache, in the hopes that no libraries' addresses will clash at runtime and thus no relocations would need to be made. But then, DU had the luxury of a 64-bit address space, so they could afford to "waste" it as it will never run out. (Please no jokes about 640k of address space being enough for anybody - except from people who actually understand how huge a 64-bit address space actually is.)

      That said, you're still stuck with the CPU time the app takes to initialize and configure itself internally.

      That's what the Emacs unexec solution is for!

      --
      "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
    3. Re:Faster app start times? by Mr+Z · · Score: 1

      ELF runtime linking is inefficient and does unpleasant things to your COW pages. I haven't investigated this in full, but Jakub Jelinek of Red Hat has written a program to "pre-link" your ELF libraries and executables, so that you can mmap, sanity check, and go without any runtime relocation.

      Awesome! Any word on when this will be 'production worthy'? Also, do you know of a website that has any performance numbers or other details?

      --Joe
    4. Re:Faster app start times? by psamuels · · Score: 1
      Any word on when this will be 'production worthy'?

      No, the first I heard of it was as I was poking around a Red Hat ftp archive looking for some RPMs for chasing a bug reported by a RH user (turns out he was using an old version of my package, but I digress...) and I noticed the prelink package sitting there somewhere. It sounded a lot like the IRIX thing - and I've thought off and on over the years that it would be nice for Linux to have that - so I read the README.

      Gooogle for "jakub jelinek prelink" - it'll tell you a lot more than I can.

      Also, do you know of a website that has any performance numbers or other details?

      Thanks to Google we have his original announcement, where he claims to have reduced the startup overhead in Konqueror (before opening its first X window) from 0.510 seconds to 0.011 seconds. Of course this is a best-case scenario, as prelinking will be the most noticeable for big binaries with lots of shared library dependencies.

      --
      "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
  84. Not to mention.... by SwedishChef · · Score: 2

    Your egregious use of the word "egregious".

    --
    No one ever had to evacuate a city because the solar panels broke!
  85. Mod this up, please (was: POSIX compliance ahead?) by Observer · · Score: 2

    See subject. A useful 'heads up' post for folks like myself who tend to assume that Linux will follow the general Un*x-family behaviours we're familiar with from the commercially-sold variants.

    And yes, I would of course ;) check this assumption if I were to do some significant implementation for the Linux platform.

  86. Imagine a Beowulf Cluster of this! by SirDaShadow · · Score: 1

    yeah, had to say it, first time I do :)

  87. Image - OMFG!!! by Anonymous Coward · · Score: 0

    Hahahahahahahahahahaha!

    Gawd, I didn't expect that at all LOL. I swear I have tears rolling down my cheeks because I'm laughing so hard!

  88. hmm by luphus · · Score: 1

    My name is Ingo Molnar.
    You kill my father - prepare to die.

    er... sorry about that, I won't do it again :)

    -nwp

    1. Re:hmm by Graymalkin · · Score: 2

      Don't you mean

      My name is ingo Molnar.
      You kill -15 my parent process - Prepare to die.

      --
      I'm a loner Dottie, a Rebel.
    2. Re:hmm by luphus · · Score: 1

      heh, very nice :)

      -nwp

  89. user-level threads are useless by RelliK · · Score: 2

    User-level threads cannot take advantage of multiple CPUs. True, they are somewhat faster on a single CPU system due to lower overhead, but that's all they are good for.

    --
    ___
    If you think big enough, you'll never have to do it.
    1. Re:user-level threads are useless by ProtonMotiveForce · · Score: 0

      They can if they're mapped to N kernel threads.

  90. ACE is nice for big systems by 0x0d0a · · Score: 2

    ACE is nice for big systems.

    But it's also way overkill for small stuff. It's a whole distributed framework, not a wrapper around pthreads.

  91. Re:Windows comparison by Courageous · · Score: 2

    It's a Windows limit, and it's in the documentation.

    C//

  92. Re:Windows comparison by Courageous · · Score: 2

    The 64K page size is Windows' page size. I can only assume that the poster stating that the intel hardware page size is 4K. I would suppose this means that a Window's (2K,NT) page of 64K is assembled from 16 hardware pages, then. The Windows' page size of 64K is in their documentation. I never paused to think about how this interfaces with hardware pages...

    C//

  93. and by Anonymous Coward · · Score: 0

    that's what the power buttun is for.

  94. Actually Microsoft stole the name IE by Anonymous Coward · · Score: 0

    Actually Microsoft stole the name Internet Explorer. They were sued, the company eventually
    went bankrupt and microsoft settled out of court.

  95. Will `top' and `ps' be fixed? by truth_revealed · · Score: 2

    Currently in Linux every thread is assigned a distinct process ID, and as such, a process has as many entries in `top' and `ps' as it has threads. This makes it difficult to monitor processes externally, or even see the other processes' information. Has this issue been addressed? (I realize this is a user-space program issue, not a kernel issue).

    1. Re:Will `top' and `ps' be fixed? by psamuels · · Score: 1
      Currently in Linux every thread is assigned a distinct process ID

      Ah, but not any more. With the new NPTL, the getpid() call will return the same number for each thread in a process. I don't actually know how or if this perturbs the /proc data, though I expect it doesn't, meaning that the procps utilities will still have to be rewritten to properly portray the new abstractions.

      --
      "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
  96. Thats why there needs to be more moderators... by hackwrench · · Score: 0, Offtopic

    so that the moderation can't be swayed by bad moderation. As it is, any given post gets affected by at most about 10 moderators.

  97. Multithreaded core files on Linux? by truth_revealed · · Score: 2, Interesting

    I can't seem to find any info on whether Linux core files still produce one core file per thread or just one core file per process (as does Solaris). Has `gdb' been enhanced to handle multithreaded programs (or multithreaded core file) on Linux? If I have a thousand threads - I sure don't want 1000 core files in the event of a crash. Is there a way around this?

    1. Re:Multithreaded core files on Linux? by Anonymous Coward · · Score: 0

      Apparently, the 1:1 thread changes Ingo did automagically buy you proper core dumps and proper signal handling. He's working on world peace next.

  98. just in time by Anonymous Coward · · Score: 0

    Now finally systems can handle the huge demand for all those millions of .NET Web Services out there.

  99. Answer me this Batman! by Anonymous Coward · · Score: 0

    As I am in complete awe of the state of GNU. Where and how may I easily contribute (with focus) the only meaningful contribution I have....money?

    I cannot express in words how the efforts of so many , for so many, resonates with my soul.

    I am always offended by those whom wish to call these contributors "communists" when in the course of battle giving one's life is "heroic". So why is giving just a piece of ones live "communistic"?

    I need a place where GNU projects layout their plans, budgets, and paypal accounts so I can participate in a meaningful way.

    To every contributor I have just two words,
    THANK YOU!

  100. Re:Alternative headline by Dahan · · Score: 2
    Correct me if I'm wrong ...

    Okay, you're wrong. This O(1) scheduler in 2.5.x is the "massive overhauling." (Yes, the patch has been around for a while... but as the article says, it's only recently been merged into 2.5)

  101. If it works like a real Unix OS, thank God! by ProtonMotiveForce · · Score: 0

    Wow, is Linux finally becoming a Real Unix OS? I won't see 50 "processes" when I do a 'ps' on a machine running Java applications, which causes less-than-aware users to claim it's using 500 Megs of memory and 10 processes?

    Thank God, but why so long? HP-UX, Solaris, etc... have had a working, stable M:N implementation forever, and it worked. Linux's whole crap about "Eh, we're smarter than they are we'll use kernel threads but make them faster" was until now a load of crap - hubris, if you ask me.

    Anyway, thank freaking God they finally have it working right.

  102. Re:Hi there! by Anonymous Coward · · Score: 0

    The fact that YOU associate M$ to Microsoft only proves him right.

  103. Re:Windows comparison by be-fan · · Score: 2

    Err, Windows NT does use the native 4KB page size on Intel, but is designed to be expandable to systems with up to a 64KB page size. As a result, certain operations (like the reserve mapping that goes on for the thread stack) aligns data in 64KB increments. IIRC, there is also 64KB of virtual slack between memory mapped objects as well.

    --
    A deep unwavering belief is a sure sign you're missing something...
  104. Re:Windows comparison by sagei · · Score: 3, Informative

    Each Linux thread has two things of its own: its own stack, which can be as small as 1 or 2 pages if the code to run is simple enough, and also its own task_struct, which is 1 page including kernel stack for the thread.

    This is not true; the kernel stack is two pages in size, i.e. 8KB on i386.

    Also, in 2.5 (where these tests were done), the task_struct is no longer allocated on the stack. It is allocated off the slab cache, while the thread_info struct is on the stack. The task_struct slab object is another ~1.7KB per task.

    Finally, I do not know what the pthreads default stack size is (user-space? what is that?) but it is certainly larger than one page.

    --

    Robert Love

  105. FUCK YOU by Anonymous Coward · · Score: 0

    yeah, had to say it, first time I do :(

  106. These WOW additions must give Linus a hardon by Anonymous Coward · · Score: 0

    I feel sorry for his girlfriend. She must be totally wiped out...then maybe not:)

    1. Re:These WOW additions must give Linus a hardon by Anonymous Coward · · Score: 0

      You mean wife. And she's a blackbelt in karate apparently. I think she can handle it.

  107. Re:Great... Now every lamer with no design knowled by Anonymous Coward · · Score: 0


    Hello.

    The grandparent started at 2 because the author wasn't a cocksucking AC.

    Thanks for playing.

    - El Generoso
    See me at http://goatse.cx/

  108. Java is great in Linux, but when by SHEENmaster · · Score: 0, Offtopic

    will Mac OS Java shape up!?

    Not only is there no version of J2DK1.4 for PPC(in Mac OS X or Debian/PPC), but the Mac OS X version takes a long time to load.

    I reboot to Linux and run Forte over X(11R6) using my server's processor and ram(768mb as opposed to 128mb). I'd use OS X if I could get an X(11R6) server to work, but neither XDarwin nor Fink will last more than a minute before crashing.

    Can this new technology be integrated into OS X or is it part of a static library that the JDK must be built with? Sun is infamous for building against outdated dynamic library, I had to 'ln -s' a library manually when installing the JDK.

    --
    You can't judge a book by the way it wears its hair.
  109. Other similar mischief... by TheLink · · Score: 3, Funny

    Some guys I know copied a Windows error dialog box and set it as a background image for the desktop, centered.

    Imagine the poor victim vainly clicking on the buttons, and getting more and more worried. Said victim actually rebooted the machine to see it reappear, and was not happy when he started to notice the sniggering bunch behind him...

    For example pic:
    http://www.adobe.com/support/techguides/oper atings ystem/windows/winerrors.html
    Probably want to replace CCmail with Explorer or something more dear to heart ;).

    I also installed a bluescreen STOP screensaver on April Fool's day on a colleague's PC. Heh, he was shocked enough to actually called another colleague over and made the usual worried mumbles.

    http://www.sysinternals.com/ntw2k/freeware/blues cr eensaver.shtml

    Since I had admin privs, I was also tempted to have ad.doubleclick.net and similar dns names to resolve to a private webserver which served out custom banner ads.

    Wonder how users would take it if they see the "Staff Meeting at 2pm banner ad". Or "Company Slogan here". Or "Big boss is watching you!". Or for search result sensitive ads: "Stop downloading mp3s/movies/porn!"

    I could actually justify that as a useful application. It's probably more useful than a doubleclick ad...

    But I'd probably need the 100K parallel thread kernel to serve up all those ad banners :).

    Bwahaha!
    Link.

    --
    1. Re:Other similar mischief... by Anonymous Coward · · Score: 0

      Hah! Cool!
      I'm smirking as I recall (vaguely) the Acorn Archimedes at school. You could password protect directories in the file structure, which the teachers used to protect, er, sensitive material.
      I worked out that the password was a plain text file within the directory, called something like '*password*' so the GUI didn't display it. You could either half-delete a copy of the directory or if you pressed f12 at the right time during bootup it disabled the access control and you could just read the password by using the BASIC prompt! (or is it a shell?)
      Tons of school letterheads and class grades at my disposal, I faked a letter from the proncipal telling all parents that their children were to be sacrificed to satan in return for his own immortality, and pinned it onto the notice-board. It stayed there for a whole fortnight.

      Sadly, though I never did manage to do a Ferris and change my grades.....

  110. Careful there... by Anonymous Coward · · Score: 0
    I was considering your post, that is until I studied the paper you mention myself. Based on this, I only have the following to say:

    • Consider the pace of Free Software development.
    • Consider that the article was based on a study using the 2.2.5 kernel.
    • Consider that that paper was from THREE years ago!
    'Nuff said.
    1. Re:Careful there... by Dahan · · Score: 2
      I have carefully considered your points, and only have the following to say:
      • Consider that the Linux scheduler hasn't changed significantly in those THREE years.
      • Consider Ingo Molnar's post on the subject.
      • Consider providing some evidence for your position, rather than just saying that I'm wrong.
      • Bulleted lists are pretty. I can do that too.
      If you guys have some evidence that the paper I referenced is no longer valid, please post it (or references to it). Don't just tell me "oh, that paper's ancient; things are different now."

      'Cuz up until fairly recently, they weren't.

      P.S. And if anyone wants to compare Windows XP's scheduling performance with NT's, be my guest... I don't think you'll see much of a change. Remember that XP is just NT 5.1, and I haven't heard about any significant performance improvements in NT's scheduler. (The only vaguely scheduler-related change I remember is the addition of "fibers" in NT 4.0 SPsomething (3?))

  111. Finally we have an answer to Sco unixware by Billly+Gates · · Score: 2

    sco and solaris both can create threads 10,000 times faster then the current linux kernels according to sun's and sco's marketing departments. My guess is that this was exagurated but is one of the benefits of the big unix's. Heavily threaded linux apps have been rumoured to fly on unixware where they would run slower on their own native platforms! I guess Linux is maturing in this aspect. Does anyone who knows anything about unix/linux threading care to comment? I wonder if this will help linux in server environments.

  112. Re:Windows comparison by Saint+Stephen · · Score: 2, Informative

    I've created over 200,000 process on a PIII 550 laptop with 256 mb of ram running Windows XP. Of course, it took a while (swapping).

    The process is called nothing.exe. Source Code: int WinMain(...) {Sleep(INFINITE);}

    I work at a lab, so I also ran it on a Compaq 8-way with 4-GB of ram. It worked but I don't remember how fast it went.

    However, there is a big gnarley limit in Windows that will limit the # of processes: the amount of memory allocated to virtual desktops or something. We researched it -- Look it up. This is why you get limited to a few thousand processes or threads if they all do GUI stuff. The bad thing is basically any function you call in user32 will register the thread as a GUI thread. It explains it all in the book Inside Windows 2000.

    Not meaning to troll, I'm just going to share basic fact: It sucks that Windows threads are so expensive, but tens of thousands of threads *DOES* suck (read: thread per client) on Windows. However, this is not the same thing as saying Windows doesn't scale -- you just have to code it differently. (Check out how many SQL Server uses when it's processing thousands of clients.) Stuff like IO Completion ports, AWE memory, and Scatter/Gather IO is the way that you have to go.

    Just because you *can* create hundreds of thousands of threads, doesn't mean it's a good idea or that your app won't run like shit on a 32-CPU machine!

  113. how does anyone know? by XO · · Score: 1

    i've tried to bring 2.5.37 up on 5 different machines, and they all crash anywhere from "OK, booting the kernel..." (hard lock) to getting all the way down to loading SCSI drivers, and getting "Powering off device 0." and then locking up.

    --
    "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
  114. Re:Windows comparison by pVoid · · Score: 1

    I think I know the limit you are talking about: it's a handle limit in the GDI subsystem.

    As for the 200k processes taking time to launch, it is quite normal, as launching a process is much more heavy than just launching a thread.

    The 2k threads I created were created in 700ms. which is very acceptable in my books.

    And to confirm, yes, creating so many threads ain't the best idea.
    Someone else mentionned thread pools as being a workaround, but only a workaround. I personally think thread pools are actually a way of doing things, and not a workaround for slow thread creation. In fact there are new WinNT APIs for thread pooling.

    yada yada... I don't think anyone will actually ever read this post =)

  115. Solaris Deathnail? by zapatero · · Score: 1


    In the systems programming world threading, thread
    scheduling, and signal processing in threads, was always considered Linux' primary weakness; and was the main strength of Solaris, especially for applications running in the telecom space. But with this announcement, I can see Solaris' last tech superority over Linux crumbling.

    I think Sun will need to quicken its re-invention pace.

    -j

  116. Re:Windows comparison by pVoid · · Score: 1

    Yes.

    Minimum loadable Memory Section in windows is 64K. I guess a thread creation creates a new stack on a newly created section boundary.

  117. New locking primitive, "futex" by Animats · · Score: 2

    Hidden in the article was a reference to a new locking primitive, futex. I don't see a manpage on line for it, though. Where is this documented?

    1. Re:New locking primitive, "futex" by Mr+Z · · Score: 1

      A futex is a mutual-exclusion primitive that's based around file and file operations rather than some heavier mechanism. The idea is that the kernel doesn't keep track of all the folks locking the file and so on, with all the extra separate bookkeeping to handle when processes exit and so on. Futexes live mostly in user-space, making them very fast.

      I don't pretend to know a fraction of the details--only that I've heard of them before and what their benefits are, roughly. Read more here.

      --Joe
    2. Re:New locking primitive, "futex" by (startx) · · Score: 2

      There is some very primative discussion of it here .

    3. Re:New locking primitive, "futex" by Animats · · Score: 2

      If it's becoming a kernel feature, there should be more documentation than some comments on the kernel mailing list. But I'm not finding any.

  118. Ingo Molnar is the smartest person on the planet. by Anonymous Coward · · Score: 0

    I can't believe how insane tux is, Ingo just continues to make Linux what it is. Keep up the good work Ingo.

    -R.Dietrich

  119. This may not even make it INTO 2.5.x... by Wolfrider · · Score: 2, Interesting

    See here ( http://lwn.net/Articles/9632/ )
    and here ( http://lwn.net/Articles/10248/ )

    --Linus is being pigheaded about this patch, wanting to "keep the code simple" instead of implementing Ingo's **fast** + Fixed solution.

    To quote LWN:
    [ So it's fast - though a few extra features have been requested. But this patch has stirred up a bit of a debate. Rather than put in a complicated new PID allocator, it is asked, why not just make the maximum PID be very large? Then, in theory, the quadratic part of get_pid() will never run so the performance problems go away, and the code stays simpler. Linus prefers this approach, as do a number of other developers; he has put a simple patch along these lines into his pre-2.5.37 BitKeeper tree.

    Ingo disagrees, pointing out that any reasonable maximum PID size can be exceeded eventually. He would rather fix the problem than try to hid it behind a large process ID space. In the absence of real-world examples that show people being bitten by get_pid()'s behavior in a larger PID space, though, Linus appears unlikely to accept any more complicated fix.
    ]

    --
    .
    == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
    1. Re:This may not even make it INTO 2.5.x... by psamuels · · Score: 1
      --Linus is being pigheaded about this patch, wanting to "keep the code simple" instead of implementing Ingo's **fast** + Fixed solution.

      Linus raised some valid objections, Ingo reworked some of the code, and it's all in 2.5.36. No worries.

      [ So it's fast - though a few extra features have been requested.

      Did you click on the "few extra features" bit? It's the funniest l-k post I've seen in some time. That whole sub-thread (sorry) was pretty good.

      --
      "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
  120. Linus didn't think much of O1 scheduler by jelle · · Score: 3, Interesting

    I remember that Linus made a remark that he tought that the O1 scheduler wouldn't impact Linux much at all, and that its development would not be a biggie for Linux, downplaying the importance of what it can achieve. Go Ingo for keeping at it!

    --
    --- Hindsight is 20/20, but walking backwards is not the answer.
  121. Java and Linux by wwi · · Score: 1

    The threads issue needs to be solved, and
    soon. We are using Java with Linux
    and get regular hangs. Conversations with
    IBM's Java support indicates that
    this is a problem with the Linux kernel,
    Java thread design, and underlying
    thread libraries on Linux. And no,
    we are not running thousands of
    threads, just two Java programs
    on a 2 CPU SMP machine.

    We eagerly await a fix.

  122. What about IPC overhead vs. threads? by Anonymous Coward · · Score: 0

    If you can start that many threads per second then that is one more reason to just use processes instead of bothering with threads. But how much longer does it take process A to tell process B to set variable X to value Y than for thread A to just set B's X to Y?

  123. Re:Windows comparison by red_gnom · · Score: 1

    So,in other words when it comes to comparing threads, size does matter.

  124. Yay Hungary by ctxspy · · Score: 0

    Just wanted to say yay for hungarians.

  125. Woosh by p3d0 · · Score: 1

    That's the sound of M:N threading whizzing past your head.

    --
    Patrick Doyle
    I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  126. Re:Windows comparison by The+Panther! · · Score: 2

    Hardware page size is 4KB, as was noted elsewhere. The key element that I haven't seen mentioned is that Windows' virtual memory system has several ways to 'allocate' memory. There's reserving pages, and there's committing pages. In the case where you tell the OS you want memory, it reserves pages. That is to say, it does not actually take memory from the free physical memory, but instead creates a contiguous address space large enough for your request, but allocates no hardware RAM at those addresses.

    When you commit a page, either through accessing a page (read or write) that is not allocated, it trips a hardware fault if the VM hasn't mapped a page to the address, which then searches for a free page, then links them together.

    The end result is, even if Windows does try to create 64k worth of memory segment space for a process, unless it is actually reading or writing to a byte in each 4k chunk, its internal VM will not allocate physical memory for the whole 64k. Furthermore, there's no such advantage or realistic way for the operating system to align anything in memory physically, except in AGP ram. The VM system handles physical pages of memory exclusively, but does not manage AGP-allocated memory (IIRC). In other words, though the OS can align the address space to anything it likes, the OS layer cannot request any physical allocation mapping or alignment. So that comment about aligning memory for processes is quite unlikely.

    Now, the XBox (which runs a variant of the Win2k kernel) has a bit more control over VM, but it also does not support demand paging, so it cannot swap to the hard disk and give you RAM+HD effective memory. Shame, that. But, as a result, you have an API that allows hardware level allocation control. Still, the OS doesn't take advantage of it, AFAIK. It's for developers.

    --
    Any connection between your reality and mine is purely coincidental.
  127. When does the .5 kernel actually become .6 ? by zaqattack911 · · Score: 1

    In otherwords, I've read tons of articles about all the fancyness being incorperated into the .5 kernel.

    When is it expected that it becomes stable? how long do I have to wait?

    The more I read about this, the more I feel going with the .4 kernel (or linux at all rather), is a mistake for a serious production server.

  128. Re:Windows comparison by Courageous · · Score: 2

    The end result is, even if Windows does try to create 64k worth of memory segment space for a process, unless it is actually reading or writing to a byte in each 4k chunk, its internal VM will not allocate physical memory for the whole 64k.

    Yes. Quite true. I hade a problem a while back on Windows which took me a bit of reading through the documentation (and verifying with some low level sys calls) to determine that what was happening is that I was running out of "reserve memory". Which is to say that, while I had plenty of physical memory left, all the address space had been used up. You can do this very easily by creating thousands of threads on your computer. To get a large number of these threads, you'll have to push the default stack size to its minimum, 64K. I was a bit disatisfied with this minimum, but I suppose I'll live with it now (or port to linux) if I have to, or upgrade to a 64 bit os if it becomes a practical limit in the future.

    C//

  129. Re:Windows comparison by pthisis · · Score: 2

    Err, Windows NT does use the native 4KB page size on Intel, but is designed to be expandable to systems with up to a 64KB page size. As a result, certain operations (like the reserve mapping that goes on for the thread stack) aligns data in 64KB increments

    That's boneheaded. Linux supports page sizes up to at least 4MB, but it doesn't align everything on 4MB boundries on the off chance that you might be using 4MB pages. It uses the appropriate alignments for the page sizes actually in use.

    An OS that has dropped all support for non-Intel hardware citing a portability concern which doesn't exist in portable OSes? As they say in Snatch, "It's spurious, mate. Not genuine."

    Sumner

    --
    rage, rage against the dying of the light
  130. Re:Windows comparison by psamuels · · Score: 1
    I was a bit disatisfied with this minimum, but I suppose I'll live with it now (or port to linux) if I have to, or upgrade to a 64 bit os if it becomes a practical limit in the future.

    One of the nice things about Linux. You don't have to live with any of these 32-bit limitations if your application is big enough to justify 64-bitness. While Microsoft had NT running on Alpha, I understand it was essentially still a 32-bit OS - it was only truly ported to 64 bits when Itanium support was added. Linux, on the other hand, has had true 64-bit implementations running since '94 or '95, so you can be fairly confident that the niggling little 32-bit-isms have mostly been caught by now.

    --
    "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
  131. I totally don't get this story. by jefp · · Score: 1

    My thread-creation benchmark can create 100,000 threads in 10 seconds on my 800MHz Linux machine at work. No idea what kernel it's running, but I'm sure it's not recent. Furthermore, my 450MHz home machine running FreeBSD 4.5 runs the same benchmark in only five seconds. WTF?

  132. Re:Windows comparison by IkeTo · · Score: 1

    > Finally, I do not know what the
    > pthreads default stack size is
    > (user-space? what is that?) but it is
    > certainly larger than one page.

    Why it needs to be larger than one page? The kernel will trap access to page faults due to stack overflow, and will allocate additional stack to it anyway.

  133. Re:Windows comparison by Courageous · · Score: 2

    Yes, I know you are right. Amongst other things, I won't be stuck with 64K per thread stack in Linux, and as you say, I could use 64 bit alpha linux. I'm looking forward to Hammer, actually.

    C//

  134. Re:Windows comparison by sagei · · Score: 2

    Why it needs to be larger than one page? The kernel will trap access to page faults due to stack overflow, and will allocate additional stack to it anyway.

    It does not need to be bigger than one page, it just is. You are right, the stack is expanded via implicit mmap as it grows... but for performance reasons the default stack is usually measured in megabytes, not pages.

    Anything but the simplest of applications would use a page rather quickly. User-space applications are programmed to assume they have any size stack they want. Local variables are huge.

    In short, I was just commenting on the default. It can surely be lowered...

    --

    Robert Love

  135. What "c10k problem"? by tlambert · · Score: 2

    I don't understand what the issue is here.

    I was able to run 1,600,000 simultaneous connections with a modified FreeBSD kernel, in June of 2001. Couldn't get much work done, but at about 300 baud per conection, after dividing up a gigabit ethernet link... you shouldn't expect to do much work.

    Without modifications, after a patch to the credential reference counting (since committed to FreeBSD 4.5), as long as a stock kernel is tuned correctly, it can still *easily* handle 100,000 simultaneous connections (16K of window space for each connection = 1.6G of mbufs).

    -- Terry

  136. So? by tlambert · · Score: 2

    So? Use non-blocking I/O instead. Problem solved.

    -- Terry

  137. It is not broken. by 1101z · · Score: 2, Informative

    No you will see a pid per thread because, that is how the scheduler knows to schedule things. The getpid() c library call from within the program. When they said it is a 1-to-1 mapping that means that there is a process per thread. Just look when you see all those proccesses with the same name, and see if they have the exact same memory usage. If they do it means they are using the same memory and are threads. No matter how you implement threads there has to be more than one proccess other wise when the program blocks for I/O all threads would be blocked.

    --
    One day people will learn the folly of Winbloze, Linux Rules!
    1. Re: It is not broken. by psamuels · · Score: 1
      No you will see a pid per thread because, that is how the scheduler knows to schedule things.

      Indeed we have not got rid of the concept of one pid per thread. But we do have an additional number, the thread group ID. The tgid corresponds to the pid of the thread group leader, and this is what will be returned by getpid() in the new pthreads. (What we call a pid is, and has always been, actually a thread ID.) This is exposed to /proc via a field in /proc/{tid}/status - one need only hack the procps utilities to make something useful of this.

      An interesting question came up on l-k recently: how to maintain (efficiently) the uniqueness of a tgid. What if you are a thread group leader, you spawn off some threads, then you die, and eventually your pid is reused. Then another process could be a thread group leader with the same tgid as your thread group. If I remember correctly, the fix is simple and elegant: if the thread group leader dies before its "siblings", it sits around in a zombie state until they all die, all the while preventing reuse of its pid.

      Just look when you see all those proccesses with the same name, and see if they have the exact same memory usage. If they do it means they are using the same memory and are threads.

      Not necessarily. Try the following:

      for x in 1 2 3 4 5; do sleep 60 & disown; done
      ps ux | grep sleep

      Notice how all five copies of 'sleep' have exactly the same name and memory usage, yet they are independent processes. If you really want to see evidence for threads, you need to check their tgids.

      --
      "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
  138. Re:Windows comparison by psamuels · · Score: 1
    I'm looking forward to Hammer, actually.

    Aren't we all? (:

    --
    "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
  139. Re:This Ingo Molnar ?= the author I. Molnar by haruchai · · Score: 1

    Hey, dork! We've never seen anyone use a redirect link to the goatse.cx site before. Wow, you must be, like, you know, like, rilly brite. Gosh, me wants be smirt lyke ewe.

    --
    Pain is merely failure leaving the body
  140. Re:Windows comparison by IkeTo · · Score: 1

    > It does not need to be bigger than one
    > page, it just is.

    At least that isn't what suggested by the documentation of linuxthreads (in Debian testing). In E.5 it says the following, implying that the default stack size is really just 1 page.

    E.5: Does LinuxThreads implement pthread_attr_setstacksize() and pthread_attr_setstackaddr()?
    These optional functions are provided in recent versions of LinuxThreads (0.8 and up). Earlier releases did not provide these optional components of the POSIX standard.

    Even if pthread_attr_setstacksize() and pthread_attr_setstackaddr() are now provided, we still recommend that you do not use them unless you really have strong reasons for doing so. The default stack allocation strategy for LinuxThreads is nearly optimal: stacks start small (4k) and automatically grow on demand to a fairly large limit (2M). Moreover, there is no portable way to estimate the stack requirements of a thread, so setting the stack size yourself makes your program less reliable and non-portable.

  141. Re:Alternative headline by Anonymous Coward · · Score: 0
    If you want to do stupid things with your programs, that's fine by the kernel developers. Just don't expect /them/ to bend over backwards to make /your/ stupid design work as well as you want it to.

    Except this things ARE NOT stupid. In no way.

  142. Ahhhh. So now it can by T.E.D. · · Score: 2

    ...run Ada 83 programs.

  143. Re:Great... Now every lamer with no design knowled by dvdeug · · Score: 2

    But while their threads will be slow, they will be to handle the text the users are entering; vastly more useful than the most optimized eight-bit character horror you would turn out.

  144. Re:Great... Now every lamer with no design knowled by Alex+Belits · · Score: 2

    Trolling is supposed to be:

    1. Fast! Writing random mild insults almost a week after the original posting isn't as great as making a real-time flamewar immediately after posting.

    2. Accessible to a potential reader. Referring to an obscure recurring theme of my rants made months away from this article (byte-value transparency of protocols vs. Unicode references in RFCs) would require a potential troll spectator a lot of googling before he will be able to appreciate your comment.

    --
    Contrary to the popular belief, there indeed is no God.