Slashdot Mirror


User: Ingo+Molnar

Ingo+Molnar's activity in the archive.

Stories
0
Comments
66
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 66

  1. Re:CFS vs. O(1) on Linux Gets Completely Fair Scheduler · · Score: 3, Insightful

    (To answer your question: the 20-21 comes from other limits to the task space - right now we are still limited to 32k pids.)

    Yes, you are right, operations on an rbtree of an arbitrary data structure are of course an O(log2(N)) algorithm, no argument about that.

    I know what the mathematical meaning and definition of the big/little ordo/theta notations is (probably better than i should ;), I only wanted to point out the fact that an O(log2(N)) algorithm for most data structures in the kernel (or elsewhere on today's computers) is equivalent to O(1) in practice, especially if N is fundamentally limited to 15 bits like in this case!

    The main purpose of the ordo/theta notations is to be able to talk about and compare the performance (worst-case/best-case/average-case) qualities of algorithms. Sticking to their strict mathematical definition in cases where it departs from their original purpose results in worse software :)

    And talking about big ordo differences between algorithms operating in finite machines still makes sense (naturally): for example, O(sqrt(N)) is not equivalent to O(1) in practice - it can still be very large, even with a pretty limited N. O(N) is also obviously very relevant in practice, even on very limited machines. But the difference between O(log2(N)) and O(1) is insignificant in most cases, and in fact it is deceiving in this case. (as i pointed it out with the O(140) example.)

  2. Re:How does it relate to disk IO? on Linux Gets Completely Fair Scheduler · · Score: 2, Informative

    Hm, that seems to be more of a VM/IO-scheduling problem than a process scheduling problem.

    Did you have a chance to try Peter Zijlstra's excellent per-bdi patches, as suggested in the bugzilla?

    But in general, CFS ought to improve such workloads too (to a limited degree), in terms of not making any IO starvation worse by adding CPU starvation to the mix :-)

  3. Re:how it's possible? on Linux Gets Completely Fair Scheduler · · Score: 5, Informative

    Seriously? What, the kernel switches to a process, the process checks its environment and figures out that the event it was waiting for hasn't happened yet, and goes back to sleep? I can't believe that a project as mature as the Linux kernel would use a scheduler like that.

    No, CFS does not do that, and that would be quite silly to do indeed :-)

    CFS keeps tasks that woke up in the runqueue, and allows them to run immediately in the typical case - just like the old scheduler did.

    Where CFS differs from the old scheduler is mainly the case when there are more tasks runnable than there are CPUs/cores available. In such cases, on any modern multitasking kernel, the scheduler has to decide which task to run, and in what order and weight to run those tasks, with the goal to provide to the user the happy illusion of multiple, snappy applications running at once.

    The old O(1) scheduler decided the "order and weight" of runnable tasks based on an pretty elaborate set of heuristics. The rules are pretty complex, but it mostly boils down to 'sleepers get more CPU time than runners'.

    (sidenote: CFS is an O(1) scheduler too for all practical purposes, with an upper limit of ~15 algorithmic steps worst-case)

    Now those heuristics worked pretty well for 15 years (those sleep-heuristics were always part of Linux scheduling, the O(1) scheduler i wrote inherited them from the original O(N) scheduler), but good is never good enough in the land of Linux ;-)

    How does CFS work? CFS follows an approach similar to Con Kolivas' SD project: a scheduler core that instead of heuristics uses "fair scheduling" to achieve interactivity. Runnable tasks are scheduled in a painstakingly fair way (and that seemingly simple concept alone is pretty hard to achieve in a general purpose kernel).

    The simplest case is when there are only CPU-intense tasks running. For example, if there are 8 CPU-intense tasks running on the CPU, each task gets exactly 12.5% CPU time. If you watch how much CPU time the tasks get it will be 12.5% long-term too, with no deviations, with no skewing caused by other tasks running inbetween.

    The more complex case is when applications schedule frequently (and that is the case on most desktops and servers), so CFS extends the concept of 'fairness' to sleeping tasks too. CFS accounts not only 'runners', but 'sleepers' too. Tasks that sleep/run frequently are still given their full 'fair share' of the CPU, up to the limit they could have gotten were they not sleeping at all.

    So for example, if you have two tasks on a CPU, one a 100% CPU hog, the other one an application that sleeps/runs 50% of the time - both will get 50% of the CPU in CFS. Under the strict 'runner fairness' approach (which for example SD is following), the 100% CPU hog would get ~66% of CPU time, the sleeper would get ~33% of CPU time.

    To achieve 'sleeper fairness', CFS runs the (ex-)sleeper task sooner, to offset its disadvantage of not hanging around on the CPU all the time. Or in other words: interactive tasks (tasks that sleep often) will get to the CPU with lower latencies. Which is the holy grail of good desktop scheduling :-)

    (granted, CFS does a whole lot more than that, its patch-impact size is 3 times larger than SD. CFS is not a single patch but a series of 50 patches, which also modularize kernel scheduling policy implementation (note, it does not modularize the scheduler itself a'la PlugSched), offer "group scheduling" (nifty thing for containers/virtualization and large systems, written by Srivatsa Vaddagiri of IBM), offer precise CPU usage accounting to /proc (used by CPU/task monitoring tools), and much more. We decided to turn Linux scheduling upside down, which gave me the easy excuse^H^H^H opportunity to extend the scheduler's design a bit more ;-)

  4. CFS vs. O(1) on Linux Gets Completely Fair Scheduler · · Score: 5, Informative

    (disclaimer, i'm the main author of CFS.)

    I'd like to point out that CFS is O(1) too.

    With current PID limits the worst-case depth of the rbtree is ~15 [and O(15) == O(1), so execution time has a clear upper bound]. Even with a theoretical system that can have 4 million tasks running at once (!), the rbtree depth would have a maximum of ~20-21.

    The "O(1) scheduler" that CFS replaces is O(140) [== O(1)] in theory. (in practice the "number of steps" it takes to schedule is much lower than that, on most platforms.)

    So the new scheduler is O(1) too (with a worst-case "number of steps" of 15, if you happen to have 32 thousand tasks running at once(!)), and the main difference is not in the O(1)-ness but in the behavior of the scheduler.

  5. glibc support? on Linux Kernel Goes Real-Time · · Score: 2, Interesting

    I have a question about the new mutex features: what glibc version is required to use this stuff? do I need other user space libraries?

    The latest upstream glibc version (2.5) has it:

    * Support for priority inheritance mutexes added by Jakub Jelinek and Ulrich Drepper.
    * Support for priority protected mutexes added by Jakub Jelinek.

    (See: http://sources.redhat.com/ml/libc-alpha/2006-09/ms g00065.html )

    No other userspace library is needed.

  6. complexity of the -rt patchset on Linux Kernel Goes Real-Time · · Score: 2, Informative

    It's painful reading how that works. It's an achievement. "613 files changed." "It's the most complex kernel feature i ever worked on." But it's one of those things that, for legacy reasons, is much more complex and ugly than it should have been.

    I think you misunderstood my point. The reason why our patchset is so complex and so large is because we want to do it right. The quick-and-ugly shortcut is alot smaller (and it has been done before), and it brings problems similar to the ones you outlined - but that's not the path we chose.

    Here is an (incomplete) list of kernel features/enhancements split out of -rt and merged upstream so far:

    - the generic interrupt code (genirq)
    - the generic time of day subsystem (GTOD)
    - the hrtimers subsystem
    - the lock validator (lockdep)
    - the generic spinlock code
    - priority inheritance enabled mutexes
    - robust and PI-futexes
    - SRCU
    - the mutex subsystem
    - irq handler prototype simplification (removal of pt_regs)
    - spinlock init cleanups
    - spinlock debugging improvements
    - voluntary preemption feature
    - latency-breaker enhancements

    Note that all those features originated from the -rt effort, but they have justification and use independent of real-time considerations. In other words: they make sense in a general purpose OS anyway. And better yet: our current judgement is that much of the rest is in this category too. So what we did and what we are doing are dozens of seemingly unrelated enhancements to the Linux kernel, which in the end enable hard real-time.

    Is such an approach more complex instead of a quick-and-dirty hard-realtime kernel feature? Sure it is - but in my opinion this is the only way it can stay maintainable in the long run. And as a happy side-effect we'll get a hard-real-time capable kernel that will run on virtually every piece of hardware on this planet. And since we've got all the time we need and no deadlines to meet, it can and will be done =B-)

  7. Hard-real-time in Linux? on Linux Kernel Goes Real-Time · · Score: 3, Informative

    I hate saying things like that. I'm a geek and I'm proud of it, and therefore want Linux to have the maximum flexibility possible. However, poorly maintained code is worse than no code at all, and there just isn't the userbase to keep hard real-time in the kernel at an acceptable standard.

    We have good news for the geek in you: the upstream Linux kernel is already more than 50% on the way to become hard-realtime :-)

    The 2.6.19-rc2 kernel already includes the following features/subsystems, which are the precondition of hard-realtime (PREEMPT_RT):

    - the generic interrupt code (genirq)
    - the generic time of day subsystem (GTOD)
    - the hrtimers subsystem
    - the lock validator (lockdep)
    - the generic spinlock code
    - priority inheritance enabled mutexes
    - robust and PI-futexes
    - SRCU
    - the mutex subsystem
    - irq handler prototype simplification (removal of pt_regs)
    - (and more stuff that escapes my mind right now)

    Most of these features were written for and prototyped in the -rt tree and were split out and merged individually. (they all have other uses besides serving a hard-real-time kernel, so their merging was largely uncontroversial)

    Granted, there's still 1.4+ MB of patches pending in our 2.6.18 based -rt tree (such as the core bits of PREEMPT_RT, irq threading, high-res timers, dynticks and more), but roughly the same amount of code has been merged upstream already, and we can now see the end of the tunnel.

    In fact, i'd say that the most controversial ones are already merged and the flamewars are largely over: such as the generic mutex code (which replaced semaphores half a year ago in 2.6.16/17) or the priority inheritance and rt-mutex code (which is now in 2.6.18).

  8. Re:Performance overhead of the -rt patch-set on Linux Kernel Goes Real-Time · · Score: 1

    Thanks - i forgot about these follow-up numbers that show even lower overhead for PREEMPT_RT: 0% lmbench overhead when compared to the vanilla kernel.

    As a summary, i'm not all that worried about the performance impact. It is real, it will hit certain workloads more than others, but it's alot less than what was feared.

    I'd still advise a generic distro against enabling it unconditionally, but a more specialized one can enable it no problem. These are the preemption kernel config options offered by the -rt kernel:

    ( ) No Forced Preemption (Server)
    ( ) Voluntary Kernel Preemption (Desktop)
    ( ) Preemptible Kernel (Low-Latency Desktop)
    (X) Complete Preemption (Real-Time)

    The first 3 config options are upstream already, and - as Thomas' article points it out - much of the "foundation" of PREEMPT_RT is upstream already too: generic interrupt code, generic time of day subsystem, hrtimers subsystem, the lock validator, the generic spinlock code, priority inheritance enabled mutexes, robust and PI-futexes, etc. Roughly half of what used to be in -rt is in the 2.6.18/19 kernels already.

    Generic distros that want to scale from small boxes to really large multi-CPU boxes should use "Server" or "Desktop" preemption. Desktop-oriented distros could use "Low-Latency Desktop" too. Carrier-grade and embedded ones could use "Real-Time" preemption.

  9. Is the -rt patchset hard-real-time? on Linux Kernel Goes Real-Time · · Score: 2, Informative

    ADEOS - the microkernel used in RTAI - is "hard real-time", as is VxWorks. TimeSys' Linux patches are soft real-time.

    Small correction: if by "TimeSys' Linux patches" you mean the -rt patchset that i'm maintaining (and to which Thomas is a major contributor), and in particular if you mean the CONFIG_PREEMPT_RT kernel feature, then the answer is a clear "no": it's not "soft real-time", it's intended to be "hard real-time" in the same sense as ADEOS/RTAI is.

    The -rt patch-set implements a fully preemptible Linux kernel, which allows a higher-prio event to preempt any lower-prio processing: it can preempt device driver interrupt processing or other "irqs off" critical sections or other normally non-preemptible (for example spin-locked) code within the kernel, immediately. (it does all the necessary hard-real-time things one would expect: it pushes interrupt processing into special kernel threads, it implements priority inheritance for all Linux locking primitives to guarantee processing and to get out of priority inversion scenarios, etc.)

    See more about the technology behind the -rt patchset in Paul McKenney's article on LWN.net, and on Kernel.org's RT Wiki. You can also try out an -rt kernel based Linux distribution yourself, grab a Knoppix-based PREEMPT_RT-kernel live-CD from: here.

  10. Performance overhead of the -rt patch-set on Linux Kernel Goes Real-Time · · Score: 5, Informative
    RT has a pretty big speed penalty.

    I can definitely say that unlike some other approaches, the -rt Linux kernel does not introduce a "big speed penalty".

    Under normal desktop loads the overhead is very low. You can try it out yourself, grab a Knoppix-based PREEMPT_RT-kernel live-CD from here: http://debian.tu-bs.de/project/tb10alj/osadl-knopp ix.iso.

    From early on, one of our major goals with the -rt patchset (which includes the CONFIG_PREEMPT_RT kernel feature that makes the Linux kernel "completely preemptible") was to make the cost to non-RT tasks as small as possible.

    One year ago, a competing real-time kernel project (RTAI/ipipe - which adds a real-time microkernel to 'above' Linux) has done a number of performance tests to compare PREEMPT_RT (which has a different, "integrated" real-time design that makes the Linux kernel itself hard-real-time capable) to the RTAI kernel and to the vanilla kernel - to figure out the kind of overhead various real-time kernel design approaches introduce.

    (Please keep in mind that these tests were done by a "competing" project, with the goal to uncover the worst-case overhead of real-time kernels like PREEMPT_RT. So it included highly kernel-intensive workloads that run lmbench while the box is also flood-pinged, has heavy block-IO interrupt traffic, etc. It did not include "easy" workloads like mostly userspace processing, which would have shown no runtime overhead at all. Other than the choice of the "battle terrain" the tests were conducted in a completely fair manner, and the tests were conducted with review and feedback from me and other -rt developers.)

    The results were:

    LMbench running times:

    | Kernel............ | plain | IRQ.. | ping-f| IRQ+p | IRQ+hd|

    | Vanilla-2.6.12-rc6 | 175 s | 176 s | 185 s | 187 s | 246 s |
    | with RT-V0.7.48-25 | 182 s | 184 s | 201 s | 202 s | 278 s |

    (Smaller is better. The full test results can be found in this lkml posting.)

    I.e. the overhead of PREEMPT_RT, for highly kernel-intensive lmbench workloads, was 4.0%. [this has been a year ago, we further reduced this overhead since then.] In fact, for some lmbench sub-benchmarks such as mmap() and fork(), PREEMPT_RT was faster.

    (Note that the comparison of PREEMPT_RT vs. I-pipe/RTAI is apples to oranges in terms of design approach and feature-set: PREEMPT_RT is Linux extended with hard-realtime capabilities (i.e. normal Linux tasks get real-time capabilities and guarantees, so it's an "integrated" approach), while ipipe is a 'layered' design with a completely separate real-time-OS domain "ontop" of Linux - which special, isolated domain has to be programmed via special non-Linux APIs. The "integrated" real-time design approach that we took with -rt is alot more complex and it is alot harder to achieve.)

    See more about the technology behind the -rt patchset in Paul McKenney's article on LWN.net, and on Kernel.org's RT Wiki.

  11. Re:It's progress, but not everything you need yet on Linux Kernel Goes Real-Time · · Score: 5, Informative

    Linux has made major progress in the real-time area. But it still doesn't have everything needed.

    Many drivers are still doing too much work at interrupt level. There are drivers that have been made safe for real time at the millisecond level, but that's not universal. Load a driver with long interrupt lockouts and your system isn't "real time" any more. This is the biggest problem in practice. There are too many drivers still around with long interrupt lockouts.

    That's where my -rt patchset (discussed by Thomas in the article), and in particular the CONFIG_PREEMPT_RT kernel feature helps: it makes all such "interrupt lockout" driver code fully preemptible. Fully, totally, completely, 100% preemptible by a higher-priority task. No compromises.

    For example: the IDE driver becomes preemptible in its totality. The -rt kernel can (and does) preempt an interrupt handler that is right in the middle of issuing a complex series of IO commands to the IDE chipset, and which under the vanilla kernel would result in an "interrupt lockout" for several hundreds of microseconds (or even for milliseconds).

    Another example: the -rt kernel will preempt the keyboard driver right in the middle of sending a set of IO commands to the keyboard controller - at an arbitrary instruction boundary - instead of waiting for the critical section to finish. The kernel will also preempt any network driver (and the TCP/IP stack itself, including softirqs and system-calls), any SCSI or USB driver - no matter how long of an "interrupt lockout" section the vanilla kernel uses.

    Is this hard technologically? Yes, it was very hard to pull this off on a general purpose OS like Linux (the -rt kernel still boots a large distro like Fedora without the user noticing anything) - it's the most complex kernel feature i ever worked on. I think the diffstat of patch-2.6.18-rt5 speaks for itself:

    613 files changed, 22401 insertions(+), 7903 deletions(-)

    How did we achieve it?

    The short answer: it's done via dozens of new kernel features which are integrated into the ~1.4MB -rt patchset :-)

    A longer technical answer can be found in Paul McKenney's excellent article on LWN.net.

    An even longer answer can be found on Kernel.org's RT Wiki, which is a Wiki created around the -rt patchset.

  12. Re: slight exageration? not. on Interview With Linux Kernel Guru Ingo Molnar · · Score: 5, Informative
    The test i did really involved the creation of 100,000 parallel threads, for a second or so. Obviously they did not do much work, other than go to sleep, but the runqueue length was definitely 100,000.


    The test would be meaningless otherwise - you can create/destroy 100,000 threads in a row on any OS without any problem.


    Furthermore, Anton Blanchard tested _1 million_ parallel threads on one of his big PowerPC boxen, using the new threading code - the test completed in roughly 30 seconds and he has got an insane load-average in the hundreds of thousands range - a further proof that the threads were running in parallel.

  13. Re:Porting to Linux? on 986MB/s With BSD And Gigabit Ethernet · · Score: 5
    As part of TUX i've implemented zero-copy TCP xmit. It turned out to be a minimal change (barring driver changes), less hassle than we initially thought. Obviously we couldnt have gotten those SPECweb99 numbers without zero-copy TCP xmit.

    One important question is, what MTU have they used. If it's the 9000 byte MTU jumbo gigabit frames then these BSD numbers havent got too practical relevance (i can saturate 8 gigabit cards with TUX, ie. 900MB/sec with 9000 byte MTU). If it's the standard 1500 byte MTU then it's nicer. (hm, i just found it, they indeed used 9000 byte MTU...)

  14. at least five problems with the slides on Benchmarks of *BSD, Linux, and Solaris at LinuxTag · · Score: 2

    1) the IO numbers look very suspect, Linux was always capable of saturating IO bandwidth up to 60MB/sec even on modest hardware. This is so basic that they should have stopped the test when they saw how far the block IO numbers are apart. It certainly does not show alot of experience in tuning Linux.

    2) the 'per character' numbers of bonnie are utterly meaningless. Bonnie is not a smart benchmark tool, but it gets quoted often. The 'per char' numbers simply measure the performance of libc's "getc()" function [and both Linux and FreeBSD use glibc.]. So the effect of the 'per char' measurement only slows Bonnie down and skews the numbers with additional CPU time. In fact i use a hacked Bonnie that just does the 'block IO' numbers - looking at the 'char IO' numbers is a waste of time.

    3) While i understand the deadline issues, the 2.4-ac series were seriously buggy. It's only 2.4.0-test5 that started behaving properly, VM and IO-wise.

    4) Linux's name is 'Linux', not 'linux' - they got it right with 'FreeBSD', so it's not hard :-)

    5) the apparent IO misconfiguration then reflects in all the 'high load' numbers, so all the slides (except maybe the network numbers) should be redone IMO.

  15. Re:Thanks, Ingo... on Answers From Planet TUX: Ingo Molnar Responds · · Score: 3

    use dup() to create an unlimited number of 'aliases' to an open file - then you can do async IO on each of them.

  16. Re:Thanks, Ingo... on Answers From Planet TUX: Ingo Molnar Responds · · Score: 3

    I'd like to add the fact that the Linux kernel creates a new shared thread in 0.01 milliseconds (10 microseconds) on a 500 MHz PIII. Forking a new process (isolated thread with new page-tables) is about 0.5 milliseconds on the same box, using the latest 2.4 kernel.

  17. Re:disagree about his threads vs process argument on Answers From Planet TUX: Ingo Molnar Responds · · Score: 4

    You raise interesting questions which were not explained in the article, and i think under Linux there are easy answers for them:

    1) if you need to share memory then you can do it with isolated processes as well - just use mmap(MAP_SHARED). The point is to have only as much sharing as absolutely necessery - to avoid any unwanted interaction between threads (isolate them as much as possible).

    2) the MMU reload is not an issue on SMP if you have isolated threads on every CPU, because there will be no context switches. You can use shared threads on a per-CPU basis to avoid the TLB overhead. Btw., the Linux kernel avoids the TLB overhead by using 'global TLBs' (on PPro and better x86 CPUs) which survive even context switches between isolated threads. Anyway, the TLB-reload problem is a short-term x86 issue only - modern RISC CPUs and IA64 use context-tagged TLBs which survive context switches.

  18. Re:Please help the non-technical among us on Answers From Planet TUX: Ingo Molnar Responds · · Score: 3

    yes, almost - ham sandwitches are a similarly important concept which help coding the Linux kernel ;-)

  19. Re:Comparing apples to oranges on Are Linux Transactions Slower Than Win2k's? · · Score: 1

    Your ServerBench comments - maybe it's a valid result, maybe not. Note that the 2.2 kernel was used, while the SPECweb99 result used the 2.4 kernel.

    Your SPECweb99 comments though make no sense whatsoever, they show basic misunderstandings of SPECweb99. In SPECweb99 the tester specifies a 'requested connections' value, and the server either meets it, or it doesnt. The Windows 2000 server maxed out at 1598, the Linux server at 4200 connections. You can request 1 million connections as well, but the testrun will not be compliant. Please check out this link for more information about SPECweb99. It's a complex benchmark.

  20. Re:what were they doing when changing specs? on Are Linux Transactions Slower Than Win2k's? · · Score: 1
    Yep, i can attest to this. I have sent an experimental 2.3 kernel at that time to the ZDNet guys (was around April?), and it made the numbers visibly better. (there was still some advantage to Windows in the many-clients case, but not the 400% difference.) What i find strange is that there is no mention of this in the article (maybe the print version includes it, i dont know).

    But yes, ZDNet did talk to Red Hat about tuning this - it was all in a rush and without us having actual access to the system, so all we could do is send a couple of wild guesses, nothing more. Keeping the ServerBench server-side source code closed is a practice that deprives us of giving any meaningful input or tuning suggestions. Having Linux clients (even if only binary and text-only variants) would help immensely.

    Linux performs very well in a transaction-rich environment, even the 2.2 kernels - 2.2 kernels are on par with NT in SAP benchmarks. (and in that case the actual SAP server is used to benchmark the system.) Last year Linux got on the SAP top 20 list, you can find a (german) article about it here, this benchmark used the 2.2.11 kernel.

  21. Re:2.2 kernels used on Are Linux Transactions Slower Than Win2k's? · · Score: 1
    well, to be fair, as long as SysV IPC shared memory is used, that should perform just fine in 2.2 as well. It does not have very good swapping properties though - this should be much better in 2.4. But i dont think this particular test swapped at all. I dont think there could anything go terribly wrong in a read/write IO server model of this type, i suspect the 2.2 VM problems to be the culprit.

    Having access to the source code of ServerBench would be very helpful though, and right now i cannot test ServerBench because it supports Windows clients only. If anyone from ZDNet listens, i'd love to help porting their client to Linux :-)

  22. 2.2 kernels used on Are Linux Transactions Slower Than Win2k's? · · Score: 5

    ServerBench is not available in source code, and the testing was done by ZDNet. From what i know about ServerBench it uses a threaded IO model on NT, but a fork/process model on Linux. The Linux 'solution' is coded by ZDNet, with no possibility from us to influence/comment the design and approach used at all. Even under these circumstances we expect the 2.4 Linux kernel to perform significantly better in ServerBench than 2.2 kernels. The 2.2.1x (and late 2.3.x) kernels had some VM problems, and with increasing VM utilization (more clients) this problem could have been triggered.

    SPECweb99 OTOH is a standardized benchmark with full source-code access (ServerBench are closed binaries), so all SPECweb99 implementational details are visible.

    Nevertheless it's technically possible that ServerBench triggers performance bugs in Linux - we'd love to see the source to fix those bugs ASAP, if they are still present in 2.4.

  23. Re:two words.. on Linux Beats Win2000 In SpecWeb 2000 · · Score: 1
    Microsoft is a member of SPEC and thus has a SPECweb99 license and thus also knows about Windows 2000 + IIS scalability issues. Microsoft still braggs about earlier Windows 2000 SPECweb99 numbers beating Linux.

    As Microsoft writes: "The Standard Performance Evaluation Corporation (SPEC) is most notably recognized for their Web server benchmarks [...]. SPECWeb 99 is the more recent Web server benchmark from SPEC that does a better job of representing Web sites today. [...]"

    So your theory is that all top PC vendors, which are in a cutthroat race with each other to get the best SPEC results out, somehow conspired to make *ALL* 16 Windows 2000 Advanced Server + IIS submissions in the past year look bad, and all this with the help and under the watching eye of Microsoft? :-)

  24. Re:Interpreting the results: The REAL bubble! on Linux Beats Win2000 In SpecWeb 2000 · · Score: 1

    4) IIS SPECweb99 performance clearly suffers if more than one IIS thread per CPU is used.

    5) you claim that Microsoft has no idea how to tune IIS - that is not a too credible claim IMO.

  25. Re:Interpreting the results: The REAL bubble! on Linux Beats Win2000 In SpecWeb 2000 · · Score: 1

    I cannot tell why W2K is slow without having seen the source code of IIS and W2K. The TUX numbers are good because TUX uses the SMP-enhancements added to the Linux TCP/IP stack and other kernel subsystems during the 2.3 cycle.