Slashdot Mirror


Improving Linux Kernel Performance

developerWorks writes "The first step in improving Linux performance is quantifying it, but how exactly do you quantify performance for Linux or for comparable systems? In this article, members of the IBM Linux Technology Center share their expertise as they describe how they ran several benchmark tests on the Linux 2.4 and 2.5 kernels late last year. The benchmarks provide coverage for a diverse set of workloads, including Web serving, database, and file serving. In addition, we show the various components of the kernel (disk I/O subsystem, for example) that are stressed by each benchmark."

97 comments

  1. But how many 3dmarks can I get?? by Rooked_One · · Score: 2, Funny

    oh wait, thats not ported to nix yet....

    1. Re:But how many 3dmarks can I get?? by Anonymous Coward · · Score: 0

      Windows is on sale today at best buy for $329. You need to get in your SUV and grab a copy for your kid's machine before the DCMA gets hold of you...

  2. The Problems with Benchmarking like this... by dWhisper · · Score: 5, Insightful

    I'm just curious what they are quantifying performance against. Everything here seemed to be strictly on the Network side of things. Are they trying to increase the actual Kernal processing of the individual threads for the network applications (File Serving, DB, and Webserving), or are they just measuring the eff. for the processing of data packets for the services.

    It sounds interesting, but it looks like the tuning is done specifically on the IBM platform, which makes me wonder. Linux already blows and MS product away for these applications, so I'm curious what they are comparing the results to. Did they just take an arbitrary point (processor load) for specific applications, or are they creating a specialized measurement (like SysMarks in Windows) that is only valid in their test suite.

    Anyway, it should be interesting to see where it ends up, eventually.

    1. Re:The Problems with Benchmarking like this... by Anonymous Coward · · Score: 0

      It's all marked against the performance of a cluster of G4 Amigas.

      Both the newest release linux kernel and new Amigas seem to be "nearly there"

    2. Re:The Problems with Benchmarking like this... by Chatz · · Score: 3, Insightful
      I think it is reasonable to do benchmarks against likely realworld applications. It seemed clear to me that they understand that the benchmark may not represent a load anyone may actually encounter, but that is outweighed by the ability of someone else to come along and use the same benchmarks.

      Some scientific/mathematical benchmarks would also be good to see.

      --
      There is folly and foolishness on the one side, and daring and calculation on the other. - Admiral Pellew, Hornblower
    3. Re:The Problems with Benchmarking like this... by swissmonkey · · Score: 2, Informative

      Linux already blows and MS product away for these applications, so I'm curious what they are comparing the results to.

      You obviously need to look better at how Linux scale on 8P machines and more before making such statements.

      Go to http://www.tpc.org and look at the results for Linux and Windows for 16P systems and more, Linux is non-existent, for a reason.

    4. Re:The Problems with Benchmarking like this... by giel · · Score: 1

      IMHO most optimization and tuning issues are roughly about three things: a static component (eg. RAM used for caching), a variable component (eg. RAM used for each request) and a 'panic' type component (eg. extra work needed for requests when running out of RAM). Its typically these type of differences in behaviour and system load which are interesting to compare. Even with a M$ box.

      --
      giel.y contains 2 shift/reduce conflicts
    5. Re:The Problems with Benchmarking like this... by Anonymous Coward · · Score: 2, Informative

      Looked at www.tpc.org but could not see anything which tells otherwise that linux still has better performance than windows.. even tough there was a very little selection of linux setup testet, those which were tested had exelent performance. The only thing that windows had better, was price/performance on a low end server, that means if you do not need to much performance, then the windows solution might be the right shot ( as of tpc.org anyway).

      Seems like the post was probably more a troll than any important issue, since the site had 90% of tests on windows servers, and 2% linux, it cannot be taken to seriusly.

      Please correct me with real tech facts.. not just some marked bs to tell what windows might be, but are not.

    6. Re:The Problems with Benchmarking like this... by dWhisper · · Score: 2, Interesting

      After checking out their list, there are only two test machines running strictly Linux. At least of the non-clustered setups. Beyond that, they are all Win 2k Data Center, .NET Server, IBM AIX and Unix. The ones that are running are running Red Hat Advanced Server, and it does not specify if they are optimized.

      Beyond that, they are not using a unified standard as their monitoring system. All of the Win machines use Com+ and the non-win use a variety.

      They also say that most of the best Price/Performance machines are running Windows 2000 Server, or .NET server (betas?). Most Linux admins would argue this, especially given the news article on /. last week that said it is cheaper to run. I wonder how accurate their measures are based on the monitoring tools.

    7. Re:The Problems with Benchmarking like this... by virtual_mps · · Score: 3, Interesting

      Go to http://www.tpc.org and look at the results for Linux and Windows for 16P systems and more, Linux is non-existent, for a reason

      That reason would be the cost of the tests and the fact that most linux hackers don't have pockets as deep as billg's.
    8. Re:The Problems with Benchmarking like this... by nicodaemos · · Score: 4, Informative

      Hmmm .... tpc.org is an interesting organization. It is a non-profit who is funded by memberships from the hardware/software companies on which it produces benchmarks.

      According to their website, "Full Members of the TPC participate in all aspects of the TPC's work, including development of benchmark standards and setting strategic direction. Full Membership costs $15,000 per calendar year."

      Wow, a large percentage of the benchmarks are using MS operating systems. Oh look full members get to set benchmark standards. Mmmm, the only pure OS company who is a full member is Microsoft. I wonder what kind of conclusion can be drawn .......

    9. Re:The Problems with Benchmarking like this... by smittyoneeach · · Score: 1
      So, you can:

      Do it yourself, or,

      Trust them, potential interest conflicts and all.
      This is the usual story when these "mine's better" discussions arise.
      For benchmarks, who has a reputation for

      Knowing what they are about, and

      Remaining objective?

      --
      Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
    10. Re:The Problems with Benchmarking like this... by gmack · · Score: 1

      They often compare performance to older kernels or even other department's patches.

    11. Re:The Problems with Benchmarking like this... by bitflip · · Score: 1

      Yes, poor old IBM doesn't have any money to do this. We should all donate!

    12. Re:The Problems with Benchmarking like this... by kurtkbee · · Score: 1
      You obviously need to look better at how Linux scale on 8P machines and more before making such statements

      If linux scalability is really an issue, beyond 8 processors, then i guess that the SGI Altix 3700 is just
      vaporware/gare.

      I suggest that you read the following articles that debunk the myth of the 8processor barrier :

      SGI Busts into Linux with 64-Processor Scalability

      NEC Calls Dibs on Breaking Linux Eight-Processor Limit

      I personally hope that these benchmarks can be run against more recent kernels and a full description of optimizations and patches used disclosed.
      Considering that SGI is using a [somewhat] standard 2.4.19 kernel to scale this well , I am certain that the results will be much better.

    13. Re:The Problems with Benchmarking like this... by I+Am+The+Owl · · Score: 0, Flamebait
      Linux already blows

      Amen. +5, Insightful.

      --

      --sdem
    14. Re:The Problems with Benchmarking like this... by Anonymous Coward · · Score: 1, Insightful
      Go get a fucking clue. The reason MS has some of the best numbers, is, guess what, that their systems are among the fastest at running TPC-C !

      TPC-C is not a perfect benchmark (in fact all cluster numbers or "cluster in a box" numbers should be disregarded or completely separated from "single DB instance" numbers). Still it takes a lot of work to get good numbers on TPC-C. A lot of that work will benefit normal DB users.

      MS has good numbers because they did that work.
      Oracle also has excellent numbers on Unix and Windows systems. DB2 also.

      Oh, and I don't like MS numbers. When scalability or performance is required I'll recommend Oracle or DB2 over SQL Server any day of the week.

      But to think that the bench are taylored to MS because they are members. They are as much taylored to MS as they are to Oracle, HP, IBM, Sun (well maybe not Sun, you'be need major tayloring to make Sun look good on any bench :-)).

      You'll see open-source DB vendors join tpc.org when their software reaches the level of performance needed to show decent numbers on _current_ TPC benchmarks (I'm sure TPC-C will be replaced as it becomes increasingly irrelevant). Until then Op-Src zealots will feel the need to spread FUD about tpc.org.

    15. Re:The Problems with Benchmarking like this... by Anonymous Coward · · Score: 0

      This is not true; Sun is there as well.

  3. A useful linux speedup guide by Anonymous Coward · · Score: 1, Funny

    is here

    Some howto's include recompilering the kernal, enabling UDMA, turning off logging and enabling MMX enhancements.

    1. Re:A useful linux speedup guide by Misanthropic_one · · Score: 1

      No wonder M$ apps are so slow and bloated...

      All their programmers are out pretending to be helping *nix admins. :)

    2. Re:A useful linux speedup guide by Gordonjcp · · Score: 1

      I tried it, but I got this slow, ugly 80s throwback operating system that didn't do DMA, had no logging, and had piss-poor hardware support. Then to top it all off, I had to keep phoning this guy up to ask if I could use it.

      "Bollocks to that", I thought, and put Unix back on.

    3. Re:A useful linux speedup guide by Anonymous Coward · · Score: 0

      I tried installing linux, but I got this slow, 70s throwback operating system that didn't do office apps, had no games, and had piss-poor hardware support. Then to top it all off, I had to keep asking elitist zealot arseholes on public message boards to ask how to get it working.

      "Bollocks to that", I thought, and put Windows back on.

    4. Re:A useful linux speedup guide by Dave2+Wickham · · Score: 1

      Amazing. An IP which isn't a goatse link :P.

  4. Actually finding the performance problem? by Chatz · · Score: 5, Interesting
    It would be great to see a follow up/some examples on how these tools are used to actually track down a performance problem. I have and I have seen many others take some performance data and make completely the wrong judgement about what is the expected behaviour, what is the bottleneck, and what to do to fix it.

    I was also suprised to see that they still use some of the old performance monitoring tools like looking at /proc, and other ascii tools, rather than something like PCP that collects all these statistics together so that you can look at any combination of subsystems on the same time line. Then they could have graphs showing the interraction and load on the disk, cpus, vm, network etc.

    --
    There is folly and foolishness on the one side, and daring and calculation on the other. - Admiral Pellew, Hornblower
    1. Re:Actually finding the performance problem? by spongman · · Score: 3, Insightful

      I agree. This article is essentially useless. They're basically saying "hey look, we made it faster, wohoo!" but they completely gloss over the details of how they did it. Where's the cumulative patches against various stock kernels?

    2. Re:Actually finding the performance problem? by Chatz · · Score: 5, Informative
      That's probably a bit harsh, both IBM and SGI have worked pretty hard to get scalability improvements into the linux kernel. The article does mention some of these things:

      Some of the issues we have addressed that have resulted in improvements include adding O(1) scheduler, SMP scalable timer, tunable priority preemption and soft affinity kernel patches.

      --
      There is folly and foolishness on the one side, and daring and calculation on the other. - Admiral Pellew, Hornblower
    3. Re:Actually finding the performance problem? by gmack · · Score: 1

      That's because the changes were merged into the 2.5.x kernel series.

      For the list of things IMB has had a hand in lately: There was the above mentioned 0(1) schedular, lockless PID allocation, faster threads, IRQ load ballancing improvements and the retooling of several drivers' SMP locking. That's just what I can remember without actually going through my kernel archives.

  5. Call me incredibly stupid, but.. by Subjective · · Score: 5, Insightful
    Wouldn't we (always) want to improve the Linux kernel performane in comparison to itself?

    Why is what we compare it to the most important issue?



    Sure, we want to see how the Linux kernel is performing, but that's unrelated to increasing it's performance - when working on the performance of a single part, people built a test for that part, and tweaked it.

    No benchmark or comparison is required in this case.

    --
    My other .sig is also this bad
  6. Re:Huh? by Chatz · · Score: 4, Informative
    Which you need to know before interpreting the results...

    I have to disagree, I thought Figure 3 illustrated how important it is to baseline to ensure that you are heading in the right direction with each change you make (although they did not have a uni-processor baseline result).

    It also showed that with the changes in June they are able to get a 4 times performance with another 7 cpus. Maybe next time they will show how it scales over the number of cpus you have.

    --
    There is folly and foolishness on the one side, and daring and calculation on the other. - Admiral Pellew, Hornblower
  7. Use the build-in benchmark tool by Anonymous Coward · · Score: 1, Informative

    >time make clean bzImage modules
    [...]
    real 6m2.519s
    user 5m13.950s
    sys 0m20.080s

    => efficency: 93.6%
    (2.4.18,xfs,ide)

    1. Re:Use the build-in benchmark tool by aardvarkjoe · · Score: 1

      real 1m49.162s

      user 1m35.010s

      sys 0m6.030s

      Hah!

      --

      How can we continue to believe in a just universe and freedom to eat crackers if we have no ale?
    2. Re:Use the build-in benchmark tool by Anonymous Coward · · Score: 0

      "time make clean bzImage modules", isn't that something that you start before you go to bed, wake up, kill off the normal cron jobs when the load average reaches 50.00 (on a uniprocessor box), go to school/work, and wait for it to be ready in a day or two? (based upon actual experience compiling 2.4 kernels)

  8. Is there deviation? by rufusdufus · · Score: 4, Insightful

    These benchmarks, like so many you see nowadays, do not include or even mention deviation across benchmark runs. There is no evidence that the tests were run more than once in order to achieve a more statistically accurate view of the benchmark numbers.
    In theory, all benchmarks should come with an average value, and an error margin. Without this, the data should be not be trusted. It not only implies that the margin of error *might* be over 100%, it indicates the people running the bench marks don't know what they are doing.

    There are a lot of reasons benchmarks can have errors, one of them being the benchmark program itself can be broken. How would you know that the numbers returned on some test weren't random if you didnt run it more than once?
    Also, disk drives and networks have latencies which can make a huge difference; those difference can wash out apparent benefits of OS tweaks.

    1. Re:Is there deviation? by Ramses0 · · Score: 1

      It wasn't some guy in his garage, and the data was presented in graph form (not chart). You can assume that for each data point, the testing methodology was the same, and that the trend-line results were the most important pieces. Besides, they admit up front that they're only trying to improve kernel performance, not guarantee that Apache version $foo.0 can server 2000 hits per second.

      --Robert

    2. Re:Is there deviation? by Anonymous Coward · · Score: 0

      Oh please. This isn't The Middleware Company we're talking about here. Having worked at a large blue company myself, I can assure you that the test wasn't run by a guy on his laptop, just for the heck of it. And if the results are going to be published, you can be assured that they were reviewed by at least peers and probably managers, as well.

  9. More attention to IO needed by dmeranda · · Score: 5, Interesting

    Why does it seem that all these benchmarks are primarily concerned with CPU performance or network throughput or single-disk reading and writing? For a large category of enterprise applications (which this paper says it is trying to address), I/O performance can usually be the most important part.

    The problem is that the typical PC hardware is just not designed for that. Large proprietary Unix or mainframe systems usually have multiple very high speed buses; a single 32-bit PCI bus is rather low-end in comparison. Now of course this is not Linux's fault; but then again Linux is not just a PC operating system! So I guess my question is if this is about benchmarking Linux for enterprise use, how about some information about Linux running on enterprise-class hardware rather than suped up PC's. I'm sure IBM must have a few resources there.

    In particular I'm interested in how the Linux kernel is designed to handle multiple independent I/O buses. Are the I/O schedulers weighted down with locking issuesor interrupt contention. Or what about the allocation of memory buffers between faster and slower I/O devices. Or even it's support for advisory I/O operations (hinting) that some proprietary OS's provide? What about asynchronous I/O?

    And of course Linux suffers from the general Unix philosophy when it come's to giving I/O the same level of attention as CPU. For instance there are lots of processor use controls, such as process nice levels, processor affinities, real-time schedulers, threading options galore, etc. But how do you say that a given process may only use 30% of the I/O bandwidth on a particular bus? And those are things that mainframes were good at, so how does Linux on mainframes compare?

    1. Re:More attention to IO needed by g4dget · · Score: 4, Informative
      In particular I'm interested in how the Linux kernel is designed to handle multiple independent I/O buses.

      By running multiple kernels. Seriously: the way to get great performance out of PC hardware is to buy lots of it and cluster it. You still end up paying less for more performance than with the high end systems.

    2. Re:More attention to IO needed by virtual_mps · · Score: 4, Informative

      The problem is that the typical PC hardware is just not designed for that. Large proprietary Unix or mainframe systems usually have multiple very high speed buses; a single 32-bit PCI bus is rather low-end in comparison.


      A single 32 bit PCI bus is anemic these days. That's why high-end servers based on ia32 processors include multiple PCI busses, increasingly PCI-X (133MHz, 64bit). Note that servers based on other processors are increasingly moving away from proprietary busses and using the same PCI you'll find in those intel-based systems.
    3. Re:More attention to IO needed by Anonymous Coward · · Score: 0

      > Seriously: the way to get great performance out of PC hardware is to buy lots of it and cluster it.

      You missed the point, in more than one respect. His point was "how is linux doing on higher end hardware," not on low-end PCs.

      There are plenty of problems that don't scale well on clusters. There are problems that are only easily solved on large machines. There are certain problems that need the I/O that large machines will provide. ( http://slashdot.org/article.pl?sid=02/12/04/013252 &mode=thread&tid=137 ) If you don't understand that, it shouldn't take too long to find someone to help you understand how 16 Athlon servers with 1 Gb of memory and fast ethernet is not quite in the same league as a mid-range system with 16 CPUs and 16 Gb of memory for certain classes of problems.

      If the linux community doesn't want linux to have its ass handed to it in perpetuity when trying to solve those problems then linux has got to scale with machine size. Of course, if you don't mind that, then Microsoft, HP, Sun, IBM, SGI, ..... will all thank you. And, along the way, linux will gain a poor technical reputation, and will start to be ignored for certain classes of problems, the list of which will likely grow. (I haven't personally noticed any tendency for people to build larger clusters or larger machines to solve simpler problems... have you?)

      Also, you are making the faulty assumption that what you perceive as a high-end feature will never migrate down to low-end PCs. (Exactly how common were: 3Ghz processors, 1 Gigabyte memory, multithreaded CPUs, dual CPUs, 64 bit CPUs, 64bit busses, 533Mhz system busses, 200Gb disks, 128Mb graphics cards, etc. etc. etc. 10 years ago? Oh.)

      I think that you are also forgetting that linux had to grow just to get to the point where it could cluster. That wasn't always the case you know. There was a point linux would just boot and drool over itself. It took real work to get it to talk to the network. Then to cluster. Next to scale.

      Linux has 3 choices: lead, follow, or get pushed out of the way.

    4. Re:More attention to IO needed by Bernie+Fsckinner · · Score: 1

      But what if you are running a big database? Building a database cluster is not exactly simple.

    5. Re:More attention to IO needed by g4dget · · Score: 1
      There are a bunch of commercial products that make building distributed data bases fairly easy. IBM promises that with one of their DB2-based products, you basically just plug in a new machine and point it at the master database server.

      Some open source equivalent would sure be nice. But even something homegrown for particular applications isn't too hard; usually, you can find an obvious field pretty easily to distribute and balance database content to different servers by.

  10. Can Linux become Mozilla? by Anonymous Coward · · Score: 2, Interesting

    I've been reading the comments from some Mozilla people ever since Apple came out with Safari based on KHTML, and it's been suggested that the bloat and delay of Mozilla comes from too many developers. Makes me wonder if Linux will succumb to the same problem.

    1. Re:Can Linux become Mozilla? by snofla · · Score: 1

      Sure, here are the first signs. Hard to believe? Even Bob X. Cringely says "Even today, you can still get to a C: prompt under Windows XP, which means a disk operating system is hiding there no matter what Microsoft wants us to believe." I don't know what he means with this, but it has DOS in it.

      --
      i don't like style guides
    2. Re:Can Linux become Mozilla? by squirmee · · Score: 2, Interesting

      The developer momentum behind Linux is somewhat more diffuse than in Mozilla. There are thousands of device drivers to build and maintain, for instance. Work performed on those device drivers doesn't "bloat" the main kernel, but does drive up the developer count substantially.

      Not to say that featuritis isn't a threat. But ironically, the very "disadvantage" of Linux, its monolithic design that microkernel hackers love to bash, is making it pretty hard to add new features willy-nilly. If we were using the HURD, the kernel would be 900 megs by now... (and Emacs would be a kernel module)

    3. Re:Can Linux become Mozilla? by giel · · Score: 2, Interesting

      I'd say that's a rather strange conclusion. The only thing the 'C:' means there is a non graphic shell to use the OS.

      My cellphone has something called Explorer, very similar to the M$ one. You can browse somekind of filesystem with it. Does that mean it's running windows? Does that mean there is a disk in it? Download cygwin and then windows can come up with '/root/ $:'.

      --
      giel.y contains 2 shift/reduce conflicts
    4. Re:Can Linux become Mozilla? by jas79 · · Score: 2, Funny

      If we were using the HURD, the kernel would be 900 megs by now... (and Emacs would be a kernel module)

      Since when is having a choice from 900 megs of kernelmodules bad?

    5. Re:Can Linux become Mozilla? by Anonymous Coward · · Score: 0

      Windows XP allows you to operate your disks!?! Wow!! -- what an amazing revelation -- when is Linux going get disk operating system features?

      Both you and Cringley should be sent to the concentration camps for stupid people.

    6. Re:Can Linux become Mozilla? by ostiguy · · Score: 2, Informative

      Just because it is in print does not mean it is true. Cringely is wrong on a lot of things, and this is one of them. MS maintained the drive lettering construct for backwards compatibility purposes, and with each revision, there are fewer and fewer limitation because of it (in 2k, ms introduced the ability to mount drives as folders in existing file systems, a unixy like feature). In the nt/2k/xp family, dos programs can only run in a dos virtual machine (NTVDM), somewhat similar to the java model.

      ostiguy

  11. For simplicity's sake by r6144 · · Score: 5, Informative
    When running with multiple CPU's, the kernel instances running on these CPUs need regulation when they access shared data. Such regulation is usually implemented with locks. A simple approach is to use a small number of "big" locks (like a lock that makes sure that only one CPU can run actual kernel code). This is very simple and easy to debug, but may cause poor performance because one CPU cannot (for example) do network transfers while another is reading disk, while this should be allowed in principle. So we should use finer-grained locks. However, as we make locks more and more fine-grained, we have more and more locks, so things get messy, hard to debug, and locking/unlocking overhead goes up to make performance degrade for fewer-cpu machines. Because of such a cost, we should make locks finer-grained when it actually improve performance much according to benchmarks.

    Of course this applies to something else, like making transfers zero-copy, too.

    1. Re:For simplicity's sake by the_bean42 · · Score: 1

      Isn't it possible to just #ifdef all the locks or something ? So they won't get compiled in for uni-kernels.

    2. Re:For simplicity's sake by Anonymous Coward · · Score: 0

      Listen you ignorant boob. Read what he said! He said (quite clearly, which is rare on slashdot) that the problem is adding locks make you slower than a uniproc machine. Nothing about the locks being in single proc machines. You should really finished middle school before you start reading slashdot.

  12. Benchmark junkies by Anonymous Coward · · Score: 1, Funny

    Benchmark junkies are abound, around and have wet dreams over these articles.

    I am one of them.

    Please, mooore!!!

  13. Usually not necessary by r6144 · · Score: 5, Insightful
    I have installed linux several times, on different machines, now (mostly redhat). UDMA settings are almost always right on modern machines. The only exception is an old P166 machine with a very old HD, where the original kernel 2.2 does not support DMA on it, but 2.4 do (transfer rate 5MB/s -> 10MB/s). Fussing with the kernel usually doesn't give much benefit, and is definitely not for newcomers.

    Things actually useful are: disabling unnecessary services on startup (if you don't use atd, don't start it to save start-up time, and in many machines it is unnecessary to detect hardware changes using kudzu upon startup); for machines with multiple HD's, put the swap on the faster HD.

    1. Re:Usually not necessary by modulo · · Score: 1

      Over the last two years on a production database server running RedHat, I found that I needed to recompile the kernel quite a few times to get new hardware support/bugfixes not found (at the time) in the stock RedHat kernel. Support for the Promise controllers on Asus motherboards, for example, tends to lag a few months behind the appearance of the hardware, the first patches, and support in the vanilla kernel release. More recently, the latest RedHat kernel (2.4.18-19.7.x) for Athlon does not enable IO-APIC, because apparently it locks up some laptops. Well, I think it's a good idea for my server, so away I go getting kernel-source.rpm.

      Now, I respect the testing and validation RedHat provides with their kernels, so I use them when I can. Arguably, if I would use more server-oriented hardware it wouldn't be an issue, but my budget is, to put it mildly, modest.

      But you're right in the sense that there is probably little to be gained in saving, say, 50KB in your bzImage by cutting out drivers that you don't use, etc. At least I don't see it subjectively, maybe somebody else can volunteer some benchmarks, but I think the attitude that you can really see the difference by recompiling your own kernel for performance is a holdover from the days when the major distros only compiled for i386 and memory was a whole lot tighter.

      --

      ...but the language is MUMPS, which I will not utter here

  14. Measurement - Lord Kelvin said it best by Anonymous Coward · · Score: 4, Insightful

    "In physical science the first essential step in the direction of learning any subject is to find principles of numerical reckoning and practicable methods for measuring some quality connected with it. I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the state of Science, whatever the matter may be."

    1. Re:Measurement - Lord Kelvin said it best by bubbha · · Score: 1


      but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind

      This kind of knowlege is called Enlightenment

      --
      I want to be alone with the sandwich
  15. well duh, the article is about benchmarking by DrSkwid · · Score: 1

    members of the IBM Linux Technology Center share their expertise as they describe how they ran several benchmark tests

    Notice not "IBM share Benchmark testing results"

    --
    There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
  16. Sub kernel? by Jedi+Binglebop · · Score: 3, Interesting

    What about kernel developers creating a sub version of the kernel (so that only those who choose to use it) to log and relay information on performance of that kernel on various users' machines?

    Is this a bad idea? Would it take too many hours of extra work?

    -JB

    --

    "I love deadlines. I love the "whooshing" sound they make as they pass by." - Douglas Adams.

  17. Benchmarking for interactivity by ehack · · Score: 5, Interesting

    I wish there were some interactive workload benchmarks - I know this is history, but when the kernel went 1.2 I found my machine really slow; the benchmarks were better but somehow the usability had gone down. It would be neat to measure the way the mouse tracking feels, the "snap" with which menus open in an application, Netscape getting a page and rendering it, etc. . Kernel compilation and numerics are not the main use of a desktop machine these days ...

    On a related note, my Mac Powerbook was really sluggish until I managed to kill some unneeded processes; they weren't really eating up time by themselves, but were somehow impacting system reactivity: The load factor hardly moved but the system became responsive to mouse clicks.

    --
    This is not a signature.
  18. I/O vs CPU & Modularised Linux by AtomicX · · Score: 5, Interesting

    I agree that I/O is a weakness of Linux currently and that it needs a lot more attention. CPU speeds and the ability of Linux to make the most of the processor is very good and has already been very well developed. With CPUs having advanced as far as they have in the past few years means that the CPU is no longer the main bottleneck of the system. I/O technologies have stood pretty much still with only small advances, so no wastage or inefficiencies on the part of the OS are acceptable.

    It is a pity that Linux like Unix developers have become a little stuck in their ways - hopefully they will do their best to address this in the 2.6 and 3.0 kernels.

    I like the idea of a modularised kernel, where people could use the I/O system that best suited their setup - but this could involve an awful lot of division and arguments and the number of bugs that would result could be huge. Perhaps Linux itself could automatically adapt the way it works more to suit its needs - hence solving the problem of Linux hugely varying performance. Does anyone else have any suggestions or comments on this?

    1. Re:I/O vs CPU & Modularised Linux by Anonymous Coward · · Score: 1, Interesting

      Maybe a change to the IO loading would be a good reason to take Linux Kernel to 3.0?

  19. mod_specweb by e__alf · · Score: 1

    "Some of the issues we have addressed that have resulted in the improvements shown include adding O(1) and read copy update (RCU) dcache kernel patches and adding a new dynamic API mod_specweb module to Apache."

    Uhmm... isn't this considered cheating?

    source code for the patch

    1. Re:mod_specweb by Anonymous Coward · · Score: 0

      Not if your trying to benchmark the kernel, instead of the webserver.

    2. Re:mod_specweb by Anonymous Coward · · Score: 0

      Ah. I stand corrected. Thank you :-)

  20. Not THAT bloated. by r6144 · · Score: 2, Interesting
    As slow and bloated as mozilla? Probably not. Although the code does look a bit messy and bloated in some places, a bit like sendmail or gcc (i.e., code size and speed may be good, but there is still a lot of code that is not easy to maintain).

    Mozilla uses C++ (and most methods are virtual) and component interfaces like XPCOM. Such things probably enhance developer's productivity, but they incur quite a bit of overhead in code size and (less so) in speed.

    It is great that core developers actually care about code size and instruction-level speed (such as the recent syscall patch, or those highly optimized inline functions in headers), and there are many people sending patches to clean up code. Maybe linux won't get as bloated as mozilla after all...

  21. The fastest kernel confiuration is... by Anonymous Coward · · Score: 1, Funny

    rm -Rf /

  22. most valuable .comoddity by Anonymous Coward · · Score: 0

    that would be integrity. it can't be bought. it diminishes as greed/fear based .coNTracting eXPands. it has become VERY scarce DOWn here. integrity is the goaled standard/fuel oil of the gnu millennium. see you there.

    tell 'em robbIE.

  23. Now there's a thought, run it through wine. by Anonymous Coward · · Score: 0

    I wouldn't be a slashdot poster if I checked whether it's been done.

  24. Re:to improve linux kernel performance by danoaks15 · · Score: 1

    I agree

  25. I don't mind what/who the source is. by Anonymous Coward · · Score: 0

    If they give full disclosure of their methods and configuration so that anyone can reproduce the results. An example of this not being the case: sysmark/bapco/intel.

  26. contributed kernel improvements by Rebar · · Score: 1

    There is some interesting info at the bottom of this page outlining some improvements Oracle and RedHat have made to this linux kernel regarding things such as SMP processor affinity and asynchronous I/O. Presumably these are open source changes -- the artical doesn't mention them at all.

  27. Something like the contest benchmark? by salimma · · Score: 2, Informative

    The contest benchmark might be what you are looking for. It tests system responsiveness by running kernel compiles under different kinds of load.

    Still based on kernel compiles, granted, but at least it tries to measure responsiveness. Been used heavily to benchmark recent kernels - check Kernel Trap for results.

    The Linux scheduling latency page of Andrew Morton might be useful as well. Alas, kernel patches tend to work on x86 first before PPC..

    --
    Michel
    Fedora Project Contribut
  28. Volanomark by Anonymous Coward · · Score: 0

    It's funny how these people think that running Volanomark in loopback mode stresses the Ethernet driver.

  29. Not very scientific, not very informative by NynexNinja · · Score: 1

    The article lacks substance, specifically what did they tune to arrive at those results they claim. None of that basic information is included in the report.

  30. Interesting ideas about performance profiling by Featureless · · Score: 2, Interesting

    The IBM paper is interesting, but beyond doing these straightforward kinds of measurements, I can think of a lot of better approaches to improving kernel and core application performance, based on research I've seen... When I was doing profiling work on supercomputer stuff a few years back I surveyed the tools and found some systems that use really novel approaches which could definitely be adapted to this purpose. I suppose word doesn't really get out about some of this stuff; anyway, take a look and see for yourself:

    S-Check

    S-Check starts with your original source code and points suspected of being bottlenecks. It adds artificial delays at the specific points throughout the parallel code. These delays can be switched ON or OFF. The switched delays generate numerous new versions of the program, with the delays simulating adjustments in code efficiency. S-Check methodically executes the many variants, recording delay settings and corresponding run times. S-Check analyzes the recorded entries against a linear response model using techniques from statistics. The results are a sensitivity analysis from which program problem areas can be identified. This provides a portable, scalable, and generic basis for assaying parallel and network based programs.

    Paradyn
    (overview)

    "...a heuristic, goal-seeking algorithm was coupled with a dynamic instrumentation package to drive an automated, systematic inquiry into the performance of a parallel application."

    The upshot is tools which can instrument a running system on the fly, and use statistical techniques that identify "hot spots" by looking for the amount of "collateral damage" when adding artificial delays to a particular location. You can even go farther, mapping out relationships, etc.

    These are approaches that came out of parallel supercomputing, because in that field traditional approaches to benchmarking and profiling are often useless and/or impractical, and the systems (and programming problems) have become so complex that effective hand tuning becomes nearly impossible as well. Of course the kernel isn't so simple either, and these days you have parallelism to boot... I would love to see these techniques solving a wider range of problems.

  31. With apologies to "Friends" by stor · · Score: 1

    And in conclusion, graphs are going up... so I'm happy.

    Cheers
    Stor

    --
    "Yeah well there's a lot of stuff that should be, but isn't"
  32. It is very interesting who wrote the story by mAriuZ · · Score: 1

    Look at the emails of the peoples :
    2 from ibm and one from AMD . It seems amd is
    looking at intel boxes ?? "The architecture used for the majority of this work is IA-32 (in other words, x86), from one to eight processors. We also study the issues associated with future use of non-uniform memory access (NUMA) IA-32 and NUMA IA-64 architectures."
    Hmm i am shure the next hammers could do the NUMA maybe they try do do it better in linux .

    --
    developer http://flamerobin.org
  33. Linux is dying ... by Anonymous Coward · · Score: 0

    well, it is.

  34. The most succesful OpenSource project yet?? by Anonymous Coward · · Score: 0

    Isn't XFree86 and in fact X11 the most successful OpenSource project EVER??

  35. Spelling by vocaro · · Score: 1

    I don't trust any article that calls it a "kernal".

  36. Desktop performance... by oliverthered · · Score: 1

    One thing that's hard to measure is desktop performance.

    I have a crap all in one mobo, with shared memory Graphics without DRI support (ok i needed a pc quick), KDE is super clunkey under 2.4, with the CK performance patchset.

    Under 2.5 the desktop is quick and smooth, applicartion seem to load a lot faster, Java applets don't hog the CPU.

    So, if your running linux on the desktop, and you feel sufficiently compitent. Start testing 2.5.

    --
    thank God the internet isn't a human right.