Improving Linux Kernel Performance
developerWorks writes "The first step in improving Linux performance is quantifying it, but how exactly do you quantify performance for Linux or for comparable systems? In this article, members of the IBM Linux Technology Center share their expertise as they describe how they ran several benchmark tests on the Linux 2.4 and 2.5 kernels late last year. The benchmarks provide coverage for a diverse set of workloads, including Web serving, database, and file serving. In addition, we show the various components of the kernel (disk I/O subsystem, for example) that are stressed by each benchmark."
I'm just curious what they are quantifying performance against. Everything here seemed to be strictly on the Network side of things. Are they trying to increase the actual Kernal processing of the individual threads for the network applications (File Serving, DB, and Webserving), or are they just measuring the eff. for the processing of data packets for the services.
It sounds interesting, but it looks like the tuning is done specifically on the IBM platform, which makes me wonder. Linux already blows and MS product away for these applications, so I'm curious what they are comparing the results to. Did they just take an arbitrary point (processor load) for specific applications, or are they creating a specialized measurement (like SysMarks in Windows) that is only valid in their test suite.
Anyway, it should be interesting to see where it ends up, eventually.
I agree. This article is essentially useless. They're basically saying "hey look, we made it faster, wohoo!" but they completely gloss over the details of how they did it. Where's the cumulative patches against various stock kernels?
Why is what we compare it to the most important issue?
Sure, we want to see how the Linux kernel is performing, but that's unrelated to increasing it's performance - when working on the performance of a single part, people built a test for that part, and tweaked it.
No benchmark or comparison is required in this case.
My other
These benchmarks, like so many you see nowadays, do not include or even mention deviation across benchmark runs. There is no evidence that the tests were run more than once in order to achieve a more statistically accurate view of the benchmark numbers.
In theory, all benchmarks should come with an average value, and an error margin. Without this, the data should be not be trusted. It not only implies that the margin of error *might* be over 100%, it indicates the people running the bench marks don't know what they are doing.
There are a lot of reasons benchmarks can have errors, one of them being the benchmark program itself can be broken. How would you know that the numbers returned on some test weren't random if you didnt run it more than once?
Also, disk drives and networks have latencies which can make a huge difference; those difference can wash out apparent benefits of OS tweaks.
Things actually useful are: disabling unnecessary services on startup (if you don't use atd, don't start it to save start-up time, and in many machines it is unnecessary to detect hardware changes using kudzu upon startup); for machines with multiple HD's, put the swap on the faster HD.
"In physical science the first essential step in the direction of learning any subject is to find principles of numerical reckoning and practicable methods for measuring some quality connected with it. I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the state of Science, whatever the matter may be."