5 Years of Linux Kernel Releases Benchmarked
An anonymous reader writes "Phoronix has published benchmarks of the past five years worth of Linux kernel releases, from the Linux 2.6.12 through Linux 2.6.37 (dev) releases. The results from these benchmarks of 26 versions show that, for the most part, new features haven't affected performance."
Considering the efforts going into VM these days and the massive deployments in Fortune 500 companies, the performance of VM based systems is predictable. All the testing with Phoronix Test Suite is repeated until there is less than 3% variance between the results - or the result set is discarded.
Realistically, looking at older kernels on modern hardware is actually a very critical dimension for corporate server environments. There are applications in that space that are deployed and supported only on some old distribution. Being able to achieve and understanding how Red Hat 7.1 will act vs Red Hat 5 is critical for some environments.
I love that Phoronix is willing to take the time to run tests like this. I just wish they'd learn how to run meaningful tests. For instance, why are they testing a bunch of CPU-bound things? Kernel won't affect that unless we're talking about SMP performance. If you want to test the kernel, test how well it handles SMP, network I/O and disk I/O. And bear in mind that disk I/O will be hugely affected by which filesystem is used and its configurable settings.
Another problem with their article is that it tests individual kernels. Most folks don't use a vanilla kernel. They use one provided by their distro, which may have distro-specific patches that address some of the performance problems (or add new ones). What I would have preferred to see is a comparison of different distro releases over the last 5 years, focusing on the most popular ones (say Ubuntu, Fedora and SuSE).
The meaningful tests (and their results) were:
1. GnuPG: avoid 2.6.30 and later.
2. Loopback TCP: avoid 2.6.30 and later.
3. Apache Compilation: avoid 2.6.29 and earlier.
4. Apache static content: avoid 2.6.12, 2.6.25, 2.6.26, then 2.6.30 and later.
5. PostMark: avoid 2.6.29 and earlier.
6. FS-Mark: avoid 2.6.17 and earlier, 2.6.29, then 2.6.33 to 2.6.36.
7. ioZone: unless you're willing to run 2.6.21 or earlier, avoid 2.6.29 and you're fine.
8. Threaded I/O: avoid 2.6.20 and earlier, 2.6.29, then 2.6.33 to 2.6.36.
Based on these results, #1 and #2 seem to be testing the same thing, and tests #3 and #5 seem to be testing the inverse of whatever that thing is. 2.6.29 seems to be especially crappy, performing worse than the kernels immediately before and immediately after it on tests #6, #7 and #8. In terms of recent kernels, tests #6 and #8 suggest a regression in 2.6.33 that has been resolved in 2.6.37.
If it were me, I'd look at either running 2.6.37 (when its released) or fall back to 2.6.32 if my hardware was supported.
Better
Worse
Same