Kernel Benchmarks

Nice graphwork by Anonymous Coward · 2001-05-09 10:08 · Score: 3

http://euclid.nmu.edu/~benchmark/null_call.gif

This shows why computer guys are not scientists. My first year phys chem prof would tear his own arm off and beat you to death with it if you gave him a graph that looked that ugly.

The Excel defaults may be ugly, but you can change them.

GCC optimizations and benchmarking by Anonymous Coward · 2001-05-09 08:47 · Score: 5

One problem with benchmarking is the optimizations settings for GCC. GCC is very sensitive to the proper choice of optimizations. Several years ago I did an extensive test of GCC using the Byte benchmark suite. I experimented with the various optimizations settings. The most important were the settings of -malign-jumps -malign-loops and -malign-functions. These flags each take a numerical argument representing a power of 2 on which the object will be aligned.

Thus "0" indicates byte alignment, "1" word (16 bit) alignment, "2" doubleword (32 bit), "3" quadword (64 bit), and "4" paragraph (128 bit). The other optimization of interest is the "-O" setting. Here arguments can take the value of 0, 1, 2, or higher. Personally, I found that -O2 was not necessarily the best setting, although it seems very common to find it set to that in Makefiles. I found using -O1 and tuning the alignment optimizations by hand provided better results.

My findings by benchmarking all the combinations of settings were that for a Cyrix 5x86, optimal alignment values were lower numerically lower than might be expected. For example, close to optimal settings as I recall were:

gcc -O1 -m386 -malign-jumps=1 -malign-functions=1 -malign-loops=1

It wouldn't be a bad starting point for any Intel processor. On modern processors, it is more important to achieve high cache hits, which is thwarted by certain wrong optimizations such as aggressive loop unrolling and excessive alignment. One particular setting to avoid is -m486. It should be avoided for most processors other than a 486, because the 486 alignment requirements are less than optimal (i.e. tends to over-align) for both its predecessors and descendents. And if you don't need a debugging version of your code -fomit-frame-pointer is usually always useful as it frees up an extra general purpose register.

Re:GCC optimizations and benchmarking by The+Famous+Brett+Wat · 2001-05-09 10:19 · Score: 5

...which just goes to prove that optimization is (justifiably, as it happens) much -maligned.

--
proof, n. A demonstration that a conclusion is implied by certain premises and axioms.

Re:Yeah, but... by PD · 2001-05-09 09:05 · Score: 4

That's a lot of work just to print out a negative number on your screen...

--
If tits were wings it'd be flying around.

silly graphs by rangek · 2001-05-09 08:25 · Score: 4

Silly graphs is a pet peeve of mine. I hate it when my students give me graphs like these. Needless gridlines, unlabeled legends, connected dots, and poor statistical analysis.

I hate gridlines and they usually distract from the graph
what the fuck is "Series 1". For Christ sakes, take a minute and either delete the needless legend or at least overwrite the stupid defaults to make them meaningful
Connecting the dots means something. If you plot linux 2.1.1 and linux 2.1.14 and draw a line or someother curve between these points, you are telling me that if I pick up linux 2.1.7 it will lie on that curve. That is not a correct interpretation of this data.
Most of these graphs contain a curve labeled Expon or something (once again, great legend). Why exponential. Why not some polynomial or some other function. What is the error in the fit/correlation coefficient(s). Just tell me something that gives me a reason to believe that this curve means something.

I also find it ironic that they used MS Excel (which they don't say they did, but it sure looks like it)...

This benchmark was not that useful by cartman · 2001-05-09 08:11 · Score: 4

First, the university benchmarking team simply ran lmbench (a free, popular, old kernel benchmarking utility) on a variety of kernels. Claiming that:

Three students and a professor from Northern Michigan University spent the semester benchmarking a bunch of Linux kernels

...somewhat exaggerates this accomplishment

Second, no data were presented on the main areas of the kernel that were improved. How is SMP performance in kernel space? Did the finer grained locks help? How is the performance from the threaded IP stack? Does it prevent IO blocking?

THAT kind of information would have been interesting. They tested only things that the kernel has done forever.

Re:This benchmark was not that useful by Baki · 2001-05-09 14:11 · Score: 3

Another thing making this benchmark useless is that it only tests Linux performance under no-load conditions (i.e. the benchmark is the only thing that runs), it doesn't tell anything about scaleability and keeping up performance under heavy load.
And that is exactly the point that Linux is often criticized for, compared to competitors (Solaris, FreeBSD): it may perform well under no- or light-load conditions, but it doesn't scale well. It would have been interesting to check whether this criticism is still valid for the 2.4 kernels.

Re:Quite limited really by Black+Parrot · 2001-05-09 13:01 · Score: 3

> I definitely noticed a jump in performance between 2.2.16 and 2.4.0 so they must be missing something here.

I use a "real world" benchmark (which of course might be completely irrelevant to you, however relevant it happens to be to me).

Here are some recent observations regarding this specific benchmark, ranked in order of effect:

Changing BIOS memory setting from CAS 2 to CAS 3 : 3.7% speedup.
Changing to a different brand motherboard, and matching the original's BIOS settings as well as possible : 2.1% speedup.
Upgrading 2.4.3 to 2.4.4 : 1.1% speedup.
Running under kernel compiled as "Athlon" rather than "i686" : no substantial difference.

Moreover, although I have not had time to test it, a well-informed friend tells me that using certain recent versions of gcc rather than certain older ones can give a whopping 30% slowdown, even using the same flags for compilation. (N.B. - He did not say "gcc is getting worse with time". He merely remarked re two specific versions, whose numbers escape me at the moment.)

If performance tuning is your forte, then clearly you've got your work cut out for you.

--

--
Sheesh, evil *and* a jerk. -- Jade

Re:Yeah, but... by the+eric+conspiracy · 2001-05-09 10:40 · Score: 3

Over three years it's still positive.

Re:Yeah, but... by the+eric+conspiracy · 2001-05-09 08:49 · Score: 4

Every evening I run a disk/memory intensive program that does a 3 year analysis of the US stock market. When moving from 2.2.x to 2.4.x I obtained a run time decrease from 270 to 190 seconds. This to me was a VERY impressive upgrade. The same code running on Win2000 takes 1300 seconds to run.

Re:Yeah, but... by the+eric+conspiracy · 2001-05-09 10:36 · Score: 4

It's the same code running on the same box - a dual P2 400 with 0.5 GB of RAM. No ifdefs. Programs are invoked from the command line. Relatively small results datasets are saved to files. Because of the size of the input dataset, and the crappy indexes the main performance determinant is the efficiency of disk i/o and buffering thereof.

For this application the 2.4 kernel kicks butt up and down the street all day. YMMV.

Another study: by MrClean · 2001-05-09 12:09 · Score: 3

Annother more extensive linux evolution study is at:
http://plg.uwaterloo.ca/~migod/papers/icsm00.pdf

Re:Devices by CJ+Hooknose · 2001-05-09 09:47 · Score: 3

What I wish is that hardware manufacturers would just use one standard interface, then only one driver for each device would be necessary. Impossible you say? Look at current modems, old sound cards (all sound blaster compatible), NE2000 network cards (I won't buy any other kinds) ATAPI CD-Roms....

Yeah, right. The problem with this approach is that it leads to unnecessarily narrow definitions of functionality, and can prevent hardware manufacturers from doing things cheaper. Not only that, but the examples you chose are kind of screwy. "Current modems" without a qualifier implies the N+1 varieties of WinModems out there, which all do things differently. Many old sound cards did things their own way and had a small DOS TSR that provided SB compatability in software. The floppy, IDE, and ATAPI command sets, as well as the RS232 serial-port standards, are published and standardized, but these are properly communications protocols between devices, not the devices themselves. The PCI and ISA busses are, again, more like protocols to allow devices to communicate rather than devices themselves. I don't see too many non-PCI, non-ISA devices that plug into the insides of an x86.

Non-x86 hardware platforms have it easier; one vendor like Apple/Sun/IBM says, "This is the list of hardware that works on our platform," and you use it. The multitude of hardware vendors for x86 boards and devices has led to a large amount of conflicting standards and weird, proprietary hardware. (If a vendor can save $0.10 per unit on a device by leaving out hardware functions which can be replicated by a kludged binary driver, they will. Think WinModems.) This approach has also made x86 hardware cheaper than the alternatives.

Simply put, things will change and change quickly in hardware. Standards are a good idea, but they quickly become lowest-common-denominator, think "VGA".

--
Give a monkey a brain and he'll swear he's the center of the universe.

Re:Quite limited really by norton_I · 2001-05-09 12:22 · Score: 3

2.4.0 has a dramatically improved mm system, most of the benefits of which don't show up on these tests, yet make a world of difference in real life.

Re:We'll beat Microsoft yet! by lizrd · 2001-05-09 23:57 · Score: 3

Don't compare apples to oranges.

I've always wondered why people say that. I can make several valid comparasions between apples and oranges:

Oranges have a thicker skin than apples
Apples grow better in northern regions than oranges
Apples make a better pie than oranges
Orange juice is thicker than apple juice
Oranges have larger seeds than apples

I could continue on like this for some time and I don't think that I would ever get around to mentioning either Linux or Win2k whilst comparing apples and oranges (Though, I might get around to mentioning OSX and British cell phone users if I were to keep at it long enough)

________________________

--
I don't want free as in beer. I just want free beer.

Pretty sloppy presentation. by cananian · 2001-05-09 09:29 · Score: 3

This was really a pretty sloppy writeup. The "performance note" from linus was linked a page too early, there were no convenient navigation links, and far too little effort was spent to identify the sources of the performance improvements identified. In addition, "capabilities" are blamed for what was really the result of a debugging-printk excess, and in at least one point "kernel 2.1.92" was blamed (a convenient culprit) when looking at the graph it is obvious that kernel 2.1.*32* was the outlier.

I'm not impressed.

--
[ /. is too noisy already -- who needs a .sig? ]

Re:We'll beat Microsoft yet! by joto · 2001-05-09 12:34 · Score: 5

So when will line count surpass Windows 2000?

Depending on point of view, that has already happened long ago...

To make the comparison meaningfull, you have to get systems of somewhat equal capacity. The linux kernel by itself is in no way comparable to Windows 2000.

In addition we need various fileutilities, an accelerated X11-server (with Mesa/OpenGL, the video-extension, and antialiasing), one of Gnome/KDE (filemanager, basic desktop utilities, a simple texteditor, something akin to COM (which would be Bonobo or Kparts)), a working web-browser (Mozilla or Konqueror), some userfriendly utilities to replace the control-panel, a user-friendly email-client and newsreader, a simple webserver, basic networking utilities (Samba with a user-friendly network neighborhood browser, telnet, ftp, ping, ...), a good media-player (capable of playing at least wav, mp3, CD's, mpeg, avi, mov and preferably asf and wmf), minicom, a ppp-dialer, and probably quite a few other goodies I've forgotten to mention.

If we put all this into a linux-distribution, I doubt we would do much better than W2k. But to make things even worse, that wouldn't make much of a linux-system. Most linux-users wouldn't be too happy without emacs, gcc with friends, perl, python, tcl/tk, and most of the common command-line utilities (sed, awk, find, etc...) (, and probably also apache, MySQL or PostgreSQL, gimp, etc...).

Line-count? Well, guess what... Linux has become bloatware... Even more than what's produced in Redmond!

The most important 'benchmark' by big.ears · 2001-05-09 09:26 · Score: 3

The most important benchmark they showed was their charts--ugly products of Microsoft Excel. Even though a lot has changed in those 4.5 years, its still easier to make your charts in windows.

We'll beat Microsoft yet! by Beowulfto · 2001-05-09 08:20 · Score: 3

Total lines of code have tripled, and are on an exponential growth curve.

So when will line count surpass Windows 2000?
----

--
There's no point in being grown up if you can't be childish sometimes. -- Dr. Who

Page fault latency: in all of 2.2, or fixed? by rknop · 2001-05-09 08:36 · Score: 3

One thing that I wonder about: that huge performance hit on the page fault latency shown in 2.2.6. Is it still there as of 2.2.19? Did the fix make its way back into the 2.2 series, or is it only fixed as of the later 2.3's and the 2.4 series? 2.2.6 is the only 2.2 in their study, so the study doesn't answer the question.

-Rob

An on-going study would be really useful. by Lethyos · 2001-05-09 09:38 · Score: 3

It would be nice to see updates to the data here as new versions of the kernel are released. For example, some users are not particularly concerned with newer versions of the kernel unless there are significant improvements. Consider this example: you're concerned mostly with performance aspects of the kernel. A new version is released that shows no improvement (or a decrease) in performance. No sense in upgrading immediately (of course, you may be one of those people who actually looks for and reports bugs) and you can wait until you see a downward trend in the graph before taking your time. There are other potential uses for "live" data such as this. I think it'd be nice if these guys would keep maintaining it. :)

--
Why bother.

Quite limited really by Professor+J+Frink · 2001-05-09 08:22 · Score: 4

Where are the results for IDE/SCSI transfer rates/latency?

Where are the results for networking?

I definitely noticed a jump in performance between 2.2.16 and 2.4.0 so they must be missing something here.

They note the large increase in hardware support, but don't seem to realise that this new support and improved support has given Linux much more performance than their benchmarks might show.

Maybe the improvements in X etc have helped but no real performance difference between 2.1.38 and 2.4.0? Put any such machines through real world work and you'll soon spot the difference...

--
"Don't get mad, get a monkey!"

22 of 136 comments (clear)