Slashdot Mirror


Linux Kernel Benchmarking: 2.4 vs. 2.6-test

frooyo pastes from kerneltrap: "Cliff White recently posted some re-AIM multiuser benchmark results comparing the stable 2.4.23-pre5 kernel against the 2.6.0-test5 and 2.6.0-test5-mm4 development kernels. In his conclusion he makes reference to earlier scheduler tests posted by Mark Wong saying, "Short summary: we mostly rock.""

30 of 293 comments (clear)

  1. Comment removed by account_deleted · · Score: 5, Funny

    Comment removed based on user account deletion

  2. SMP by Doesn't_Comment_Code · · Score: 5, Interesting

    The SMP code (written by Linux developers by the way) is supposed to be kicked up a notch in the new kernel. That's what I've heard anyway. I'd love to see Linux being the best OS for multiple CPU scaling.

    That will help everyone from the server market, to me when I save up enough for a two processor motherboard.

    --

    Slashdot Syndrome: the sudden, extreme urge to correct someone in order to validate one's self.
    1. Re:SMP by Arker · · Score: 4, Insightful

      I've got nothing against Linux improving at SMP in essence, but there is something very bad going on here it seems to me. Notice that while the new kernel 'kicks ass' on SMP systems, on uniprocessor systems the 2.4 kernel is the one kicking ass. Anyone benchmarked 2.4 against some of the pre-SMP kernels on a uniprocessor machine?

      Face it, the vast majority of users are uniprocessor, and kernel performance is more of an issue on lower-end machines. Improving performance on big multiprocessor boxes is fine by itself, but not when it harms uniprocessor performance. I'm not a kernel hacker, but I've read many people that this would not happen, that the SMP code would not hurt performance on a uniprocessor machine when the kernel is compiled without it, but that's obviously not turning out to be the case. Anecdotal evidence, at least, suggests that this performance degradation has actually been going on for quite some time, at least back to when SMP code first started being added.

      I'm not sure what all the factors here are, so naturally I'm not going to tell you the solution, but it certainly looks like a potential problem that should be discussed. Hopefully someone with more specifics than I have can chime in...

      --
      =-=-=-=-=-=-=-=-=-=-=-=-=-=-
      Friends don't let friends enable ecmascript.
    2. Re:SMP by GooberToo · · Score: 3, Interesting

      Yes, there are. The scheduler is fully HT aware. It seems that many of the SMP and numa optimizations also apply to HT'ing as well. As such, the developers have been working hard to support it.

      Worth noting, however, it's not uncommon, even for a system that fully supports HT, to see a noteworthy performance drop when HT is enabled. Seems many new systems come with HT disabled in the BIOS for this very reason. Granted, I'm not 100% that's not a Window's specific issue rather than a broad-board HT issue, but something to keep in mind, nonetheless.

    3. Re:SMP by blakestah · · Score: 3, Informative

      Notice that while the new kernel 'kicks ass' on SMP systems, on uniprocessor systems the 2.4 kernel is the one kicking ass. Anyone benchmarked 2.4 against some of the pre-SMP kernels on a uniprocessor machine?

      Yeah, they missed an important test - latency for interactive processes. A lot of scheduler work went into improving this, and it makes a huge difference when you have large memory processes working hard.

      This aspect is improved across the board in 2.6, as well as the SMP issues. Sure, the uniprocessor machine may be a little slower, but response latencies in X are a lot better, and this makes more of a difference to users.

    4. Re:SMP by Vellmont · · Score: 3, Insightful

      Well, I don't think you can conclude that the SMP changes to the kernel are what's slowing down the 2.5 Uniprocessor performance vs 2.4 kernels. There are many other changes that took place (low latency and an improved scheduler come to mind) that aren't SMP related.

      Obviously the SMP performance has been improved, and there was a lot of potential for improvements looking at the 8x test. Another way to interpret the results would be to say that the other changes decreased performance across the board on SMP and Uniprocessor systems. The SMP improvements in SMP machines more than made up for this added cost and improved raw performance on SMP machines.

      Hopefully the performance loss on Uniprocessor machines can be decreased or eliminated. Even if it's not, I think you need to remember that raw performance isn't the be-all-end-all thing that's important. 7% is pretty small in the grand scheme of things where processing speed is doubling every 18 months. Responsiveness and better scheduling that doesn't starve processes is more important than a 7% performance decrease IMO, and you don't get that from faster processors.

      --
      AccountKiller
  3. GREAT by proj_2501 · · Score: 4, Funny

    now i need another CPU to increase performance!

  4. novel idea. by justin_w_hall · · Score: 4, Funny

    Go figure. An OS that gets faster with each version.

    --

    ---
    "how can the same street intersect with itself? i must be at the nexus of the universe!" - cosmo kramer
    1. Re:novel idea. by stratjakt · · Score: 3, Interesting

      It's only faster if you have 8 CPUs, your single proc desktop box will be slower.

      Which just reaffirms my belief that linux is becoming ever more firmly planted in the server world, and desktop linux is still just a hobby for the most part.

      --
      I don't need no instructions to know how to rock!!!!
    2. Re:novel idea. by GooberToo · · Score: 5, Insightful

      That statement simply is not true. Granted, you can always find some corner case where the workload is going to be slower between releases (2.x or 2.6), however, as a rule of thumb, 2.6 should still be a huge improvement for even uniprocessor users. Best yet, many, many parameters of the kernel and scheduler are tunable, so, you can always adopt the kernel to work best for your specific workload needs.

      While it's true that they are working hard to significantly improve Linux for the server room, by far, they have never lost site of the uniprocessor user. Remember, there is nothing wrong with tuning the kernel for your uniprocessor needs, and specific workloads. They just can't do that when they are benchmarking because it would skew the results, invalidating them. They are not only trying to measure how their improvements effect the overall system, but, what makes for sane initial defaults, which are reflective of a general purpose and broad workload. If you understand what you are doing, there is not a reason to believe that you can't greatly improve things for your specific uses and workloads. It's important to keep all of these in mind when talking about these benchmarks. Furthermore, you should fully expect your favorite distro to come with tuning presents which reflect a targeted workload (file/print server, workstation, database, web server, etc.).

      Keep in mind that the benchmark you looked at represents one category of many different types of workloads. So, for that specific workload, it may of been slower, however, that workload my not represent anything you do with your computer. Remember, other types of workloads are significantly faster. One last note, remember, performance is the classic trade off with lower latencies. It trades responsiveness for raw throughput. If, on a uniprocessor workstation, you only see a -7% drop in performance and latency is greatly reduced, chances are, not only will you never notice the loss in performance, but you'll be praising it for how well it works with your mouse, monitor and keyboard (if feels better and makes you a happier user).

      Just some food for thought.

    3. Re:novel idea. by Karn · · Score: 3, Insightful

      Wrong.

      One benchmark used for Linux kernels is hammering a system while playing an mp3 to see if they can get it to skip. Low latency is mostly a desktop feature, and the 2.6 kernel is going to have much improved latency.

      Other portions of Linux have changed, and may not initially outperform 2.4, but if you think this kernel isn't going to be a dramatic improvement over 2.4 for desktop users and servers, and if you think the kernel developers aren't taking the desktop into consideration, you are mistaken.

      --


      Why do I keep typing pythong?
  5. woo by grub · · Score: 5, Funny


    If you thought SCO was mad over 2.4, just wait until they make up evidence for the 2.6 kernel!

    --
    Trolling is a art,
  6. Rock? by TheLink · · Score: 5, Insightful

    It's only significantly faster if you have 8 processors.

    Whereas it is 7% slower if you have one processor.

    I suppose they'll have uniprocessor version which runs faster? Lots of people have uniprocessor pcs.

    Hyperthreading doesn't really count.

    --
    1. Re:Rock? by arkanes · · Score: 3, Interesting

      It might very well be slower than 2.4, but "feel" faster. The low latency stuff can improve responsiveness at the expense of performance.

  7. User Experience by the_crowbar · · Score: 5, Informative

    I run 2.4.22 at work and 2.6.0-testX at home. The 2.6.0test(vanilla) series feel much more responsive, especially in X. I have not done any real benchmarks of my systems, but after working with 2.4 all day 2.6 seems to fly.

    Just my observation
    -the_crowbar

    --
    Have you read the Moderator Guidelines
  8. Thanks SCO. by EDA+Wizard · · Score: 5, Funny

    Looks like that 1970's UNIX code really increases performance for SMP P-III's.

    Now we can appriciate the forsite that our Unix fathers had when developing Xeon SMP code in the late 1970's.

  9. I'm a bit leery. by devphaeton · · Score: 5, Interesting

    "the general trend in the metric indicates everything has been improving, so I think we rock."

    For some reason, the scheduling seems to get more and more choppy (in that i've noticed) with every iteration of 2.4.x kernel. Currently i'm on 2.4.22, and while i don't have any specific tests, numbers or statistics i'm noticing some issues.

    Easiest way to reproduce it is to have the machine do something cpu intensive, such as mkisofs, cdrecord, bzip2 some huge file, cp anything large, installing (via aptitude) or even the "Reading Package Lists...." stage of apt-get update.

    Oftentimes, the machine will become unresponsive for about 3 seconds at a time, then jolt back up to speed, then pause for 3, on and on. Even after the command line returns the prompt, or gkrellm's cpu and proc krells show that everything is all done, i will still see lag in responses from the kb, mouse, or whatnot off and on for about 10-15 seconds.

    I've gone over my kernel config and tweaked a few things here and there but with no change. I can back down to a 2.4.18 kernel and it's not as bad. Going down to a 2.2.x kernel completely solves the problem, but of course will bring its own issues with some of my newer packages (such as gcc) and a few pieces of newer hardware.

    A friend of mine and I have gone over this (on my machine and his) and he experiences a lot of the same issues i do.

    Mind you, i'm not complaining. I'm very grateful to all the developers of the world that i even *have* a linux system to run. But this is something that makes me more excited about the kernel 2.6.x series. I haven't tried one out yet, but from what i've heard and read, it should be awesoe. :o)

    --


    do() || do_not(); // try();
    1. Re:I'm a bit leery. by blonde+rser · · Score: 4, Informative

      Since it seems your running debian and all those cpu intensive operations are also hd intensive operations have you checked hdparm -d /dev/hda . I know it is simple but it is so simple that I forgot to check for about a month. Debian appears to have dma off by default.

  10. Re:who cares? by Curien · · Score: 3, Insightful

    I agree that those things are issues, but they have nothing to do with Linux (the kernel). This is a new release of /the kernel/. You should only get excited about it if you care what kernel you're running. Most people don't, and they shouldn't (as long as the kernel supports all their hardware).

    --
    It's always a long day... 86400 doesn't fit into a short.
  11. Re:Not to be a n00b... by NtroP · · Score: 5, Funny

    Not to be a n00b, but I can't make too much sense of the benchmark the story linked to

    You actually READ the article?!? Man! You ARE a N00b!
    --
    "terrorism" and "pedophilia" are the root passwords to the Constitution
  12. Linux sorta Scales, but the hardware doesn't... by caveat · · Score: 5, Insightful

    I'd love to see Linux being the best OS for multiple CPU scaling.

    You do need a scalable OS to suport lots of processors, of course, but you also need hardware that scales too (clustering doesn't count). Example - SGI is using Linux with NUMAflex on the Altixes to cluster 64-processor system images, but that kind of hardware isn't commodity in any way, and isn't going to be anytime soon.
    Anyway, Linux doesn't scale THAT well...as of 9/2000, SGI was using IRIX for a 1024-processor single-system-image supercomputer; I've heard they can go to 2048 now, but I don't have anything to back that up. Dunno about Solaris, but I imagine it's pretty scalable as well.

    --

    Facts do not cease to exist because they are ignored. - Aldous Huxley
  13. Re:Real world please. by tmasssey · · Score: 3, Insightful
    It's called Lotus Instant Messaging (nee Sametime). And companies are using it in the real world.

    Just becaues you can't see its use outside of a toy, doesn't mean everybody can't.

  14. Smoother scheduling in 2.6.0 by Dr.+Zowie · · Score: 4, Informative

    Devphaeton, you hit the nail on the head about 2.6.0. Its main advantage over 2.4.x (for this luser anyway) is the smoother multitasking even on a uniprocessor system. I'm running a tweaked 2.6.0-test5 on my laptop, and jobs that would make 2.4.x unusable are barely detectable (from the standpoint of moving the mouse around, typing up slashdot articles, and the like).

    Of course, the ACPI support and swsusp doesn't hurt either :-)

  15. SCO Kernels by Schwartzboy · · Score: 5, Funny
    No, no, no! They don't have to "make up" a shred of evidence, you insensitive clod! Bear with me as I walk you through the intensive fact-finding process that will prove beyond a shadow of a doubt that 2.6 does, in fact, have more proprietary SCO stuff in it than any *nix ever has before! Watch as the scene unfolds...

    DARL: So, um, hey. It looks like there's this new "too-pointe-six colonel" out on the market from those Lenn-ucks people. We own all that too, right?

    SUIT: Well, sir, it's like this. Do you remember how the 2.4 kernel had all of those lines of code in them that are ours, even though they showed up in textbooks before most of our stuff existed?

    DARL: Sure, but how does that help us with this new thing?

    SUIT: Think about it. Most operating systems, according to my extensive research during years of never having looked at a computer before, contain the same code that they always did, plus a couple of lines of new comments and an extra variable or two that shows how much you're able to charge users for the new features. Just think about the Windows 95 and 98 thing. Perfect example there.

    DARL: But...my mansion only has 93 windows. Where is this heading?

    SUIT: *blinks* Errr...yeah. Well, it's all the same code, and even those sneaky Linux commies try to pull a fast one on us and put one of those different codes in there, we can always assert our ownership of these "opened sources" files that I just printed out. I asked this guy, you know, and he said that all of these sources are what's in Linux, and since I printed it on paper and stuff, I figure it must be a textbook. Since we own all the words that show up in textbooks, and this has a lot of words, I think we've found ourselves a new angle here.

    DARL: Smithers, cry havoc and let slip the Lenn-ucks colonel lawsuit monkeys once more!


    I do so hate having to correct you people. *sigh*
    --
    "Linux doesn't exist. Everyone knows Linux is an unlicensed version of Unix"- Kieren O'Shaughnessy
  16. Am I missing something here? by AntiGenX · · Score: 5, Insightful
    If you look at the difference between the outcomes for uniprocessor vs dual. There doesn't seem to be very good scaling.

    linux-2.6.0-test5 - 992.06 - Uni
    linux-2.6.0-test5 - 1017.43 - Dual
    linux-2.6.0-test5 - 5406.68 - Quad

    Does this mean that you only gain 3.49% when adding a 2nd processor? Obviously I don't expect things to scale linear but 3%!? Am I missing something here? And then 81.65% for quad? I'm not trolling, I'm looking for someone to explain what I'm missing.

    1. Re:Am I missing something here? by rakarnik · · Score: 3, Informative

      Yes, the number for dual is not 1017, but more like 1545.

      Here are the actual numbers for 2.6.0-test5 and the compute workload:
      1 - 992.06 - 100%
      2 - 1545.03 - 155%
      4 - 5175.28 - 521%

      Now for why the 4 processor case is actually 5 times better than the single CPU case, I do not know enough about the benchmarks to comment.

  17. Re:2.4 vs 2.6 by tmasssey · · Score: 3, Informative
    By definition, with the speed of context switches and other overhead the same, a system with "low-latency" switching (switching faster between interactive jobs) will be slower. It switches more often, therefore wasting more cycles with switching overhead.

    Of course, there is the possibility of trimming cycles from the process of switching contexts. Linux, though, already had that pretty low. That's why Linus is so resistant to shared-memory, shared-context threads: the cost of processes is so low that the benefits are small. However, some speed was gained in context switches.

    Overall, though, more switching means slower performance, even though the user feels like the system is faster. It's not faster. It's actually slower. It's just more responsive.

    Confused yet? :)

  18. The kernel isn't everything by skamp · · Score: 3, Informative

    I've tried the new kernel, and I got more responsiveness issues than improvements. But besides that (I might very well have misconfigured something), I'd like to point out that the kernel itself isn't all that matters: the new drivers that accompany it are just as much important. I noticed a significant increase in X's launch time as well as a whopping 250 FPS with glxgears to be compared to the 150 FPS I got with my 2.4.22 setup. This is probably due to major improvements that were brought to the drivers for my i830M chipset.

  19. Re:Half rock by GooberToo · · Score: 5, Insightful

    You are correct! The scheduler reacts different to different work loads. This is why the kernel developers try hard to test their changes under a number of different workloads. To top it off with, they attempt to target the benchmarks which behave like real-world work loads rather than contrived and unrealistic workloads. That's not to say that they don't test those too, however, they clearly direct more attention at real-world workloads and corrosponding result sets.

    The 2.6x series kernels will be a big step up for just about everyone that seriously uses their computer. Significant realiability improvements as well as faster thoughput on disks, much, much higher scalability for SMP (hyperthreading and numa and even highly loaded uni-systems) systems, and much lower latencies, all at the same time. Granted, there are still some tests which may not be a win-win all the way around, however, almost everything in general is an improvement with hardly any detracters.

    So, saying, "we mostly rock", really is a true statement!

  20. Re:Good Question, Bad Arithmetic by stef49 · · Score: 4, Informative

    a quad cpu more performant than 4 * single cpu?
    Odd but not impossible.

    For example, if in the single cpu config the processes are doing a lot of memory-cache missed then having 4 cpus (with 4 times more the amount cache) could reduce the number a cache misses and so could make the quad configuration more than 4 times faster.

    The same reason could explain why 2 cpus are not faster than one: if 2 caches are not large enough and if the processes have a very bad locality then you may get as much cache misses with the dual cpu system than with the single cpu system.