Slashdot Mirror


Optimizations for Source-Based Distributions?

Kenny Mann asks: "I currently run a Linux distribution called Lunar Linux and it is a source based distribution branched from the original Sorcerer GNU Linux. I've done a bit of research on compiler optimizations and such and was wondering what kind of performance is there really to be had for setting these options? I know that the more options the greater chance of unexpected failures, so my next question is what about optimizing your kernel?" Optimization is tricky, and I think the answer to this question is more complex than "yes, optimize" or "no, don't optimize". Rather there might be classes of applications that are safe to optimize and classes of applications that are not. How do those performance hounds out there feel about optimizing the kernel, however?

41 comments

  1. And so I'm not a complete jerk.... by Anonymous Coward · · Score: 3, Informative

    This thread on forums.gentoo.org discusses to death what compilation flags are good to use (mostly gcc 3.x+). Even though you're using a different source based distro, the concepts are all the same.

  2. Should be a database with this info. by stienman · · Score: 4, Interesting

    When a project is given to the opensource community, the optimizations that work should be submitted as well. Possible instability due to such optimizations should be noted, and where expedient, should be fixed.

    This is usually taken care of in the form of a makefile. If the author didn't intend any optimizations to be run against the code, they didn't put them in the makefile.

    It would be great, however, if there were a project to find out what instabilities happen due to specific optimizations, and either fix GCC so it can more intelligently tell when to optimize and when not to, or get code developers to adopt safe coding practices which will allow the optimizations to be made without problem.

    -Adam

    1. Re: Should be a database with this info. by OldMiner · · Score: 5, Informative
      to find out what instabilities happen due to specific optimizations, and either fix GCC so it can more intelligently tell when to optimize and when not to

      Having only but read the man page for gcc a couple of times, and not even that of gcc 3.0, I can say I'm woefully underqualified to comment on the subject, but I will anyhow. From what I've observed, there is no legal C++ code that doesn't self-modify which gcc can't intelligently optimize without problems. The only time that I have ever seen optimization issues was when inline assmebler was involved, which I would hope was already optimized, considering the nature of the beast. Further, many of the optimizations that gcc performs are rather simple things such as loop unrolling, function inlining, delaying popping of the stack until after several function calls, etc.

      Perhaps one of the most notable optimizations, for the beginner at least, is that one needs to provide -O or gcc will not allocate any variables in a register. It'll be memory, register, operation, and back to memory over and over again. (Or perhaps just a direct memory operation if you're on x86.) Despite my early teachers' insistance that compilers were simply too smart and didn't need such hints, I tested and found that a trivial heavily looped programs often ran 3 times faster when I declared the loop counters as 'register'. The problem was that we were simply using "gcc source.c" to compile our programs. gcc produces very poor code if -O isn't used at least.

      But, anyhow, I think the largest issues would be concerned with -m and -f flags which may change default or even standard behavior. For instance, -felide-constructors breaks ANSI C++ compliance, but isn't a bad idea if you create and destroy a large number of expernsive objects. (Then again, you shouldn't do that.)

      --
      You like splinters in your crotch? -Jon Caldara
    2. Re: Should be a database with this info. by David+Greene · · Score: 1
      From what I've observed, there is no legal C++ code that doesn't self-modify which gcc can't intelligently optimize without problems.

      Legal C++...well, that's the trick, isn't it? I've seen at least one major failure in a complex project I work on that was exposed by -O2 optimization level using g++-3.2. Whether it's a bug in the code or a bug in the compiler I do not know. Unfortunately, g++ doesn't seem to have the command-line interface to systematically test things.

      Further, many of the optimizations that gcc performs are rather simple things

      :) Nothing is simple when it comes to optimization. Bugs creep in all the time.

      --

    3. Re: Should be a database with this info. by OldMiner · · Score: 2
      From what I've observed, there is no legal C++ code that doesn't self-modify which gcc can't intelligently optimize without problems.
      Legal C++...well, that's the trick, isn't it? I've seen at least one major failure in a complex project I work on that was exposed by -O2 optimization level using g++-3.2.

      Could you clarify on this matter please? I have seen some things break between version of g++ due to changes in how the STL was implemented. In one case, for instance, I was transparently using pointers as if they were iterators. This caused problems that required a rather large rewrite. But that was entirely my fault. Anyhow, my question is really what specifically caused your problem?

      Whether it's a bug in the code or a bug in the compiler I do not know. Unfortunately, g++ doesn't seem to have the command-line interface to systematically test things.

      Actually, I think -O, -O2, -O3, and -Os are mostly abbreviation for long lists of various -f commands. See the manual for a complete reference. It goes on for pages, however, and isn't exactly clear on which -f's the specific optimization levels, but it sheds some light. There is the occasional mention of "this is automatically enabled after -O2". And there's a sentence that reads On most machines, the -O option turns on the -fthread-jumps and -fdelayed-branch options, but specific machines may handle it differently.

      --
      You like splinters in your crotch? -Jon Caldara
    4. Re: Should be a database with this info. by OldMiner · · Score: 2
      Actually, I think -O, -O2, -O3, and -Os are mostly abbreviation for long lists of various -f commands.

      Gah, ignorance redux. The same page clearly reads Not all of the optimizations performed by GCC have -f options to control them. Too much skimming, not enough scanning.

      --
      You like splinters in your crotch? -Jon Caldara
    5. Re: Should be a database with this info. by David+Greene · · Score: 1

      Yep, that's exactly the phrase I was going to point out to you. :)

      --

    6. Re: Should be a database with this info. by David+Greene · · Score: 1
      Could you clarify on this matter please? ... Anyhow, my question is really what specifically caused your problem?

      Unfortunately, I do not know. Because this is a large project there are several developers working on it and I don't "know" all of the code. I'm fairly confident the code I've written is standard-compliant but I have no idea about the other bits.

      The trouble is there's no easy way to find out who is at fault. I can run through and try every -f option individually but often even that is not enough because optimizations interact with each other and it's usually the case that bugs are exposed only by sequences of optimizations.

      --

  3. IMO by Anonymous Coward · · Score: 4, Insightful

    Latest versions of GLIBC and Kernel and XFree 4.3 are more important than optimisations.

    1. Re:IMO by Anonymous Coward · · Score: 0

      How about latest versions of those WITH optimizations then? would that not be even better?

  4. Optimizations have a huge impact on performance by eggstasy · · Score: 4, Interesting

    I'm not an expert on anything, but a friend of mine teaches at a local university, and when we were at my birthday party he mentioned having compiled an old encryption proggy he made (codigo pro) both under the ancient Turbo C and the most recent version of GCC. I dont know what settings he used, but he said it was like 3 times faster for the same code running on the same platform.
    Of course he made the software ages ago when he knew very little about programming compared to now, and the source was probably lacking every possible optimization that an expert coder would introduce, so YMMV.
    I never mess with settings when compiling programs under Linux, since I'm pretty much a newbie and often have trouble installing things from source.
    However, an expert friend of mine who has a very old PC always goes for an LFSish setup where he compiles everything and tweaks all the settings by hand. He claims it works miracles, and I believe him. I know from my old MS-DOS graphics programming experience that small source and compiler tweaks could be the difference between a professional-looking program and a crappy amateurish app full of flicker.
    I could ramble endlessly about all the optimization success stories from my youth, starting with the classic "DEFINT A-Z" QBasic trick, then progressing through Turbo Pascal compiler tweaks and finally achieving C + ASM goodness, but I shall not bore you any further :)

    1. Re:Optimizations have a huge impact on performance by cheezfreek · · Score: 1
      Of course he made the software ages ago when he knew very little about programming compared to now, and the source was probably lacking every possible optimization that an expert coder would introduce, so YMMV.

      As a compiler developer, I have to say that programmer-driven "optimizations" are (the vast majority of the time) very shortsighted and very local. Sure, you can hand-optimize a function to perform much faster with no (or low) optimization, but most of these optimizations inhibit aggressive interprocedural optimizations that the compiler could do (at high optimization levels) that would increase performance for that function and the areas that call it.

      For a simple example, just think about function pointers in C. Sure, you might do better at low optimization levels if you assign to a function pointer and call through that pointer many times. But at high optimization levels, if you had used a switch or if/else type of structure to determine each call, the compiler could inline the most likely call (yes, reasonable heuristics do exist for most of these circumstances). If this call was to be made 90% of the time, you would get an incredible increase in speed by not going through the function pointer, as the inlined code could be aggressively optimized. Sure, the least likely cases would probably get plenty slower, but they would be incredibly unlikely.

      To make a long story short, hand-optimization isn't all it's cracked up to be. Keep the code free of "tricks" and you'll give the compiler's optimizer more opportunities to shine.

    2. Re:Optimizations have a huge impact on performance by chthon · · Score: 1

      Or like D. Knuth has put it :

      Premature optimisation is the root of all evil.

      When I maintained COBOL code written by other people, I was sometimes appalled at what things some programmers introduce to try to make things run faster.

      My experience in this has mostly been : try to understand the problem at hand first, and then implement it with clearly defined loops and subroutines. This helps maintenance, and the compiler has a much better time figuring out its optimisations.

  5. Check the other guy's boards by Mr.Ned · · Score: 4, Interesting

    Check out other distro's forums and mailinglists. Sorcerer, Gentoo, and many others of these squeeze-everything-out-of-the-compile distros have great resources for this type of question. The Gentoo forums (https://forums.gentoo.org/) have quite a bit of talk about this sort of thing, but are more geared toward the portage system. However, threads about -fomit-frame-pointer and -funroll-loops can be enlightening - do they really give a speed boost, or do they just take up more disk space? Search away :)

    I have seen a thread about optimizing the kernel on the Gentoo forums (guess which distro I spend time with?), but these seem to be much more hassle than they're worth in the long run in terms of segfaults and crashes and the like.

  6. Why the tweak? by OldMiner · · Score: 4, Insightful

    Someone already commented on this somewhere, it could have been FidoNET or Slashdot, so I'll paraphrase. Anyhow, the upshot is that there are two things about Linux.

    1. It's an OS that can be used to run programs which occupy your full time.
    2. Two, it's an OS that can be tweaked endlessly to occupy one's full time.

    The lady or gentleman who finds it more entertaining to tweak the kernel than play Quake is that much geekier and worthy of respect in my book. It's just important that such a person realizes, aside from gaining some small amount of technical knowledge and problem-solving skills, such an act is also little more productive than a fragfest.

    --
    You like splinters in your crotch? -Jon Caldara
    1. Re:Why the tweak? by espresso_now · · Score: 1
      It's just important that such a person realizes, aside from gaining some small amount of technical knowledge and problem-solving skills, such an act is also little more productive than a fragfest.

      That's why I play Quake III while everything is compiling in the background!
      --
      Of course, and I highly suspect it, I may be talking out of my ass. -oqti
    2. Re:Why the tweak? by irix · · Score: 2

      The lady or gentleman who finds it more entertaining to tweak the kernel than play Quake is that much geekier and worthy of respect in my book.

      And when you make it out of University you'll realize that you have to get some work done too. I use Linux on my desktop at work - do you think I have the time to be compiling software from sratch every time I want to install or upgrade something? I use RedHat, because I know everything is tested and will work, and I can get it installed/upgraded quickly, and get on with work that needs to get done.

      I realize that people like source-based distros, and I understand the appeal. However, I have to imagine that a lot of the people who run these distros are university students on the other end of a fat network pipe with a lot of spare time.

      --

      Do you even know anything about perl? -- AC Replying to Tom Christiansen post.
    3. Re:Why the tweak? by ubikkibu · · Score: 1

      > [Kernel tweaking] is also little more productive than a fragfest.

      Ha! Some folks--you, I'd wager--play around with Linux for fun. Some folks use it to get work done. And some folks base their businesses on it. The latter two camps benefit quite obviously from speed optimizations.

      One small example: I run a streaming audio (icecast) server for a public radio station. It easily saturates 2 T1 lines, and everything runs on a PII-166 with 128MB RAM. I couldn't even get the encoder (lame) to run reliably until I began the tedious process of benchmarking and recompiling each of the major components--it didn't have enough CPU time. After a few days of "tweaking," I now have a cheap box that handles 40 mp3 clients around the clock, with many months of uptime.

      On the contrary, tweaking is highly productive.

    4. Re:Why the tweak? by OldMiner · · Score: 1

      >> [Kernel tweaking] is also little more productive than a fragfest.

      >> I couldn't even get the encoder (lame) to run reliably until I began
      >> the tedious process of benchmarking and recompiling each of the major
      >> components--

      And, remind me, what does tweaking lame for a specific purpose have to do with teaking the kernel?

      Nothing against tweaking a specific application for a specific purpose. Even tweaking the kernel with purpose is a very understandable activity. But, in general, little time is spent in kernel functions. I thought that was, in fact, part of the point of a microkernel.

      --
      You like splinters in your crotch? -Jon Caldara
  7. This is slightly OT, but... by benjamindees · · Score: 2, Interesting

    There should also be some sort of massive Linux bugzilla, with maybe an automated reporting agent like Windows and Netscape have. Applications can be written to call that program when an error occurs, or whenever the user wants to file a report. The data wouldn't even have to be looked at by humans, just be collected to tell what the most common problems are or maybe for a developer to search through later. I suppose the massive bandwidth costs would prohibit a project of this sort, but I still think it would be useful.

    --
    "I assumed blithely that there were no elves out there in the darkness"
    1. Re:This is slightly OT, but... by SpaceLifeForm · · Score: 2

      Sort of like mentioned here?
      Or found here?

      --
      You are being MICROattacked, from various angles, in a SOFT manner.
    2. Re:This is slightly OT, but... by damiam · · Score: 1

      I believe he meant for all of Linux, not just the kernel. Note that many distros (Redhat and Debian at least, don't know about others) have their own bug databases.

      --
      It's hard to be religious when certain people are never incinerated by bolts of lightning.
    3. Re:This is slightly OT, but... by Anonymous Coward · · Score: 0

      Linux IS the kernel. Anything not in the kernel comes from something else.

    4. Re:This is slightly OT, but... by benjamindees · · Score: 2

      Sorry, I wasn't clear. I meant "Linux" in the general, "anything that runs on Linux" sense that most end-users seem to view Linux as.

      To them, even if a bug in KDE causes everything but the command-line to come crashing down, Linux is to blame. They don't have the knowledge (or time) to pinpoint exactly what program or component was to blame, they just know something failed.

      If we could provide an automatic utility with a simple GUI that would ask a couple of questions, then send all sorts of debugging info to a massive database, I think it would be useful.

      I don't know of any distributions that include such a utility, but I believe they would if a general one were available.

      --
      "I assumed blithely that there were no elves out there in the darkness"
  8. Re:Source distros by Otter · · Score: 3, Insightful
    It's like the (apocryphal, I assume) story about Einstein explaining relativity: When a man spends a minute sitting on a hot stove, it seems like an hour. When he spends an hour sitting with a pretty girl, it seems like a minute.

    Staying up all night to recompile KDE seems like a minute. Every millisecond spent waiting for a response after clicking a button seems like an hour.

    See, it's scientific! As a physics genius, surely you should be familiar with the basic implications of general relativity.

  9. General approach by SpaceLifeForm · · Score: 4, Informative
    1. Don't worry about the kernel, it's pretty tight already. Definitely compile it for your proper platform, but I would not waste time on additional tweaks.

    2. Do re-compile your C library however. Most of your applications spend a lot of their time executing code from the C library.

    --
    You are being MICROattacked, from various angles, in a SOFT manner.
  10. Optimization hints. NUT ALERT: be warned and read by jsse · · Score: 5, Interesting

    First answer your question on kernel - the kernel optimization by default is good enough, e.g. it uses -Os instead of -O3 because some program like kernel usually run faster with less memory trace. You might want to optimize individual modules, though.

    For the rest of the packages(I know you didn't ask, but it doesn't stop me. :), you could try some crazy optimization. The hardest thing to decide is that which optimization flags in gcc work best for your system. Should you use all optimization flags? Will these flags break your system?

    Inspired by rocklinux, I've tried to benchmark individual optimization flag, i.e. test each flag and discard those flags which don't give your system performance gain. Of course, the script used in link above is pretty old and you must modify for gcc3.2+. Thanks to lameass filter I won't post my script here.

    That sound like wasting of time but the result is satisfying. The max. yield I could gain is as much as 19% in comparing to plain -O3 optimization. Here are the result:

    vendor_id : GenuineIntel
    model name : Mobile Pentium MMX
    flags : fpu vme de pse tsc msr mce cx8 mmx
    gcc version 3.2 (i586-pc-linux-gnu)
    Result: '-O3 -march=pentium-mmx -fomit-frame-pointer -finline-functions-fcse-follow-jumps -funroll-loops -frerun-cse-after-loop - frerun-loop-opt -fno-cprop-registers -funroll-all-loops -maccumulate-outgoing-args -fschedule-insns'
    Performance gain(compare to -O3 only) ~ 9.9%

    vendor_id : GenuineIntel
    model name : Pentium III (Coppermine)
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
    gcc version 3.2 (i686-pc-linux-gnu)
    Result: '-O3 -march=pentium3 -fomit-frame-pointer -finline-functions -funroll-loops'
    Performance gain(compare to -O3 only) ~ 13.7%

    vendor_id : AuthenticAMD
    model name : AMD Athlon(TM) MP 2000+ (a dual CPU system)
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow
    vendor_id : AuthenticAMD
    model name : AMD Athlon(TM) MP 2000+
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow
    gcc version 3.2 (i686-pc-linux-gnu)
    Result: '-O3 -march=athlon-mp -fomit-frame-pointer -finline-functions -fforce-mem -s -funroll-loops -frerun-loop-opt -fdelete- null-pointer-checks -fprefetch-loop-arrays -ffast-math -maccumulate-outgoing-args -fschedule-insns'
    Performance gain(compare to -O3 only) ~ 19.6%

    19.6%!! If you asked me, it worths it to optimize your desktop; but to the server, you'd like to have it running stable than to have it running 19% faster, you can trust me on that. :)

    PS. In the processing of testing, I found some flags are dangerous and better use with care: -fmove-all-movables, -frename-registers and -malign-double. I suspected that they broke my file-util, which corrupted my entire fs. Just be careful.

  11. Re:Source distros by Mr.Ned · · Score: 4, Interesting

    See Kyle Sallee's comparison of source and binary distros at http://sorcerer.wox.org/docs/distro/distro2.html - source distros aren't just about those extra seconds.

    They're about true dependency resolution - if you can configure the make file, chances are you won't hit any snags when running the program, unlike certain other binary package format I know.

    They're about being up-to-date - what good is that security patch that came out 20 minutes after the bugtraq article if it takes your distro a week to release a binary?

    They're about program selection - someone has to compile it, and like before, if no one has compiled it for your arch and your distro to your liking, you've got to do it or sit there on your heels and wait. Grab the latest version, test it out, submit a bug report, and participate in open source!

    They're about learning - if you've never installed a distro from source, it's an enlightening process that is instructive. Forget Mandrake's Control Center that will configure your X server; forget about Debian's ncurses menu that lets you select modules to add to a precompiled kernel; forget SuSE's YaST that auto-generates your /etc/fstab. Do it yourself! It's the natural thing for a geek to do - take it apart and put it back together.

    They're about knowing what your system is running - unneccesary open ports are security holes waiting to happen. Do you really need telnetd running? Fingerd? Apache? Webmin? No? Don't compile them! Saves you time, space, and makes you more secure. The less extra stuff on your computer, the better. The latest versions of Mandrake and SuSE come on DVDs!!!! I probably have a 800MB of downloaded source between a server and a desktop; nothing compared to a full 9GB DVD or 7-CD set. I don't download what I don't use.

    Source distro's aren't for every purpose. No corporate desktop needs Slackware. But for the geeks among us, they are a dream come true.

  12. Does IA-32 support by anthony_dipierro · · Score: 1

    delayed branching? Now that's something that can save a whole lot of time in some kernel code. It can also confuse the hell out of someone looking at assembly code!

    1. Re:Does IA-32 support by jyasskin · · Score: 1

      AFAIK, No. And it probably wouldn't save much time in kernel code. Modern processors juggle instructions around so much anyway, that the overhead of keeping track of the delay slots would cost more than any benefit of having them.
      Also, how many slots would you need? the original 8086 probably only needed 1 (or none, which would explain not having any). The P4 could use like 15, which is impossible for other reasons.

    2. Re:Does IA-32 support by anthony_dipierro · · Score: 1

      Hmm... Yeah, I'm probably thinking from an old-school RISC point of view rather than a modern CISC architecture. Shows you how I know nothing about IA-32. Thanks.

  13. He's a delightfull trol by arcadum · · Score: 1

    I agree with you: everything's relative.

  14. Recent discussion on lkml by heatmzr · · Score: 3, Informative

    gcc optimzations for the kernel were discussed recently on this thread

  15. Re:Source distros by phantomlord · · Score: 2
    I don't use a source distribution, but I do build all my systems from source myself via a LFS type approach. I've easily spent hundreds of hours compiling, watching for new releases, etc... I do gain the benefit of optimizations specific to my hardware/environment/etc... but more importantly, I get to learn how everything works together. I might not be writing all the source myself, but there are compilation erros in some cases and I need to figure out how to fix it, I get to learn how to get things to work exactly how I want, I get to write all the configuration files myself as I install them so I can ensure there aren't any idiotic defaults that are going to bite me later, etc. It also means I don't have to rely on a single specific source for updates. I can run all the latest stuff without any failed dependencies and I get them as soon as they're released rather than having to wait for someone to package them.

    Time isn't simply money. Time is an investment. I'm more than willing to spend time, even lots of it, to further educate myself. Time invested in education is exponentially greater than simply investing it in money because increased education means more money when you invest the time to make it.

    --
    Don't leave your mind so open that your brain falls out. Don't close it so much that you cut off the blood.
  16. Re:Source distros by Garen · · Score: 1

    I think much of the attractiveness isn't necessarily that the user gets a lot more performance out of the system with more aggressive optimizations -- but that the source based distros can are easier to configure, administrate and upgrade.

    That has definately been my experience with FreeBSD and Gentoo.

  17. Re: -felide-constructors by sir99 · · Score: 1
    Just a note: -felide-constructors seems to have become the default in gcc 3.2. Judging from the manpages of 2.95 and 3.2, the behavior also changed between the C++ draft standard and the final standard.


    One place where gcc isn't quite standards-compliant is with IEEE floating point math. This isn't a problem usually, but for a library I'm writing, I have to enable -ffloat-store or I'll get slightly wrong results. Intel's C compiler is worse on this front by default though, since you have to enable two or three flags to get standard behavior ;-)

    --
    The ocean parts and the meteors come down
    Laid out in amber, baby.
  18. Re:Source distros by phre4k · · Score: 1

    I ran gentoo for a while. I dont think it is easy to upgrade.

    The dist-upgrade feature is far more simple then:
    http://www.gentoo.org/doc/en/upgrade-to-gen too-1.4 .xml

    --
    "Nobody really checks their email any more. They just delete their spam"
  19. Re:Source distros by FooBarWidget · · Score: 2

    "They're about true dependency resolution - if you can configure the make file, chances are you won't hit any snags when running the program, unlike certain other binary package format I know."

    That's what the Autopackage project is about: sane autoconf-like depency resolution for binary packages.

  20. Newbies often write w/o understanding the meaning by r6144 · · Score: 1

    Sure, just write in the "correct" way, the code will perform well even without any tricky optimization. However, although you know what the computer will do to execute your C code, many new programmers do not. They will just use doubles when ints should be used, use pow() when calculating 1 Knowing how to write "clean" code is the most reliable way to give your program acceptable performance. Compiler optimization is much less important for many people (for GCC, you can't get much faster than -O2 or -Os for most code), comparatively, and one definitely shouldn't rely on them (like changing a/5.0 to a*0.2) too much.