Genetic Algorithms for GCC Optimization
captain igor writes "For the power users in the house that enjoy taking the time to squeeze every last drop of performance out of their programs, here's an interesting little program I ran across today call Acovea. Out since 2003, Acovea's main function is using genetic algorithms to determine an optimal set off Gcc optimization flags to squeeze the most performance out of a given program. Certainly an interesting concept, definitely worth a look. Some nice results on a P4 and Opteron can be found here "
It's a feature on top of the existing advantages of "compiling into" instead of "installing onto" a system, and a feature that is pretty much esclusive to OSS.
And if you thought that was boring you obviously havn't read my Journal ;-)
Ofcourse it's different if your a developer too, but...
Software should be free as in speech, but if we also get some free beer, all the better.
In the article the author mentioned that the flags toggled by the -O options were unknown.
;)) and the -v flag to see the information.
I know that the listings in the manual are fairly accurate, not perfect though. If you want to know exactly what flags are activated when you compile you use the -Q flag (undocumented, AFAIK
gcc -v -Q file.c
Output:
[...]
options enabled: -fpeephole -ffunction-cse -fkeep-static-consts
-freg-struct-return -fgcse-lm -fgcse-sm -fsched-interblock -fsched-spec
-fbranch-count-reg -fcommon -fgnu-linker -fargument-alias -fident
-fmath-errno -ftrapping-math -m80387 -mhard-float -mno-soft-float
-malign-double -mieee-fp -mfp-ret-in-387 -mstack-arg-probe -mcpu=pentium
-march=i386
[...]
One thing I noticed with one of my GCCs is that -fomit-frame-pointer was not activated on -O3, even though the manual says it is...
It does not look like he checked the value of using -mfpmath=sse,387 - only -mfpmath=sse.
I would question just using SSE for floating point if the code is written just using double values - IIRC SSE doesn't like doing doubles very well. Allowing the compiler to use both the 387 mathco registers as well as the SSE registers might be a win here.
The other point about using SSE for floating point would be to use simple floats and see what difference the math options make.
www.eFax.com are spammers
Q: How can you tell you have perhaps gone slightly overboard in making compiler optimization options available?
A: Your users have not just given up trying to reason out what they should do, or even brute forcing every possible combination, they're inventing fucking genetic algorithms to find out what works best.
Basically the act of "calling gcc from the command line" is know officially a murky problem space to attack with pseudo-random hill climbing stuff...
I believe posters are recognized by their sig. So I made one.
Using a GA basically means it randomly tries some flags. While this is nice and automated, I'd rather have the developer understand his/her code and the compiler to the extent that the best flags can be picked deterministically. Besides, the "optimizer" might very well pick flags that work with the test cases, but break the executable at some other place. At worst this might result in a some terribly hard-to-track security vulnerability.
It would be really cool if this technology could somehow be integrated into the Gentoo project.
:)
Of course it would be unreasonable to have the each single ebuild compile and get benchmarked several times on each user's PC, but these genetical algorithms could be used to predetermine the optimal compiler settings for each architecture/ebuild-combination, store this information in a central database and have portage automatically select the optimal compiler setting from that database, each time it compiles an ebuild.
No more figuring out what the best compiler options are: the ebuild maintainers will take on that job for you!
"Oooh, does that mean we get to kick some puffy white mad zionist butt?"
Oh the irony. Your submissions were ignored because they would have been dupes of this article from the 20th April. You were only 3 days late...
#include "coucou.h"
Genetictoo, like Gentoo, but now with Genetic algorithms.
... but I think theres a future in using genetic algorithms in the development of gcc itself. Setup a bunch of Pentium4 HT computers, compile gcc compilers with various optimisations, then have them compile something like apache or the linux kernel, run it and benchmark, and get back to compiling the next gcc. Let the whole system run for months and I wonder if the gcc could do something close to the Intel compiler.
Optimisation features of all processors are documented, but not all compilers know how to use them. Not all compilers know how to arrange code to properly utilize the CPU's pipelines and cache to the max either. This way, one gcc compiler can be genetically optimised, then handed out to the eyeballs out there as alpha/beta for audition before real code is compiled from it. This can probably be done for ALL architectures supported by gcc.
"Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
Why not just brute force this problem, maybe with some distributed computing? GAs are heuristic, and will never be guaranteed to find the global maximum for a given problem. Complicating the issue is the nature of the fitness landscape - series of adjacent vectors in the search space likely do not have smoothly-changing fitnesses, and so a GA is not well-suited to finding the solution.
A code optimizer seems like a natural application for GAs. If you can prove a piece of code's logical equivalence to another's, you can have a code generator produce random versions of the same code (functions, loops, blocks) and then run that as a GA to find the best-performing version. On the other hand, compilation might take ages to run.
...I strongly agree. No other compiler in history has this many knobs and dials. They interact in strange and unpredictable ways. Even worse, the -f (functionality/features) options can't be tested with all the possible -m (machine-specific) ones due to the number of platforms required. So we get bug reports complaining that -ffoo combined with -mbar on the Foonly 3000 causes random explosions killing a busload of nuns, and all we can do is shrug and say, "the -ffoo designers didn't have a Foonly, so just Don't Do That."
One of the things I'm pushing for GCC 4.x is to take an axe to the -f switches. It won't get consensus, of course, but it'll raise interesting discussion.
There's an insult in there somewhere, but I'm not sure what you're trying to attack.
Optimization options have always been a murky problem space in GCC. No other compiler targets as many processors as we do. The Right Thing on one chip will not be the Right Thing on another; that's why we have this mess.
As for what you call "peudo-random hill-climbing stuff," probably with little Dogbert-like waves of your hand, I suggest you take some formal courses in evolutionary algorithms. For solving non-linear discontinuous problem spaces, they are extremely effective, and go well beyond "hill climbing stuff."
One of the original goals of Acovea was to find better combinations of switches for popular platforms, and then make those the default for future major versions. I haven't followed the project as closely as I'd wished, so I can't say whether that's still the goal. Hope so.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
That database should be called "the GCC source tree".
Once a particular set of flags has been shown to be consistently better for a majority of sample real-world code, those settings should become the new defaults for that platform on, say, -O2.
Doing this is slightly tricky, given the current GCC codebase, but that's why God gave us text editors. :-)
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
I have an application which I tried with sse,387 and the speed was slightly worse than sse. But on top of that I got random errors which I tracked down to an if () ... else ... block being exectuted where neither the if nor the else part was taken. A compiler bug for sure.
But it sure is old news for nerds...
If at first you don't succeed, skydiving is not for you
from last November.
poetic that an evolutionary algorithm, analogous to that in nature which evolved the human brain, would be used to optimize compiler options in pareto-optimal search spaces (different platforms).
Brute forcing all permutations is a very silly way to do this, IMHO. Compare an EA solving close to optimally the travelling salesman problem in a short time with the np-completeness of finding the absolute peak. 99% is good enough if 100% takes longer than the age of the universe to figure out for large inputs.
Also, various compiler choices change the binary sizes, cache hit rates, etc, which might lead to solution spaces in which there are many peaks (pareto-optimality). e.g. code size vs memory requirements.
The matrix is coming. Even Indian programmers will be out of jobs soon! haha
I think I'll volunteer to work on this, haven't been more excited about a project in a long time. (that is, after my real work finishes--I'm a gardener--pay is better Lol)
Acovea was linked here in Genetic Algorithms and Compiler Optimizations