Genetic Algorithms and Compiler Optimizations
mfago writes "Scott Robert Ladd has written an enlightening article and accompanying software package that utilizes genetic algorithms to optimize the optimization flags fed to a compiler. Those who have tried to squeeze the last drop of performance
from a code know that it can be very difficult to determine the set of optimization flags beyond -O3 (with gcc for example) that yields the best performance. Lots of background links included in the article."
It turns out that hand tuning will ALWAYS get you the best optimizations, but human nature dictates that a certain portion of optimizations will never be found. Genetic algorithms will find some of these optimizations in a reasonable time, but not all. So the bottom line is, no code can ever be 100% optimized, unless it is compiled for weeks.
This is a very neat application of genetic algoriths! Haven't seen something so interesting since this Tron cycle game.
Why now just write faster code, and use inline assembler when needed, possibly with SIMD and MMX instructions?
Those compilers are helpful, but as stated in article, squeeze the last drop of performance from a code is what they do.
Note to self: get smarter troll to guard door.
It actually seems like a good idea.Some problems I can imagine are optimization flags may have non speed related side-effects?
:) or maybe the base-line is already close to optimal, and so it isnt a hard problem?
However, this seems like a great candidate for GA's - a fitness function (ie speed of execution), and a nice binary vector of atributes (flags).
Interesting that the gains aren't that great it would seem to sugges that the flags don't do much
But Kudos nevertheless. No expert in the GA field, so its possible someones tackled this before, but if they haven't, extra Kudos.
Winton
The concept is good in that it can take a predefined set of available optimizations and "figure out" (via some AI magic) optimal settings. This is good.
But, a more interesting problem might be more of a meta-compiler problem. Sure, we can optimize a set of options that we humans have determined/guessed are most influential in the building of efficient code. But what about running an analysis on the options themselves?
If we want to get real optimization, why not apply some data-mining theory to the different switches available within the compiler to generate the set of option switches selection?
Then, post selection of switches that are most influential in the compilation process, you can run this genetic algorithm to select "best switches" for the program you're compiling.
Diplomacy is the art of saying, "Nice doggie!" until you can find a rock.
There are a couple problems you can come into when getting this deep into compiler optimization flags. 1. Flags will optimize certain parts of code well,but may make other parts slower. 2. There is no guarantee the optimization carries over to other computers.
Because you have a single data point from a related problem domain (that does not even use the same solution!) and the single data point does not perform well, the concepts presented in this article are what? Flawed?
I suspect you either do not understand genetic algorithms, or that you do not have experience in a problem domain where GA could have been successfully used.
Just because you don't understand GA doesn't mean that GA isn't applicable to this problem. I recently had a problem to solve similar to the problem outlined in the article (optimizing results with N variables, where N > 5) and the GA solution beat the pants off the brute-force solution.
Note that I'm actually not saying that GA is a panacea; rather, I'm saying that for some problem domains, like the one in the article, it makes a whole lot of sense.
As such, is there any justification for using a GA to do this rather than, say, simulated annealing? I'd would be interested to see a comparison of the two approaches. Especially in light of papers like this one.
anyways think about the overkill of developing such an elegant solution for such a minor problem. i mean those few percentages of performance enhancement...
I'm sure some people would love a few percent improvement in their SPEC results...
Personally, I have an algorithm that I'm running hundreds of millions of times as part of a larger calculation for my PhD research. I'd love a "few percent" improvement simply by changing the compiler flags.
Acovea is curently a research tool in its own right, and not quite ready for application to general problems. Some issues that require modest effort to resolve:
1) Code currently is pentium3/4 centric
2) Optimizes flags for gcc 3.4 or intel's compiler
3) Cannot easily handle code residing in more than one file.
...because they were able to pick out combinations of optimizations that measuring each optimization individually wouldn't have discovered.
Consider the case in huffbench when
-fmove-all-movables, -freduce-all-givs and -ftracer were discovered to result in great improvements when used with the other GA-selected optimizations -- but not when applied directly to -O3.
Sure, these situations may be specific to the test cases -- but that's not to say his software can't be used to evaluate real-world scenarios as well, such that companies doing a final release compile of their code let a GA churn for a few days to determine the best flags to use.
That's a Good Thing, no?
The Green Hills C compiler had all kinds of optimization techniques but they also had a way to measure the assembler output. One method would eliminate all symbols and make huge code that ran inline. The code was fast but used up a lot of disk space. They had many optimization options and dozens of flags that could be set so that the compiler would actually give you what you wanted. Analyzing the assembly code was a great way to learn the ins and outs of their compiler.
Banjo - The more I know about Windoze, the more I love *nix
This gets me thinking about Nat Friedman's GNU-rope (Grope) project. I heard him talk at ALS back in '98 and then the project seemed to completely disappear. Searching on "gnu.rope" leads to a few mailing list postings asking "where'd it go", but no good information about the project.
The basic idea was to reorder the functions in an executable so that locality of reference was maximized and cache hits were increased. The result is less paging and better performance and memory usage.
The really interesting bit is that the optimization is based on the usage of the program being optimized-- that is to say that my Grope-optimized version of Mozilla might be different than yours based on my differences in usage (i.e. perhaps I browse with images turned off, etc).
The tie-in to the article here is that Nat's system, Grope, used simulated annealing to traverse the n-dimensional space of potential function arrangements and profiled the memory paging of the application as a fitness function of the new arrangement. It's not a GA-- but it's functionally similar.
So-- anybody know what happened to Grope? I'd imagine that a community would spring up around it fairly quickly given the relatively high number of "performance zealots" who are busy "tricking out" their Linux boxes by compiling everything from scratch (think Gentoo) optimized for their procesor. Now they can add the "spolier" or "racing stripe" of having exectuables specifically re-ordered for their personal usage patterns! *smirk*
The Attitude Adjuster, I hate me, you can too.
True, but you can still get some improvement with a procedure like this:
In fact, this might get you better results overall than running the genetic algorithm against the whole project. Sometimes the optimal thing is to have one set of optimizer flags for a certain chunk of code and then another set for another chunk.
Having said all that, this article was mainly intended for research purposes. The goal seems to have been to find out more in general about which flags matter and how much than it was about producing a technique for everyday compiling.
In other words I randomly swapped pairs of instructions that could be swapped, if it saw an improvement it kept it and if it saw a degradation it rejected it with a probability depending on how bad it made things.
After hours of running it would sometimes trim a clock cycle or two per loop off a short assembler routine.
Doesn't it make you feel good to know that our freedoms are protected by politicans, lawyers and journalists.
What we should be most skeptical about is that Mr. Ladd wanted to study the effects of using a GA on optimization, yet he neglected to run a control sample involving random perturbation of options from the start case (-O1 or -O3), without selection.
If, as I suspect, the search space is disorderly, then random perturbation should, among the same number of test compilations, be able to find a solution comparable to that which his GA finds.