A Review of GCC 4.0
ChaoticCoyote writes "
I've just posted a short review of GCC 4.0, which compares it against GCC 3.4.3 on Opteron and Pentium 4 systems, using LAME, POV-Ray, the Linux kernel, and SciMark2 as benchmarks. My conclusion:
Is GCC 4.0 better than its predecessors? In terms of raw numbers, the answer is a definite "no". I've tried GCC 4.0 on other programs, with similar results to the tests above, and I won't be recompiling my Gentoo systems with GCC 4.0 in the near future. The GCC 3.4 series still has life in it, and the GCC folk have committed to maintaining it. A 3.4.4 update is pending as I write this.
That said, no one should expect a "point-oh-point-oh" release to deliver the full potential of a product, particularly when it comes to a software system with the complexity of GCC. Version 4.0.0 is laying a foundation for the future, and should be seen as a technological step forward with new internal architectures and the addition of Fortran 95. If you compile a great deal of C++, you'll want to investigate GCC 4.0.
Keep an eye on 4.0. Like a baby, we won't really appreciate its value until it's matured a bit.
"
http://www.kdedevelopers.org/node/view/1004
;)
Qt:
-O0 -O2
gcc 3.3.5 23m40 31m38
gcc 3.4.3 22m47 28m45
gcc 4.0.0 13m16 19m23
KDElibs (with --enable-final)
-O0 -O2
gcc 3.3.5 14m44 27m28
gcc 3.4.3 14m49 27m03
gcc 4.0.0 9m54 23m30
KDElibs (without --enable-final)
-O0
gcc 3.3.5 32m56
gcc 3.4.3 32m49
gcc 4.0.0 15m15
I think KDE and Gentoo people will like GCC 4.0
Intel compiler's reason why it generate faster code is because it does auto-vectorisation (ie, it automatically finds out how to transform some code patterns to take advantage of native vector operation, such as those provided by sse). They started to implement this in gcc 4.0, but it's a veyr first iteration that for what I know is still kinda limited. I'm not even sure it's enabled by default, even in -O3. There are lots of improvement there targeted at gcc4.1.
The whole point of gcc4.0.0 is the tree-ssa thing. The author of this test didn't seem to notice that this stuff doesn't get enabled in -O2 nor -O3, but does have to be enabled by hand. This includes autovectorization (-ftree-vectorice) among other things which may make a difference.
If I was him, I'd repeat the tests again enabling the -ftree stuff when building with gcc4.0.0.
I think the author of the article misunderstands just what happened with GCC 4.0.
The main improvement in GCC 4.0 is implementing Single Static Assignment.
SSA is not an optimization. It is a simplification. If you can assume SSA, then it opens the door to an entire class of optimizations that can help improve your performance without affecting your code's correctness.
That last bit -- optimizing code without affecting correctness -- was a big problem in the days before SSA.
In that regard, SSA is a similar technology to RISC -- it does not speed things up by itself, but it enables speedups for later on.
The lack of SSA is one thing that kept gcc out of the hands of compiler researchers. Now that it does that, academia can start hacking away with gcc, and the delay you expect is the time between implementing SSA and implementing all of the optimizations that really will improve code performance.