Intel Compiler Compared To gcc
Screaming Lunatic writes "Here are some benchmarks comparing Intel's compiler and gcc on Linux. Gcc holds it own in a lot of cases. But Intel, not surprisingly, excels on their own hardware. With Intel offering a free (as in beer) non-commercial license for their compiler, how many people are using Intel's compiler on a regular basis?"
Both machines are running SMP kernels because Hyperthreading makes the single P4 processor look like two processors. If you don't run an SMP kernel, you don't get access to the second virtualized CPU.
GStreamer - The only way to stream!
So does this mean tha mozilla compiled with the intel compiler would run comparable to it's windows counterpart?
I would like to see a test with real desktop applications and desktops, ie. gcc GNOME/KDE vs. icc GNOME/KDE. Would these projects see significant performance improvements from the Intel compiler?
In looking at the selection of benchmarks, it seems like they're all based on scientific numeric problems. In other words, they're all very floating-point intensive. I'll admit that I didn't read it all that carefully, but it looked a bit like reporting SPECfp numbers without looking at SPECint.
Also, the benchmarks used are probably much more loop-oriented than much of the real-world code, but that's typical of benchmarks.
What I would find interesting would be to compile glibc, apache, and something like perl or mysql with both sets of compilers and see what difference you can get with some web server benchmarks. Or compile X and some game and see how the frame rate compares between the two compilers. Or compile X and Mozilla, and find some really complicated pages to see what gets rendered the fastest (possibly using some trick to get it to cycle between several such pages 1000 times).
I'm a little ignorant when it comes to this... gcc and linux have always gone hand in hand in my mind.
Could we see versions of linux distributed with intel compiler instead of gcc? Can the intel compiler compile the kernel?
Clue me in!
--noodles
I was just looking into this the other day--since Intel just supplies their own binaries, it wasn't thrilled about my Gentoo install. Sure, I had rpm, and I could use --nodeps to make it shut up during the install, but even then it didn't like my binaries. Maybe they're too new to be RedHat compatible? ...not to mention the silly license file I had to copy to get the thing to even attempt to install. Thanks, Intel, but no thanks. Until you open the source on this one, I see no need to test it out on my ATHLON. ;)
pb Reply or e-mail; don't vaguely moderate.
I am Pentium of Borg. Division is futile. You will be approximated.
a ^= b; b ^= a; a ^= b;
But Intel, not surprisingly, excels on their own hardware.
Do you mean to imply that Intel knows something about the Pentium architecture or instruction set that the authors of gcc don't? Does the code emitted from the Intel compiler use undocumented instructions? Intel's compiler is newer than gcc and wasn't developed with the "many eyes" that have looked at gcc over the years. It looks like Intel's engineers wrote a better compiler, simple as that.
These benchmarks give gcc a black eye, but I doubt Intel was using undocumented secrets of their chip to defeat gcc. Sometimes the open source community has to admit that not every open source project represents the state-of-the-art.
I've been writing some integer only video compression code, and these results don't really bear out what I've been seeing with GCC 3.1 and Intel C++ 6. I'm getting a consistent 15-20% more framerate under Intel, using an Athlon. An Athlon, god alone knows what we'd be looking at if I was daft enough to buy a P4. Admittedly there are some vectorizable loops in there (that are going to be replaced by primitives at some point), but even without those the performance improvement from C6 was consistent and noticeable.
More relevant is how the performance of C7 is markedly worse on the P3 platform than C6. Very disappointing, makes me wonder what they've done.
Dave
I write a blog now, you should be afraid.
You still have to get the license file and install it where necessary, but try an `emerge icc`. It does work (by doing pretty much exactly what you did by hand...). What I really want is the ability to put USE='icc' in make.conf.
Dan
i've been using icc on a realtime computer vision project that i'm working on. intel's compiler ended up giving me an approximately 30% boost over all --- a difference which is not to be sneezed at. in terms of empirical performance data for my application, icc wins hands down.
;-)
that said, icc does a lot of things that really irritate me. for one, it's diagnostic messages with -Wall are, well, 90% crap. note to intel: i don't care about temporaries being used when initializing std::strings from constants --- the compiler should be smart enough to do constructor eliding and quit bothering me. the command line arguments are somewhat cryptic, as are the error messages you get when you don't get the command line just right. the inter procedural optimization is very *very* nice; however, be prepared for *huge* temporary files if you're doing ipo across files (4+mb for a 20k source file adds up very quickly).
this all said, i don't think that i'm going to give up either compiler. gcc tends to be faster on the builds (especially with optimization turned on) and has diagnostics that are more meaningful to me. fortunately, my makefiles know how to deal with both
I wonder why he didn't turn on -fforce-addr under GCC?
Under the versions of GCC that I have used, I've always found that -fforce-addr -fforce-mem gives a slight speed boost when combined with -O3 -fomit-frame-pointer.
Under GCC 3.2, it looks like -fforce-mem is turned on at optimization -O2 and above, but -fforce-addr does not appear to be turned on, and it seems like it may be of some help in pointer heavy code.
I didn't use -fforce-addr because I didn't think of it! ;)
Based on some work suggested by someone in e-mail, I'm going to see if it's possible to write a "test every option combination" script. Given the hundreds of potential options, we're looking at a REALLY BIG test... ;)
In my view, gcc has far too many options and virtually no real documentation about how those options interact, or even what options go with what circumstances. Very messy, and very hard for people to figure out.
,I hope to alleviate that problem, given time and resources.
All about me
My article is a guideline, not a pronouncement. Your mileage is guaranteed to vary.
All about me
Historically, Intel has always been ahead of the competition in terms of code generation; I've used their Windows compiler for years as a replacement for Microsoft's less-than-stellar Visual C++.
On the Pentium III, the gcc and Intel C++ run neck-and-neck for the most part, reflecting the maturity of knowledge about that chip. The Pentium 4 is newer territory, and Intel has a decided edge in know how to get the most out of their hardware.
I have great faith in the gcc development team, and as my article clearly states:
All about me
Aarrrgggghhhhhh! I'd made that point in an earlier incarnation of the article, and it got lost when I rewrote the conclusions. Thanks for bringing this too my attention; I'll restore the lost text.
All about me
All about me
I'm running Intel C++ and Fortran 95 with Debian "unstable" as my distro (though I provide my own kernel), and it's currently using glibc 2.3.1.
Intel has stated on their web site forum that their compilers don't work with the glibc provided with Red Hat 8.0. I don't have an installation of Red Hat here, so I can't verify the problem.
All about me
Why did SPEC dudes bother splitting between SPECint and SPECfp after all?
Because encryption and other heavy number theory doesn't use floating-point.
Because analog modeling of physical systems such as circuits doesn't use integers except as loop counters and pointers.
Because floating-point hardware draws a lot of power, forcing makers of handheld devices to omit the FPU.
Will I retire or break 10K?
My specific issue has to do with code that looks like this:
;-)
class C {
public:
C(const string& s = "some string");
};
icc wants code that looks like this:
class C {
public:
C(const string& s = string("some string"));
};
The only real difference I see between the two is the explicit creation of a temporary. Now, as to why GCC doesn't complain is another issue --- maybe its diagnostics for temporaries aren't turned on with -Wall (perhaps -pedantic fixes that); however, I have this feeling that GCC's constructor elision is the trick here. To be honest, I'm very curious to find out why this happens. As an interesting aside, Stroustrup tackles the issue of overloading operators in a "smart" way so as to avoid unecessary copies.
Personally, I think Java (and whomever it "borrowed" these particular semantics from) got it right. Unfortunately, Java isn't exactly a good language for talking to hardware
On other architectures, gcc lags WAY behind the native compiler from the hardware vendor, look at sun workshop, compaq alpha compiler or sgi mipspro..
And the more code is made that only compiles with gcc, the more performance wastage on these architectures.
http://spamdecoy.net - free throwaway anonymous email - avoid spam!
My specific issue has to do with code that looks like this:
// C.hpp
// C.cpp
class C {
public:
C(const string& s = "some string");
};
icc wants code that looks like this:
class C {
public:
C(const string& s = string("some string"));
};
The only real difference I see between the two is the explicit creation of a temporary. Now, as to why GCC doesn't complain is another issue --- maybe its diagnostics for temporaries aren't turned on with -Wall (perhaps -pedantic fixes that); however, I have this feeling that GCC's constructor elision is the trick here. To be honest, I'm very curious to find out why this happens.
Constructor elision trick? The code
const std::string& s = "some string";
implicitly constructs a temporary std::string and binds it to the reference s. I don't know how the compiler could eliminate the construction of the temporary each time the function is called, unless it compiled it to something like the following:
#include <string>
class C {
static const std::string S_DEFAULT;
public:
C(const std::string& s = S_DEFAULT);
};
#include "C.hpp"
#include <iostream>
const std::string C::S_DEFAULT("some string");
C::C(const std::string& s) {
std::cout << "C::C() called with: " << s << std::endl;
}
You may wish to rewrite your code in this manner because it virtually guarantees that the std::string for the default parameter is constructed once and only once. It also provides an added benefit: if the value of the default changes (from, say, "some string" to "some other string"), then only the C class's translation unit needs to be recompiled.
You know, Einstein once said his biggest mistake was naming his theory 'special relativity'
His theory is not that everything is 'relative', it is that if you specify all the variables, everything is precisely understood.
Moreover, it does not apply to anything else. How is it that you use the postulates of the constancy of the speed of light and of simultaniety, can be applied to the speed of ICC over GCC???
Quote from the article:
"Like Einstein, I have to say the answer is relative."
DOH!
Liberty.