Posted by
michael
on from the better-but-more-expensive dept.
deadfx writes "Linux Journal is carrying an article authored by members of the Intel Compiler Lab examining specific optimizations employed by the compiler that allowed it to beat the gcc compiler on several benchmarks."
From what very little I understand, gcc has years of infrastructure focused on multiple machine instruction sets.
I'm pleasantly surprised gcc can do as well as it does considering that it can be built and run on some very unusual and dated pieces of hardware (although, from the recent release notes, it looks like some of the most obscure ones are slipping into oblivion.)
-- "Provided by the management for your protection."
(Set aside the fact that C automatically scales pointers arithmetic for you. Also ignore for the moment the fact that x86 allows you to do a scalar multiply by 4 in the load instruction -- pretend we are accessing structures of some large or non-power-of-two size.)
Here the computation of x * 4 is redundant, even though we never wrote x * 4 in the original program.
The point is, dumb code doesn't just arise because of dumb programmers, but because of the compilation process. (Also imagine you are calling a macro that computes offsets for you, etc.) Anway, every compiler implements this level of common sub-expression elimination, even gcc, so don't worry!
From what very little I understand, gcc has years of infrastructure focused on multiple machine instruction sets.
I'm pleasantly surprised gcc can do as well as it does considering that it can be built and run on some very unusual and dated pieces of hardware (although, from the recent release notes, it looks like some of the most obscure ones are slipping into oblivion.)
"Provided by the management for your protection."
...and why don't YOU look up the meaning of proprietary in a dictionary. SPARC is an open standard. x86 is not.
Stick Men
In fact,
a[x] = b[x] + c[x];
probably compiles into something like (in the C equivalent):
offset_b = x * 4 +
val_b = *offset_b;
offset_c = x * 4 +
val_c = *offset_c;
offset_a = x * 4 +
val_a = val_c + val_b;
*offset_a = val_a;
(Set aside the fact that C automatically scales pointers arithmetic for you. Also ignore for the moment the fact that x86 allows you to do a scalar multiply by 4 in the load instruction -- pretend we are accessing structures of some large or non-power-of-two size.)
Here the computation of x * 4 is redundant, even though we never wrote x * 4 in the original program.
The point is, dumb code doesn't just arise because of dumb programmers, but because of the compilation process. (Also imagine you are calling a macro that computes offsets for you, etc.) Anway, every compiler implements this level of common sub-expression elimination, even gcc, so don't worry!