AMD Alleges Intel Compilers Create Slower AMD Code
edxwelch writes "In AMD's recient anti-trust
lawsuit
AMD have examined the Intel compiler and found that it deliberatly runs code slower when it detects that the processor is an AMD.
"To achieve this, Intel designed the compiler to compile code
along several alternate code paths. ... By design, the
code paths were not created equally. If the program detects a "Genuine Intel" microprocessor,
it executes a fully optimized code path and operates with the maximum efficiency. However,
if the program detects an "Authentic AMD" microprocessor, it executes a different code path
that will degrade the program's performance or cause it to crash.""
...software. Something like this would never be implemented in an open source compiler. With open source, you know exactly what you get... with closed, you get what the owners want you to get, which will help their bottom line.
Meh.
They key point here is that in order to invoke Hanlon's razor, you have to be able to *adequately* explain the issue by assuming stupidity instead of malice, though. In other words, the explanation of the issue by means of stupidity must be reasonable; if you (as an unprejudiced, objective person) can instantly dismiss that explanation because it's obviously untrue, then Hanlon's razor does not apply.
:)
Now ask yourself again: do you believe that it's PURE COINCIDENCE that Intel's compiler produces slow code for its competitor's processors? Code that even a novice Assembler programmer would be able to improve instantly (see the rep movsb vs. rep movsd issue mentioned in a comment above, for example?) Do you believe that, for some funny reason, they never happened to notice? That noone ever complained to them about it? That they're really just an innocent victim here?
Well, do you? If yes, please get back to me, I have a used car to sell you.
quidquid latine dictum sit altum videtur.
Part of AMD's claims is outrageous. Why would AMD expect its competitor, Intel, to write software that supports AMD's own products? We would not expect IBM to modify AIX or any other IBM software package to run on SPARC, which is a poorly designed processor.
1. AMD's claim is that the Intel Compiler produces code that actively detects the AMD CPU, then intentionally runs slower code. That's not the same thing as Intel optimizing their compiler for the Pentium Architecture.
2. If you think the SPARC is a poorly designed processor, you need your head checked.
By restricting the GCC compilers to generating only a simple but fast subset of instructions, we could encourage both AMD and Intel to deprecate and, ultimately, eliminate the more complex x86 instructions. Linux and the bulk of open-source software use the GCC compilers and would provide a critical mass of support for a new streamlined transistor-count-reduced x86 chips. Here, I am thinking, "shockingly reduced in power due to using 1/3 of the transistors."
Wouldn't that make the x86 Platform act like one of those "Poorly Designed RISC Processors" you were just complaining about?
In any case, you won't see much of a transistor reduction. Most of the instructions you're trying to avoid are implemented in MicroCode and add no significant overhead to the chip. What *does* add all the transistors is the 20 stage pipeline, branch prediction, superscalar execution, Out Of Order instructions, etc, etc, etc.
Javascript + Nintendo DSi = DSiCade
Most Linux development is done using GCC , Most of windows with MSVC++. Only true hard-core inner-loop optimising geeks usually use Intel C/C++ compilers. These are people like game devs, crypto developers and HPC programmers.
So yeah, there's a lot of code that doesn't work with Amd64 when compiled with ICC. But how many people build stuff on Amd64 with an Intel compiler ?. (remember this is not valid for stuff compiled on a pentium 4 but running on amd64)Quidquid latine dictum sit, altum videtur
Isn't Prescott 32 stages nowadays? Silly Intel. Gotta have the bigger pipeline, huh?
;-)
Indeed. Only Crays and DSPs used to have pipelines that long. A single jump instruction, and you have to flush the entire pipeline! In super-computing and DSP, you almost never see a jump, so there's no concern. But in Intel's zeal to win the clock rate wars, they maxed out the pipeline to an absolutely ungodly length. And a 2.2 GHz AMD64 *still* outperforms a 3.2 GHz Pentium!
Seems that Intel's P4 design backfired on them. Of course, that may be due to Intel's belief that the Itanium would take the market by storm. They did learn from their iAPX 432 chip by at least producing a method for emulating x86, but they failed to take into account how deeply entrenched the x86 performance crowd was. Now AMD64 is eating Intel's lunch! (Oops!)
And as a person who's designed a simple (can't do too much in 10 weeks) 2-issue out of order machine, let me tell you, that's fun stuff. Really makes you appreciate how insanely complex real processors are. And don't even get me started on their branch prediction...
I hear you. Trying to cram a reasonable chip into an FPGA can be quite a challenge. If MicroCode hadn't been invented, it might not be possible to fit one in so few transistors. At least we can finally stop the CISC vs. RISC debate. The MicroCode designs provide CISC instructions on top of RISC cores just to confuse the heck out of both sides. Next up, writing a VI clone in LISP!
Javascript + Nintendo DSi = DSiCade
It's very simple. If I were a programmer buying Intel compilers (I mostly do administrator work) would I have been reasonably led astray by their advertising to think that what Intel was selling was an X86 compiler that didn't play favorites? There's an enormous class action waiting out there for programmers who thought they were getting something (an honest x86 compiler) but werent and had to deal with user complaints from customers who suffered. There's a similar end user class action just waiting for an enterprising lawyer to set up.
End users and programmers have no interest in supporting Intel processor dominance but were tricked into that by Intel's underhanded dealing.
It's one thing if you don't go out of your way to support an entirely different architecture that you don't make yourself, but it's entirely different to intentionally cripple your program when it finds out it's running on your competitor's hardware that's fully compatible with your own.
who depend on sites like Slashdot to cull the most newsworthy items from the multitude of sites, mail lists, and other sources, it is news.
Congratulations. You're on the gcc mailing list and the rest of us must now bow before your mad news reading skills. You are truly one to behold.
Wow... that's a great example, and you should gather as much evidence of it as you can, especially Intel's responses, and send it to AMD's legal team.
/ eng/compilers/clin/index.htm or here http://www.intelcompiler.com/.
If I were Intel, I would respond with the product description of the Intel compiler, as found here http://www.intel.com/cd/software/products/asmo-na
The product is clearly labeled as a high performance compiler for Intel CPUs. The grandparent used the wrong tool for the job which required a generic compiler.
The way a responsible compiler developer generates code in this situation is to say, "I need feature X, Y and Z for this code path--it will be used if all are present."
The only thing the latest Pentium IVs have on the Athlons is the SSE3 extensions, and that is pretty much irrelevant to most code. "It must be true, Intel told us."
If you switch the Intel compiler so that it does NOT generate multiple-CPU-supporting code, you can generate code that runs on Pentium III, IV, and Athlon just fine. (My overly-cheap ECS Duron board burned out, and who cares about Duron anymore anyway?)
(The magic switches are -xK or -xW, BTW.)
But there is no reason for code targetting a Pentium IV restricted to SSE2 at the highest to NOT run on an Athlon. You can "not support" it, sure, but to DELIBERATELY disable the code when you detect a competitor's chip is... well, anti-competitive.
And this isn't a free compiler. You pay for ICC. (There's a for-free-software download, sure, but most people who really care about this stuff are more than willing to pay, and pay well, for the best compiler optimizations they can get.)
The problem is not that they can't guarantee that the AMD chips work. The problem is they are generating instruction streams that run JUST FINE on AMD chips and their runtime code DISABLES that code path (if you use the recommended or default compiler settings).
It's AMD's job to make sure their implementation of IA32 + whatever extensions is good, not Intel's job to disable code on their chips--except based on the processor capability word. (Whatever "flags" from /proc/cpuinfo is properly called.)
What people seem to be saying is that by patching the binary to force the P4 path, there is a significant performance increase on the Athlon. In other words, even though the P4-optimization is not optimal for the Athlon, it exceeds the performance of the "baseline" path and there is no reason to disable it -- other than to cripple AMD's performance.
yes, ICC generates faster code even on AMD-CPU's than GCC (for example) does. But that's not the point AMD is making. AMD claims that ICC detects whether the CPU is an AMD-CPU. If it is, it generates slower executables, even though there's no real technical reason to do so. It does so merely to put AMD at a disadvantage. Yes, the code might still be faster than GCC-generated code (you could say that it tells quite a bit about GCC as well....), but it's still crippled when compared to Intel-code.
Lesbian Nazi Hookers Abducted by UFOs and Forced Into Weight Loss Programs - -all next week on Town Talk.