Erratum Plagues Quad-Core Opterons, Phenoms
theraindog writes "Errata are not uncommon with new processors, but a problem with the TLB logic in AMD's quad-core Opteron and Phenom processors appears to be quite serious. The erratum is so severe that AMD has issued a 'stop ship' order on all quad-core Opterons. AMD has also blamed this bug for the delay of the 2.4GHz Phenom, despite the fact that the erratum is unrelated to clock speed. A BIOS-based workaround for the issue has been made available to motherboard makers, but it apparently carries a 10-20% performance penalty. What's more disturbing is that AMD knew of the erratum and the potential performance hit associated with fixing it before it launched the Phenom processor. Hardware provided to the press for reviews did not include the fix, conveniently overstating Phenom performance."
Wow, bad times for AMD. They're losing the war against intel, and now have another set back. A 20% performance penalty is simply unacceptable for any processor. The fact that it is for brand new ones makes it an even bigger slap in the face for consumers.
Every CPU maker publishes the errata for their CPUs because system designers/vendors/whatever need to know these things. Every CPU made for the past (insert very long time in the computer world here) has had a big list of errata publicly published. Just got to the Intel or AMD site, for example, and look up the errata on the PPro, P3, P4, Core, Core2, Athlon, Athlon XP, Athlon64, Athlon64 X2, or whatever your favorite CPU happens to be.
The thing is, the CPU is actually broken a bit and AMD has pulled the Barcelona line but are continuing to sell the Phenom(inal Failure) line to customers and, evidently, don't plan to 'fix' the problem later (Intel offered replacements for the Pentium floating point bug after they got dinged on it, for example... I know... I had one and replaced it).
So... if you actually get your hands on (or got your hands on) a Phenom, realize you have a broken CPU and the more you load it, the more likely you'll have stability issues.... and AMD isn't (currently) going to fix it.
AMD has also blamed this bug for the delay of the 2.4GHz Phenom, despite the fact that the erratum is unrelated to clock speed. [Emphasis added.]
Why does the summary claim this? I read through both articles, and AMD says this is a hardware issue across both chip models. Since this is a hardware issue, wouldn't it stand to reason that AMD would hold up a related chip because it's a hardware bug across both chip models and not because it's a clock speed issue? I'm not sure where the "despite" comes into play. I didn't see where the article said that AMD is not delaying a different speed Phenom.
It's not like there aren't problems with Intel's CPUs - just take a look at the problems with the MMU in the Core 2 - but no-one is suggesting Intel is doomed. It would just be better if AMD had admitted this when they first knew about the issue rather than sending out review units that are known to have serious issues.
There are other possibilities that are more likely. For example, perhaps the patched kernel is doing something like loading microcode into the processor. The kernel code would be GPLed but the microcode would not be.
The idea was to gain some cash to sustain operations until a faultless (i.e. no major faults) CPU can be released. Those that bought faulty CPUs will get their CPUs replaced as soon as faultless CPUs are completed. In some sense you can look at AMD's action as taking out a long term loan.
A counter argument to my theory can be that AMD would not risk its reputation to take out a "cash loan" in such a manner. However, the risk of losing reputation is justified if we consider another major factor at play: the holidays. It is less likely that AMD would gain the same (or even close to the same) cash flows if they would have released the CPUs after the holidays.
AMD now has some cash and is able to breath a little bit. When it releases fixed CPUs it will be able to continue where it left off.