Flaws In Intel Processors Quietly Patched
Nom du Keyboard writes "According to this article in The Inquirer and this Microsoft Knowledge Base article, a fix for some significant problems in many of Intel's most recent processors has been quietly released — by whom is not clear. Patches are available on Microsoft's site. Affected processors include Core 2 Duo E4000/E6000, Core 2 Quad Q6600, Core 2 Xtreme X6800, XC6700, and XC6800. Details on just what has been fixed are scanty (it's called a 'reliability update'), however, it's probably more important than either Intel or Microsoft is openly admitting." There is no indication that Apple users are affected.
This patch affects the microcode, which are the underlying machine instructions: http://en.wikipedia.org/wiki/Microcode
How could this not affect Intel Macs? They use the same machine instructions that everyone else does!
Two months ago, Intel introduced microcode updates for all systems with an Intel® Core(TM) 2 Duo processor. According to an HP Tech Support Document:
While the implications of the issue are difficult to quantify, any of the following symptoms can occur:
* The system may stop responding to keyboard or mouse input.
* A system operating in a Microsoft Windows environment may generate a blue screen.
* A system operating in a Linux environment may generate a kernel panic.
This was the first I had heard of this; probably a good time to check for BIOS or microcode updates."
The HP link also indicates the nature of the problem, which should not be OS specific:
This Intel microcode update addresses an improper Translation Lookaside Buffer (TLB) invalidation that may result in unpredictable system behavior such as system hangs or incorrect data.
Can You Say Linux? I Knew That You Could.
Yeah, because going to the processor's documentation page is hard to find. (Look under "specification update"). For the desktop Core2Duo processors, there are 59 pages(PDF) of errata documentation. Updated May 2007...
Comment removed based on user account deletion
The real "Libtards" are the Libertarians!
Typically it is only sequences of instructions that would trigger these bugs. In other words, the CPU has to be in a certain state to trigger the bug. Some OSs will never get in that state. The bugs are surely something like this because otherwise crashes would be far more common than we see.
The reason why I mention cache handlers is because those are notoriously tricky and have proven buggy before. The Core Duo 2 CPUs need new cache handlers to handle the dual (and more) cores and thus this area is more likely to be buggy.
Engineering is the art of compromise.
Incorrect. Microcode on Intel processors can be updated live by software. This has been possible for ages. For information on how this can be done in Linux for example, see here.
- Michael T. Babcock (Yes, I blog)
So here's the deal.
Intel processors don't directly execute instructions anymore. They translate x86 into a series of other operations -- an internal code, if you will. Sometimes there are bugs in the code that's generated. Microcode patches address those bugs.
Unless Intel has an update mechanism I'm not aware of, this is a Microcode update, and this is how they are always released.
And for what its worth it doesn't patch anything, it loads into the processor at boot. Delete the microcode file or remove the OS and the processor will be just as you bought it.
Just be glad they were smart enough to use such a system where the processor can be updated while running and temporarily, allowing you to revert back to its purchased state.
Everybody publishes errata. AMD's are at: http://www.amd.com/us-en/assets/content_type/white _papers_and_tech_docs/25759.pdf (starting on page 12)
A deep unwavering belief is a sure sign you're missing something...
It's actually quite likely. CPU errata tend to effect corner cases. Eg: CPU returns wrong data if you read from an I/O port while servicing a TLB miss (or something like that). These bugs tend to be highly timing and sequence dependent, and its very likely that no two OSs use exactly the same sequence that triggers the bug.
A deep unwavering belief is a sure sign you're missing something...
. However, any _compiler_ worth its salt will try to use every bit of microcode it can to optimize for a given architecture or microarchitecture
Actually, compilers try to avoid micro-coded instructions like the plague. On most x86 processors, micro-coded instructions can only issue out of a single issue slot at a fixed rate, and hence their use drastically lowers performance. Modern compilers generally treat the x86 like a RISC with a weird condition register and fancier addressing modes.
A deep unwavering belief is a sure sign you're missing something...
The Linux kernel is not currently affected, though some multi-processor apps with homegrown assembly might be.
The problem is some sort of atomic operation sequence. Somebody let slip a reference to the bug on a mailing list today, without any real details. Probably the details are still under NDA.
E4000 - doesn't exist
E6000 - doesn't exist
Q6600 - k, this one does exist
X6800 - this one exists too
XC6700 - doesn't exist
XC6800 - doesn't exist
Of course, they probably meant E4000 and E6000 series, and maybe they meant QX6700 and QX6800...
I guess it was the inquirer's fault. But they probably could have just said "all Core 2 Duos, Extremes, and Quads."
You can download the software developers manual for Intel's line of processors, which covers pretty much everything you ever needed to know, lots you probably didn't, and then some.
It's historically been 3 volumes, but these days they have volume 2A, 2B, 3A, 3B, plus there is the optimization reference, and some changes and notes.
Have a blast!
Intel has released an errata document.
For June 2007 it lists 3 new errata:
AH106
A memory access may get a wrong memory type following a #GP due to WRMSR to an MTRR mask.
AH107
PMI while LBR freeze enabled may result in old/out-of-date LBR information
AH5P
VTPR may lead to a system hang
However, the document states that there are no fixes available. So it's probably not what MS/Intel is addressing here.
1) The Pentium FDIV bug produced an incorrect answer in 1 in 9 billion double precision floating point divides. It did not affect integer divides.
2) The answer always contained at least 14 correct significant bits (usually more, but an error in the 15th significant bit was the worst case). The means that single precision calculations were almost invariably correct.
3) Any hack to solve the problem would have been hundreds of times slower than just living with a small error in so few calculations.
4) All games today get by just fine using single precision floats for rendering.
5) It took a guy (Thomas Nicely) with a Ph.D. doing heavy research in computational number theory to find it, yet you found it while working on a game in QuickBasic.
I think Nicely said it best in his FDIV flaw FAQ:and also:
Education is a better safeguard of liberty than a standing army.
Edward Everett (1794 - 1865)
b) If you're running a Red Hat-derived distro, watch out for updates to the kernel-utils package, which provides microcode_ctl and /etc/firmware/microcode.dat. It might also be worth checking Tigran's site a bit more regularly. I note that his page includes a microcode.dat which is about 7 months newer than that currently provided by CentOS 4.5's kernel-utils package.
The microcode needs to be updated every boot. It's volatile and resets when you turn off the system. See http://urbanmyth.org/microcode/
As far as I know, all OSes do this.
Isn't it more plausible that the file names have the word "genuine" in them because like many patches, they're only available to activated windows boxes, and that it's just some random bug in the microcode being fixed?
The bug in question is the bug in the TLB that was discovered back in April. Here's HP's page on it. I think that the only reason it's news today is because Microsoft has either just released or re-released a patch to fix the issue on Windows boxes.