Intel's Nehalem EX To Gain Error Correction
angry tapir writes "Intel's eight-core Nehalem EX server processor will include a technology derived from its high-end Itanium chips that helps to reduce data corruption and ensure reliable server performance. The processor will include an error correction feature called MCA Recovery, which will detect and fix errors that could otherwise cause systems to crash — it will be able to detect system errors originating in the CPU or system memory and work with the operating system to correct them." Update: 05/27 19:11 GMT by T : Dave Altavilla suggests also Hot Hardware's coverage of the new chip, which includes quite a bit more information.
Error correction on an x86 chip?
Sweet. Now all those high-end server applications running on x86s that need great uptime can finally join the big boys. [rolls eyes].
I'm just not sure of the utility here -- I RTFA, but I'm still not clear on why Intel would cannibalize Itanium sales (new release delayed again) by offering error correction on Nehalem chips. Is the demand for x86 Server chips that high? I thought anyone requiring 5 nines (or anything close to it) would never consider using x86?
Can someone with more knowledge of the high-end server market please clarify?
"Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
I'm a bit surprised this is only seeing the light now: as we get smaller and faster, the number of errors observed goes up amazingly.
Back in the stone age, Cray computers didn't even have parity memory, partly because they were willing to re-run programs but mostly because errors were unlikely. Cray himself famously said "parity is for farmers".
These days, errors are very common, and I'm literally amazed that x86s don't have better-than-ECC error detection and correction. All the commercial Unix vendors have them.
--dave
davecb@spamcop.net
And Nehalem is an all in-order design, so they can scale out to very large numbers of cores or register-and-decoder sets on a single chip. That helps offset the huge bottleneck of trying to go to molasses-slow main memory on every cache miss, by allowing another thread to run. Mind you, I'd want enough cores to host 128 threads in order to at least match the new SPARCs, but that can come along later (;-))
You must be thinking of Atom, because Nehalem is definitely an out-of-order processor and not particularly small either. It does use SMT (and a big instruction window) to hide memory latency (and to keep its 4-wide execution engine busy), but that's having multiple threads running on the same core.
Frankly while Niagra is a very interesting approach that I think will only become more popular in the future (and Atom is theoretically capable of doing the same thing though right now it's just embedded stuff), for now there are many server apps where single-thread performance still matters greatly and for that out-of-order is the way to go (as Intel found out the hard way by trying every trick in the book to make an in-order machine fast enough).
The enemies of Democracy are