Slashdot Mirror


Intel's Nehalem EX To Gain Error Correction

angry tapir writes "Intel's eight-core Nehalem EX server processor will include a technology derived from its high-end Itanium chips that helps to reduce data corruption and ensure reliable server performance. The processor will include an error correction feature called MCA Recovery, which will detect and fix errors that could otherwise cause systems to crash — it will be able to detect system errors originating in the CPU or system memory and work with the operating system to correct them." Update: 05/27 19:11 GMT by T : Dave Altavilla suggests also Hot Hardware's coverage of the new chip, which includes quite a bit more information.

4 of 80 comments (clear)

  1. Re:x86 by Lally+Singh · · Score: 4, Insightful

    They're not, nobody buys Itanium. They're going after SPARC and POWER. Lots of people are looking at the speed and throughput of modern x86 and noticing the price difference. Especially in this economy.

    And with Ellison in control of SPARC, it's the best way to go.

    --
    Care about electronic freedom? Consider donating to the EFF!
  2. Re:ECC memory replacement? by davecb · · Score: 4, Insightful

    I'm a bit surprised this is only seeing the light now: as we get smaller and faster, the number of errors observed goes up amazingly.

    Back in the stone age, Cray computers didn't even have parity memory, partly because they were willing to re-run programs but mostly because errors were unlikely. Cray himself famously said "parity is for farmers".

    These days, errors are very common, and I'm literally amazed that x86s don't have better-than-ECC error detection and correction. All the commercial Unix vendors have them.

    --dave

    --
    davecb@spamcop.net
  3. Re:x86 by Chris+Burke · · Score: 5, Insightful

    Error correction on an x86 chip?

    Sweet. Now all those high-end server applications running on x86s that need great uptime can finally join the big boys. [rolls eyes].

    Is the demand for x86 Server chips that high? I thought anyone requiring 5 nines (or anything close to it) would never consider using x86?

    The story of the server market for the last 10+ years is simple: x86 has been eating everyone else's market share from the bottom up. Commodity pricing > perceived advantages of the proprietary RISC vendors. To the extent that there are real necessary features x86 lacked, it has acquired them as necessary.

    There's been correctable ECC on x86 server chips for years. x86 has long since moved up-market past the point where basic RAS features (like ECC) are mandatory. Intel's Xeon has had these features for a long time. AMD Barcelona core was the first to have correctable ECC in the L1 caches -- before it could detect errors but couldn't fix them.

    Basically the only new feature here is the ability to notify the OS about uncorrectable errors so that the OS can try to fix the problem by nuking the affected app, reloading a code page from disk or whatever else is appropriate so that a system reboot isn't always necessary on uncorrectable errors.

    Yeah this is something the "big boys" already had, fat consolation that will be now that x86 is poised to eat their lunch. Not even Intel themselves could reverse the trend when they tried. They could use features like this to differentiate Itanium all they want, at the end of the day the customer says "yeah that's great, but can you do it in an x86 chip?" This is just them bowing to the demands of the market (in order to make mega $$).

    --

    The enemies of Democracy are
  4. Re:x86 by Anonymous Coward · · Score: 4, Insightful

    x86 is slow and under performing architecture

    So right there you've destroyed your credibility. You couldn't be any more wrong if your name was W. Wrongy Wrongenstein.

    Right now, x86 processors are the highest performance in the world.

    and I am surprise that Intel is bolting error correction on top of it

    Well, that just shows you aren't paying attention to the trends of where x86 is going any more than you've been paying attention to its performance. x86 has been gradually moving up market into higher and higher tiers of servers for well over a decade now.

    The Intel instruction set is so complicated that often times a single bit being flipped means it is still a very much valid opcode which when executed will do something completely different from what you expect it to do.

    And now we see that you don't have much clue about instruction set encoding, either.

    There is literally no commercially viable instruction set for which the above is NOT true. Look at a traditional RISC instruction set with 3 operands and 32 GPRs. Almost half of the bits (15 of them) in every 32-bit ALU instruction for such a processor are register addresses. Flip any of those bits and the register address is still valid -- there are no invalid addresses, so the processor can't tell the difference between the wrong address and the right one. The remainder of the bits in such an instruction are typically instruction format select, opcode select, and miscellaneous control bits. Flip an opcode bit and you'll get the wrong ALU op, more often than not... processor designers leave some room for adding opcodes, but typically not a lot.

    See, the only way an instruction set can guard against bit flips is not by simplicity (as you implicitly claim), it's by being horribly wasteful. When people design instruction encodings, they look at the width of all the bit fields in each instruction format and use the smallest they can get away with. Instruction sets which aren't efficiently packed aren't any good: they use more memory to store program code, have reduced effective icache size for the same number of bits in silicon, tend to have major clumsiness (such as too-small immediate operand sizes, or too-small relative branch windows),and so forth. Efficient packing always means there are very few invalid bit patterns for each field in the instruction; if you have a lot of invalid patterns you probably could be packing the instruction tighter. Few invalid patterns means that most bit flips still produce a valid instruction.

    This seems to be nothing short of a stopgap measure for not losing more customers to the big iron manufacturers like Sun and IBM who both have their own CPU's that were built with stability in mind.

    Idiot. Intel isn't losing big iron marketshare to IBM and Sun. It's taking big iron marketshare from them. Adding big iron RAS features to x86 is the next step in that trend.

    x86 has moved into areas where it simply is not going to shine as brilliantly as it did on the desktop. The only issue is that moving to a new platform is going to be catastrophic in that too many people rely on it. Apple being able to transition from PowerPC to x86 is quite a feat, but x86 transitioning to the next big thing is going to be impossible without at least backwards compatibility in the form of x86 emulation, and boy is the x86 instruction set fun to emulate!

    1990 called, and it wants its foolish predictions of where x86 cannot go back.

    Much better informed people than you thought, back then, that x86 could never be a workstation or server CPU in any capacity at all. It was just a personal computer processor, and a rather ugly and slow one at that.

    Instead, Intel proved they could make fast x86 processors, and steadily increased x86 presence in the workstation and low end server market throughout the 90s, with an assis