Slashdot Mirror


Microsoft Advice Against Nehalem Xeons Snuffed Out

Eukariote writes "In an article outlining hidden strife in the processor world, Andreas Stiller has reported the scoop that Microsoft advised against the use of Intel Nehalem Xeon (Core i7/i5) processors under Windows Server 2008 R2, but was pressured by Intel to refrain from publishing this advisory. The issue concerns a bug causing spurious interrupts that locks up the Hypervisor of Server 2008. Though there is a hotfix, it is unattractive as it disables power savings and turbo boost states. (The original German-language version of the article is also available.)"

3 of 154 comments (clear)

  1. Broken processors by Anonymous Coward · · Score: 5, Insightful

    The processors are clearly broken, and anyone who bought them should get a refund or an exchange. End of story.

    1. Re:Broken processors by hattig · · Score: 4, Insightful

      It's pretty serious.

      Server requirements of CPUs include virtualisation and power savings (saving power in the data centre is a top priority for companies now).

      This CPU cannot do both at the same time, at least with Windows Server 2008's Hypervisor. Presumably it is being sold with both items listed as features however. I agree with the OP - the CPUs are broken as sold and advertised.

  2. Isn't it really a bug in Windows Server? by tomhudson · · Score: 5, Insightful

    FTFA:

    For the integrated hypervisor of Windows Server 2008 R2, Microsoft has bravely resorted to a timer function that they themselves had classified as unreliable for former processors: the timer of the Advanced Programmable Interrupt Controller (APIC). Unlike, for example, the CPU timer (Time Stamp Counter, TSC) - which by now is comparatively resistant to power-saving, SpeedStep and turbo-boost modes, but is also virtualised by virtual machines - the APIC timer can also trigger interrupts. Unfortunately, right now, the Nehalem has too many of those, so that the hypervisor falters and then stops, returning the message "Clock_Watchdog_Time-out".

    So yes, if you depend on something that generates an interrupt whose code path may be suspended in certain power-saving modes, don't be surprised if it doesn't get serviced promptly. It looks more like a bug in Windows Server.

    Back in the old days, when you issued a CLI instruction, you made sure your routine didn't do too much work before issuing an STI, because that code isn't re-entrant (it's directly modifiable by the hardware, which is why you have to use the "volatile" keyword to make sure that compilers didn't "optimize away" any loops, etc). Kind of hard to guarantee that if you're putting that portion of the hardware to sleep between interrupts. As the article points out, disabling those power-saving modes fixes the problem.