Linux Shootout: Opteron 150 vs. Xeon 3.6GHz Nocona
danalien writes "Anandtech with their previous review have stirred up a bit of controversy, and they've released their follow-up review where they pit AMD's Opteron 150 vs Intel's Xeon 3.6 Nocona (on linux)."
No message here. Oh, did you know that an Athlon64 3000+ is within 2fps of a P4 3.4 Extreme Edition in Doom 3?
Look up the prices for those two items.
Athlon 64 is the name used for the desktop line, and Opteron is the name used for the server/workstation processors.
I submitted this story an hour or two ago, but thinking about it it will be rejected just like everything else, and then pop up under someone else's name.
so what the hell.
Opteron Exposed: Reverse Engineering AMD K8 Microcode Updates
Summary
This document details the procedure for performing microcode updates on the AMD K8 processors. It also gives background information on the K8 microcode design and provides information on altering the microcode and loading the altered update for those who are interested in microcode hacking.
Source code is included for a simple Linux microcode update driver for those who want to update their K8's microcode without waiting for the motherboard vendor to add it to the BIOS. The latest microcode update blocks are included in the driver.
Background
Modern x86 microprocessors from Intel and AMD contain a feature known as "microcode update", or as the vendors prefer to call it, "BIOS update". Essentially the processor can reconfigure parts of its own hardware to fix bugs ("errata") in the silicon that would normally require a recall.
This is done by loading a block of "patch data" created by the CPU vendor into the processor using special control registers. Microcode updates essentially override hardware features with sequences of the internal RISC-like micro-ops (uops) actually executed by the processor. They can also replace the implementations of microcoded instructions already handled by hard-wired sequences in an on-die microcode ROM.
AMD's U.S. Patent 6438664 ("Microcode patch device and method for patching microcode using match registers and patch routines") goes into substantial detail on this.
Typically microcode update blocks are stored in the BIOS flash ROM and loaded into the processor as the system boots. They can also be loaded by the operating system; for instance, Linux contains a microcode device driver for Intel chips.
AMD recently released a "BIOS fix" to motherboard makers to address Errata 109, in which REP MOVS instructions caused subsequent instructions to be skipped under specific pipeline conditions.
Previously it was not clear if and how AMD even supported microcode updates in the K8 family until this announcement. After analyzing a number of BIOS images, it appears that AMD has secretly used the microcode update facility on several occasions over the past few years, but obviously avoided publicly disclosing that it actually had bugs patchable in this manner.
Early K7 (Athlon) cores initially supported microcode updates as well, until ironically the microcode update mechanism itself was found to be broken and subsequently listed as an errata!
The following sections describe the microcode update procedure, obtained by clean room reverse engineering various vendors' BIOS code. The actual microcode update blocks are embedded in the BIOS image; the most recent updates (created June 2004) have been included in the Linux driver source code attached to this description.
Microcode Update Procedure
The update procedure expects the 64-bit virtual address of the update data, including the 64 byte header, to be in edx:eax:
edx = high 32 bits of 64-bit virtual address
eax = low 32 bits of 64-bit virtual address
ecx = 0xc0010020 (MSR to trigger update)
Execute wrmsr with these register values. If the address and update block data are valid, wrmsr completes successfully. Otherwise, a GP fault is taken.
The microcode does not appear to update MSR 0x8B with the new update signature as it does on Intel processors, despite the fact that some BIOS code I have analyzed does seem to check this field. It is possible the MSR is only updated under certain conditions, for instance when microcode is loaded before initializing the cache controller. Nonetheless, as we shall see below, the processor is clearly doing something internally when it claims to accept an update in this manner.
The update generally takes around 5500 clock
http://slashdot.org/~GuyFawkes/journal
Provided you have a NUMA-aware operating system, that is. The OS needs to know which memory is attached to which processor, since access to memory attached to the same processor on which a thread is running will obviously be faster and lower latency than going across hypertransport to a different processor and waiting for an answer.
There are older dual and quad Opteron vs Xeon reviews around.
Humorously, the also say this:
Now we know that the Nocona is here, and it's getting slaughtered at the Altar of The Opteron.
Belief is the currency of delusion.
This still leaves me wondering why an Opteron 250 (2.4GHz, 1MB L2 cache) seems to so seriously outperform an Athlon 64 3500+ (2.2GHz, 512KB L2 cache).
When people says that the first article was bad, it's because it was really bad: 64-bit binaries for Intel vs. 32-bit binaries for AMD, copy&pasted benchmark results from previous 32-bit benchmarks, tests (PI digit computation) that measured the libc optimization instead of the actual benchmark (when removing the printf() it got about a 10x boost). People on aceshardware forums were posting TSCP scores about 2x what Anandtech got, on the same processor. So the A64 3500+ scores you saw in that article are trash. Forget them.