AMD Takes 25 Percent of Server Market
An anonymous reader writes "AMD has taken 25 percent of the server market for itself, according to a News.com article. This gives them some 21 percent of the entire x86 market, and is an increase from only 16 percent in the second quarter of 2005." From the article: "AMD has been picking away at Intel's server market share for several years based on the superior performance and power consumption of its Opteron processor. But Intel fired back last month with a new Xeon processor based on its Core microarchitecture that appears to be outperforming current Opteron processors on several tasks. Intel is pinning its hopes of resurrecting its market share--and its stock price--on the new Core generation of processors."
Based on the opinion of most IT analysts, the 4P servers is actually the "sweet spot" of the market. So I expect Opteron to continue it's lead there. The HyperTransport from AMD is superior to the FSB when you start getting into multiprocessor servers. And I expect AMD will extend HT to be up to 8 chips (currently 4) in the next generation chips sometime in 2007.
For more information on AMD, see: wikipedia on AMD
(BTW, EM64T = shameless clone/re-branding of x86-64, which is an open standard created by AMD. A rare case of Intel not succumbing to Not Invented Here syndrome. From here on, I'll lump them both under the name "AMD64".)
FWIW, I have very little hands-on experience (not being a frequent programmer of x86 assembly), but there are two big features of AMD64 that stand out: more registers (which helps compilers especially), and addressing relative to %rip (the 64-bit Instruction Pointer). The former lets you compute more things on-the-fly without reserving stack space for temporary variables, which can cut down on round trips to L2 or main memory -- thus making AMD64 a bit more like a RISC system, while leaving behind the ivory tower "orthogonal" (read: code-bloating) instruction sets that RISC forces on you. The latter lets your code reference constant things like strings (which are generally compiled into the .text section, right alongside the code that uses them) without [PIC] reserving a register for it, or [non-PIC] hardcoding the address. This simplifies the build process for a LOT for programmers.
Quick tutorial on PIC:
Let's say I have a function, void hello() { printf("Hello, World!\n"); }. If I compile and link this code normally, I get something that looks like push $0x80484b8; call printf, where 0x80484b8 is a hard-coded address located in the .text section (or else a section for data constants that can be found relative to .text). If you're building an executable, that's fine, since the location of .text will be known at link-time.
However, if you want to bundle your code into a shared library, that won't do at all. Each program that loads your library will load it at a different address, so .text could be anywhere in memory. On a modern system, you can add a fixup so that the dynamic linker patches your code on the fly, but now your "shared" library has one copy in memory per instance, even if it's all instances of the same program. That's worse than a static library! The solution is called PIC, Position Independent Code, and is invoked with -fPIC when using GCC. On x86, it usually looks something like this: call .Lfixup; .Lfixup: pop %ebx. Since x86 provides relative jump/call instructions, you can call to .Lfixup without knowing the absolute address, which pushes %eip on the stack as the return address. After the pop, %ebx now contains the absolute address of the .Lfixup label at runtime, and you can safely access your constants relative to that. (All that fuss just because you can't use %eip directly.)
On the downside, you've now eaten a register (on the already register-starved x86 architecture) and you've blown away most branch predictors, forcing a pipeline stall. Not a biggie if you just do it once in main() or similar, but since this might be a library function, you have to do it each time the function is called, in each function that needs it. Ew. It works, but it's not elegant, and it eats performance very badly if you call a PIC function from within an inner loop, so a lot of programmers just tell their tools to compile the entire program twice: once with PIC, and again without. (That's what all those *.lo files are from GNU libtool.)
AMD64 allows compilers (and assembly writers) to unify PIC and non-PIC code into a single, efficient path. Instead of jumping through hoops to copy %rip to %rbx and locate your constants relative to %rbx, you can just address your constants relative to %rip directly. There's no longer any penalty for using PIC, so compilers can just turn it on by default, saving the world from millions of tiny hassles that add up to one big Ick. It's probably the single most real-world useful thing they could have possibly added to the x86 instruction set.
Range Voting: preference intensity matters