Xeons, Opterons Compared in Power Efficiency
Bender writes "The Tech Report has put Intel's 'Woodcrest' and quad-core 'Clovertown' Xeons up against AMD's Socket F Opterons in a range of applications, including widely multithreaded tests from academic fields like computational fluid dynamics and proteomics. They've also attempted to quantify power efficiency in terms of energy use over over time and energy use per task, with some surprising results." From the article: "On the power efficiency front, we found both Xeons and Opterons to be very good in specific ways. The Opteron 2218 is excellent overall in power efficiency, and I can see why AMD issued its challenge. Yes, we were testing the top speed grade of the Xeon 5100 and 5300 series against the Opteron 2218, but the Opteron ended up drawing much less power at idle than the Xeons ... We've learned that multithreaded execution is another recipe for power-efficient performance, and on that front, the Xeons excel. The eight-core Xeon 5355 system managed to render our multithreaded POV-Ray test scene using the least total energy, even though its peak power consumption was rather high, because it finished the job in about half the time that the four-way systems did. Similarly, the Xeon 5160 used the least energy in completing our multithreaded MyriMatch search, in part because it completed the task so quickly. "
the Opteron ended up drawing much less power at idle than the Xeons
...
the Xeon 5160 used the least energy in completing our multithreaded MyriMatch search, in part because it completed the task so quickly.
So what does this mean for people shopping for servers?
If your servers constantly tick along at nearly 100% CPU use, you might do better going with the Xeon system. If your machines basically sit idle most of the time with an occasional spike for a few seconds when it actually does something, the AMD would save you more on electricity.
Of course, this raises a third possibility - Would running a number of virtual servers on one large Xeon machine waste more energy than it saves, or give a net gain?
You're forgetting the basic formula from Hennessy and Patterson:
Yes, CISC has better work per instruction, except for one glaring issue I'll get to in a moment, but - for various reasons explained throughout H&P - it loses on the other two and thus overall. That's why nobody's making new processors that are CISC internally any more; they just couldn't hit the issue widths and clock speeds are achievable with a RISC core (even if that core has a CISC ISA bolted on the front). What's missing here is that not all work is useful work. As anyone who has accidentally coded an infinite loop knows, executing lots of instructions is not necessarily a good thing. The glaring issue I mentioned earlier is that a lot of the instructions executed on a register-poor architecture like x86 are not doing useful work. Register thrashing means i-cache bandwidth is wasted fetching instructions which are then used to waste d-cache bandwidth, which more than outweighs any advantage from variable-length instructions.
So, you say, wouldn't variable-length instructions on a register-rich processor be the best of both worlds? Not so fast. A regular instruction set makes superscalar execution easier because it means that multiple instructions can be fetched literally at the same time without having to examine the first one to figure out where the second one begins and so on. It also makes deeper pipelines easier because it allows many internal activities (e.g. register allocation, hazard detection) to start after a simple pre-decode stage, in parallel with the remainder of decode. Either way, regular instruction sets allow for more parallelism - and parallelism in some form is the generally the key to CPU performance. If you're willing to give up performance by eschewing most modern processor-design techniques, which might be the case for a deeply embedded system with extreme size and/or power requirements, then variable-width instructions might still be a reasonable choice. In that case you might as well use an older architecture; there are plenty to choose from. For new processor designs, though, variable-width instructions are almost invariably a way to lose.
Slashdot - News for Herds. Stuff that Splatters.