Into the Core - Intel's New Core CPU
Tyler Too writes "Hannibal over at Ars Technica has an in-depth look at Intel's new Core processors. From the article: 'In a time when an increasing number of processors are moving away from out-of-order execution (OOOE, or sometimes just OOO) toward in-order, more VLIW-like designs that rely heavily on multithreading and compiler/coder smarts for their performance, Core is as full-throated an affirmation of the ongoing importance of OOOE as you can get.'"
Ok, so I know I'm going to get a lot of AMD people agreeing with me and a lot of Intel people outright ripping me to shreds. But I'm going to speak my thoughts come hell or high water and you can choose to be a yes-man (or woman) with nothing to add to the conversation or just beat me with a stick.
I believe that AMD had this technology before Intel ever started in on it. Yes, I know it wasn't really commercially available on PCs but it was there. And I would also like to point out a nifty little agreement between IBM and AMD that certainly gives them aid in the development of chips. Let's face it, IBM's got research money coming out of their ears and I'm glad to see AMD benefit off it and vice versa. I think that these two points alone show that AMD has had more time to refine the multicore technology and deliver a superior product.
As a disclaimer, I cannot say I've had the ability to try an Intel dual core but I'm just ever so happy with my AMD processor that I don't see why I should.
There's a nice little chart in the article but I like AMD's explanation along with their pdf a bit better. As you can see, AMD is no longer too concerned with dual core but has moved on to targeting multi core.
Do I want to see Intel evaporate? No way. I want to see these two companies go head to head and drive prices down. You may mistake me for an AMD fanboi but I simply was in agony in high school when Pentium 100s costed an arm and a leg. Then AMD slowly climbed the ranks to be a major competitor with Intel--and thank god for that! Now Intel actually has to price their chips competitively and I never want that to change. I will now support the underdog even if Intel drops below AMD just to insure stiff competition. You can call me a young idealist about capitalism!
I understand this article also tackles execution types and I must admit I'm not too up to speed on that. It's entirely possible that OOOE could beat out the execution scheme that AMD has going but I wouldn't know enough to comment on it. I remember that there used to be a lot of buzz about IA-64's OOOE processing used on Itanium. But I'm not sure that was too popular among programmers.
The article presents a compelling argument for OOOE. And I think that with a tri-core or higher processor, we could really start to see a big increase in sales using OOOE. Think about it, a lot of IA-64 code comes to a point where the instruction stalls as it waits for data to be computed (most cases, a branch). If there are enough cores to compute both branches from the conditional (and third core to evaluate the conditional) then where is the slowdown? This will only break down on a switch style statement or when several if-thens follow each other successively.
In any case, it's going to be a while before I switch back to Intel. AMD has won me over for the time being.
My work here is dung.
Dammit if you're gonna quote Family Guy, at least do it properly!
"Brian, there's a message in my Alphabits. It says 'OOO'"
"Peter those are cheerios."
See, it's just not as funny if you forget the Alphabits part.
That would be due to several "lessons learned" as Intel developed Itanium.
1. The instruction overhead due to extra hint bits, etc, means Itanium instructions are much larger than x86 32/64 instructions. With the addition of poor branch performance (read: more wasted instruction bandwidth), the need for large, high-bandwidth caches makes Itanium expensive.
2. The compilers have not caught up. EPIC lacks OOOE, and has poor dynamic branch prediction hardware, so it is at the mercy of the compiler.
Core retains Intel's original insights made with the P6:
1. x86 is hard to decode (takes more silicon), but it takes less bandwidth than other instruction formats. Bandwidth is even more expensive than the cost of more complex decoders, just look how expensive it was for Intel to add full-speed cache to the original Pentium Pro, and how pricey the Itanium is with huge, fast on-chip cache.
2. OOOE + Branch Prediction + internal RISC is king. One reason the original Pentium never performed well is because it could RARELY execute more than one instruction per cycle. Thus, it performed like a fast 486 unless the code was recompiled as Pentium optimzed. The P6 was designed to avoid the reliance on compilers to improve performance, as it could optimize code in any condition. Funny, we didn't start seeing Pentium-optimized code on the market until the P6 started taking over.
Core is just a logical extension of this concept. The predictor is more accurate, there are more instruction decoders, more ALUs and SSE units, and more retirement units. The only reason Core seems to groundbreaking is because we didn't see it in small, evolutionary steps.
Man is the animal that laughs.
And occasionally whores for Karma.
"Hoisting of loads from an unknown address is now performed more speculatively than it used to be, at the cost of some complexity in the retirement unit."
/store to/ an unknown address." If you're going to pretend to school little old clueless me about the complexities of memory reordering and retirement then at least learn the difference between a load and a store.
I think you mean, "hoisting of loads above a
Senior CPU Editor | Ars Technica | http://arstechnica.com/