Slashdot Mirror


Into the Core - Intel's New Core CPU

Tyler Too writes "Hannibal over at Ars Technica has an in-depth look at Intel's new Core processors. From the article: 'In a time when an increasing number of processors are moving away from out-of-order execution (OOOE, or sometimes just OOO) toward in-order, more VLIW-like designs that rely heavily on multithreading and compiler/coder smarts for their performance, Core is as full-throated an affirmation of the ongoing importance of OOOE as you can get.'"

9 of 178 comments (clear)

  1. AMD Vs Intel: Round 8 by eldavojohn · · Score: 5, Informative

    Ok, so I know I'm going to get a lot of AMD people agreeing with me and a lot of Intel people outright ripping me to shreds. But I'm going to speak my thoughts come hell or high water and you can choose to be a yes-man (or woman) with nothing to add to the conversation or just beat me with a stick.

    I believe that AMD had this technology before Intel ever started in on it. Yes, I know it wasn't really commercially available on PCs but it was there. And I would also like to point out a nifty little agreement between IBM and AMD that certainly gives them aid in the development of chips. Let's face it, IBM's got research money coming out of their ears and I'm glad to see AMD benefit off it and vice versa. I think that these two points alone show that AMD has had more time to refine the multicore technology and deliver a superior product.

    As a disclaimer, I cannot say I've had the ability to try an Intel dual core but I'm just ever so happy with my AMD processor that I don't see why I should.

    There's a nice little chart in the article but I like AMD's explanation along with their pdf a bit better. As you can see, AMD is no longer too concerned with dual core but has moved on to targeting multi core.

    Do I want to see Intel evaporate? No way. I want to see these two companies go head to head and drive prices down. You may mistake me for an AMD fanboi but I simply was in agony in high school when Pentium 100s costed an arm and a leg. Then AMD slowly climbed the ranks to be a major competitor with Intel--and thank god for that! Now Intel actually has to price their chips competitively and I never want that to change. I will now support the underdog even if Intel drops below AMD just to insure stiff competition. You can call me a young idealist about capitalism!

    I understand this article also tackles execution types and I must admit I'm not too up to speed on that. It's entirely possible that OOOE could beat out the execution scheme that AMD has going but I wouldn't know enough to comment on it. I remember that there used to be a lot of buzz about IA-64's OOOE processing used on Itanium. But I'm not sure that was too popular among programmers.

    The article presents a compelling argument for OOOE. And I think that with a tri-core or higher processor, we could really start to see a big increase in sales using OOOE. Think about it, a lot of IA-64 code comes to a point where the instruction stalls as it waits for data to be computed (most cases, a branch). If there are enough cores to compute both branches from the conditional (and third core to evaluate the conditional) then where is the slowdown? This will only break down on a switch style statement or when several if-thens follow each other successively.

    In any case, it's going to be a while before I switch back to Intel. AMD has won me over for the time being.

    --
    My work here is dung.
    1. Re:AMD Vs Intel: Round 8 by archen · · Score: 2, Informative

      f there are enough cores to compute both branches from the conditional

      I don't see how that could really be useful. I mean if you were computing instructions on a one by one basis, then perhaps that would work, but you fill the pipe then find out it's the prediction is wrong so you go to the other cpu, however when you look at the bigger picture you realize that you are essencially crippling one CPU by dedicating it to doing something other than actually processing.

      Intel's CPU branch prediction is already known to be better than AMD's. I think the bigger news is that the pipe will be cut down by half of the P4 to 14. If they can keep the processor fed (and perhaps move the memory controller on die like AMD has), then Intel may finally be able to end the spanking session they've been recieving by AMD.

    2. Re:AMD Vs Intel: Round 8 by amjacobs · · Score: 2, Informative
      It's entirely possible that OOOE could beat out the execution scheme that AMD has going but I wouldn't know enough to comment on it. I remember that there used to be a lot of buzz about IA-64's OOOE processing used on Itanium. But I'm not sure that was too popular among programmers.
      There is nothing new with Out of Order Execution. It's been implemented in all the Pentium cores as well as AMD chips from the K6 (I think) on up. In fact, the reason why going to multi-core designs is necessary is because it is difficult to extract any more instruction-level parallelism (ILP) from code using additional hardware techniques. (For instance, some new hardware may increase performance by 3%, but add 10% area to the design) Itanium, which is an in-order processor, shifts the ILP extraction to the compiler.
      The article presents a compelling argument for OOOE. And I think that with a tri-core or higher processor, we could really start to see a big increase in sales using OOOE. Think about it, a lot of IA-64 code comes to a point where the instruction stalls as it waits for data to be computed (most cases, a branch). If there are enough cores to compute both branches from the conditional (and third core to evaluate the conditional) then where is the slowdown? This will only break down on a switch style statement or when several if-thens follow each other successively.
      Processors can already do what your suggesting. All modern cores from AMD and Intel are super-scalar. This means that there are multiple pipelines running in parallel. If you have two pipelines, you can compute both of the possible results from a branch and discard the incorrect value. BUT, you cut your maximum efficiency by half. (You are using 2x the resources to get 1x the results) You wouldn't want to do the same thing with separate cores for a variety of reasons. Instead, using separate cores for separate threads provides an immediate performance improvement without the need for major code revisions.

      I don't think you'll find anyone who is against using OOE. Ideally, you would want a processor that combines good hardware techniques (OOE, branch prediction, prefetching) with a good compiler/ISA (maybe some sort of VLIW, but I still have my doubts about the compiler feasability). The key for AMD and Intel is to find the right balance of hardware and software techniques that provide the best overall performance.

    3. Re:AMD Vs Intel: Round 8 by dpilot · · Score: 2, Informative

      >Guess what, Intel doesn't make motherboards either. They contract with Asus or another company to sell their
      >motherboards with the Intel brand on it.

      Two points:
      1. Intel design chipsets for their CPUs. AMD designed one, a while back, and otherwise relies on 3rd party.
      2. Intel may well have designed, engineered, and spec'ed the board, regardless of who makes it.

      So this is really a statement that Intel has better control of delivering their CPU capabilities to the end user than AMD, independent of the raw capabilities of those CPUs.

      --
      The living have better things to do than to continue hating the dead.
    4. Re:AMD Vs Intel: Round 8 by ciroknight · · Score: 4, Informative

      I believe that AMD had this technology [wikipedia.org] before Intel ever started in on it.

      No offense, but you lost me right about here. The Athlon 64 and Opteron (and the Clawhammer/Sledgehammer chips as a whole) are fundamentally a whole different direction than the Core Duo. While they're aiming towards the same goals (really damned fast x86 code execution), they get there in two entirely different ways.

      The idea behind the Athlon 64 and Opteron chips were to attack Intel where it would hurt them most, the midrange server section of their business. AMD realized that Intel sells more of these machines, and the maintainance contracts on these machines mean that they're going to keep coming back to you for more of them, even 5 years down the line when your chips are virtually "obsolete". This is broadcasted very loudly in their choice to integrate a memory controller onboard their CPUs; in order to upgrade chips with an integrated memory controller, you have to replace the whole board, and managers aren't going to want to do that very often. Your chips are cheaper overall (because they don't have to have external logic to drive the memory controller anymore, and they were cheaper to begin with), but it locks you into AMD as a company, and locks you into that chip (a slam dunk victory for AMD).

      The Intel Core philiosophy was something completely different; it was reactionary in the sense that the Pentium 4 and Netburst were sputtering to the end of their performance gains, way earlier than Intel could have prediticted. But at the same time, Intel has always been known to make great mobile chips, and the Intel Core Architecture was built on a mobile chip platform. It was the logical choice, even in March 2003 when the Pentium M/Core Architecture first made itself available to the world as Banias. The Athlon 64 didn't even make itself available on the market until April (Opteron) or September (Athlon 64) of that year.

      Better late than never? Yeah, of course. But the point is, the Opteron was meant to be a server chip and take back the market from Intel and is completely succeeding. The Core chips were entirely meant to be Mobile chips, and due to technology trickledown, we're starting to see that Mobile chips are just as much at home in desktop computers.

      And, I know you werent' trying to make yourself out to be a complete and total AMD fanboy in your post, you entirely came off that way, especially without knowledge of the product itself. I don't care particularly for either company, just the fastest chips I can possibly get my hands on, and right now that's the Athlon FX, but in a few months that's going to be Conroe.

      --
      "Victory means exit strategy, and it's important for the President to explain to us what the exit strategy is." G.W.Bush
    5. Re:AMD Vs Intel: Round 8 by Tracy+Reed · · Score: 2, Informative

      Those forms of consumer revolt deny the company money. Just like denying a car gasoline this will cause the company to eventually stop functioning. I think the OP made a very good point. There is no point in treating a company like a person. Companies are also completely amoral.

  2. Re:and this is why... by greypilgrim · · Score: 4, Informative

    Dammit if you're gonna quote Family Guy, at least do it properly!

    "Brian, there's a message in my Alphabits. It says 'OOO'"

    "Peter those are cheerios."

    See, it's just not as funny if you forget the Alphabits part.

  3. Re:Core has OOOE? by default+luser · · Score: 3, Informative

    That would be due to several "lessons learned" as Intel developed Itanium.

    1. The instruction overhead due to extra hint bits, etc, means Itanium instructions are much larger than x86 32/64 instructions. With the addition of poor branch performance (read: more wasted instruction bandwidth), the need for large, high-bandwidth caches makes Itanium expensive.

    2. The compilers have not caught up. EPIC lacks OOOE, and has poor dynamic branch prediction hardware, so it is at the mercy of the compiler.

    Core retains Intel's original insights made with the P6:

    1. x86 is hard to decode (takes more silicon), but it takes less bandwidth than other instruction formats. Bandwidth is even more expensive than the cost of more complex decoders, just look how expensive it was for Intel to add full-speed cache to the original Pentium Pro, and how pricey the Itanium is with huge, fast on-chip cache.

    2. OOOE + Branch Prediction + internal RISC is king. One reason the original Pentium never performed well is because it could RARELY execute more than one instruction per cycle. Thus, it performed like a fast 486 unless the code was recompiled as Pentium optimzed. The P6 was designed to avoid the reliance on compilers to improve performance, as it could optimize code in any condition. Funny, we didn't start seeing Pentium-optimized code on the market until the P6 started taking over.

    Core is just a logical extension of this concept. The predictor is more accurate, there are more instruction decoders, more ALUs and SSE units, and more retirement units. The only reason Core seems to groundbreaking is because we didn't see it in small, evolutionary steps.

    --

    Man is the animal that laughs.
    And occasionally whores for Karma.

  4. Re:Article summary by Hannibal_Ars · · Score: 5, Informative

    "Hoisting of loads from an unknown address is now performed more speculatively than it used to be, at the cost of some complexity in the retirement unit."

    I think you mean, "hoisting of loads above a /store to/ an unknown address." If you're going to pretend to school little old clueless me about the complexities of memory reordering and retirement then at least learn the difference between a load and a store.

    --
    Senior CPU Editor | Ars Technica | http://arstechnica.com/