Slashdot Mirror


Architectural Difference Between The P4 And G4

homerJAYsimpson writes: "This article is a great refernce of the differences in the architecture of the P4 and the G4. What is nice is that it is not a holy war of who is better but an explaination of why Intel made its choices and uses the G4 as a point of reference. It has just tons of info on uPs, useful for everyone." Not for the techie novice, but its a well written piece if you're reasonably technical and want to understand more about two of the most important chips on the market.

8 of 78 comments (clear)

  1. The difference... by Anonymous Coward · · Score: 5

    is about 70 degrees F.

  2. Predictor of predictors by nadador · · Score: 3

    I'm interested to see what the influence of Alpha IP will be on Intel core designs. When I took computer architecture at CMU we spent a couple of lectures on why the Alpha was the best thing since sliced bread, as far as microprocessors go.

    One of the big things that the Alpha did that was so cool was the branch predictor, which actually implemented two branch prediction algorithms and then had a predictor that watched them both and picked the one that was recently the most correct. Some of that kind of deep knowledge of branch prediction and how to avoid having your long pipeline kill performance would be information that Intel could sorely use on the pentium 4 core, as well as on the Itanic, I mean Inanium, I mean *Itanium*. There we go.

    Is anyone else suprised that the G4 core seems so vanilla? The difficulty of making a 4 stage pipeline run at upwards of 733 MHZ on a .25 or .18 micron process is pretty amazing. I'm impressed. I suppose that the embedded focus at Motorola meant that bells and whistes weren't a high priority, but I wonder what kind of performance improvements G4e will demonstrate with a longer pipeline and all.


    --

    Outside of a dog, a book is a man's best friend. Inside a dog, its too dark to read.
  3. Clock Speeds by Midnight+Thunder · · Score: 5

    One thing that I got from this article is why we shouldn't be depending too much on clock-speeds for comparison, and thus the fact that PPCs aren't yet available at clock speeds of x86 shouldn't really matter. The wide and shallow approach of the PPC certainly means that less clock cycles are needed than the narrow and deep approach of the x86.

    Now I know that they only tests that really matter are the real world tests, simply because at a user level that's the only real place that I'll notice the difference.

    Of course another issue is going to be motherboard differences and how much I/O depends on the processor, but this is another story.

    --
    Jumpstart the tartan drive.
    1. Re:Clock Speeds by CBravo · · Score: 3

      I'll drop the cluebomb again:
      -it is not about processors/instructionsets
      -it is not about MHzs

      it is about e.g. compilers, parallellism, shortest path , bandwidth, technology and algorithmz. You _then_ work on the rest.
      Processors are only a means to what you want to accomplish. I've seen DSP's take a 4x MHz gap just because it had a good architecture. Deep down information processing (clocked or not) takes time to go through the logic.

      --
      nosig today
  4. Comparing cycle penalty times is meaningless ... by VAXman · · Score: 3

    The P4's long pipeline means that bubbles take a long time to propagate off the CPU, so a single bubble results in a lot of wasted cycles. When the G4e's shorter pipeline has a bubble, it propagates through to the other end quickly so that fewer cycles are wasted. So a single bubble in the P4's 20 stage pipeline wastes at least 20 clock cycles (more if it's a bubble in one of the longer FPU pipelines), whereas a single bubble in the G4e's 7 stage pipeline wastes at least 7 clock cycles. 20 clock cycles is a lot of wasted work, and even if the P4's clock is running twice as fast as the G4e's it still takes a larger performance hit for each pipeline bubble than the shorter machine.

    What the author apparently fails to grasp is the only thing which matters is wall clock time. P4 may have a 20 cycle mispredict penalty, higher than G4e's penalty of 7, but it also at about triple the clock speed. 20 cycles @ 1.8 GHz is less than 7 cycles @ 600 MHz.

    This is basically another very pedestrian hate-on-P4 article with very little substance. P4 does have some performance problems (mostly to do with shifts and multiplies) and they're documented in the optimization manual, but this article does nothing to dig any deeper than what a dozen other pedestrian articles have said.

    Also ...

    Intel was definitely paying attention, and as the Willamette team labored away in Santa Clara they kept MHz foremost in their minds.

    Willamette was designed entirely in Oregon. Santa Clara had nothing to do with it, and has had nothing to do with IA-32 design since P5 (nearly 10 years ago).

  5. Different Chips... by zephc · · Score: 4

    for different uses!

    The G4 is meant to be usable in embedded systems, while the P4 is meant to be usable as a space heater

    =P
    ----

    --
    "I would say that 99 per cent of what my father has written about his own life is false." - L. Ron Hubbard Jr.
  6. Learn to read (and I'll learn not to troll) by WIAKywbfatw · · Score: 3

    "2 of the most important chips on the market"

    Jeez, why do people have such a bad grip of the English language? Is it really that hard to understand?

    Yes, "two of". As in "not exclusively of". Yes, the Intel Pentium 4 is one of the most important chips out there. And yes, so is the AMD Athlon. But so it the Motorola G4, and so for that matter is the upcoming Intel Itanium.

    Now if the description of the article said "the two most important", I could understand your gripe. But it doesn't. And besides, haven't we already seen dozens of similar comparisons between Intel and AMD processor families?

    --

    "Accept that some days you are the pigeon, and some days you are the statue." - David Brent, Wernham Hogg
  7. x86 instructions are bytecodes of the future by Waffle+Iron · · Score: 5
    After reading this article I think that history is repeating itself. I've been scoffing at the P4, but now I think that Intel may be laughing at the end.

    If you remember then the Pentium Pro came out, people (including me) dissed it because it was years behind schedule, huge, expensive and hot. Actually, its architecture was just ahead of the process technology curve. With a few tweaks, the same CPU core came to dominate the world with the P-II and the P-III.

    Looking at the radical changes in the P4, including storing only uOPs in the instruction cache and reserving (currently useless) pipeline stages for speed-of-light cross chip delays, they are planning ahead for future realities. We can think of the current P4 as being like the Pentium Pro, just a short-lived beta release.

    The more interesting question is which approach to driving uOPs will win out: P4, Transmeta or Itanium. P4 and Transmeta convert legacy x86 opcodes to internal wide architecture on-the-fly (P4 in hardware, transmeta in software); Itanium makes the compiler generate wide architecture directly. Note that the original pre-translated instruction format (CISC, RISC, Java bytecodes, whatever) is now largely irrelevant.

    My view is that in the abstract, Transmeta has the best approach, followed by P4 and Itanium last. This is because the software approach is the most flexible and can even be upgraded in the field. In theory, it could detect and store the individual performance characteristics of each program on a user's machine. Granted, they currently focus on low-power, but if they retargeted their technology at high speed, it could be interesting.

    The P4 approach is hardwired, but at least it can adapt to local code characteristics and translate them to the current internal architecture version.

    The Itanium exposes low-level chip details to the compiler, and the decisions are cast in concrete from there on out. It doesn't seem very future-proof to me; if the IA64 architecture changes in the future, today's compiled code will suffer.