Slashdot Mirror


More on the PowerPC 970

functor writes "Ars Technica's Jon Stokes has a treatise up covering the microarchitecture of the high-performance 64-bit PowerPC 970 microprocessor, due to be released by the end of the year, that goes over in detail how this chip is put together, and how we can expect it to perform. This is the follow-up to Stokes' article detailing the PPC 970's design philosophy. 'It appears to hold quite a bit of promise in bolstering Apple's currently almost obsolescent product line, and it appears to have been designed explictly to fulfil Apple's requirements. To say the least, the second half of this year looks to be pretty interesting as Apple's product line promises to become competitive performance-wise with IA-32 and x86-64-based PCs again.''

7 of 344 comments (clear)

  1. Re:No matter how many times I read it... by eweiland · · Score: 4, Informative

    This is still a PPC chip. No changes to programs are necessary for them to run on it. The only change that will have to be made is if a software vendor decides to run in 64-bit mode which many don't have to do. Performance of the new chip is not dependent on whether the program runs at 32 or 64 bits. This is not a migration like moving from the 680x0 line of processors to the PPC which was an overall change in architecture.

  2. Re:Inaccuracy, Part 1 by 11223 · · Score: 4, Informative
    You're completely wrong. The maximum speed of the FSB and whether it supports DDR (or QDR) is determined by the processor, not by the chipset. For the G4e, the maximum known speed at which MaxBus can operate is 167MHz - precisely what Apple uses.

    They can't make the FSB DDR or QDR without appropriate support from the processor, and that's exactly what they haven't been getting from Moto.

  3. Re:competitive, sure... by 11223 · · Score: 4, Informative
    The PPC 970 is great news for Apple, but it is still a bone thrown to them while the x86 PC is feasting on the meat of the Intel and AMD processors.

    As Nethack would say, "Ugh! This meat is tainted!"

    The 970 is fundamentally a 64-bit processor, and its performance must be evaluated in that context. The fact that the 970 will pull off amazing speed in the 32-bit arena only shows how well-designed this processor is.

    Keep in mind, the Hammer is only shipping at 1.4, 1.6, and 1.8 GHz - the same speeds the 970 is targeted at. And the 970 has the advantage of an ISA that was designed from the beginning to do 32 and 64 bit addressing, versus one that's a 64-bit extension of a 32-bit extension of a 16-bit micro with full compatibility to an 8-bit redesign of a 4-bit processor.

  4. AMD is the odd man out by pchown · · Score: 4, Informative

    Interesting, if you look at the pipeline design of the PowerPC it is much closer to Intel than AMD. The PowerPC pipeline has sixteen stages, the Pentium 4 twenty, and the Athlon ten.

    Presumably the P4 can reach higher clock speeds than the Athlon because there is less work to do at each pipeline stage. On the other hand a longer pipeline increases the probability of a stall, so the work done per clock cycle goes down.

    I'd speculate that the PowerPC ought, therefore, to be able to achieve clock rates approaching but not equalling the P4, since they are both comparatively "over-pipelined". At the same time, the PowerPC ought to deliver slightly more throughput per clock cycle because the pipeline is slightly shorter.

    Meanwhile, the Athlon will be running at a significantly lower clock rate, but delivering comparable throughput.

    1. Re:AMD is the odd man out by Lebannen · · Score: 4, Informative

      As well as the depth of the pipeline, I believe the article also says you need to look at the width of the pipeline; it points out the G4+ is wide and shallow, the Pentium 4 is narrow and deep, and the 970 is wide and deep. You will therefore get bubbles in the 970's pipeline, but their effect is minimised and you're far less likely to get stalls.

      Combine this with the more intelligent branch prediction, out-of-order execution etc in the 970, and you're probably looking at a chip which is slightly less efficient clock-for-clock than the G4+, but more efficient than the Pentium 4.

      Integer performance wise, it looks like the 970 will be about equal to a Pentium 4 of 25-50% higher clockspeed; FPU-wise, and of course Altivec-wise, it looks like a monster. So; it probably won't outperform the current Pentium 4s at a lot of tasks, but will kick it about on other more specialised tasks, which is a big step over the G4+. We're not looking at a Pentium-crusher, but we are looking at something that will be vaguely competitive.

      Just gotta see how well it scales, after that, and whether 64-bit will mean anything for average tasks... and when it actually happens, of course.

      --
      Diplomacy is the art of saying "nice doggie" whilst looking for a rock
  5. Re:Inaccuracy, Part 1 by functor · · Score: 5, Informative

    No, but IA-32 motherboard manufacturers go a good number of steps further. ;) I recommend that you investigate Intel's Placer (E7505) chipset and motherboards based on it (several of Supermicro's offerings, as well as offerings from Tyan and other manufacturers, e.g. the Iwill DPL533 and DP533. These motherboards support 133 MHz QDR system buses (coming to 533 million transfers a second), matched (quite well) with two channels of PC2100 DDR SDRAM (resulting in 4.267 GB/s of memory bandwidth that is actually utilizable by the processors, since the memory bandwidth matches the system bus bandwidth, unlike Apple's offering, which is bottlenecked by the system bus at just 1.333 GB/s, whether you have one processor or two). (And I'm certain that 200 MHz QDR Xeon chipsets are not far off in the future, since Intel in general appears to be headed in that direction.)

  6. Re:Kitchen Sink by functor · · Score: 5, Informative

    The Pentium 4 is, in fact, designed to scale to high clock speeds exactly so that it can tolerate lots of pipeline bubbles in flight without ending up stalling for too long.

    A lot of these tricks (high decode bandwidth, multiple instruction queues [really buffers meant for reordering the instruction stream], branch prediction, etc.) are meant to reduce hazards such as pipeline bubbles as far as possible, and the PPC 970 does these hazard-reducing operations rather well, too.

    And, yes, we're now in the post-RISC world where instruction complexity (particularly in the realm of SIMD and streaming/explicit cache manipulation instructions) is growing because simple instructions clearly aren't enough to allow for great throughput increments.

    (Read some of Stokes' older articles in the Ars Technopaedia; I'm sure you'll find them interesting.)