More on the PowerPC 970
functor writes "Ars Technica's Jon Stokes has a treatise up covering the microarchitecture of the high-performance 64-bit PowerPC 970 microprocessor, due to be released by the end of the year, that goes over in detail how this chip is put together, and how we can expect it to perform. This is the follow-up to Stokes' article detailing the PPC 970's design philosophy. 'It appears to hold quite a bit of promise in bolstering Apple's currently almost obsolescent product line, and it appears to have been designed explictly to fulfil Apple's requirements. To say the least, the second half of this year looks to be pretty interesting as Apple's product line promises to become competitive performance-wise with IA-32 and x86-64-based PCs again.''
No, but IA-32 motherboard manufacturers go a good number of steps further. ;)
I recommend that you investigate Intel's Placer (E7505) chipset and motherboards based on it (several of Supermicro's offerings, as well as offerings from Tyan and other manufacturers, e.g. the Iwill DPL533 and DP533.
These motherboards support 133 MHz QDR system buses (coming to 533 million transfers a second), matched (quite well) with two channels of PC2100 DDR SDRAM (resulting in 4.267 GB/s of memory bandwidth that is actually utilizable by the processors, since the memory bandwidth matches the system bus bandwidth, unlike Apple's offering, which is bottlenecked by the system bus at just 1.333 GB/s, whether you have one processor or two).
(And I'm certain that 200 MHz QDR Xeon chipsets are not far off in the future, since Intel in general appears to be headed in that direction.)
The Pentium 4 is, in fact, designed to scale to high clock speeds exactly so that it can tolerate lots of pipeline bubbles in flight without ending up stalling for too long.
A lot of these tricks (high decode bandwidth, multiple instruction queues [really buffers meant for reordering the instruction stream], branch prediction, etc.) are meant to reduce hazards such as pipeline bubbles as far as possible, and the PPC 970 does these hazard-reducing operations rather well, too.
And, yes, we're now in the post-RISC world where instruction complexity (particularly in the realm of SIMD and streaming/explicit cache manipulation instructions) is growing because simple instructions clearly aren't enough to allow for great throughput increments.
(Read some of Stokes' older articles in the Ars Technopaedia; I'm sure you'll find them interesting.)