ArsTechnica Compares the P4 and G4e: Part II
Deffexor writes "It looks like Hannibal of ArsTechnica fame has put Part 2 of his original comparison article between Intel's P4 and the Apple/Motorola G4e. In a nutshell, this second article covers the execution core, the AltiVec unit and SSE2, as well as a myriad of other interesting factoids. An interesting read, if not a little technically intense for those of us with less than a CE/EE degree. Have at it boys!"
i personally believe that flexibility of the assembly instructions as well as the number of instructions executed per cycle contribute greatly to the dominant speed (at any given MHz/GHz) of the ppc processor. compare any intel/amd processor to a ppc at the same clock speed, and the ppc will kick its x86 ass.
the high end ppc desktops are topping out around 900MHz, while the p4's are hitting 2GHz. there has to be another explanation besides the complaint that jobs is ignorantly sitting on his thumbs. i think he knows what he's doing.
note: i am not a mac zealot.. i don't even own a mac - only 4 x86 pc's (1 athlon, 2 p133, 1 p120). i simply can appreciate the speed of the ppc.
"I just want to thank my coach Eric a.k.a. Disco for shattering my reality..."
This article is extremely informative and gives you a good insight into how these processors are designed, as well as how they compare. I disagree with the poster though, you don't need a CE or EE degree to get the idea of what's going on. I'm a CE and I had classes on this sort of thing so yes I could follow all the gritty details, but I think the author did a good job of explaining things so that most people could understand. Also, I thought the author summed things up perfectly saying:
The preceding discussion should make it clear that the overall design approaches I outlined in the first article can be seen in the execution cores of each processor. The G4e continues its "wide and shallow" approach to performance, counting on instruction-level parallelism to allow it to squeeze the most performance out of code. The P4's "narrow and deep" approach, on the other hand, uses fewer execution units, eschewing ILP and betting instead on increases in clock speed to increase performance.
This is exactly the case. Unfortunately the popular masses don't understand all of this wide vs narrow stuff, so they go for the higher clock speeds. In reality, Intel is really pulling one over on us, charging more money and all we're getting is a higher clock rate, not a whole lot of performance gain. PPC has proven itself time and time again to be the better processor, but unfortunately they aren't used in very popular machines (mostly Macs,) so we don't get to reap the benefits.
On a related note, this article touches on one of the many reasons why the Gamecube will run circles around the Xbox. GameCube's processor is a 485Mhz PPC designed specifically for video games, while the Xbox just uses a common Pentium running at 733 MHz.
This all brings up a good question: why haven't Macintosh's or GameCube's marketers come up with a bench mark to put next to the processor speed? Maybe I missed it, but I've never seen a Macintosh commercial saying "comes with a G4 800 MHz, comparable to a P4 1.5 MHz." There might be too many legalities involved to do something like that, but it seems like they need to educate people somehow of the non 1 to 1 relationship between clock speeds of P4s and PPCs.
~ now you know
Feel free to debunk it. Explain why it's better for developers (and the user experience) to have to work out how to optimise to a new pipeline every couple of years, rather than to squeeze every last drop of speed out of one design before moving onto the next on (e.g.) a five year cycle.
The whole point of modern processor architectures is for the work of pipeline optimisation be done by either the processor or the compiler. The goal is less work for the application developer. This coincides with the trend towards higher level programming languages - I can't think of any large major applications architected such that the code needs to be hand optimized at instruction level. Sure, after profiling some parts may be tuned, but the size of applications today are just too difficult to design in overview with that scope. A huge amount of the transistor budget these days is taken by microcode that performs these optimizations on the fly, but with architectures such as Itanium you're seeing a move towards compile time ordering. But now I'm getting sidetracked...
Ever looked at the specification of a Playstation [e-scapegames.co.uk] and wondered how on earth developers got it to do what they had it doing by the end of its lifecycle?This is a time-honoured trend in closed-hardware systems. Look at the last generation of games on the SNES. Look at the scene demos being put out for the Amiga in the mid 90's, essentially a decade old hardware platform at that point.
Ever wondered why early Playstation 2 games bit the weenie?
All first generation titles do not harness the full capacity of a machine; I think this is your point. But on open systems developers don't have to optimize anything, thanks to Moore's Law. Maybe it doesn't strictly live up to the "small-is-beautiful" aesthetic, but software development is about optimizing results. Time will be spent where the greatest payoff is, and since performance boosts are a natural consequence of progress more resources are devoted to development.
It *is* fair to make these comparisons since the G4 and the P4 are the best Motorola and Intel have to offer us, right now. It's irrelevant what the G4 was "aimed at competing with"; what is relevant is what we have in our hands now.
The whole market for motherboard upgrades comes from this situation. Apple does not support motherboard that it did not manufacture, and the OS used to check for the precence of genuine ROMs. So third parties could not build replacement boards. By upgrading only the CPU subsystem, the rest of the motherboard would remain genuine Apple and therefore run the system without problems.
Also remember that Mac hardware tends to be more expensive and last longer than PCs. While the performance boost you win by upgrading only the CPU system is lower, the impact on the workstation is also lower. Changing a motherboard means changing the system, having new drivers, so basically more maintainance work.
This situation might change with darwin, theoretically, nothing prevents some company from producing PPC motherboards, recompile Darwin for it and then build a installer that instals OS X on top of Darwin. Old machines that Apple does not support can run OS X this way.