IBM PowerPC 970 Architecture
riclewis writes "Hannibal from Ars Technica offers an explanation of some of the internals of the new IBM chip. It's certainly more powerful than anything on the desktop now, but by the time it's released a year from now, it looks to be middle-of-the-pack (which could still be a step up for Apple...) This excitement over the early release of hardware specs kinda reminds me of all the hype surrounding the Sony's Emotion Engine when it was introduced a couple years ago. In fact, some are suggesting the PPC 970 chip might be closely related to the PS3's 'Cell' processor..."
The PowerPC 970 triples the length of the PowerPC pipeline
This will give it the same issues the P4 has. Namely a large penalty for branch mispredicts, etc. Instructions per clock will decrease.
OTOH, they should be able to crank the speed!
The law is a weapon of the government, not a protection for the likes of you. Surely you understand that.
> Instructions per clock will decrease.
.09 process shows me that this 970 chip has legs. Another thing... IBM has *always* been conservative about what not-quite-ready chips will do as far as clock, and benchmarks. I expect "Real World" [no relation to Peter Gabriel] performance to be quite good. [although I expect Peter Gabriel's performances to be fantastic =)]
Actually, IPC is *increased* from the current G4. It will now fetch 8 instructions per clock, and retire 5 per clock.
The current G4 IIRC fetches either 3 or 4 per clock. I have no idea how many it can retire at once.
This coupled with a quick move to a
Blocklevel: Practical Information Architecture
I have no idea who you are, Mononoke, but I'd wager $1000 that Hannibal Stokes knows more about chip architecture than you do. The PPC 970 will have a hard fight (both in marketing and benchmarks) against the 4+GHz x86 chips also due a year from now.
p.s. How the heck did that get rated as Insightful? I'm as rabid a Mac addict as any of you, but it's just plain wrong to mod someone up for spouting false evangelism.
Well, I'll try.
rendering apps like Lightwave, Maya, etc will benefit from this for several reasons:
The 64bit architecture:
Lightwave [if rewritten to be 64bit] will be able to use bigger numbers, and use more memory. Bigger numbers means that calculations that would involve making a 64bit word out of 2 32bit words [as it currently stands] needn't be done. Being able to address more memory is *always* a good thing.
Really good Floating Point Performance:
3D rendering apps love FP. bigger/faster/more Fp units are a good thing.
Memory Bandwidth:
The 900MHz bus will allow a *huge* amount of memory to be shuttled back and forth from the processor *very* quickly. This means your huge scenes will be rendered faster.
Altivec/Vector Processing unit:
Because the VPU doesn't do double precision FP, it doesn't help in the final rendering [much]. It *will* help in things like realtime previews, where the math is simplified. Imagine *big* previews of scenes in realtime.
Multiprocessing:
This chip is [as implied] MERSI compliant. This means that it is a perfect candidate for multiprocessing, like the current G4.... but the 970 can go many more "ways" than the G4 [the G4 was in an "optimal" multiprocessing stage with 2 procs]. The 970 can go up to 16, IIRC.
This seems like it'll be a winner.
.
Blocklevel: Practical Information Architecture
Once you move beyond a 4.5billion, into the realm of 18.5 (two orders of magnitude past trillion), you can address anything for the forseable future (since you can count each year until the heatdeath of the universe this way, for example).
For vector operations, 64bit words make for some fast math operations, since you can pack more 32-bit integer components into each bus transfer.
For floating point, it means you have greater precision in hardware (allowing things like real physics and shapes to be modelled without noticable issues caused by subtle number creep). Since most systems use IEE-784 (64bit double precision floating point), it means a speedup to that software since you're not working with it as 2 32-bit operations.
In terms of storage space, it means you can address more than 2,199,023,255,552 bytes (~2 terabytes) of disk space (assuming a 512-byte sector). This is important for people with big RAID arrays today, and people with ludicrously big Maxtor drives 3-4 years from now.
For RAM, it means you don't have to worry about your server topping out at 4 gigabytes of RAM. It also means that your VM space has no effective limitation for the forseable future (very useful for people working on large projects, trying memory-intensive algorithmic approachs to traditionally NP-hard problems, or distributed computing problems).
I'm sure I missed a lot of the benefits even with this list. As you can see, 64-bit is not just a number game. It is 32 orders of magnitude larger than 2^32, meaning our grandchildren will probably still be using 64bit machines with no limitions being apparent (unlike 16-bit to 32-bit, which only moved from 65k to 4.5 billion in terms of addressable amounts of something).
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
Actually, the DDR thing is a little misguided. The real reason DDR had no effect was because the 2.1 GB of memory bandwidth was feeding into 1.3 GB/sec of processor bus bandwidth.
A deep unwavering belief is a sure sign you're missing something...
PPC chips can only work on one swing of the computing "cycle", not on the up and down like an Athlon can for example
It's called positive and negative edge triggering. It's not a new technology either. I was dealing with it in the 80's at the discrete logic level.
AGP 2x uses this and 4x uses positive, negative, high and low triggering. Certain UDMA modes make use of this clocking technique also.
Your argument doesn't hold water.
His arguement DOES hold water. PPC CPU's DO outperform Intel x86 CPU's by a good margin when compared clock for clock (showing the MHz Myth for what it is). Especially the G4 and boy when AltiVec can and is exploited... Wow. There IS more to CPU design than smaller die and deeper piplining for higher MHz.
As far as I can tell, Apple seem to be in a position where they have to make the best of what they can get, due to Motorolla dropping the ball pretty baddly.
I hope IBM comes to their rescue. How ironic.
War crimes, torture, lies, illegal spying... Would someone give Bush a blowjob, already, so he can be impeached?
http://www.heise.de/ct/english/02/05/182/
j pg)
. ht ml
SPEC benchmarks for the G4 processors. (Not a synthetic benchmark issued by Apple, but by an unbiased third party, SPEC)
G4 1 GHz SPECs at 306 integer 187 floating-point
Interestingly, the 1 GHz G4 was almost neck-and-neck with a 1 GHz PIII (http://www.heise.de/ct/english/02/05/182/qpic02.
http://www.spec.org/osg/cpu2000/results/cpu2000
A large archive of SPEC results for many CPUs, including x86.
A few choice results:
1.2 GHz Athlon (Ancient by today's standards) - 443 integer, 387 FP
Athlon XP 1700+ on an Epox EP-8KHA (Happens to be my mobo - Slowst Athlon XP listed for this mobo):
633 integer, 561 FP
Dell Precision Workstation 330, 1.3 GHz P4 - 474 integer, 502 FP (The P4 doesn't seem to be taking too much of a branch misprediction hit here)
So in the case of G4s, while they may be a bit more efficient MHz for MHz (And the P3 vs. G4 benchmarks so that this isn't even necessarily the case), the fact that they're so far behind on the clock speed curve hurts them badly.
If you want to see a good example of MHz not being everything, check out the benchmarks of Alpha systems - The 750 MHz ones chew even 1.2 GHz Athlons for lunch. But don't look at Apple...
Also interesting in the case of the SPEC benchmarks run by Heise - MS C pays a 10-15% performance hit over GCC in the SPEC benchmarks.
retrorocket.o not found, launch anyway?