Inside the PowerPC 970

← Back to Stories (view on slashdot.org)

Posted by CmdrTaco on Wednesday May 14, 2003 @04:57PM from the take-a-closer-look dept.

daveschroeder writes "Jon "Hannibal" Stokes has posted a long-awaited, very detailed analysis of the IBM PowerPC 970 at Ars Technica. Notable quote: 'The 970 was made for Apple'."

15 of 163 comments (clear)

Min score:

Reason:

Sort:

In the market for a 64-bit workstation? by idiotnot · 2003-05-14 17:02 · Score: 5, Insightful

Fast forward a few months....hmm...a few options:

Sun: Nice hardware, very expensive, CDE.
AMD: Commodity hardware, cheap, WinXP.
HP: Intel hardware, very expensive, CDE or WinXP.
I think I know what I'd buy.

Of course, the Athlon64/Opteron would get quite a bit of consideration due to my hobbies.

But I think it'd end up being the Mac.
1. Re:In the market for a 64-bit workstation? by Bold+Marauder · 2003-05-14 17:12 · Score: 4, Insightful
  
  AMD, NetBSD in place of XP? ;)
Dual FPUs! by Anonymous Coward · 2003-05-14 17:15 · Score: 4, Insightful

Reading through the article, its nice to see some real design going into a processor. Looking through Intel's last few chips, they've been upping ther clock speed and packing in more cache.

Yeah, yeah, they are hog-tied because you can't easily re-compile the entire windows platform to use new instruction sets. Linux users, of course, don't have this problem (muhahahah).

Did anyone else catch the bit on the twin FPU's? I'm just imagining what this thing is going to do with vector operations and frequency transforms.

For most of you non-engineers:

-Most 3d vector operations are affine tranformations. Using a 4x4 array of floating point numbers you can translate, rotate, and scale. Works beautifully, but it's a lot of calculations.

-The Fast Fourier Transform (FFT) is used a lot in signal processing. It's a floating point monster.
drop AltiVec by g4dget · 2003-05-14 17:24 · Score: 2, Insightful

If the 970 were solely intended as a Linux desktop platform for IBM, they would've preferred to reduce the 970's die size, power consumption, time-to-market, etc. by just leaving out the Altivec unit altogether, instead of shoehorning it into the design the way they did.
This is probably true and rather unfortunate. AltiVec is important for Apple marketing because it lets them claim impressive performance figures without actually needing to push the state of the art in terms of processor design further than Intel. It's also important for a few special-purpose applications (PhotoShop filters, etc.).
But the reality of regular high-end computing is that people don't have the time to optimize their software for the latest oddball hardware platform. And even something like a hand-coded vectorized BLAS library doesn't help because most scientific software still doesn't use such libraries.
I think this tradeoff doesn't even work well for Apple. Imagine how much better it would be if Apple could ship systems based on the 970 today, rather than after a few months additional delay due to AltiVec. And every dollar and watt that is shaved off the AltiVec price makes it a much more viable processor for servers and blades, which would get volume up and prices down. Gimmicks like AltiVec cost much more than they are worth, even for Apple.
1. Re:drop AltiVec by Anonymous Coward · 2003-05-14 17:42 · Score: 2, Insightful
  
  A few problems:
  (A) SIMD is really bloody fast if you use it. And Apple does. Heavily. Would you want to rewrite OSX, significantly slowing it down, to create an altivecless version?
  
  (B) Apple has gone through two major transtions: 68k->PPC, Mac OS Kernel->BSD kernel. Another rewrite requiring transition is possible, but over something this small? That seems unlikely. And I bet users would be THRILLED when some apps just stop working.
  
  (C) The other option is to just crack those 128 bit instructions down, just like everything else. But if you're gonna make the chip bigger and uglier to do this, why not just add altivec? The only argument I can think of is that this would get rid of the altivec's extra long pipeline and possibly allow lower clocks/operation for some things.
2. Re:drop AltiVec by jovlinger · 2003-05-14 18:19 · Score: 2, Insightful
  
  Don't forget the oh-so-cool Fastest FFT In The West.
  
  IIRC, it will partically evaluate the code against the known size of the input, and I think also do some data-driven special-casing.
  
  Basically, it beats the pants off standard-library FFTs.
  
  While I'm at it, responding to grandparent:
  
  most scientific software still doesn't use such libraries
  
  Would you care to elaborate? I mean, if you're not writing against known ultra-optimized libraries, what business do you have expecting your software to run fast? That's like compiling with gcc instead of intel's compiler. I would expect that most scientists DO care enough to use the fastest libraries at hand, and put some effort into identifying bottlenecks.
  
  Or perhaps you were implying that most scientific software is too esoteric to benefit from fast linear-algebra? From what I recall from my physics courses, it was pretty much ALL linear algebra: vectors/ matrices/ determinants / eigen values.
  
  In summarium, this position confuses me.
3. Re:drop AltiVec by Apotsy · 2003-05-14 18:55 · Score: 4, Insightful
  
  You'd be surprised how much stuff in Mac OS X is AltiVec optimized. Even memcpy gets a boost from it. It's a lot more than just a "gimmick".
  
  --
  Free Hans!
4. Re:drop AltiVec by Ikari+Gendo · 2003-05-15 02:19 · Score: 2, Insightful
  
  SIMD is really bloody fast if you use it. And Apple does. Heavily. Would you want to rewrite OSX, significantly slowing it down, to create an altivecless version? ... Apple has gone through two major transtions: 68k->PPC, Mac OS Kernel->BSD kernel. Another rewrite requiring transition is possible, but over something this small? That seems unlikely. And I bet users would be THRILLED when some apps just stop working.
  
  Er ... OS X doesn't need to be rewritten. It runs on Altivec-less G3s, and probably will continue to do so for a very long time.
  
  --
  FreeBSD - the power to serve.
Re:nope. by iomud · 2003-05-14 17:56 · Score: 2, Insightful

But you don't need altivec for a next gen chip that's so much faster than it's predecesor that no one would notice. There is a huge disparity in the performance of current G4's and whatever they're gonna call the 970 based machines.
Re:nope. by bnenning · 2003-05-14 18:31 · Score: 2, Insightful

But there's like three people in the world who actually use altivec.

More than 3 people have ripped music in iTunes. Then there's the tremendous acceleration it provides for encoding DVDs, Final Cut Pro's real-time effects, BLAST, and plenty more. It's not even close to just Photoshop.

--
How to solve most of our problems: 1.Lots of nuclear plants. 2.Cure aging.
Re:nope. by g4dget · 2003-05-14 20:03 · Score: 1, Insightful

If you spend $3800 on a high-end Mac, you get something that runs a small number of hand-optimized commercial programs really fast. That may be useful for some of Apple's core audience.
But if you spend the same $3800 on x86 hardware, you get a small compute cluster that runs a lot of software faster and without AltiVec optimizations. For most scientific applications, as well as most video and audio applications, that's probably a better deal in terms of bang-for-the-buck, but, admittedly, it's probably not something your average Mac user wants to set up.
Re:Is this the G5? by Alan+Partridge · 2003-05-14 21:12 · Score: 2, Insightful

well, G5 was written on a roadmap I once saw describing the Motorola MPC 85xx series - but seeing as how that series has been on sale for about a year now with no sign of a CPU variant suitable for Apple to use, I guess we can forget about the MPC 85xx being used in PowerMacs. Personally, I'd like Apple to adopt the moniker "G64" for their PPC 970 powered machines - that'd stick it to Intel alright, and the idiotic warez kids would stop comparing clock speed and start comparing word length instead of getting on with their lives.

--
That was classic intercourse!
integer FFTs aren't uncommon by Trepidity · 2003-05-15 00:00 · Score: 3, Insightful

You just use fixed-point arithmetic instead of floating-point (i.e. a fixed 32 bits of precision, or 16 bits, or whatever). A simple way of doing is is to make INT_MAX/2 = 1.0, -INT_MAX/2 = -1.0, and everything in between scaled appropriately. (/2 to avoid overflow). Then you implement fixed-point addition, multiplication, division, and subtraction (as commonly doing in hardware DSP chips) and you've got yourself an integer-only FFT.

Some really old C code doing something along these lines is available here.

--
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
Re:nope. by TheRaven64 · 2003-05-15 00:13 · Score: 3, Insightful

Using SIMD is intrinsically easier than using a cluster. To use a SIMD instruction set all you need to do is isolate occurences where you are applying the same operation to multiple units of data and (in the case of AltiVec) call the corresponding vector operation in the standard lib.
Coding for a cluster introduces all kinds of communication and synchronisation headaches, especially since it takes such a long time to communicate between nodes (1ms is a very long time in terms of a CPU).

--
I am TheRaven on Soylent News
yep by Zergwyn · 2003-05-15 01:45 · Score: 2, Insightful

I agree with this totally. A surprisingly large, and ever increasing, amount of OS X libraries use altivec, which means that developers using those libraries get some acceleration for free. Altivec is much easier to optimize stuff for then MMX, SSE2, etc.