Slashdot Mirror


Inside the PowerPC 970

daveschroeder writes "Jon "Hannibal" Stokes has posted a long-awaited, very detailed analysis of the IBM PowerPC 970 at Ars Technica. Notable quote: 'The 970 was made for Apple'."

14 of 163 comments (clear)

  1. One long read... by Thaidog · · Score: 1, Interesting

    Interesting idea to say that the vector units were "hacked" onto the power arcitechture... and this being the reason therefore this chip is designed for apple...

    --

    ||| I still can't believe Parkay's not butter.

    1. Re:One long read... by batboy78 · · Score: 1, Interesting

      I'm more concerned about the release of the PPC 980 the mobile edition of the 970 in a nice 15 inch Al Powerbook. If you are into the rumor mill your should check out MacBidouille's website. He has some speculations about speed and performance against the current line of P4 processors.

  2. Re:Inaccuracy, Part I by PetWolverine · · Score: 2, Interesting

    Whoa! A duplicate article, this I've seen before. But this is nuts!

    --
    I found the meaning of life the other day, but I had write-only access.
  3. nope. by netsrek · · Score: 3, Interesting

    Try doing audio signal processing or heavy graphics/video work.

    You're pretty thankful for your Altivec then...

    I saw such an insane improvement in Reaktor when it got Altivec enhanced...

    --

    i don't read slashdot anymore.
    1. Re:nope. by g4dget · · Score: 1, Interesting
      Using SIMD is intrinsically easier than using a cluster.

      They are just different. Clustering allows program-level parallelism, which gives you nearly linear scaling for throughput for arbitrary programs with no programming effort. SIMD and vector processors are very specialized and require a lot of effort to use well, and that effort is usuall only worth it if you need to lower processing latencies.

      (Note, incidentally, that one of the most important SIMD machines, the Connection Machine, was built as a network of processors.)

    2. Re:nope. by drunkenbatman · · Score: 2, Interesting

      They had a Mac with a DSP built-in back in the day. The Quadra A/V! It blasted the shit out of most any other computer when it came to Photoshop - rivaling modern computers for some tasks.

      They actually made two: the Quadra 660AV and the Quadra 840av (there was also a centris line without the fancy stuff). The 660 used a 25MHz 68040, and the 840 used a 40MHz 68040, and had a seperate DSP that you had to write specifically for.

      Apple's thought was kinda ahead of the curve at the time, in that they were introducing speech recognition, were going to ditch the modem in favor of the geopod (just a jack, cost about half as much as a modem and you'd do the same stuff in software using the DSP). After it died (ver quickly) there were rumors that they'd be adding a phillips trimedia card later on, but that was dropped as they learned their lesson hopefully from the AV fiasco.

      At no tasks could those computers hold their own with any "modern computer". The 840 held its own for awhile, but that was due to the fact that it had the fastest released 040 chip, there wasn't much PPC software for awhile and the fastest powermac (80MHz) released later on emulated 68K code at about the same speed as the 840 ran natively.

      But honestly- and I did own one and loved it- they were a bad buy. Since you had to code specifically for the DSP, only a few photoshop filters and one or two 3D programs wrote any code for it... maybe a few things here and there, but nothing really of note. The telephony aspect of it sucked bad and never really worked well, so the DSP just sat there hanging out 99.9999999999% of the time as Apple themselves said they wouldn't be included in future machines a bit later as they weren't needed due to the speed of the PPC when using native code.

      Sorry, just at no time did they rival modern computers- they were cool for what they were, and allowed voice recognition and playback at a time when it was unthinkable... but that's about it.

      drunkenbatman

  4. Re:drop AltiVec by Daleks · · Score: 4, Interesting

    But the reality of regular high-end computing is that people don't have the time to optimize their software for the latest oddball hardware platform. And even something like a hand-coded vectorized BLAS library doesn't help because most scientific software still doesn't use such libraries.

    ATLAS is a BLAS implementation that is tuned for each system that it runs on. The people at Mathworks use this as the underlying BLAS system in Matlab. Mathematica Maple, etc. use this as well. There is even a G4/AltiVec optimized version available here. This is the whole point of layered software.

  5. Re:drop AltiVec by Wyatt+Earp · · Score: 5, Interesting

    AltiVec is nice for somethings.

    My iTunes ripping of mp3s nearly tripled when I went from a 466 MHz G3 to a 400 MHz G4 due to iTunes being optimized for AltiVec.

    Some Photoshop actions and filters see up to 800% improvments.

    Running iMovie exports on a 600 MHz G3 iMac take 2-300% longer than on a 400 or 500 MHz G4

  6. Re:drop AltiVec by GlassHeart · · Score: 4, Interesting
    AltiVec is important for Apple marketing because it lets them claim impressive performance figures without actually needing to push the state of the art in terms of processor design

    Don't confuse "new" with "state of the art". The former is just something that hasn't been done before. The latter is something that yields "impressive performance figures". If Altivec is competitive with Intel, then it is state of the art, by definition, even if it's 20 years old. The CPU cache is a decades old concept, yet CPUs with caches are still state of the art.

    Imagine how much better it would be if Apple could ship systems based on the 970 today, rather than after a few months additional delay due to AltiVec.

    Don't underestimate the cost of software. Your idea is expensive, because it requires software vendors to maintain two different versions of their code. This can lead to buggier or more expensive products, or it can lead to the "abandonment" of the G4 installed base. That could easily be worth the few months for Apple.

  7. Re:In the market for a 64-bit workstation? by Mooncaller · · Score: 5, Interesting

    I did not have much trouble getting GNOME working on a HP B180 with HP-UX 10.20 ( compiled with acc, HP-UXs standard compiler). BTW that is a 180MHz PA-RISC machine. It kicked a 1GHz Pentium based workstations butt, even after I put Gentoo on the Intel box ( it original had only Windows NT, shudder). Fast clock rates can't compensate for a moronic architecture in the hands of heavily multitasking users like me.

  8. Turning the FFT into an integer monster. by MickLinux · · Score: 2, Interesting

    In response to your FFT being a floating point monster... in a lot of cases, couldn't you turn it into an integer monster? I've been thinking about this, and it occurs to me that the vector can be decomposed into halves (thus the 2^x units in the FFT), but a vector and angle theta it can as easily be decomposed into to vectors half the length, one at angle phi, and the other at angle (2theta-2phi).

    That, where phi is any angle. That being the case, it seems to me that you could pick your values phi to correspond to "perfect" triangles (3-4-5, ~42 degrees, for example), and keep your operations in the integer realm for everything except subtraction of angles.

    I dunno, I haven't checked this out really thoroughly, and this is therefore probably nonsense. Last time I tried to do anything with the DFT, I thought I had something that blew the FFT away in terms of speed... precisely because I didn't understand the full FFT process, and its beautiful simplicity.

    In reality, I got a very modest improvement over the FFT, not worth the extra code in my opinion.

    My method was very different, involving a redefinition of the DFT matrix-vector combination, and had more work on paper, but fewer multiplications. But what I thought was (log2n)^2 instead of the DFT's N^2 order of magnitude multiplications, was really something like 0.87Nlog2N multiplications. FFT gets N*log2N multiplications.

    Essentially, when I understood the FFT well, and applied my lessons to it, I ended up showing that not all the multiplications are neccesary. Some of the FFT multiplications are dupes just like this article, and there is a system for finding them, also just like this article. (Look for the multiplications posted by Taco.)

    But the fact that I can make such errors means that I could be completely wrong about my supposed integer FFT.

    --
    Correct Horse Battery Staple: 72 bits of entropy. Enter "Correct H" into google. When it generates the phrase, that's
  9. Nonsense by Karl+Cocknozzle · · Score: 2, Interesting
    I really don't think it's possible for each of 30 people to be aware of all 30 other articles.

    Bullshit. When I worked foy the University Daily Paper we had no problem avoiding duplicate stories all over the paper... And we ran FAR MORE THAN 30 STORIES A DAY.

    In my example it was a bunch of drunk/high/rushing out to get laid coward students--Can't professionals who are being paid do their damn job right do AT LEAST as good as the wasted college kids?
    --
    Who did what now?
  10. Up to marketing, not technology by mariox19 · · Score: 3, Interesting

    For me, the most interesting part of the article concerns the pricing of the new machines as the real question. According to the author, the chip will make Apple machines technologically competitive. The question is, will Apple price them to gain market share, or continue to sell to a disappearing niche of luxury computer buyers.

    Maybe Apple's concentration on developing software, and selling that software (rather than giving it away), along with its new business ventures, such as .Mac and the new iTunes online music store, point to a new business model that can afford to cut the margins on hardware.

    If they don't lower the price of their machines -- the top ones, namely -- they will suffer, long-term. I don't think they need to be on par with PC's; I just think they cannot be too much more expensive than the PC's.

    --

    quiquid id est, timeo puellas et oscula dantes.

  11. troll? by netsrek · · Score: 2, Interesting

    wtf? hey maclots.... Just cause someone is criticising Altivec doesn't necessarily make them a troll....

    I do agree with you that clustering could be far more useful than it currently is, but as you say, anything that requires low latency is kind of problematic...

    As far as clustering goes, you know you're able to put together a PC processing monster and use VST System Link ?

    Been considering this to add to my TiBook...

    --

    i don't read slashdot anymore.