Slashdot Mirror


PowerPC 970 Running at 2.5 GHz

kuwan writes "IBM has just released a press release that indicates they have the new PowerPC 970 running at 1.8 to 2.5 GHz making it 'the fastest PowerPC so far.' IBM's original estimates were to have the chip running at 1.4 to 1.8 GHz at introduction, so this is very good news for those of us hoping Apple will use this as their next-generation chip."

39 of 593 comments (clear)

  1. Let's see some FAB speed scores by MarkRH · · Score: 5, Insightful

    Who cares how fast IBM has this running in the lab--let's see how fast those fab lines are running before we get too excited.

    1. Re:Let's see some FAB speed scores by AresTheImpaler · · Score: 5, Informative

      here is some info i found.. might help:
      SPECint2000
      - 937 @ 1.8 GHz
      SPECfp2000
      - 1051 @ 1.8 GHz
      Dhrystone MIPS
      - 5220 @ 1.8 GHz
      - 2.9 DMIPS / MHz
      Additional Performance
      - Peak scalar GFLOPS = 7.2
      - Peak SIMD GFLOPS = 14.4
      - RC5 : 18M keys/sec
      Unfortunately at the very bottom it says that some of this are estimates.. here is the link where I got the info: http://www-3.ibm.com/chips/techlib/techlib.nsf/tec hdocs/A1387A29AC1C2AE087256C5200611780

    2. Re:Let's see some FAB speed scores by Clockwurk · · Score: 5, Informative
    3. Re:Let's see some FAB speed scores by Monokeros · · Score: 5, Informative

      OK, Everyone who wants to understand which processor is fastest should really take a course on processors. Here's the (condensed) deal with the MHz myth:

      All other things being equal, faster clock frequency = faster processor. The trick is in the magic words "all other things being equal". If I have a 1 GHz G4 and overclock it to 1.8GHz it will be faster. That's because the processor is using the exact same process but all the steps in the process suddenly take less time.

      The problem is that no two processor designs are the same. RISC vs CISC isn't even the only consideration. There are cache sizes/locations, number of pipeline stages, number of pipelines, processor component layout, all kinds of crap. And thats just IN the processor. Motherboard designs don't even enter into my discussion.

      PPC and x86 are very different, as well you know if you are a nerd (if you aren't then what are you doing here anyway?). But even processors that run the same instruction set are different enough that clock frequency doesn't necessarily dictate relative processing speed. This is why if you went to tom's hardware when the P4's first came out and looked at the benchmarks, initial P4's were rated as slower than P3's which were running at a SLOWER clock frequency. And I don't think I have to tell you about AMD vs. Intel processors at equal clock speeds.

      The point is that clock frequency is a number that represents something that is actually going on inside your processor. It doesn't always accurately represent speeds relative to other processors, but its a pretty good heuristic when used wisely. If you're comparing the speed of different P4's you wouldn't be in error if you said "I want a 2.6GHz P4 because its faster than a 2.2GHz P4". However, you probably would be in error if you said "I want a 2.6GHz P4 because its faster than a 2.5GHz Power5".

      --
      The Statue of Liberty is America's lawn jockey.
  2. ?!?!?!1 by Spazntwich · · Score: 4, Interesting

    I wonder how they managed to up the clock so dramatically? Is it just SOI and other techniques, or did they lengthen the pipeline significantly.

    If it's just a pipeline lengthening scheme, well, meh, but if they kept the same execution pipeline and are now at 2.5ghz operating range, they're going to kick some ass.

    1. Re:?!?!?!1 by addaon · · Score: 4, Informative

      This is the same 970 as before. No lengthened pipeline, although the 970 has a relatively long pipeline to begin with. And they probably hit 2.5ghz by selective testing... I haven't seen suggestions they can manufacture these chips in quantity yet. Keep in mind that Intel demos ~5GHz chips every few months or so. Even so, it's promising that the design seems to scale up that far without issues and without needing a process change.

      --

      I've had this sig for three days.
    2. Re:?!?!?!1 by binaryDigit · · Score: 5, Insightful

      Funny that you ask. The fact is that it doesn't matter. Remember the so called "mhz myth" well it definitely exists from a marketing standpoint. IBM could have cranked up the clock rate and achieved 0% performance increase and it wouldn't matter to most people. They just say "oh, Apple has a 2.5ghz processor, that's better than 1.8ghz, oooh, aaaah". This is the same battle that AMD fights. They are spending big bucks trying to remind people that just because that P4 is running at 3ghz, it doesn't mean that it is THAT much faster than a 2.2ghz Athlon.

  3. Motherboards ready for 2.5MHz? by occam · · Score: 4, Interesting

    I just hope Apple has their motherboards ready for 2.5GHz. The original spec of 1.8GHz with 6+GB bus was a little heady compared to Apple's current technology (no thanks to Motorola). I'm hoping they know how to build motherboards with the best of them to take advantage of IBM's new 970 chip. Pushing the envelope from 1.8GHz to 2.5GHz just makes the whole motherboard engineering issue more challenging. Let's hope Apple hardware design it up to the task (and then some).

    1. Re:Motherboards ready for 2.5MHz? by addaon · · Score: 4, Informative

      What's more interesting is that the frontside bus of the 970 was designed to scale with processor speed. So the 1.8GHz was supposed to have a 900MHz (well, presumably 225MHz quad-pumped) FSB, using a multiplier of 2. The 2.5GHz, then, has two options... either drop down a notch to use a multiplier of 3 (getting an 833MHz FSB, which is manageable)... or go full-hog and hit a 1.25GHz FSB. While I suspect that for the 2.5GHz chip the answer is, unfortunately, the former, the question is a bit hazier in the case of a 2GHz part... 1GHz is manageable but impressive, whereaz 666MHz simply isn't enough. Of course, they can allow non-simple multipliers and solve the issue, but I do recall that they were planning on supporting only integral multipliers.

      --

      I've had this sig for three days.
    2. Re:Motherboards ready for 2.5MHz? by addaon · · Score: 4, Interesting

      Eh, us mac users have lived with a slow bus too long to not want a fast one... because it might not be improved significantly for another four years! But yes, you're correct about the multiplier math. I just seem to remember hearing someone from IBM refer to the 1.8GHz part as having a 2x multiplier, and saying the 1.4GHz would have the same multiplier for a 700MHz (175MHz) bus... and I got the impression, quite possibly incorrectly, that the phrase 'simple multiplier' (they didn't say integer multiplier, note) meant a multiple of four, pre-quad-pumping. But again, all I'm going on here is vague phrases and the fact that the 1.4GHz and 1.8GHz parts had such different bus speeds (which makes upgrading even more fun, come to think of it).

      --

      I've had this sig for three days.
  4. More Information by robbyjo · · Score: 5, Informative

    Here you can find a more technical details than just press release.

    Here is the actual spec about the PowerPC 970.

    Ars Technica articles. Apparently, PPC 970 just last year's news. The real news is just the cranked-up speed...

    --

    --
    Error 500: Internal sig error
    1. Re:More Information by Anonymous Coward · · Score: 4, Insightful

      One of the most interesting bits of information from the above IBM pages: In addition to its support of new 64-bit solutions, the 970 retains full native support for 32-bit applications. This not only protects 32-bit software investments, but provides these 32-bit applications with the same high-performance levels that it extends to 64-bit uses. This native, nonemulated, 32-bit support is not limited to application code, which runs unmodified. 32-bit operating systems with minor updates can also take advantage of the PowerPC 970's outstanding performance.

  5. Digital Lifestyle by anaesthetica · · Score: 5, Funny

    "It is ideal for very computing intensive applications, for example in the area of simulation like meterology or geological calculations."

    Along with the rollout of the 970 chip, Apple will introduce two new insanely great iLife Apps: iWeather and iEarth. Now you can calculate weather patterns in your neighborhood and export the results to iMovie! Also, use iEarth's predictive powers in landscaping your front yard, planning your garden, and preventing cracks in your house's foundation.

    Perfect for your digital lifestyle.

    Eat that Miscrosoft!

    1. Re:Digital Lifestyle by questionlp · · Score: 5, Funny
      Okay... I've got karma to burn...

      Microsoft, after several delays, releases Hailstorm XP and Terra XP for their latest operating system, Longhorn. The release announcement was done with Steve Ballmer running around the stage at TechEd 2004 screaming, "Call me daddy! I own the Earth!" Later, Bill Gates corrects Ballmer by saying, "Sorry Steve, I own the Earth!" Reports have been coming in that Scott McNealy of Sun, Larry Ellison of Oracle, and Richard Stallman of FSF all huddled up and crying.

      Unfortunately, shortly thereafter, Earth blue-screened and permanently enabled copy-protection on every living person until each person forks over their soul along with $5000 per year for life support.

  6. Hopefully by cosmo7 · · Score: 5, Funny

    ~Perhaps this will lead to some sort of debate regarding the virtues of Macs compared with PCs, something so rarely discussed on SlashDot.

  7. you gotta wonder... by Petrox · · Score: 5, Insightful

    how many people have been holding off (or switching to other platforms) on a new Apple computer purchase for these new chips. I'm sure Apple is chomping at the bit waiting for these chips to be mass produced so that they can get them into Powermacs (and hopefully Powerbooks too), like, yesterday.

    The POWERLite series (which is basically what the 970 is) is a great alternative to x86 for Apple for quite a few years ahead. Not only does IBM have an incentive to keep producing these chips at ever-greater clock speeds (something that Motorola with the G4 doesn't seem to have a great deal of interest in doing) because IBM actually uses these in their Blade servers, but it sets up a nice roadmap for successive generations of chips (the POWER5 is just around the corner, with a Power5Lite a la PowerPC 980 coming shortly thereafter? Such a chip is probably only a year and a half off and, running MacOSX, would rocksock).

    Yum.

    --
    sig my booty, check my website
    1. Re:you gotta wonder... by BWJones · · Score: 4, Insightful

      how many people have been holding off (or switching to other platforms) on a new Apple computer purchase for these new chips. I'm sure Apple is chomping at the bit waiting for these chips to be mass produced so that they can get them into Powermacs (and hopefully Powerbooks too), like, yesterday.

      Well, for scientific users the debate about which platform to use has *significantly* been mitigated by the presence of a true UNIX with OS X allowing for the easy porting and running of code already written for other *nix distros. I personally have replaced three machines including an older Mac, a Windows box and an SGI with a single dual G4 with a sweet Cinema Display.

      Now, could I use more power? Absolutely. Code that is optimized for Altivec is screaming fast. Faster than just about any other platform I have used in fact. However, code not optimized for Altivec gets whomped on by the Wintel platform right now and I would like to see some of the delta in performance go away.

      All of that said, OS X is one impressive OS. The best OS out there for the general audience and for a number of specialized audiences as well. It can only get better and is awaiting fast CPU's with fast bus speeds.

      I suppose it also might be argued that OS X has matured faster as a result of the lagging performance of the G4 chips in that Apple has had to optimize lots of code to get things running fast, whereas Microsoft tends to rely on fast boxes to get through code bloat. Just look at Safari vs. IE as an example of this.

      --
      Visit Jonesblog and say hello.
  8. PC == Personal Computer || Wintel architecture? by Anonymous Coward · · Score: 5, Funny

    if (PC == "Personal Computer")
    printf("Why do we say Mac vs PC?\n");
    else if (PC == "Wintel architecture")
    printf("Why confuse people with something called 'PowerPC'?\n");
    else
    printf("WTF?");

  9. Easy by sydlexic · · Score: 5, Funny

    I wonder how they managed to up the clock so dramatically?

    Xeon + hobby paint.

    1. Re:Easy by Tumbleweed · · Score: 5, Funny

      Xeon + hobby paint.

      No way, man, VTEC stickers! :)

  10. Comment removed by account_deleted · · Score: 5, Insightful

    Comment removed based on user account deletion

  11. AltiVec confirmed by obi · · Score: 5, Insightful

    Interesting: this PR release seems to confirm the planned extensions are in fact, Altivec. I haven't followed it too closely, but I thought this wasn't confirmed yet.

    Guess that makes it clear this is Apple's next chip.

  12. Explanation by TWX_the_Linux_Zealot · · Score: 5, Informative

    "First of all, what is the processor that Apple using now? Isn't it some sort of PowerPC already? I see this one supports Altivec and I know that G3 and G4 Apple computers have the same instruction sets. Is this just another implementation, or is G3 and G4 relatives of this new processor?"

    Apple does currently use a PowerPC processor in their computers. They have for the past eight years or so. Currently they're using the "750" edition, a'la G3 and G4, which are supplied by both IBM and Motorola.

    "Second: what operating system does the IBM PowerPC run?"

    The IBM machines with these series of microprocessors are things like the later generation AS/400s and RS/6000's. There are also some workstation machines (both badged as such and badged differently) with IBM PowerPCs in them. AS/400s use OS/400. RS/6000s can run many different OSes, including Linux and AIX.

    "I suspect that the article is just confusing and processor itself is not made by IBM. Right??"

    Wrong, at least on who makes the microprocessor. Motorola hasn't been doing so well lately, and even early on they had to deal with IBM to meet quota. IBM's hand in the PowerPC line is visible in Macintosh 5200's, which were common schoolroom computers that are starting to be end-of-lifed. They're dating back to August 1996 or so.

    --

    IBM had PL/1, with syntax worse than JOSS,
    And everywhere the language went, it was a total loss...
  13. misinformation by Anonymous Coward · · Score: 4, Funny

    Here's some:

    - The new chip has a 54 stage pipeline, thus making it as effective as a current 700 MHz G4.

    - The chip tested eliminated all ability for cache, thus allowing the speedup in clock but making it slower than all current G4s available in Apple computers.

    - It is being developed as PowerPC but will be transitioned into x86.

    - It will not support multiprocessing and MP applications will have to be done through a hackneyed clustering.

    - This chip will help to propel Apple to 20% market share. (I'm a shareholder.)

    - When worked hard, the chip gives off an odor vaguely reminiscent of shrimp flavored chips.

    - The 970 is slightly faster than a Porsche 944.

    Please feel free to add your own misinformation because there's not all that much real information to be discussed, anyway.

    1. Re:misinformation by joe_bruin · · Score: 5, Funny

      the 970 achieves 64bit performance by having 4 on-die 16bit 68040 cpu's and doing hardware instruction translation (in realtime) from ppc to 68000.

      in a technology leap, this cpu bypasses intel's hyperthreading technology and proceeds directly to 'ludicrous threading'. this technology allows a thread to finish a task before it was even created.

      the 970 incorporates hardware acceleration for microsoft's windows media drm technology. Windows Media Player 9 Series(r): If You Struggle It Only Hurts More(tm).

      unlike endothermic cpu's commonly manufactured by intel and perfected by amd, the ppc 970 uses exothermic cmos technology. it therefore requires a constant heat source to avoid freezing.

      these chips use ibm's patented plutonium-on-silicon manufacturing process, and as such require a license from the nuclear regulatory commission to own.

  14. Re:please explain by MikeMo · · Score: 4, Informative

    The 970 has the same instruction set (99%) as the G4, but it also has a very, very different internal architecture that should make it quite a bit faster than the G4 at the same clock rate. It's actually a scaled-down version of the Power4 chip, the CPU in a lot of IBM's much larger systems. The Power family is the root of the PowerPC chip, which was actually created by IBM/Apple/Motorola to simply use the same instruction set.

    The IBM Power4 runs many of IBM's OS's.

  15. Re:please explain by binaryDigit · · Score: 4, Informative

    First of all, what is the processor that Apple using now? Isn't it some sort of PowerPC already? I see this one supports Altivec and I know that G3 and G4 Apple computers have the same instruction sets. Is this just another implementation, or is G3 and G4 relatives of this new processor?

    Apple currently uses the G4 and G3 family. The G4 has AltiVec, G3 does not. G4/G3 are product names, whereas 970 are more like model numbers. There all related in that they implement the PowerPC ISA (Instruction Set Archetecture).

    Second: what operating system does the IBM PowerPC run?

    Depends on who is selling the machine the chip is in. Apple sells OS9 and OSX. IBM has AIX. And of course there's Linux and BSD. These are the most common.

    I suspect that the article is just confusing and processor itself is not made by IBM. Right??

    Nope, IBM does manufacture the 970. IBM also makes G3's. AFAIK Motorola is the only one making G4's right now (could be wrong here, could be that IBM is cranking some G4's as well). Also note that both Motorola and IBM sell other variations of the PowerPC (most well known is the PPC that powers the Nintendo GameCube).

  16. wiggy by DemiKnute · · Score: 5, Insightful

    Whodathunk that one day we'd be reading a story titled "Apple: ..." with an IBM icon? Maybe I'm getting old, but I think it's kinda cool.

    --
    .
  17. From the Specs... by aSiTiC · · Score: 5, Informative

    From reading the specs it says:

    9 Fetch, Decode Stages
    5-13 OoO Execute Stages
    2-3 Dispatch, Commit

    So at total of 16-25 pipelined stages. I also notice that the longest(25) is for the Alti-Vec engine. This is very comparable to Pentium 4 which has 26 pipelined stages, although Pentium 4 does not have a vector engine.

  18. Re:Let's see some FAB speed scores (specs here) by writertype · · Score: 5, Informative
    Well, hauling out the report from Microprocessor Forum it looks like:
    The core, as defined, contains 64 Kbytes of instruction cache, 32 Kbytes of data cache, and 512 Kbytes of 8-way set associative level 2 cache. Unlike the Power4, the core does not apparently contain an onboard cache controller to enable the use of off-chip L3 cache.

    The front-side bus electrically runs at 450-MHz, double-clocked to an effective rate of 900-MHz, generating a peak bandwidth of 7.2 Gbytes or 6.4 Gbytes/s of useable bandwidth after transaction overhead is taken into account, Sandon said. Five instructions can be issued and acted upon at any one time, while a total of 200 instructions can be "in flight" at any time, taking into account instructions that are stored in queues.

    Performance-wise, IBM believes the chip can record a benchmark of 932 on SPECint 2000 and a score of 1051 on SPECfp2000, both at 1.8-GHz. Peak SIMD GFLOPs should be about 14.4, Sandon said. Using Dhrystone MIPS, the chip should output a score of 5,220. or 2.9 DMIPS/MHz/. IBM expects the chip should test 18 million RC5 keys per second.

  19. Re:Help by TheRaven64 · · Score: 4, Funny

    Didn't you post this to the last article that mentioned Apple? If you hate your Mac so much, why is it still on your desk? And why do you keep copying this 19MB file around anyway? Your disk must be getting pretty full of copies of the same file by now...

    --
    I am TheRaven on Soylent News
  20. Re:quick question by TheRaven64 · · Score: 5, Insightful

    will laptops be feasible?

    These chips are targetted at blades. Blades require:

    1. Low power consumption
    2. Low heat dissipation

    Laptops, on the other hand, require:

    1. Low power consumption
    2. Low heat dissipation

    Draw your own conclusions

    --
    I am TheRaven on Soylent News
  21. Re:Chip speed won't save Apple by damiam · · Score: 5, Funny
    They're over and done with, and have been, for nearly half a decade now.

    And they will continue to be over and done with for several more decades, while still turning out incredible computers.

    --
    It's hard to be religious when certain people are never incinerated by bolts of lightning.
  22. Re:Let's see some FAB speed scores (specs here) by nosferatu-man · · Score: 4, Informative

    For comparison's sake, the P4 Xeon @ 1.8ghz pulls 703/717 (int/fp) on SPEC CPU2000.

    Assuming a linear scaling in SPEC performance, we can look forward to a 2.5ghz 970 scoring about 1294/1460, which is pretty respectable. Not a world beater (especially for 2H03), but a far cry from the abominable performance of the current G4.

    'jfb

    --
    To spur "enterprise Linux," Big Bang, the distributed two-phase commit.
  23. No by Galahad2 · · Score: 4, Insightful

    The 2.5GHz number isn't the same as Intel talking about 5GHz P4s. IBM means that they're going to sell 2.5GHz Blade servers. The reason that Intel talks about their insane GHz processors is to impress consumers into buying Intel. People in the market for mid-range Blade servers couldn't care less about what IBM can do in one in a million chips, and they would likely be annoyed if IBM misrepresented it in that way. If IBM can't manufacture the chips in quantity (I'm not aware if they're manufacturing any 970's in mass yet), they will be able to shortly, certanly before the release of the chip.

  24. Re:Let's see some FAB speed scores (specs here) by nosferatu-man · · Score: 4, Interesting

    Fair enough. Right now, the fastest processor in the world is the Pentium 4 3.06ghz: 1130/1103 (int/fp). For pure floating-point horses, it's the Itanic 2 743/1427 (int/fp).

    So a 2.5ghz 970 would be close in performance to both of today's fastest shipping processors. It's likely that the P4 and Itanic will be 15-20% faster in six months, so IBM will still be lagging in the performance hunt. However, it's striking how much closer to the peak performers this chip will move IBM -- and, by extension, Apple.

    'jfb

    --
    To spur "enterprise Linux," Big Bang, the distributed two-phase commit.
  25. Reality check by Rui+del-Negro · · Score: 4, Interesting

    At the same clock speed, and for short sequences of instructions, a Z80 can beat a P4. The problem is... they don't make them at the same clock speed.

    It's irrelevant how many times per second the chips clock says "tic-tac", what matters is how fast real chips can get real jobs done. For real-world purposes, you can compare the best (ie, the fastest chips) or the most valuable (ie, the ones with the best speed/price ratio).

    So you see, Mr. Anonymous Coward, comparing the performance "per clock cycle" is irrelevant. It's like comparing the performance "per instruction length", or "per transistor count". It might be interesting from a theoretical point of view, but if a chip that does a lot of work per cycle cannot do more than a couple of cycles per second, it's still a terribly slow chip. The P4 was designed to do less work per cycle, but work at higher frequencies. The Athlon, on the other hand, does more work per cycle but cannot reach such high frequencies. In the end, they're more or less matched. So, in that situation, which one do you buy? Perhaps you buy the one with better "performance per clock cycle". I buy the one that's cheaper (funnily enough, in this case they would be the same).

    I thought Macs were competitive with PCs. Or are you saying that anyone who buys a Mac is totally clueless? It all depends on the market you're talking about. When this chip is finally released, PC processors will be twice as fast than they are now, and will probably cost half what they cost now. Anyone buying a Mac for raw number-crunching is an idiot, just as anyone using Windows for a firewall or a quad Xeon for an office machine is an idiot. It doesn't matter is something is faster or slower, as long as it's fast enough.

    To use a car metaphor (that most people seem to understand), not everyone needs or wants to drive a Lamborghini. It's expensive, it's hard to park, it's hard to drive, it's cramped and it drinks like a fish. Most people are better off with a "normal" car, that's fast enough and powerful enough for them, is easy to drive, and has room for the kids and the dog.

    Having said that, if you spot someone selling a metallic-gray Lamborghini Diablo Roadster (convertible) for less than 15K, let me know, will you?

    RMN
    ~~~

  26. Estimated Scores of 2.5GHz Chip by Galahad2 · · Score: 5, Informative

    Assuming the same bus speed (which is impossible, so take these numbers to be within, say, one hundred points of reality) and linear performance progression, the 2.5GHz chip should have:

    SPECint2000 =
    937 / 1.8 = 520.5 points/GHz * 2.5
    Estimated Score ~= 1300
    Average P4@3.0GHz score ~= 1080 (the 970 = 20% faster)

    SPECfp2000 =
    1051 / 1.8 = 583.9 points/GHz * 2.5
    Estimated Score ~= 1460
    Average P4@3.0GHz score ~= 1100 (the 970 = 33% faster)

    RC5 =
    18 / 1.8 = 10 * 2.5
    Estimated Score ~= 25M keys/sec
    Average P4@3.0GHz score ~= 4.3M keys/sec (the 970 = 581% faster)

    Take these numbers with a grain of salt, but they're somewhat interesting. I like the RC5 score, especially. ;)

  27. Re:x86 does have vector support by Dominic_Mazzoni · · Score: 5, Informative


    Yeah you're right I didn't account for MMX and SSE.

    However there is little comparison.

    Alti-Vec
    # 32 separate Registers
    # 128 bits per register
    # No interference with FP registers
    # no context or mode switching
    # max throughput: 8 Flops / cycle

    MMX/SSE
    # 8 MMX registers shared with the FPU, 8 for SSE
    # 64 bits per mmx register, 128 bits per xmm register
    # MMX stalls the FP registers
    # context switching required for MMX
    # max throughput: 2 Flops / cycle

    When you are playing a 3D game do you really want your FPU stalled for vector calculations?


    To be fair, you could program your 3D game to do all FPU calculations in SSE. gcc has an option to do this automatically now. And SSE2 is one step ahead of AltiVec in one regard - it supports a few double-precision operations.

    But aside from those two nitpicks, I agree completely. I've hand-optimized code for both Pentium/SSE and G4/AltiVec and there's no comparison: SSE provides a small performance boost for a lot of work, while AltiVec provides a large performance boost for a little bit of work. AltiVec has very fancy shift, rotate, and shuffle instructions that are completely lacking in SSE. These are useful for more than just RC5 - they're totally necessary to vectorize many more complicated algorithms without the overhead of putting the data in the right place eating up any potential speed gains.

    That's why the 970 in a Mac will easily beat the P4 in a number of tests: Apple has optimized hundreds of system calls to use AltiVec already, so many programs get the speed gain automatically.