Ars Dissects POWER5, UltraSparc IV, and Efficeon

← Back to Stories (view on slashdot.org)

Ars Dissects POWER5, UltraSparc IV, and Efficeon

Posted by timothy on Thursday November 20, 2003 @07:58AM from the pirate-noise-plural dept.

Burton Max writes "There's an interesting article here at Ars about the POWER5, UltraSparc IV, and Efficeon CPUs. It's a self-styled "overview of three specific upcoming processors: IBM's POWER5, Sun's UltraSparc IV, and Transmeta's Efficeon. " I found the insights as to Efficeon (successor to Crusoe) to be particularly good (although it paints a sad picture of Transmeta, methinks)."

15 of 176 comments (clear)

Min score:

Reason:

Sort:

Good article by The_Ronin · 2003-11-20 08:02 · Score: 5, Interesting

Too bad they focused too much on Power and Transmeta while paying little time on UltraSparc IV and V and ignored Itanium. Needs a little more balance and it would have been a great read.

--

I don't drink because I have to, I drink to stop the voices in my head!
1. Re:Good article by AKAImBatman · 2003-11-20 08:16 · Score: 5, Interesting
  
  I think it would have been best to have an article devoted to the TransMeta chip, and split the Power5/UltraSparc discussion out into its own article. That way he could have given a great deal more attention to the powerhouse chips and how they're going to change the future. TransMeta's chips are on the level of ARM, not UltraSparc.
  
  --
  Javascript + Nintendo DSi = DSiCade
Sun? by Raven42rac · 2003-11-20 08:08 · Score: 3, Interesting

Why the heck did Sun's offering get thrown in there? For variety? The Efficeons look awful nice to people who want less power-hunger from their computing devices. If all you do is word processing and such, why the heck even use an Intel/AMD chip? Less heat, less power, what is not to love? Now the IBM chips have really piqued my interest, I am a huge fan of IBM's chips, especially in Apple computers (I am a proud owner of a 12" Powerbook).

--
I hate sigs.
One Power 5... by Realistic_Dragon · 2003-11-20 08:09 · Score: 4, Interesting

Will show up as _4_ processors to the OS! (2 cores both doing SMT.)

This means that in a (say) 512 processor box the OS will have to handle 2048 processors efficiently. That's placing a lot of control in the hands of the software designers, and a lot of money in the hands of the companies that license per processor.

On the other hand, UNIX is getting pretty efficnelt at scaling to large systems, perhaps it (and by extension Linux thanks to SGI and IBM) will be able to handle it with no problems. One thread per processor on a desktop system might prove to be quite efficient :o)

--
Beep beep.
1. Re:One Power 5... by stevesliva · 2003-11-20 08:17 · Score: 3, Interesting
  
  I'm getting a lot of karma mileage from this Power5 MCM review these days. They visited the same Microprocessor Forum that Ars did.
  
  --
  Who do you get to be an expert to tell you something's not obvious? The least insightful person you can find? -J Roberts
2. Re:One Power 5... by redgren · 2003-11-20 09:20 · Score: 2, Interesting
  
  Reliability is pretty impressive on these designs, considering the complexity. At least on the Power4 (similar design), MTBFs are measured in decades.
3. Re:One Power 5... by AKAImBatman · 2003-11-20 09:26 · Score: 2, Interesting
  
  Yes, but what are the advantages? IBM, Sun, and HP all make a business out of selling components with very high MTBF. Yet, if I have a 64 processor machine chugging along for years on end, I have a reasonably good chance of seeing a failure. (Particularly when chips come from a bad batch.)
  
  So, IBM is taking away the ability to hot swap individual chips in exchange for... what? That's the big question. If there's some major improvement in the design, say so! Inquiring minds want to know! :-)
  
  --
  Javascript + Nintendo DSi = DSiCade
The "hyperthreading" thing. by Animats · 2003-11-20 08:18 · Score: 3, Interesting

First "Hyperthreading", now "prioritized hyperthreading".
It's amusing seeing this. It reflects mostly that Microsoft has finally managed to ship in volume OSs that can do more than one thing at a time. (Bear in mind that most of Microsoft's installed base is still Windows 95/98/ME. Transitioning the customer base to NT/Win2K/XP has gone much more slowly than planned.)
But Microsoft takes the position that if have multiple CPUs, you have to pay more to run their software. So these strange beasts with multiple decoders sharing ALU resources emerge.
power consumption by bigpat · 2003-11-20 08:22 · Score: 4, Interesting

Wasn't low power consumption the number 1 benefit that transmeta was looking to provide, so that you could get twice the battery life (or soemthing like that) without sacrificing too much performance. Did Transmeta shoot itself in the foot by letting people think that it was going to provide higher performance chips than the competition.

The main selling point of transmeta was always power consumption, so have they lost their edge in that area? If so, then that would be serious for them, but the article doesn't answer that question.
Why only two threads per core? by joib · 2003-11-20 08:50 · Score: 2, Interesting

Seems like the power5 will be able to run only two threads per core, like the pentium 4. For the P4 it is understandable that they want to reduce cost as much as possible, but why be so frugal on a high-end cpu like the power5?

I mean, the MTA supercomputer which pioneered the entire SMT concept, was able to run 128 threads per cpu. Ok, so they had different design constraints as well. Basically, the idea was that the cpu:s didn't have any cache at all thus making them simpler and cheaper. To avoid the performance hit usually associated with this they simply switched to another thread when one thread became blocked waiting for memory access.

Anyway, is there any specific reason why IBM didn't put more than 2, say 8 or 16 threads per cpu on the power5?
1. Re:Why only two threads per core? by kcm · 2003-11-20 09:59 · Score: 3, Interesting
  
  In other words, you're laying out the basic problems of:
  
  1) Being able to FIND parallelism
  2) Being able to take advantage of it:
  a) Issuing multiple instructions (limited fetch bandwidth)
  b) Executing them in parallel (limited FUs)
  c) Committing them to memory / retiring
  
  20% is generous, but that's a limitation of the simplicity of HT with respect to the EV8 / UltraSparc-V scale of SMT implementation, which leans towards a more full-issue design.
So, despite being lower voltage/MIPS... by csoto · 2003-11-20 08:51 · Score: 5, Interesting

the author suggests that it's not worth "pissing off Intel" to go with Transmeta. Give me a break. Transmeta is the only thing pushing Intel to make Centrino and other lower-wattage chips. They recognize that anybody in the mobile computing/devices world will seriously consider anything that gives their customers increased battery life and less toasty pockets.

--
There exists no way of exchanging information without making judgments. --Bene Gesserit Axiom
memory and processor watts not the same by pz · 2003-11-20 08:53 · Score: 5, Interesting

Multiple times while reviewing the Efficion architecture the article's author suggests that the tradeoff of additional storage required for Transmeta's code-morphing approach will easily balance out the power savings from making a simpler CPU. This belies a deep misunderstanding of power consumption in digital systems, as readily evidences by the fact that modern non-Transmeta processers dissipate multiple tens of Watts of power (often nearly 100W) and a full complement of memory (4G, in modern machines) dissipates a few Watts at most.

Also in the article, the author suggests that processors spend most of their time wating on loads, and then argues that since the code-morphing approach means more instruction fetches, the Efficion processor will be spending disproportionatly more time on loads. Then, after this assertion, he admits that he does not know *where* the translated Efficion code is held. Might it be in one-cycle-accessible L1 cache? That point is conveniently sidestepped. He does not understand under what circumstances the profiling takes place, although he regurgitates the sales pitch nicely. He argues that transistors hold the translated code (trying to argue against the transistors-for-software tradeoff) but then does not realize that transistors in memory do not equate transistors in logic (neither in power, as they are not cycled as frequently, nor in speed characteristics).

In all, I find the author's treatment of the Transmeta architecture sophomoric, and, after finding that section lacking, I left the rest of the article unread. Your mileage may vary.

--

Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
I need some explanations by Anonymous Coward · 2003-11-20 09:09 · Score: 1, Interesting

Interesting article indeed, yet there is a thing I on't quite understand about ILP (Instruction Level paralellism) :
If the number of decoded instructions is higher, then - the CPU being superscalar - the probability of having all pipelines working grows, which means that ILP's also going up.
Of course the ILP depends on the compiler quality and the program code itself, but having a good parallelism capacity in the CPU is also a key factor.
Sweeping generalizations by PetWolverine · 2003-11-20 12:07 · Score: 2, Interesting

In fact, you could tell the story of the past 15 years of computer evolution -- from the rise of the PC to the rise of the Internet -- in terms of the effects of the amount of time it takes various components -- from a processor all the way out to a networked computer -- to load data.

I like this assessement. Forget about Moore's Law as a measure of our progress; latency and throughput are far more important than processing power.

Computers used to be for processing information; these days, most people use them more for accessing and delivering information. Every new computer I've gotten before my current one has only satisfied me by being faster than the ones that went before, not by actually being fast enough. However, my current machine (dual-1.25GHz Power Mac G4) leaves me with no complaints about speed--while I certainly wouldn't complain if it were a little faster, I never feel like I'm waiting for the computer for an unreasonable amount of time; most of the time, it's waiting for me.

However, when it's not waiting for me, it's waiting for one of its hard drives to spin up and feed it with data, or for some slow server to send it something. I would trade one of my processors for a 2x improvement in either disk or network latency. While these aren't the types of latency directly addressed in the article, I would wager that on the rare occasions when I actually have to wait for some processing to take place, most of that time is spent loading data from memory, not actually processing it.

It's not that processors are fast enough for everybody and we should forget about making them any faster; I'm sure graphics and video professionals, among others, will always have a need for more raw speed. But for most computer users, the continued emphasis on speed is misplaced. If computer manufacturers could transfer just a little bit of their R&D spending from increasing speed to decreasing latency, we'd all be better off.

--
I found the meaning of life the other day, but I had write-only access.