Ars Dissects POWER5, UltraSparc IV, and Efficeon

Good article by The_Ronin · 2003-11-20 08:02 · Score: 5, Interesting

Too bad they focused too much on Power and Transmeta while paying little time on UltraSparc IV and V and ignored Itanium. Needs a little more balance and it would have been a great read.

--

I don't drink because I have to, I drink to stop the voices in my head!

Re:Good article by AKAImBatman · 2003-11-20 08:16 · Score: 5, Interesting

I think it would have been best to have an article devoted to the TransMeta chip, and split the Power5/UltraSparc discussion out into its own article. That way he could have given a great deal more attention to the powerhouse chips and how they're going to change the future. TransMeta's chips are on the level of ARM, not UltraSparc.

--
Javascript + Nintendo DSi = DSiCade
Re:Good article by Anonymous Coward · 2003-11-20 08:30 · Score: 3, Informative

There's a reason they ignored Itanium, it's about upcoming processor technologies. Last I checked there wasn't a new, soon to be released, Itanium that Intel was pushing.

In fact, the current Intel processor roadmap shows the same Itanium 2 processor for the first half of 2004 as it did for the second half of 2003.
Re:Good article by jaberwaki · 2003-11-20 08:39 · Score: 3, Informative

I believe he didn't spend much time on the UltraSparc IV because, quote:

"To get the "hyperthreading" effect of two processors on one chip, Sun stuck two full-blown UltraSparc III cores on a single chip, which is chip-pin compatible with the UltraSparc III."

He assumes the interested reader will already know something about the UltraSparc III. Sun didn't fundamentally change the chip architecture. Also the Itanium architecture is already discussed ad-nauseum in other articles. It wasn't meant to be a balanced overview of all new CPU architectures.
Re:Good article by pmz · 2003-11-20 12:22 · Score: 3, Informative

Sun didn't fundamentally change the chip architecture.

Probably the most significant outcome of the USIV will be 212-CPU Sun Fire 15K servers. That seems to imply something like 5 or 6 CPUs per rack-unit (although it appears the 15K is somewhat bigger than a standard rack).

--
Healthcare article at Kuro5hin

brain fart while reading the article by zontroll · 2003-11-20 08:03 · Score: 2, Funny

I had a brain fart for a second while reading the article:

"This is why the advances that have the most striking impact on the nature and function of the computer are the ones that move data closer to the functional units. A list of such advances might look something like: DRAM, PCI, on-die caches, DDR signaling, and even the Internet"

For a second there, I thought that the list of advances started with DRM, not DRAM, and I almost had a heart attack.

Transmeta is a joke by Anonymous Coward · 2003-11-20 08:04 · Score: 3, Insightful

Since day 1 they have skirted the benchmark issue, always trying to deflect the question.

Just like that article yesterday on their new chip. Did they ever cite a single benchmark? NO.

The basic performance of your CPU product, as measured by industry standard benchmarks, is essential knowledge.

I was under NDA on the previous gen Transmeta stuff. It was amusing how the other OEMs reacted - it was crap, but nobody could say anything in public.

Sun? by Raven42rac · 2003-11-20 08:08 · Score: 3, Interesting

Why the heck did Sun's offering get thrown in there? For variety? The Efficeons look awful nice to people who want less power-hunger from their computing devices. If all you do is word processing and such, why the heck even use an Intel/AMD chip? Less heat, less power, what is not to love? Now the IBM chips have really piqued my interest, I am a huge fan of IBM's chips, especially in Apple computers (I am a proud owner of a 12" Powerbook).

--
I hate sigs.

Re:Sun? by illumin8 · 2003-11-20 09:50 · Score: 4, Insightful

I am a huge fan of IBM's chips, especially in Apple computers (I am a proud owner of a 12" Powerbook).

I don't mean to burst your bubble, but your 12" PowerBook uses a Motorola processor, not an IBM one. I own a 15" PowerBook though and I love it.

That having been said, the IBM PPC 970 or G5 is breathing new life into the PowerMac line and Apple is doing really well because of it. I can't wait until they get it stuffed into a PowerBook.

--
"When the president does it, that means it's not illegal." - Richard M. Nixon
Re:Sun? by Kunta+Kinte · 2003-11-20 11:27 · Score: 2, Insightful

Why the heck did Sun's offering get thrown in there? For variety? The Efficeons look awful nice...
"I don't like or use it so one else does"
Real smart.
Any idea the amount of Sun systems are out there? People who use Sun hardware and software, and *gasp*, like it?! Should we only evaluate chips that currentlydo ok in the slashdot market?

--
Based on upvotes, Ageism is the only "-ism" Slashdotters care about and think isn't SJW

One Power 5... by Realistic_Dragon · 2003-11-20 08:09 · Score: 4, Interesting

Will show up as _4_ processors to the OS! (2 cores both doing SMT.)

This means that in a (say) 512 processor box the OS will have to handle 2048 processors efficiently. That's placing a lot of control in the hands of the software designers, and a lot of money in the hands of the companies that license per processor.

On the other hand, UNIX is getting pretty efficnelt at scaling to large systems, perhaps it (and by extension Linux thanks to SGI and IBM) will be able to handle it with no problems. One thread per processor on a desktop system might prove to be quite efficient :o)

--
Beep beep.

Re:One Power 5... by stevesliva · 2003-11-20 08:17 · Score: 3, Interesting

I'm getting a lot of karma mileage from this Power5 MCM review these days. They visited the same Microprocessor Forum that Ars did.

--
Who do you get to be an expert to tell you something's not obvious? The least insightful person you can find? -J Roberts
Re:One Power 5... by Elwood+P+Dowd · 2003-11-20 08:30 · Score: 2, Insightful

This means that in a (say) 512 processor box the OS will have to handle 2048 processors efficiently. That's placing a lot of control in the hands of the software designers, and a lot of money in the hands of the companies that license per processor.

Fortunately for IBM, they are both the hardware designers and, frequently, the software designers. They can ensure that their big iron will be supported by software.

--

There are no trails. There are no trees out here.
Re:One Power 5... by stevesliva · 2003-11-20 08:31 · Score: 2, Informative

It's four dual-core SMT processor chips, and four L3 cache chips per MCM, actually. I think cache is sexy, but I'm biased.

--
Who do you get to be an expert to tell you something's not obvious? The least insightful person you can find? -J Roberts
Re:One Power 5... by AKAImBatman · 2003-11-20 08:35 · Score: 2, Insightful

Alright, 4 two-way chips. But does it actually improve anything over individual processors? If I have to yank a board on an UltraSparc, I'm not going to throw away the entire board and all its processors! I'm simply going to replace the bad one and slap the board right back in the system. With IBM's design, I have to throw the whole thing away and get a new block of cement^W^W^W processor chip for my machine.

--
Javascript + Nintendo DSi = DSiCade
Re:One Power 5... by redgren · 2003-11-20 09:20 · Score: 2, Interesting

Reliability is pretty impressive on these designs, considering the complexity. At least on the Power4 (similar design), MTBFs are measured in decades.
Re:One Power 5... by AKAImBatman · 2003-11-20 09:26 · Score: 2, Interesting

Yes, but what are the advantages? IBM, Sun, and HP all make a business out of selling components with very high MTBF. Yet, if I have a 64 processor machine chugging along for years on end, I have a reasonably good chance of seeing a failure. (Particularly when chips come from a bad batch.)

So, IBM is taking away the ability to hot swap individual chips in exchange for... what? That's the big question. If there's some major improvement in the design, say so! Inquiring minds want to know! :-)

--
Javascript + Nintendo DSi = DSiCade
Re:One Power 5... by isaac · 2003-11-20 09:57 · Score: 3, Informative

So, IBM is taking away the ability to hot swap individual chips in exchange for... what? That's the big question. If there's some major improvement in the design, say so! Inquiring minds want to know! :-)
Damn, dude, RTFA if you're that curious!
What is gained is full-speed interconnect between processors within the same module. No "multipliers" - the bus between the cores within the module run at chip speeds. The timings are so tight at 2+ GHz that this is simply impossible to do with individual chips.
-Isaac

--
I am not a lawyer, and this is not legal advice. For Entertainment Purposes Only.

Performance != marketshare by G4from128k · 2003-11-20 08:11 · Score: 4, Insightful

The history of Wintel suggests that top-rated raw CPU performance is not the best predictor of adoption. Compatibility with market-dominating software platforms is a greater determinant of CPU sales. We might hope that advances in compiler design adn flexible cores can help any CPU run x86 code, but there are always the little nts that prevent true compatibility and drive computer buyers toward the dominant platform.

--
Two wrongs don't make a right, but three lefts do.

The "hyperthreading" thing. by Animats · 2003-11-20 08:18 · Score: 3, Interesting

First "Hyperthreading", now "prioritized hyperthreading".

It's amusing seeing this. It reflects mostly that Microsoft has finally managed to ship in volume OSs that can do more than one thing at a time. (Bear in mind that most of Microsoft's installed base is still Windows 95/98/ME. Transitioning the customer base to NT/Win2K/XP has gone much more slowly than planned.)

But Microsoft takes the position that if have multiple CPUs, you have to pay more to run their software. So these strange beasts with multiple decoders sharing ALU resources emerge.

power consumption by bigpat · 2003-11-20 08:22 · Score: 4, Interesting

Wasn't low power consumption the number 1 benefit that transmeta was looking to provide, so that you could get twice the battery life (or soemthing like that) without sacrificing too much performance. Did Transmeta shoot itself in the foot by letting people think that it was going to provide higher performance chips than the competition.

The main selling point of transmeta was always power consumption, so have they lost their edge in that area? If so, then that would be serious for them, but the article doesn't answer that question.

Why only two threads per core? by joib · 2003-11-20 08:50 · Score: 2, Interesting

Seems like the power5 will be able to run only two threads per core, like the pentium 4. For the P4 it is understandable that they want to reduce cost as much as possible, but why be so frugal on a high-end cpu like the power5?

I mean, the MTA supercomputer which pioneered the entire SMT concept, was able to run 128 threads per cpu. Ok, so they had different design constraints as well. Basically, the idea was that the cpu:s didn't have any cache at all thus making them simpler and cheaper. To avoid the performance hit usually associated with this they simply switched to another thread when one thread became blocked waiting for memory access.

Anyway, is there any specific reason why IBM didn't put more than 2, say 8 or 16 threads per cpu on the power5?

Re:Why only two threads per core? by AxelTorvalds · 2003-11-20 09:44 · Score: 2, Informative

I recall an article on Forbes (of all places.. they talk to the right guys though) on the matter comparing the Sun Niagra design goals and IBM's and Intel's. Basically the answer was that it's not clear what kinds of apps benefit from 8, 16, or 32 threads of parallelism. This is a low tech description but there are other bottle necks, you have to have that many "threads" of code that are ready to run to benefit from it or else it's cheaper to context switch.
Subsequently, I don't know how much you've played with Pentium IVs but they don't buy much in most circumstances. We're not talking about doubling performance or anything like that. If one SMT unit get's you a 20% improvment on a p4 in the best cases then what does the 4th unit buy? IBM is just hedging because the technology hasn't shown that it delivers serious punch yet.
Re:Why only two threads per core? by kcm · 2003-11-20 09:59 · Score: 3, Interesting

In other words, you're laying out the basic problems of:

1) Being able to FIND parallelism
2) Being able to take advantage of it:
a) Issuing multiple instructions (limited fetch bandwidth)
b) Executing them in parallel (limited FUs)
c) Committing them to memory / retiring

20% is generous, but that's a limitation of the simplicity of HT with respect to the EV8 / UltraSparc-V scale of SMT implementation, which leans towards a more full-issue design.
Re:Why only two threads per core? by Anonymous Coward · 2003-11-20 16:27 · Score: 3, Insightful

This is a false economy. Just because you have 32 threads to run doesn't mean you would benefit from 32-way SMT. Remember that you don't just need 32 contexts in your CPU, you need enough cache to be able to feed 32 unrelated threads. The reason SMT sometimes slows down a CPU is that the 2 or more threads running concurrently compete for cache space. If you just run a single thread at a time, it has a whole quantum to fill up the cache and use it.

The way this worked on the afforementioned MTA machine is that the processor had 128 contexts and NO cache. Memory latency was 32 cycles, so as long as you had at least 32 compute-bound threads, there were no cycles lost to latency. However, this means that each thread did take longer to run.

aQazaQa
Re:Why only two threads per core? by javiercero · 2003-11-20 20:46 · Score: 2, Insightful

Actually the multithreaded design is an answer to the lack of parallelism, as most deisgns are able to deal at the thread or process level, hence the parallelism is implicit and does not ned to be "found" That is the whole point. You are citing the limitations of superscalar, not SMT designs.

So, despite being lower voltage/MIPS... by csoto · 2003-11-20 08:51 · Score: 5, Interesting

the author suggests that it's not worth "pissing off Intel" to go with Transmeta. Give me a break. Transmeta is the only thing pushing Intel to make Centrino and other lower-wattage chips. They recognize that anybody in the mobile computing/devices world will seriously consider anything that gives their customers increased battery life and less toasty pockets.

--
There exists no way of exchanging information without making judgments. --Bene Gesserit Axiom

Re:So, despite being lower voltage/MIPS... by curtlewis · 2003-11-20 08:59 · Score: 3, Insightful

Centrino is not a chip!

it's a package of intel wireless, intel cpu and some other stuff.

memory and processor watts not the same by pz · 2003-11-20 08:53 · Score: 5, Interesting

Multiple times while reviewing the Efficion architecture the article's author suggests that the tradeoff of additional storage required for Transmeta's code-morphing approach will easily balance out the power savings from making a simpler CPU. This belies a deep misunderstanding of power consumption in digital systems, as readily evidences by the fact that modern non-Transmeta processers dissipate multiple tens of Watts of power (often nearly 100W) and a full complement of memory (4G, in modern machines) dissipates a few Watts at most.

Also in the article, the author suggests that processors spend most of their time wating on loads, and then argues that since the code-morphing approach means more instruction fetches, the Efficion processor will be spending disproportionatly more time on loads. Then, after this assertion, he admits that he does not know *where* the translated Efficion code is held. Might it be in one-cycle-accessible L1 cache? That point is conveniently sidestepped. He does not understand under what circumstances the profiling takes place, although he regurgitates the sales pitch nicely. He argues that transistors hold the translated code (trying to argue against the transistors-for-software tradeoff) but then does not realize that transistors in memory do not equate transistors in logic (neither in power, as they are not cycled as frequently, nor in speed characteristics).

In all, I find the author's treatment of the Transmeta architecture sophomoric, and, after finding that section lacking, I left the rest of the article unread. Your mileage may vary.

--

Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.

Re:memory and processor watts not the same by Hannibal_Ars · 2003-11-20 09:53 · Score: 5, Informative

"Multiple times while reviewing the Efficion architecture the article's author suggests that the tradeoff of additional storage required for Transmeta's code-morphing approach will easily balance out the power savings from making a simpler CPU."

I neither suggest nor imply anything this simplistic. In fact, I go to great pains to show how complicated the whole power picture is for Efficeon.

"This belies a deep misunderstanding of power consumption in digital systems, as readily evidences by the fact that modern non-Transmeta processers dissipate multiple tens of Watts of power (often nearly 100W) and a full complement of memory (4G, in modern machines) dissipates a few Watts at most."

Er... you do realize, don't you, that comparing Efficeon to a 100W processor is not only unfair, but it's stupid and I didn't do it anywhere in the article. A more appropriate comparison is Centrino, which approaches Efficeon in MIPS/Watt without any help at all from any kind of CMS software. I think that you might be the one who needs to learn a bit more about digital systems.

"Also in the article, the author suggests that processors spend most of their time wating on loads, and then argues that since the code-morphing approach means more instruction fetches, the Efficion processor will be spending disproportionatly more time on loads. Then, after this assertion, he admits that he does not know *where* the translated Efficion code is held. Might it be in one-cycle-accessible L1 cache? "

No, it is most certainly all not stored in L1. TM claimed that the original CMS software that came with Crusoe took up about 16MB of RAM, and that this was paged in from a flash module on boot. What I'm not 100% certain of are the exact specs for Efficeon, but I've assumed in this article that they're similar. This is a reasonable assumption, especially given the fact that the new version of CMS contains significant enhancements and is unlikely to be smaller. In fact, it's much more likely to be larger than the original 16MB CMS footprint, especially given that DRAM modules have increased in speed and decreased in cost/MB, which gives TM more headroom and flexibility to increase the code size a bit.

"That point is conveniently sidestepped. He does not understand under what circumstances the profiling takes place, although he regurgitates the sales pitch nicely. He argues that transistors hold the translated code (trying to argue against the transistors-for-software tradeoff) but then does not realize that transistors in memory do not equate transistors in logic (neither in power, as they are not cycled as frequently, nor in speed characteristics)."

Of course I know that transistors in memory are not the same as transistors on the CPU. My point though is that they're still not "free" in terms of power draw, and that it also costs power to both page CMS into RAM and to move it from RAM to the L1. And even having pointed that out, I still don't claim that this cancells out all the power saving advantages of TM's approach.

As far as relying on the sales pitch for info on CMS's profiling, well, TM doesn't exactly release the source for CMS, nor do they provide a detailed user manual for it avialable to the public. As their core technology, details about CMS are highly guarded and the only information that either you or I will likely ever have access to about it is whatever they put in the sales pitch. So I, like everyone else, must draw inferences from their presentations and do the best I can.

Anyway, if you don't like the article, that's fine. But being a hater about it just makes you look lame.

--
Senior CPU Editor | Ars Technica | http://arstechnica.com/
Re:memory and processor watts not the same by PastaAnta · 2003-11-20 11:57 · Score: 2, Insightful

Oh my god! Please stop comparing apples and banjos and try to make sense of it!

DDR SDRAM does not "run" at around 400MHz - the frequency of the databus is 400MHz. As you state yourself the power usage is very dependant on the usage pattern and only very few memory cells actualle change state during each write (up to 8 for an 8 bit RAM). I would guess that leakage and discharge of the capacitor cells is a significant factor, which you totally ignore.

In a processor on the other hand, a lot of transistors change state every clock cycle - even during execution of NOPs. Some signals will even change state several times during a clock cycle due to asynchronous races in the logic paths.
Re:memory and processor watts not the same by PastaAnta · 2003-11-20 12:43 · Score: 3, Insightful

First of all I thank you for a great article. You have som interesting views on the Transmeta approach. But like the parent poster I feel you may jump to some conclusions based on assumptions.

It is true, that the CMS has a cost in terms of RAM usage but this does not necessarily translate into extra load latency. As I have understood the clue should be to utilize the fact that in common code you only execute a very little portion of the code most of the time (like 90%/10% or whatever). It should be expected that much can be gained by heavily optimizing these "inner loops", which should translate into reduced load latency as fewer instruction will be executed in total. The execution of the four optimisation runs or JIT compilation should drown in the millions of times these inner loops are executed.

You could say that it is a complete waste of transistors and power usage to have many transistors performing the same optimisation over and over again in the conventional processors. These hardware based optimisation will also never be as efficient as their scope is limited.

There are some interesting perspectives with the Transmeta approach as well. You state that POWER5, UltraSparcIV and Prescott tacle the problem with load latency by using SMT to fill pipeline bubbles from data stalls and thereby increase utilisation of the execution units. This should be possible for Transmeta as well, by upgrading their CMS to emulate two logic processors instead of one.

But you are right! A complete theoretical comparison is impossible - only real world experience will show...

contexts != threads by kcm · 2003-11-20 09:03 · Score: 5, Informative

first, you don't just automatically get a linear increase with the width of the multiple-threading capabilities. it's not like it's free to increase the RF size and/or FUs, etc.

you're also confusing contexts with active threads. the Tera^WCray MTA had 128 contexts available -- so that thread switching is more light-weight, more or less -- but only one could be active at one time.

SMT in the various forms have more than one active thread, which introduces the problem(s) of competing for resources in the issue and retire stages, etc et al.

Efficeon has integrated northbridge by chipace · 2003-11-20 09:31 · Score: 2, Redundant

One detail that they didn't mention was the integrated AGP and DDR memory controller on Efficeon. Blades don't use graphics, so I'm thinking that Efficeon was designed primarily for Japanese laptops.

Efficeon allows for a low chip count design. That could mean a smaller and more reliable laptop design.

Guess what??? by crgrace · 2003-11-20 09:43 · Score: 5, Funny

I actually read the article!!!!!

All my questions were answered so I have nothing to say.

Re:One Power 5... (Just a matter of scale...) by Anonymous Coward · 2003-11-20 09:46 · Score: 3, Insightful

Ok, so you are worried that your parts are no longer accessable.

One of the first computers I built had individual TTL parts (74xx type things) to make the CPU. If I fried on of those, I would just replace that single part and be going again. No need to replace the whole CPU.

I, for one, would never go back to that. Not just the size but the performance and the cost.

It used to be that I would buy 4K-bit RAM chips. Buy 8 of those to make a 8x4K RAM array (4K bytes) and then add a simple address decoder and put it on an S100 bus and you have more RAM for you system. Now I buy 512Meg DIMM modules where you can't (and don't want to) replace the individual chips. (Ok, you could if you have the fancy tools and you could get the chips in question but the cost factors just don't make that worth while)

Systems are getting faster because of the higher integration levels. Taking the off-chip caches (like systems build with 386 and 486 CPUs) and putting that onto the CPU (first in the P-Pro as multi-chip modules and then later, onto the CPU itself) has significantly improved performance. Yes, it has removed ability to replace the cache separate from the CPU but then who really wants to or needs to do that. And at what cost (go back to 100MHz cache memory interfaces? I like the 3GHz clock in my on-chip cache, thank you very much.)

The same is true of multi-cpu systems. As you increase the performance, the communications performance becomes a major bottle neck. First IBM put two CPU cores on one chip. Then Intel did a thing called Hyper-Threading (after they said that dual cores have no value :-) The next step is to somehow connect multiple (4/8/16/whatever) CPUs along with large (multi-meg) caches together using these specialized interconnect technologies.

Imagine the performance gain of having 1GHz+ clocked cache of, say, 256Meg connected to 8 really fast CPU cores. It would be as much of a step forward as going from my TTL based 8-bit CPU to the 6502 single-chip CPU.

I know I would not want to go back... So lets investigate how to move forward.

Code has to be loaded anyway by Effugas · 2003-11-20 11:10 · Score: 2, Insightful

Since the author of this article is lurking here, I thought I'd ask:

You make a rather big deal about Transmeta needing to run all x86 code through a "code morpher" (dynamic recompiler, actually), and come up with a decently large set of conclusions based on it.

What's the big deal? No processor executes raw x86 anymore. Everything translates into an internal microcode that bears little resemblance to the original asm. Of course, normal chips have hardware accelerated microcode translaters, whereas Transmeta must recode in software -- but Transmeta's entire architecture was designed from day one to do that, and concievably they have more context available to do recoding by involving main memory in the process.

And what is it with you neglecting the equivalence of main memory? Yes, transistors are necessary to store the translated program. They're also necessary to store the original one -- the Mozilla client I'm presently tapping away inside sure as hell doesn't fit in L1 on my C3! Outside of a small static penalty on load, and a smaller dynamic penalty from ongoing profiling, you can't blame performance on the fact that software needs to be in RAM. Software always needs to be in RAM.

Don't get me wrong -- Transmeta's a performance dog, and everyone's known that since day one. But I think it's reasonable to say the cause is mostly one of attention -- every man hour they threw into allowing the system to emulate x86 took away from adding pipelines, increasing clock rates, tweaking caches, etc. In other words, yes it's a feat that they got the code to work, but you don't need to blame the feat for the quality of work -- they simply did alot of work nobody else had to waste time on, and fell behind because of it.

Much easier explanation. Might even be true.

Yours Truly,

Dan Kaminsky
DoxPara Research
http://www.doxpara.com

Re:Code has to be loaded anyway by kma · 2003-11-20 15:59 · Score: 3, Informative

Ehh. In my opinion, people overestimate how big a deal x86 architecture complexity is, in part because it flatters their preconception that Intel is evil. ("If only dastardly Intel hadn't been holding the world back with this demon architecture from hell, think how fast CPUs could be now!") While working at VMware, I've gotten to know the x86 architecture on a first name basis. He now lets me call him "Archie."

While Archie is undoubtedly an ugly, drunk screw-up, he's really a droplet in the ocean of effort that goes into a competitive CPU implementation. Yeah, we've got lots of code to deal with him, and he's an ongoing source of work, but not all that much code, nor that much work. If Archie were really such a terrible guy, it wouldn't be possible for Intel and AMD to be eating so many RISC vendors' lunches.

Mike Johnson, the lead x86 designer at AMD, probably put it most succinctly when he said, "The x86 isn't all that complex -- it just doesn't make a lot of sense." It's peculiar all right, but not so peculiar that it can explain Transmeta's failure to be performance competitive. From speaking with Transmetans, I get the strong impression that they got bogged down because making a high performance dynamic translation system is ridiculously hard, rather than, say, because they just couldn't get the growdown segment descriptors right.

Perhaps because by TCaM · 2003-11-20 11:51 · Score: 2, Insightful

unlike the other x86 knockoff manufacturers they have actually attempted something somewhat new and different in their designs. They may not have met with a roaring success marketwise but they certainly did try to attack things from a different angle. The point of the article seems to be comparing the somewhat different aproaches the various cpu makers took in their designs, not how many millions of chips they have sold or billions of dollars they have in the bank.

Sweeping generalizations by PetWolverine · 2003-11-20 12:07 · Score: 2, Interesting

In fact, you could tell the story of the past 15 years of computer evolution -- from the rise of the PC to the rise of the Internet -- in terms of the effects of the amount of time it takes various components -- from a processor all the way out to a networked computer -- to load data.

I like this assessement. Forget about Moore's Law as a measure of our progress; latency and throughput are far more important than processing power.

Computers used to be for processing information; these days, most people use them more for accessing and delivering information. Every new computer I've gotten before my current one has only satisfied me by being faster than the ones that went before, not by actually being fast enough. However, my current machine (dual-1.25GHz Power Mac G4) leaves me with no complaints about speed--while I certainly wouldn't complain if it were a little faster, I never feel like I'm waiting for the computer for an unreasonable amount of time; most of the time, it's waiting for me.

However, when it's not waiting for me, it's waiting for one of its hard drives to spin up and feed it with data, or for some slow server to send it something. I would trade one of my processors for a 2x improvement in either disk or network latency. While these aren't the types of latency directly addressed in the article, I would wager that on the rare occasions when I actually have to wait for some processing to take place, most of that time is spent loading data from memory, not actually processing it.

It's not that processors are fast enough for everybody and we should forget about making them any faster; I'm sure graphics and video professionals, among others, will always have a need for more raw speed. But for most computer users, the continued emphasis on speed is misplaced. If computer manufacturers could transfer just a little bit of their R&D spending from increasing speed to decreasing latency, we'd all be better off.

--
I found the meaning of life the other day, but I had write-only access.

Slashdot Mirror

Ars Dissects POWER5, UltraSparc IV, and Efficeon

40 of 176 comments (clear)