1100 MHz 'Athlon Killer' Due From Intel in December
jeffstar writes "According to this article at The Register, Intel has an 1100 MHz 'Athlon Killer' IA32 chip coming out. Yum, that's the kind of sauce I like." Sounds great. If it comes out - and performs - as promised.
1. Megahertz is a dead end
Allready processors are too fast for the rest of the system. This has been alleviated for the last decade by an increasingly complicated system of caches and chipsets. At worst you'll go throgh 3 levels of processor cache, main memory, disk cache and finally disk, for a total of 6 levels of memory. This could go on indefinately but will have decreasing returns, unless the architecture of the computer can catch up to be generally faster. SGI/Cray has done this well.
2. Megahertz == Marketing
Ever since the P2, it's been terribly obvious that Intel just develops to satisfy what the majority clueless consumer wants- a higher megahertz number. The P2 made it blatant by being inferior to the older P's when run at equal megahertz. The only benefit was that it would run at higher megahertz.
Efficiency
No x86 has been really efficent- in many ways. More gates, more watts, more space, more heat. The unfortunate predominance of x86 is leading to space robots being designed with pentiums because Intel can push through to get the chips certified. When multiprocessing becomes a necessity as clock speeds dead end, who will be able to afford the power and large case for cooling that 8-64 P[3-5]'s will need? It's absurd.
Start Running Better Polls
I'm left wondering if this article is going to be any more accurate than one the Register ran earlier this year when they said that the 666MHz Coppermine would appear in late 1999, "clear 12 months before AMD is expected to reach the magical figure". Yeah, right.
HH
Yellow tigers crouched in jungles in her dark eyes.
She's just dressing, goodbye windows, tired starlings.
Branch prediction is the major problem. Sure, predicting one branch may work 90% of the time, but when you start talking about wide machines, all of a sudden you're predicting 2, 3 or 4+ branches at once. Your prediction rate goes way down. Fast.
A student here did a study that showed >50% of the processor cycles were spent recovering from branches. And I don't think the study was on a particularly aggressive machine (though I can check that).
The encouraging this is, if we can get around branch problems (and that's a huge if), the parallelism is there. But not where the machine can see it. There was a study exploring the limits of ILP in Spec95 (yes, not realistic benchmarks, but it's what was available). If you assume perfect prediction (yes, completely unrealistic, but this was a limit study) and remove the stack pointer (which is often on the critical path of instruction dependencies), you can get parallelism in the hundreds (for integer programs) or thousands (for floating point stuff) of instructions.
But there's catch. If your instruction window is 10k instructions wide or less (a completely unrealistic size, by the way), the parallelism drops by an order of magnitude or more. The hardware doesn't have enough context to see it. But the compiler does. Think about forking threads on function calls when you can and you'll see where I'm going.
Some kind of model like Simultaneous MultiThreading may be needed in the future. Compaq is working hard on this for the Alpha.
What's important to remember is that we've received the biggest speed boosts from the process guys. Cranking the clock and packing in gates (i.e. cache) does much more than adding another pipeline. Remember Moore's Law.
--