The Quest for More Processing Power
Hack Jandy writes "AnandTech has a very thorough, but not overly technical, article detailing CPU scaling over the last decade or so. The author goes into specific details on how CPUs have overcome limitations of die size, instruction size and power to design the next generation of chips. Part I, published today, talks specifically about the limitations of multiple cores and multiple threads on processors."
the quantum computer!! Until then we'll have to suck it up with these Si things.
That's what's been happening the last 10-15 years. Where are the indications that "time to market" and "sloppy programming" will suddenly vanish?
Run old software.
Its only new software thats sucking up all the extra processing power.
Remember back with really sluggish 33mhz 486s etc (and a lot lower) and thinking of the ultimate computer being a whole 50mhz.
Well now you got a computer thats over 10 times faster with practically infinate capacity.
Fire up that old operating system and run you original software, you will be in heaven!
liqbase
Didn't the powerpc have something approaching this.
I remember the old motorola 68000 range having 16 32bit regs for general coding, and one of the prime benefits of the ppc was the vastly greater registry capacity.
I stopped coding assembler when I moved to x86 - what a horrible cludge of a stack stack biased platform it is.
liqbase
What kind of algorithm are you imagining would benefit from 256 fields of non-vectorized data?
Of course, those registers could be used in larger things for everything that's worthy of a local variable, but as soon as you run into a stack operation you'll either only want to push a subset of the registers to the stack, or face a harder blow of memory access times by making each function call a 2048 byte write to memory.
Explicit encoding of parallelism, hints to branch prediction, and similar stuff, seems far more appropriate.
Again, few single functions in an imperative language have 256 separate variables, without involving arrays of data. Unless the register file is addressable by index from another register (basically turning it into a very small addressed memory, which is whta you try to avoid with registers), you have little use for 256 of them. Take for example a trivial string iteration algorithm, most of those registers would be completely useless. The same holds true for common graph algorithms.
http://www.anandtech.com/printarticle.aspx?i=2343.
Same article without 90% of the ad-bloat.
Chances are that you aren't often pushing your CPU to capacity. What I'd like to see is a better way to identify bottlenecks in my system. There's no sense pumping more power into a system if it's all going to be throttled by something like a slow hard drive.
Socialism: A feeling of discontent and resentment caused by a desire for the possessions or qualities of another.
Smart power, not more power? How unamerican!
TERRORIST!
Pretty Pictures!
The Itanium has a huge file with, IIRC, even more registers in total. They are not inter-changeable, though, but the (almost) only point in that would be to keep the total number of registers down, while being flexible for most types of code. As I think that it's generally actually easier to make them separate for different execution units, that's not very interesting. Also, note that the Itanium currently has a 2-cycle (again, IIRC) register access time! They tried to be visionary, adding a huge register set, in addition to some parallelism encoding and other things I mentioned in the parent, but they traded (what seems to be) far too much to get it.
A huge (defined as MMIX-like, not AMD64-like)register file might be great, but you need selective register pushing to stack to get away with it, unless you or the compiler are performing very aggressive inlining. What's easier, if you're doing assembler -- calling a function and put a local on the stack or writing a huge fricking implementation of your main algorithm, taking great care to use all different registers in each function inlining?