Clockless Chips
iarkin writes "TechReview is running a very interesting
article about clockless chips.
Clockless, or asynchronous, chips work very much faster and consume less power than their synchronous equivalents (Intel hade some experiments on these chips back in -97, the results showed that the asynchronous chips were three times faster and consumed only half the power)."
.. otherwise people would've noticed this has been
posted before (sept 15)
Clockless chips will never take off. How are people supposed to draw incorrect conclusions about which chip is the fastest when there's no MHz/GHz rating?
In other news, AMD abandons all current R&D to work on clockless chips so they can win the clock-speed wars against Intel...
There is no escape from The Muffin.
If there is no clock, how do they know that they are 3 times faster? :-D
Async processing is a very old idea. The problem is that designing the logic for it is a far greater chore than for regular chips. CPU designers are simply not good enough to do it well yet.
"How Sun swerved to avoid Rambus"
http://www.theregister.co.uk/content/3/22279.html
More details on the CPU:
http://www.theregister.co.uk/content/3/22274.html
Sun press release:
Extends UltraSPARC III Chip Family Tree--First Use of Sun-Developed Asynchronous Logic Design in Chip's Memory Interface
At Sun Labs:
feature article
async research home page
I took the clock out of my computer with an xacto knife. I immediately noticed an infinite difference in the speed at which it ran.
I also have an asynchronous clock ever since the spring in my wristwatch snapped.
Clockless chips would result, perhaps, in the most interesting (funny?) marketing.
Intel would develop a standard way of indicating performance. Based on something their particular chips are good at. We'll say they release the Pentium Clockless 1000, Pentium Clockless 2000 and Pentium Clockless 3000.
AMD would, if trends indicate anything, market them using performance ratings. Instead of deciding performance based on the intel standard, they would have new names to indicate that their processors, in some situations, are faster than their Intel counterparts. They'd probably be called the AMD Athlon Clockless XP 1100+, and so on.
In response, Intel would start releasing worse processors, but with higher numbers. Pentium Clockless II 5000 would be their flagship.
AMD would continue making their processors in the traditional manner, but would adopt a new naming mechanism. AMD Ahtlon Clockless Performance XP Super Fantastic 6000, maybe.
Repeat ad nauseum.
-NeoTomba
The main problem with async. design is the asycnchronous part of it. In a typical computer, you have tons of parts that you use interchangably. These parts have operate at different speeds. How would two devices working at different speeds operate smoothly. Generally, this is very hard. But the thing is they can: But the devices themselves need to agree on a few things. But async. design is higly complicated because in a clockless environment you have to pretty much garauntee something like "I'll do this within 2 equivalent clock cycle." or have other types of signalling negotiation. You can't clock on a "clock" to do stuff. You have to clock on a "async" signal.
This is the problem in the large. When you go down to the chip level, there are tons of nightmares. There can be feedback loops causing race conditions that only occur at certain times. There are load problems that might increase complexity so much more than equivalent problems in a clocked design. Clocked design makes things a lot simpler and still designing a chip is extremely diffucult.
But the future I don't think is in clockless design, but "careful clock" design. For example, there are chips which are smart enough to disable sending the clock to certain part of a chip when it knows those parts will never be used. That saves a lot of power. There are chips which aim to spread the clock around carefully thus increasing the speed. And remember, almost 50% of the power in a chip is lost due to the wiring!
me.
As I understand it, traditional systems use a clock signal to let each stage of the pipeline know when the previous stage has completed. Each stage is designed to have few enough transisters that a signal has to pass through to guarantee that it will be done by the time the next clock signal arrives. Clockless systems instead design the processor such that at each step in the processing, the difference between a partial result and a completed result is self-evident. This requires more work, both in the design of the processor and in terms of transisters, but at the benefit of eliminating the clock (and many associated transisters) and any waiting between when the processor has completed a step and when the clock signal arrives.
Since dealing with the clock signal has become increasingly complex, instead dealing with not having one is becoming a more reasonable solution.
Then stop reading about it, silly!
Practice random senselessness and act kind of beautiful.
Actually, I bet there would at least some marketing cachet associated with a "clockless" chip. Remember a decade ago when CD player DACs went from 16 bits to 18 or 20 bits, then suddenly the coolest thing going was a "1 bit" DAC (i.e., a delta modulator)? The buying public will tend to go for whatever marketing decides is trendy.
The reason why asynchronous logic hasn't hit store shelves yet probably has to do more with implementational difficulties than marketing. I was taught synchronous logic design for my EE degree -- it's easier to design something when you know that results in remote parts of the chip are synchronized to the clock. When you looked at a timing plot for a circuit, it was usually pretty easy to debug because some parts of the circuit were clearly taking too long to execute their tasks -- and the solution was equally straightforward, decrease the clock speed. Designing for asynchronous circuits is probably much harder, since tentative results can screw things up. Furthermore, it's hard to imagine how some design techniques such as pipelining can work in an asynchronous environment.
Toronto-area transit rider? Rate your ride.
Of course I'm used to things getting published a little late on slashdot ;-)
M0571y H@rml355.
How 'bout: "So fast it can execute an infinite loop in 15 seconds."
Admit nothing, deny everything and make counter-accusations.
There are some compelling reasons:
Though synchronous design has enabled great strides to be taken in the design and performance of computers, there is evidence that it is beginning to hit some fundamental limitations. A circuit can only operate synchronously if all parts of it see the clock at the same time, at least to a reasonable approximation. However clocks are electrical signals, and when they propagate down wires they are subject to the same delays as other signals. If the delay to particular part of the circuit takes a significant part of a clock cycle-time, that part of the circuit cannot be viewed as being in step with other parts.
For some time now it has been difficult to sustain the synchronous framework from chip to chip at maximum clock rates. On-chip phase-locked loops help compensate for chip-to-chip tolerances, but above about 50MHz even this isn't enough.
Building the complete CPU on a single chip avoids inter-chip skew, as the highest clock rates are only used for processor-MMU-cache transactions. However, even on a single chip, clock skew is becoming a problem. High-performance processors must dedicate increasing proportions of their silicon area to the clock drivers to achieve acceptable skew, and clearly there is a limit to how much further this proportion can increase. Electrical signals travel on chips at a fraction of the speed of light; as the tracks get thinner, the chips get bigger and the clocks get faster, the skew problem gets worse. Perhaps the clock could be injected optically to avoid the wire delays, but the signals which are issued as a result of the clock still have to propagate along wires in time for the next pulse, so a similar problem remains.
Even more urgent than the physical limitation of clock distribution is the problem of heat. CMOS is a good technology for low power as gates only dissipate energy when they are switching. Normally this should correspond to the gate doing useful work, but unfortunately in a synchronous circuit this is not always the case. Many gates switch because they are connected to the clock, not because they have new inputs to process. The biggest gate of all is the clock driver, and it must switch all the time to provide the timing reference even if only a small part of the chip has anything useful to do. Often it will switch when none of the chip has anything to do, because stopping and starting a high-speed clock is not easy.
Early CMOS devices were very low power, but as process rules have shrunk CMOS has become faster and denser, and today's high-performance CMOS processors can dissipate 20 or 30 watts. Furthermore there is evidence that the trend towards higher power will continue. Process rules have at least another order of magnitude to shrink, leading directly to two orders of magnitude increase in dissipation for a maximum performance chip. (The power for a given function and performance is reduced by process shrinking, but the smaller capacitances allow the clock rate to increase. A typical function therefore delivers more performance at the same power. However you can get more functions onto a single chip, so the total chip power goes up.) Whilst a reduction in the power supply voltage helps reduce the dissipation (by a factor of 3 for 3 Volt operation and a factor of 6 for 2 Volt operation, relative to a 5 Volt norm in both cases), the end result is still a chip with an increasing thermal problem. Processors which dissipate several hundred watts are clearly no use in battery powered equipment, and even on the desktop they impose difficulties because they require water cooling or similar costly heat-removal technology.
As feature sizes reduce and chips encompass more functionality it is likely that the average proportion of the chip which is doing something useful at any time will shrink. Therefore the global clock is becoming increasingly inefficient.