Asynchronous Logic: Ready For It?
prostoalex writes "For a while academia and R&D labs explored the possibilities of asynchronous logic. Now Bernard Cole from Embedded.com tells us that asynchronous logic might receive more acceptance than expected in modern designs. The main advantages, as article states, are 'reduced power consumption, reduced current peaks, and reduced electromagnetic emission', to quote a prominent researcher from Philips Semiconductors. Earlier Bernard Cole wrote a column on self-timed asynchronous logic."
so how long do you think this will take before it's implemented on any sort of a large scale?
Isn't that when your boss gives you several conflicting ideas on how he wants a product to be implemented, all at the same time?
Isn't this the same as having a CPU without a timer? (i.e. no MHz/GHz rating).
Brains use async logic elements. Maybe the only way to achieve good artificial intelligence with practical speeds is with async logic. With a cluster of async nodes you can build a physical simulation of neural nets. Consider having a small array of async nodes simulating parts of a neural net at a time. That would be a lot faster than what would be possible with ordinary sequential processing. Async logic might very well bring large neural net research into practicality.
On the flip side, the millions of simultaneous transitions in synchronous logic begs for a better way, and that may well be asynchronous logic
The advantage outlined here seems to be independant functionality between different areas of the PC. It would be nice if the components could work independently and time themselves, but is there really a huge loss in sustained synchonous data transfer?
From what I've understood, in most aspects of computing, synchronous data communication is preferable. IE, network cards, sound-cards, printers, etc. Don't better models support bi-directional synchronous communication?
The problem with asynchronous logic is that even though it might seem faster in theory you have to deal with the introduction of many new race conditions. Thus to prevent to the race conditions you need to implement many handshacking methods. In the end it really becomes no faster than sychronous logic due to the handshacking. This is especially true these days with 2.5 GHz CPUs.
Isn't this where the idea of digital logic really got started? At least its how it was taught when I was in school.
We even did some design work in async. Cool stuff. Easy to do, fast as hell...
Never did figure out why it never caught on. Except for the difficulty in being general purpose.. so easy of a job with sync logic. And i guess it does take a certian mind-set to follow it.
---- Booth was a patriot ----
Intel recognise clockless as the future, and hence the P4 actually has portions designed that are clockless.
Before know-it-all's follow this up with "but it runs at 2.xx GHz", let them please read an article on about how much of your chip is oscilating at that immense speed.
As it's said in the EE industry, "oh god imagine if that clock speed was let free on the whole system"
and it doesn't work all that great.
It usually goes like this: little head decides to take some action that big head later decides wasn't such a good thing to do.
Fortunately I've invested in a logic synchronization device, which I like to call "wife". Wife now keeps little head from failing to sync with big head through availability (not use) of tools "alimony", "child support", and "knife" (aka "I'll chop that damn thing off while you sleep!")
But more to the point: while asynchronous logic may appear to offer a simple tradeoff (slower processing time for more efficient battery life), recent advances in microsilic design make the argument for asynchronous components moot. For one thing, while two synchronous ICs take twice the power of one asynchronous IC (not quite because of the impedance caused by the circuit pathway between two chips, but that's negligible under most circumstances), they will in general arrive at a result twice as quickly as its serial pal. Twice as quick, relatively equal power consumption.
The real reason for the drive towards asynchronicity is to cut down on the costs of an embedded design. Most people don't need their toaster to process the 'Is the bread hot enough' instruction with twice the speed of other people's toasters. But for PDAs (Personal Data Assistants) or computer peripherals I wouldn't accept an asychronous design unless it was half as much.
Try not. Do or do not, there is no try.
-- Dr. Spock, stardate 2822-3.
These chips are great for battery powered devices, such as pagers, because they don't have to power a clock. Extends the batt life at least 2x. But even if the advantages are superior to clocked chips for larger markets, how do you market something like this to people who want to see "Pentium XXXVIV 1,000,000 Ghz" on the packaging?
It'll be enlightening for people to just go there and read your information in context anyway, plus there are links to papers and stuff. You shoulda posted the link!!
In a way, an asynchronous circuit design already is a parallel computer. An asynchronous machine contains many (largely) independent components that communicate with each other in order to solve computational problems more efficiently by breaking them down into small pieces and working on them in parallel.
In this context, your notions of parallel computing will change greatly. Currently, individual nodes in a cluster are islands of computation, separated by (comparably) vast distances. Messages between nodes take orders of magnitude more time than messages within a node.
When you set out to build a supercomputing cluster in the asynchronous world, ideally the entire cluster would be within a single die. Then the latency between nodes would be reduced to microseconds or nanoseconds, and nodes could split work more effectively. The high-speed buses and complex arbitration schemes required for asynchronous computing will be equally useful for designing massively parallel clusters-on-a-chip.
I'm wondering how asynchronous logic stand up against transiant errors induced by a cosmic ray?
On a synchronous circuit most of the time such glitch won't do anything because it won't occur at the same time the clock "ring" so the incorrect transient value will be ignored.
As the "drawing size" of circuits gets lower and lower, every circuit must be hardened against radiations, not only circuits which must go on space or in planes..
... where are the designn tools?
We all know about the advantages async logic has in many respects to clocked one. The problems is, the async logic *design* tools are nowhere as good or as many as the tools available for designing clocked logic.
Chicken and egg problem? Maybe, or maybe just another untapped opportunity for those crazy software people...
Ready? Not right now.
Because posting without giving credit to original author is wrong.
JOhn
Campaign for Liberty
There was an article in Scientific American about this just recently...
I'm sure that many /. readers, like me, are wondering if asynchronous chips get faster if you pour liquid nitrogen on them.
Seriously though, does the temperature affect the switching time? Or does the liquid nitrogen trick just prevent meltdown of an overclocked chip?
In synchronous circuits, there are power spikes as most of the gates transition at the clock edge. It's interesting that this issue is becoming a major one. ICs are starting to draw a zillion amps at a few millivolts and dissipate it in a small space while using a clock rate so high that speed of light lag across the chip is an issue. Off-chip filter capacitors are too far from the action, and on-chip filter capacitors take up too much real estate. Just delivering clean DC to all the gates is getting difficult. But async circuitry is not a panacea here. Just because on average, the load is constant doesn't help if there are occasional spikes that cause errors.
One of the designers interviewed writes: "I suspect that if the final solution is asynchronous, it will be driven by a well-defined design methodology and by CA tools that enforce the methodology." That's exactly right. Modern digital design tools prevent the accidental creation of race conditions. For synchronous logic, that's not hard. For async logic, the toolset similarly has to enforce rules that eliminate the possibility of race conditions. This requires some formal way of dealing with these issues.
If only programmers thought that way.
Sounds like my wife.
But seriously, isn't that an oxymoron?
At first, I thought it meant that we take a program, break it up into logic elements and scramble them like an egg. That won't work.
But after reading, I see it means that everything isn't pulsed by the same clock. So, if a circuit of 1,000 transistors only needs 3 picoseconds to do it's job, while another 3000 transistors actually need 5 picoseconds, then entire 4000 transistors are turned on for5 picoseconds. So, 3000 transistors are needlessly powered for 2 picoseconds.
This adds up when we're talking 4 million transistors and living in the age of the Gigahertz.
Am I ready for asynchronous logic? It doesn't really matter -- it can come along whenever it wants, and I'll come use it when I have some spare cycles.
Josh Woodward
Second real-world definition: When someone else (usually of the opposite sex) answers your question with an accusation that's completely off-topic.
Third real-world definition: Many slashdot posts (sort of including this one :-)
Surely Digital Equipment Corporation's PDP-6 had it in 1963?
Or is this modern "asynchronous" logical some totally different concept?
"How to Do Nothing," kids activities, back in print!
For those who may have missed it (as I did the first time)...the article title itself is a bit playful.
Thats synchronus dislogic
Im not here now... Im out KILLING pepperoni
I've said it before, I'll say it again. Shut up! I don't know who you're trying to impress. Your comments don't add to the discussion. If I want to know about your research (mentioned twice in the past week I believe!) I'll go to your friggin homepage. You seem to post on every 5th story! Looking up that much information on the net to copy and paste into a slashdot comment must take up an enormous amount of time. Do yourself (and us) a favor and use this time to search for a girlfriend or at least go buy a fleshlight (warning, not safe for work or kids.)
Isn't an UART at least partly an asynchronous chip? So you probably have got one your PC today...
And Chuck Moore's description of an asynchronous Forth chip is available in Google's cache (I don't know why he pulled it from the web site).
Sun has talked quite a bit about async logic in their own designs. I forget if it is in their current generation of chips or not, but they've talked about putting 'islands' of async logic into their chips, with an eventual goal of using it throughout.
The article as embedded.com talks about 'security'. What they really mean here is like, for example, in those smart access cards in a DirecTV. They say a clockless design is harder to figure out what is going on. So, it is a DRM monster, they say.
BTW, he goes by different names, usually those with the word "Physics" in it.
Here's another example of his copy and pasting:
This post: http://developers.slashdot.org/comments.pl?sid=42
is copied from this web page:
http://www.intuitor.com/moviephysics/mpmain.html
Take a look for yourself at his post history, the wide range of topics, and supposed knowledge.
Dave
FPGA, Wireless, ASIC, Verilog, VHDL, HW, 10yr exp, Team Lead, Ottawa (More? Email above. slashdotusername=dgmartin98 )
Those of us that have been around the block more than twice know that asynchronous design has been the technology of the future for a long, long time. My personal experience goes back to the mid-seventies, but I'm sure there were asynch he-men doing their thing with vacuum tubes and RTL. :-)
The catch, then as now, is that asynch logic is just plain more difficult for our tiny little human brains to grok. This was true back in the days when humans designed their own logic, and it is even more true now when 99%+ of all logic is designed not by humans, but by logic synthesis software (Synopsys DC and Cadence PKS).
That said, there are always folks out there doing Cool Stuff w/asynch circuits. Hope that Ivan Sutherlands's group at Sun Labs survives Sun's recent massive layoffs.
Its not about English class, we don't need a MLA style credit here. A simple URL with a couple of quotes will do. Its not cool to copy-n-paste and not give credit and its straight up misrepresentation to do so. Misrepresentation should not be encouraged or tolerated on this forum or any other.
Also, I agree how its rather funny we got modded down. Do your worst moderators, my Karma is still excellent and these small posts won't hurt it!!
JOhn
Campaign for Liberty
No problem asynchronous logic will be. To program some say difficult but they weak minded people are. Excuse me, I have to post a response to the story on Slashdot about logic asynchronous now.
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
That rung a bell with me. When I heard about 'clockless computing' a couple of years ago, one of the first examples was the microprocessor inside of a pager. They wanted to go clockless (which I assume is the same type of processor here, polite corrections invited...) so they could have a lower power processor. The idea? Make pagers last longer on a single battery.
I'd say it worked. However, from that same article, it'll be quite a while before desktop PC's use a processor like that. I should probably go read the article heh.
Aww hell, My SO has been doing this for *Years*, i mean she is the queen of one-sided logic for ages ;-)
p.s. Kylie if your reading, j/k love ya!
~what was that? I dunno, but you've got it's license plate number stamped on your forhead ~*ouch*~
The largest ascynchronous project (to my knowledge)is the MiniMips that was developed at Caltech 1997 and has 1.5 M transistors. It was modelled after the R3000 mips architecture.
The best selling larg scale asynchronous circuit seems to be a micro controler that Philips developed and used in a pager series.
- how this would change the appearance of code that I can write?
- Would there be any difference at all?
- Would I need an entirely new programming language, replete with syntax to leverage asynchronous logic?
- Are there (sensible) examples of this for me to gawk at?
This really sounds interesting but being just a dumb programmer, I'd be interested in seeing this concept in terms I can understand (if it exists...).Height: 38U, Weight: 0 Newtons, Eyes: #0000FF, OS: Gray Matter 1.0 (Alpha)
Eh... "Asynchronous" means "without synchronization" (ie. "without clock"). It has nothing to do with serial vs. parallell operation.
HIBT?
I choose to remain celibate, like my father and his father before him.
Here at Caltech the CS department is into this kind of thing.
They've even built a nearly complete MIPS 3000 compatible processor using async logic.
Seems pretty cool, but I'm waiting for some company to expend the resources to implement a more current processor (such as the PowerPC 970 perhaps) in this fashion.
Computers Without Clocks
In most modern CPUs, all of those occur independently in different units in the pipeline.
But they still do their function once per global clock cycle. After that, they pass their results on to the next stage.
As a result, the clock rate is limited by the longest propagation time across a given pipeline stage. A solution that allows for higher clock speeds is to increase the number of pipeline stages. This means that each stage has to do less. (The P4 one-ups this by having stages that are the equivalent of a NOP just to propagate the signal across the chip. But they're still globally clocked and synchronous.)
P4 has (I believe) a 20-stage pipeline. (It's in that ballpark) - The Athlon is sub-10, as are almost all other CPUs. This is why the P4 can achieve such a high clockrate, but it's average performance often suffers. (Once you have a 20-stage pipeline, you have to make guesses when branching as to WHICH branch you're going to go on. Mispredict and you have to start over again, paying a clock cycle penalty.)
Shorter pipelines can get around the branch misprediction issue by simply dictating that certain instruction orders are invalid. (For example, the MIPS architecture states that the instruction in memory after a branch instruction will always be executed, removing the main pipeline dependency issue in MIPS CPUs.)
With asynch logic, each stage can operate independently. I see a MAJOR boon in ALU performance - Adds/subtracts/etc. take up FAR less propagation time than multiplies/divides - but in synch logic the ALU has to operate at the speed of the slowest instruction.
Most important is the issue of power consumption - CMOS logic consumes almost no power when static (i.e. not changing its state), power consumption is almost exactly a linear function of how often the state changes, i.e. how fast the clock is going. With async logic, if there's no need for a state change (i.e. a portion of the CPU is running idle), almost no power consumed. It is possible to get some advantages in power consumption simply by changing the clock speed. (e.g. Intel SpeedStep allows you to change between two clock multiplier values dynamically, Transmeta's LongRun gives you FAR more control points and saves even more power, many Motorola microcontrollers such as the DragonBall series can adjust their clock speed in small steps - One Moto uC can adjust from 32 kHz to 16 MHz with a software command.)
retrorocket.o not found, launch anyway?
Since Windows sometimes seems asynchronous, perhaps it would crash less on such a machine?
You'll notice that Cadence (one of the big EDA software companies) is cooperating with/heavily investing in one of the async hardware companies. It's not an untapped opportunity - They're tapping it as we post. :)
retrorocket.o not found, launch anyway?
In CMOS logic, power consumption is not related too much to the static state of the chips, i.e. "transistor is on for 5 ps".
It's related to how often the state change occurs.
A good example of where async logic might be useful:
ALU multiply operation takes 20 pS, LOTS of transistors
ALU add/subtract op takes 5, FAR fewer transistors
In current designs, this usually means that add/subtract ops have to run at a clock rate that is slow enough to accomodate that 20 pS clock
In an async design, the add/subtract instructions can run 4 times as fast. But since the multiply/divide stage is not clocked, those transistors aren't doing anything so overall power usage is less. (The add/subtract stage uses 4x the power it did before, but the mult/div stage was probably using 10x or more the power the add/sub stage was using)
retrorocket.o not found, launch anyway?
Imagine a beowulf cluster of these!
Or, as an asynchronous machine might say...
Iamgien boewulf a culstre of thees!
One of the guys heavily involved in that project is no longer at Caltech, but is a professor at Cornell University now.
:)
http://vlsi.cornell.edu/~rajit/
One of the best (albeit toughest) profs I've ever had. This guy knows his stuff, and is very good at passing the knowledge on.
Happens to be responsible for Cornell's only FreeBSD lab, which the CS students prefer to the CS department's own systems. Many of them continue using the CSL lab long after finishing ECE/CS 314. (Req'd for all ECE and CS majors.)
retrorocket.o not found, launch anyway?
It's not new, company's like Sun have been pursuing this for years. Here's info on the FLEETzero prototype async chip they were showing off at the ASYNC 2001 conference last year.
From a friend of mine, who should know, because he used to fix it...
;), NORAD used to have an asynch. computer as part of the defense system. They also had an old pneumatic mail tube system which passed right over it. The static electricity generated by the mail tubes would f**k up the timing signals. I suprised we didn't start WW III.
In the Big Mountain (where the Stargate is
Seymour Cray tried it after the 7600 and before making the Cray-1. He decided that regardless of the performance advantage, people wanted a computer that was KNOWN to work.
In this, like in most other things he did Cray was right
Sent from my ASR33 using ASCII
This is my favorite example of asynchronnous logic.
As the rat speeds up or slows down the chip compensates for it.
Not often that you cant play with toy mice and call it research.
Mouse powered Chips, Open source Processors and Lego
The only real use I can see is to ignore clock
skew. If the amount of power consumed by the clock has been steadily increasing, I expect that sooner or later the chip will be segmented off into "asynchronous" sections where the clock skew is expected to be unknown (but hopefully constant.
Of course, since the (not so) latest Xilinx FPGAs have a simpler (for the designer) approach (use PLLs to generate corrected clocks), I expect that this is either old hat or wrong.
I agree about the technology of the future part, though I might also put it in the "watch this technology when Moore's law ends". If it's featured in slashdot coming from Sun, its probably in this catagory.
Scott
This was back in the times when we didnt have decent enoufgh software to proove the chips would work yet.
Mouse powered Chips, Open source Processors and Lego
The thing is that async can nowdays be just as easily designed as sync. I have a great sync to async converter which improves speed by 30%.
Mouse powered Chips, Open source Processors and Lego
Here is Sun Lab page for all of this stuff please note, that Ivan Sutherland does it for years
If we cannot be free, at least we can be cheap -- FZ
Am I ready for asynchronous logic?
"And like that
Is the earliest example I can think of. Roughly 50 ECL 10000 Gate Arrays. It was only syncronous at the "edges" like buss interfaces. Circa 1983. I was on the simulation tool design team. Loads of fun on the skew analysis portion of the simulation. You have to account for all the "local" varieties of skew (within a cell, within a quadrant of the chip, and within the chip overall, and more), and the lead and trace generated skew as well.
- Tjp
I am in wallow with my inner money grubbing capitalistic pig. ... Oink!
Dear god people, do you have any idea how impossible asynchronous circuits are to debug?!!?
I spend several hours a day with a hair dryer and go through many cans of freeze spray debugging many many stupid little asynchronous designs that engineers think are "cool" or "sweet". Yeah, and they work on the FPGA on their desk and none other.
Please please please, if you don't want to listen to me then goto http://www.chipcenter.com/pld/pldf030.htm
and read what Xilinx's Director of Applicatons Engineering has to say.
quote "Asynchronous design methods may ruin your project, your career and your health"
I am done now here you go is better than are you done yet???? or am I Missing something really big?
thank God the internet isn't a human right.
About time...I was trying to get my prof to go for this when I was doing my elec eng masters thesis back in 95...grrrrr..
oh well...
I'm studying chip design and my supervisor scoffs at asynchronous logic. I don't have any real input of my own, but his view is that we've been waiting for commercially viable asynchronous designs for as long as cheap fusion, and neither has happened yet despite many loud enthusiasts.
One of the real problems of asynchronous logic is in testing. With synchronous logic your design is partitioned into registers and combinational logic. The combinational stuff can be tested at production by use of every possible test vector, while registers are rather easy to test. Together these two tests virtually guarantee that the state machine works. Do that for every state machine and you're done.
Asynchronous state machines, however, have no obvious way to break them down. You have to give them sequences of inputs and check their sequential outputs. Even if you think it's working you can never be sure, and what happens when the temperature changes? Race conditions can result in the state machine breaking under changing temperatures.
Synchronous design is a very mature field. Nowadays you can be sure that a design works before fabrication (well, almost.. =) and then synthesise it into gates that ought to work first go. If they didn't then AMD and Intel would go under pretty soon!Asynchronous design is hard and my hat goes off to the people who do it for a living. But the same amount of effort would result in far more development using standard techniques. I guess you really have to want to do it.
Yes, synchronous logic has serious issues with clock distribution, but it's still the most commercially viable design technique. The fact that your CPU is fully synchronous is testament to that.
So, which will come first: cheap fusion or reliable asynchronous logic?
In an article in Engineering magazine (it is distributed to all engineers at Kansas State University and probably many others) and Intel representative (I can't recall the name) claimed that anyone who came up with a completely asynchronous x86 solution would have a significant advantage. He is right for several reasons.
1) Power usage and heat are important in many devices. The Via Cyrix chips do reasonably well, even without the power of Intel and AMD designs.
2) Fully asynchronous chips do have the capacity to perform far better than synchronous designs.
The article also discusses various designs for asynchronous chips. A good read, if you can find it. The magazine came out some time towards the end of 2000 (I am thinking November), but not for sure.
As stated in the article, in an asynchronous system, the clocks are divided up on a modular basis
First of all, in a purely asynchronous system, there are no clocks at all. For example, if you wanted you had a circuit that did this:
x = (a AND b) OR c
Assuming this consists of an AND circuit and an OR circuit, x would be invalid while it waited for (a AND b) to finish. Once (a AND b) finished, x would become a valid result. In a synchronous system, x would be valid in time for the next clock cycle. No other circuit would operate using invalid value in x b/c they would be waiting for the clock first before looking at x.
In an asynchronous circuit, other circuits can operate on x whenever they want (no clocks), including any invalid values currently stored in x. Therefore, a handshake signal is usally used to signify when x is valid. When other circuits see this signal, they can then start treating x as if it were valid. You can see how this is difficult to time when each stage is so dependant on the previous one.
Sometimes, a hybrid approach is taken where asynchronous results are stored in a clocking circuit. This practice is in more common use I believe.
and only the modules that are running need power at all.
As for power saving, asynchronous circuitry is not the same as power saving circuitry. In the ALU, for example, often times it will perform many or all operations on any data handed to it in parallel (i.e. x AND y, x XOR y, at the same time). Then, if the operation asked for x AND y, it will return the value of x AND y and throw away the value for x XOR y. If this is the case, power consumption is not reduced because all circuits are being used regardless.
The AMULET group at the University of Manchester has been doing research and implementation for asynchronous processing (based around an asynchronous ARM design) for many years. They have a bunch of good information available on their projects, and the subject in general.
The AMULET group at the University of Manchester has been doing research and implementation for asynchronous processing (based around an asynchronous ARM design) for many years. They have a bunch of good information available on their projects, and the subject in general.
Kudos to Ferranti, Plessey and the University of Manchester who did a lot of the design work.
- Elemination of clock skew problems - the clock is a timing signal, but it takes a certain amount of time for the clock signal to propogate around the chip, so as the clock frequency goes up, this becomes a huge problem
- Average-case Performance Synchronous circuits must be timed to the worst performing elements. Asynchronous circuits have dynamic speeds.
- Adaptivity to processing and environmental variations Dynamic speed here againg. If temp goes down, circuit speeds up. If supply voltage goes up, speed goes up. Adapts to fastest possible speed for given conditions
- Component modularity and reuse easier interface because difficulty with timing issues are avoided (handshake signals used instead).
- Lower system power requirements it takes alot of power to propogate the clock signal, plus spurios transistor transistions are avoided. (MOSFETS only use considerable power when they change states).
- Reduced noise All activity is locked into a single frequency in synchronous, so big current spikes cause large ammounts of noise. Good analogy is the noise of 50 marching soldiers vs. the noise of 50 people walking at their own pace. The synchronous nature of the soldiers causes the magnitude of the noise to be much greater.
Major drawback: Not enough designers with experience and lack of asynchronous design tools. So far the book is a great read, but pretty technical (good for an EE or com sci person who's had a basic digital logic class).The book is "Asynchronous Circuit Design" by Chris J Myers from the University of Utah.
Also I wrote a paper about this for my computer architecture class:
http://ee.okstate.edu/madison/asynch.pdf
Precisely. There is literally decades of research on the design and testability of synchronously clocked designs, whereas there is very little on asynchronous designs. All the EDA tools available for testing chips today (processors, ASICs, what have you) are all based on synchronous design principles. To change to asynchronous design requires an entire paradigm shift, from functional design, to testability, to producing vendor tools to work the flow. Synchronous designs have been based on stuck at fault testing for decades, and greatly simplifies the task of proving 10 million transistors are doing what they are supposed to. One basically only has to check for a 1 or a 0 in the right place during the right clock cycle. Asynchronous designs basically require verifying the timing and path delays from every gate in the chip to every other gate. There is no predictable time when things will happen, or when things will get there, since timing delays are dependent on fabrication process and variation.
For now, asynchronous logic works best in small pieces of a much larger, synchronous design, and where it makes sense - interfacing with things that are asynchronous (UARTs, ADCs, RF receivers, etc). Usually, one can verify with great confidence of achieving over 99% fault coverage on the synchronous portion, while resorting to just functional tests to see if the async logic works right. Writing functional tests for an entire chip these days however is almost insurmountable, unless you have all the time and money in the world to burn. Because synchronous designs have more structure and follow rigid design criteria, structural testing is far far easier.
The drawbacks of synchronous designs is of course, clock tree synthesis and controlling skew. Power can be dealt with these days by gating off clocks to unused regions, using lower power FFs, etc. However, controlling the clock skew on a single clock chip can be the largest hurdle during layout and fabrication. Even so, it is not an impossible task, and verifying full scan synchronous designs via ATPG and/or BIST outweighs most benefits of purely asynchronous designs.
A fully asynchronous processor or chip will not become economically feasible until more research is done. A chip that cannot be tested is worth nothing at all.
There is a company called Fulcrum Microsystems which grew out of Caltech's VLSI program and does primarily ASYNC stuff.
Isn't this where the idea of digital logic really got started?
Yes.
At least its how it was taught when I was in school. We even did some design work in async. Cool stuff. Easy to do, fast as hell...
Never did figure out why it never caught on.
It DID catch on. But the chips kept getting bigger.
It's easier to design silicon compilers for synchronous designs than for asynchronous - and when you've got millions of gates per chip you REALLY want compiler assist, rather than to lay out all the circuit details by human effort.
It's also easier to make automated TEST program generators for synchronous designs, to run the machines that test the chips when they come out of fabrication and reject the ones that are broken. You NEVER get high-90s coverage with human-generated "functional" tests - but a compiler can get there easily:
- Add muxes in front of the flops to string 'em into "scan chains" - big shift registers connected either to the regular pins or a "JTAG" controller. Then on the tester you'll:
- "Scan in" a random starting state.
- Step the chip a few times.
- "Scan out" the result and see if it matches expected, simultaneously scanning in a different starting condition.
The test generation program becomes essentially a random-number generator, chip simulator, and fault-tested-so-far counter, with a few finesses for things like getting things reset properly, testing gates with big fanin, making sure busses aren't floating, rejecting patterns that don't test anything new, working around flops that weren't on the scan chain because they were on a critical path, avoiding logic loops that become implied RS flops or ring oscilators (depending on whether the loop has an even or odd number of inversions), identifying logic circuits that have untestable failure modes, and the like.
But full-scan and partial-scan don't work if the flops aren't tied to a small number of clock domains that can be tied together or otherwise controlled directly by the tester. Asynchronous logic elements (such as ripple counters or other circuitry where a flop's clock is driven from another flop's output, or other logic that's something other than a clock distribution and switching system) just don't scan well.
There IS a way to get the same sort of massive observability and controllability over asynchronous designs - the Cross Check array - along with automatic test program generation systems to work with it. (Think of DRAM- or active-matrix-LCD-style cross-point addressing of test-points and signal-injection points - about one for every four regular transistors on the chip.) It tests async designs just fine, and gives better coverage than full scan with about half the silicon overhead.
But it's patented. The company that made it never got much market penetration in the US fabs. It has since merged and the product may be completely gone at this point. Except for Sony, which had an unlimited license from funding them when they were a startup, their own software, and (as of a few years back at least) used it in all of their consumer chips.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Imagine a Beowolf Cluster of THESE!!!
While what you say is true, the systems described are not purely asynchronous. Rather, they seem to be a collection of individually synchronized (and possibly purely asynchronous) modules that use handshaking to communicate between each other.
4X silicon die size and glitch issue.
I once work as assistant for this product on 1996,
my boss got his degree and I don't see any further application since then.
First:
;)
evidence that biological brains are heavily chaotic, which ANNs traditionally are not.
I think that even traditional and simplest ANN have chaotics properties due to their non-linear nature. But there are also more recent and complex models that use more chaotic elements...
Second, brains are extremely recurrent in ways that could never be simulated by traditional computers -- there are simply too many links.
Never say never, again
Tradicional computers are growing fast (Moore's law) and according to Kurzweil, in the year 2019 a $1000 computer will have a capacity similar to a human brain (only the capacity, that is, the hardware... maybe not the algorithms.)
Furthermore, you don't need to simulate one brain with one computer... you could try to simulate one brain with -say- 10000 computers that are connected and communicate asynchronously
My own estimations indicate that NOW all the computers, joined together by the Internet, have more capacity than a single human brain (in computing power and memory).
I started a project (InterSAINT.org) to develop the algorithms for that: ANN and distributed/grid computing.
Third, the human brain is not based merely on reward and punishment. When I sit in a chair at night, pondering whether I agree or not with what Bush has done today, there's no clear source of reward or punishment. Yet, at the end of the day, my brain has changed. ANNs have no ability to self-contemplate and change in this way.
Are you sure? Maybe when you sit on a chair you do it because you have learnt that that chair is not broken (it doesn't have a punishment). Maybe you sit because your body wants resting for a while (it has a reward). Meanwhile you can think about Bush or anything else... but surely everything you think is for a reward (including sometimes the reward of thinking).
Fourth, when an ANN is trained, every weight in the network is changed. In a biological brain, particular links form and are destroyed, but learning is not a global process. I'm not a neuroscientist, so if I'm wrong, someone please point that out.
AFAIK, more than form and destroy links, the links are just reorganized, that is, instead of connecting to one neuron, the link moves and connects to other neuron. That process happens mostly in the first 6 or 7 years of life and I don't think it is the main process in learning. Other processes like the one that changes the weights of the links are more important and they are always present (not just before we are 7 years old).
In addition, the learning process in ANN is based mainly in changing weights but there are also models that change the network structure (moving links like the biological model but also creating and destroying links)
Fifth, you can ask a human why he/she came to a particular conclusion.
As another poster said, you learnt that.
You learnt how to change the weights in your brain to save some "thoughts" and how to recall those thoughts when somebody asks...
I don't see anything that a machine couldn't do.
although they are extremely valuable computational tools, they are not a magic wand. Many pattern recognition and data organization tasks can be much better performed by traditional symbolic algorithms.
I agree very very much.
But, symbolic algorithms are quite useless when you want something that has self-organization, learning, fault tolerance properties... especially if it is also very complex.
--
ACid
But hey, at least I'm honest!
This is very off topic, but in reference to your signature. Have you heard of screen? Its very nice. As many virtual terminals as you can shake your fist at, and it works over telnet and ssh and whatnot.
$ man screen
What if you want to be logged in as different users, access priv., etc. Then screen just doesn't cut it.
Within the space of a single clock cycle, the Pentium (or other designs) might make use of asynchronous logic, but (and this is the important bit) the asynchronicity only exists within the domain of the CPU. The external interface to the CPU is still governed by a clock: you supply the CPU with inputs, trigger its clock, and a short (fixed) while later it supplies you with outputs. Asynchronous logic removes the clock entirely.
Not strictly true - that's just asynchronous logic taken to its extreme.
A more practical approach, which I'm told has already been used here and there, is to build functional units that are asynchronous, while keeping the chip as a whole synchronous. Either the functional unit takes a varying number of cycles to produce its result, or the rest of the chip assumes that it will take the longest possible time, but the (multi-stage, multi-cycle) calculation is performed without the need for internal clocking.
Even an entirely asynchronous chip would have to have synchronous I/O if it was to be used in any conventional system. Redesigning all parts of a computer system from scratch would be a vast amount of work and result in a system that was far more expensive than necessary.
The feature of asynchronous logic that distinguishes it from a big block of arbitrary combinational logic is that asynchronous circuits have a way of indicating when computation has completed. This allows you to string them together in more complicated patterns without fear of race conditions, and means you don't have to always assume it will take the longest possible time to complete.
While asynchronous logic has yet to show a substantial speed benefit over well-designed synchronous logic, it tends to consume considerably less power, as you don't have to propagate clock signals to all parts of a functional unit. Clock gating only buys you so much.
The disadvantage is that it's often somewhat larger than the equivalent synchronous logic, and that verifying the correctness of an asynchronous design is a nightmare (a more difficult mathematical problem, and one with less background research at present). It's also hard to assess maximum power dissipation (you have to prove that no pathological one-in-a-trillion state transition that can cause vast amounts of current to be shunted can exist).
my dreams have been filled with visions of large processor cores with no fixed pipeline that can reconfigure themselves for different tasks, and have several operations in progress at once, taking several pathways through the core.
To some extent, this is done already.
Any superscalar processor will at least potentially have multiple operations being performed at once, in different pipelines as well as the same pipeline. The pipeline typically forks manyfold after the fetch/decode stages and recombines when results are ready to be committed and instructions retired. Under ideal conditions, most or all of the functional unit pathways in a processor could be busy at the same time (though code that does this is very rare - usually you're limited by data availability and by mismatch between the processor's facilities and the resources needed by the code).
As for self-reconfiguring processors, there is much work yet to be done. I've seen a handful of papers on partly reconfigurable processors; a few are on my prof's page. However, reconfigurable functional units suffer from the twin problems of being slower than a hard-wired functional unit at a specific task, and being extremely difficult to produce optimized code for (generally, the more ways you can use something to change the workings of a program, the harder it is to find an optimal use for it).
It's still an interesting idea with potential in several types of situation, though.
Asynchronous logic also doesn't affect the reconfigurability of a block of logic. Reconfigurability happens at a higher level of the design.
I suppose, for the immediate future, that asynchronous logic elements will simply augment current processor designs as you (and others) have outlined in this discussion. How long, then, until we reach a paradigm shift and start designing our processors in a fundamentally different way?
Pulling a number out of a hat, I get 10-15 years. Superscalar, highly-pipelined architectures have almost reached the limits of extendability without a major breakthrough in design, but there are still performance-enhancing tweaks to be made. CMP - multiple cores on one die - is the way of the future, but the number of cores involved will be small for the next few linewidth shrinks. Beyond a certain point, though, we're going to have to change from using conventional cores on a fast crossbar-type interconnect to something that more closely resembles the larger parallel machines of today. It is at that point that new opportunities for optimization and design changes will arise, which may (or may not) lead to significant changes in the way individual processor cores are designed (as the place of a core in the system will have changed).
It's also possible that there never will be a drastic paradigm shift. Modern superscalar processors represent a fairly mature solution that matches the nature of most computing problems fairly well. Only time will tell whether something radically better will come along.
In the meantime, asynchronous logic may very well replace synchronous logic for the low-level implementation of processor circuitry.