Self-Timed ARM Provides Low Power Consumption
hardcorebit writes: "The Amulet Group at the University of Manchester is working on a 'self-timed' or 'asynchronous logic' chip which uses the ARM architecture and instruction set. The benefits? Much lower power consumption, lower EMF emissions, and it works with everything written for the ARM. Their latest effort is 'broadly equivalent' to an ARM9. Anyone had a chance to get their hands on one of these beasts?"
Since CPU marketing has reduced to saying my clock speed is faster than yours (whether my CPU can do more useful work being irrelevant) what will be the "dumbed down figure of merit" (tm) for systems that don't have clocks?
ARM chips already have a vast amount of market share -- they dominate the cellular phone market. (In fact, I don't know of a single cellular phone chip set based on any processor core except the ARM7 or ARM9. Somebody help me here -- there has to be one...)
The thing is, that market is a great deal more power sensitive than anything you or I can imagine in a desktop machine, or even in a laptop or palmtop configuration. All the power cost of the system in the processor (except when the user is actually transmitting), and any increases in processor power consumption come at the expense of standby battery life. If you want to add rich content to a cell phone, you need a faster processor, and standby life can't be reduced much more without losing users...bottom line, without improvements in processor technology, smart phones won't ever be a reality. We may have smart briefcases, but, in that case, why use a phone and not a wirelessly connected laptop?
I believe EDSAC came one year later. Check the links on the above posts and you'll find Manchester Baby dates back to 1948 & EDSAC to 1949.
I think Cambridge claim it was the first 'proper' computer as it was more like modern computers. Possibly had a teletype style keyboard/printer? Can't remember exactly.
For anyone curious about that statement, the way ARM does it is by making ALL instructions conditional. Rather than branch for a small piece of conditional code, you can just scream right thru it!
I used to work for Acorn (the original "A" in ARM before it was changed). The guy doing the Amulet work at U. Manchester is Steve Furber, who was one of the original Acorn design engineers and original architects of the ARM.
Quite so. Kinda like the ARM-based NewsPAD of old, innit. Hope it's rather more successful.
Yes.
--
This comment was brought to you by And Clover.
I'm a 2nd year CS student at Manchester University, and one of the final year projects on offer this year is to port Linux to the Amulet.
I didn't look too deeply into it, but I assume from the fact that this project is on offer that some work must need doing to get the present arm ports running on the Amulet.
just a little insider information
However, an asynchronous design requires that the delay lines be very conservatively designed, as if the delay line was a little faster, and the logic a little slower, on the worst case critical path, the chip would fail completly, which results in a slower processor by design.
;-).
This is the same argument I've heard for using asychronous logic. The idea was, that the speed of the entire core is limited by the portion that can't accept a clock higher than N MHz. With asynchronous logic theoretically this allows each function block to work at their maximum speed, with the slow ones affecting only the operations that they're involved with.
I'm not necessarily disagreeing with you, but that's the general idea of the counterpoint
Larry
O.K, i'm sorry if my orginal post implied that i thought that PC's & Macs "where the only computers". Certainly not, i used to own an Amiga (Not more than 2 years ago), and have used Acorns before. So i know what the ARM is like in use. :)
The question was a real one. Embeded systems honestly didn't cross my mind. You give some good examples. But, does/would the self-timed ARM designs still consume more power than a Transmeta? Does anyone have any hard data on the two for a comparision?
Syllable : It's an Operating System
There's plenty of projects available using the Amulet for students at Manchester University. One guy I know was building a robot control board using one of these (it had a particular lysick offboard communications system - it was self-programming using a Xilinx array, and used the bootstrap code to soft-load it).
One of next years projects is to port Linux to Amulet. Should be interesting, as I don't know as there's an MMU yet.
Anyway, I have seen one of them running and it's quite impressive. It has a genuinely low power consumption (almost literally nothing when not doing anything), and because of the wonderful CMOS speedup effect, you can increase the speed of the thing by wacking up the core voltage. Very cool.
The implementation spec is fairly tightly controlled, though, so don't expect to get hold of it just yet.
Anyway. Come to Manchester! It's got pubs!
-- I reserve the right to be completely wrong --
they [both] would say that though wouldn't they.
Cheers for the info.
First off, I'm not an expert here (again, I don't get to get too deep because I second as the sysadmin -- a "pee-on as needed" engineer if you will ;-). I hope to get deeper into NCL design as the sysadmin duties die down (they were quite heavy when I got here because they had been operating without a sysadmin for ~3 years as they grew).
Secondly, I assume you understand the purpose of the "acknoledgement", which is essentially the "hey, I'm with the previous result, I'm ready for the next set of inputs"? The acknoledgement along with the normal properties of CMOS prevent any "race condition" from occurring (I assume that is your fear?). Again, the physical design is pretty much "100% delay INsensitive". Again, I'm not exactly following you here, but remember, we're not using traditional NAND, etc... gates in CMOS, but NCL gates (e.g., 3 of 5) and they are designed specifically for NCL and the acknowledgement flow.
Third, the only problem 2NCL (the NCL math used in CMOS -- 4NCL is ideal, but NOT practical in physical design) has to do with is what we call "orphans." They are unforseen results that may either cause a condition where data can be "hung" from moving on, or (more likely), the input triggers a gate to open when it shouldn't (e.g., forgetting to take the acknoledgement into account). AFAIK (I've never messed with finding them myself) "orphans" are a "pure logic" problem and we can identify most of them through a post-design "orphan checker."
Again, I do *NOT* speak for Theseus Logic and there are much better individuals here who can clear up any questions. Feel free to fire off some questions to the address(es) on the web site.
-- Bryan "TheBS" Smith
-- Bryan "TheBS" Smith
Independent Author, Consultant and Trainer
Actually, tolerance of manufacturing variability is one of asynchronous design's strengths. Because each "chunk" of gates only performs its computation when all input data is available, it doesn't matter if a piece of data arrives early or late. The computation is data-driven. It runs as fast as the lines can switch. So it does not have to be "more conservative."
Async designs require more silicon area for simple things, like data lines. Rather than having a single "high = 1, low = 0" data line + a clock line, our group used three wires. First wire high = 1, second wire high = 0, third wire high = downstream component got the data, reset please.
The coolest geek feature of async processors is that if you improve the transistor physics (e.g., put an ice cube or some liquid nitrogen on the processor) the instruction rate increases. Whee!
James Cook
ex-"cook@vlsi.caltech.edu"
now james@cookmd.com
Remember, I said that when computers were invented, besides the "operator" and "operand," you had to account for the "control", which was previously the mathematician him/herself with boolean logic on paper.
The first computers WERE async (e.g., Eniac, etc...). But by the speed of components in 60s, the race conditions were great enough that prompt a "formal" control line. This became the clock, which was simple and did the job. By the time ICs rolled around (early '70s with Intel's first memory package), the clock was the "thang" to use.
But starting with speeds of 100MHz+ and the number of transitors in the millions, clocks were limited by the speed of light (combined in the delays of the semiconductor material itself, silicon). As such, clocks are now localized to certain portions of the IC, but yet, have to be "synchronized" somehow.
Karl Fant, the man behind NCL, devised this embedding of control in logic at Honeywell over two decades. Theseus Logic was founded in 1996 to take NCL commercial (Honeywell had no interest in doing so). Our main argument is that WE FINALLY ADDRESSED "CONTROL" in the way computers should be designed. Remember, boolean logic and algebra was designed for mathematicians, NOT computers. And most of the industry is starting to side with us that dual-rail/acknowledgement is the way to go.
-- Bryan "TheBS" Smith
-- Bryan "TheBS" Smith
Independent Author, Consultant and Trainer
Here's a quick summary of the benefits of NCL:
[ One really "neat" feature of NCL is the inverter gate, THERE IS NONE! Invert in NCL is simply done by swapping the rails! ]
The only negatives to NCL are:
-- Bryan "TheBS" Smith
-- Bryan "TheBS" Smith
Independent Author, Consultant and Trainer
First off, you are comparing two entirely different markets. ARM is NOT designed to run in an end-user, general-purpose desktop.
Secondly, StrongARM did go the same as Alpha ... to Intel in the cross-license and fab buy-out! Every thing about the main, single reason Intel made the deal points sole at ... yes, StrongARM (of course, the nice side-effect was the dropping of the lawsuit -- and Digital wanted to dump its fabs anyway). The damn thing was eating everything up, MIPS, Hitachi, etc... and Intel's own i960 really needed a good replacement.
The smartest thing Intel has done in a long time (at least technically ;-) was to buy StrongArm from Digital. Man is StrongArm just gaining market and mindshare or what?!?!?!
But ARM itself (non-StrongARM) is far from dead. It's used in numerous products you use, just like MIPS.
-- Bryan "TheBS" Smith
-- Bryan "TheBS" Smith
Independent Author, Consultant and Trainer
Secondly, I assume you understand the purpose of the "acknoledgement", which is essentially the "hey, I'm with the previous result, I'm ready for the next set of inputs"? The acknoledgement along with the normal properties of CMOS prevent any "race condition" from occurring (I assume that is your fear?).
My fear isn't a race condition; it's a spurious signal emitted from a previous output stage causing processing to begin before it should in the following stage, with invalid data. Spurious signals like this occur all of the time, and are called "glitches"; they result when multiple paths through a logic block have different lengths. The canonical solution is to ignore all outputs until enough time has passed for them to stabilize. Glitches can also be minimized by adding redundant logic terms.
Again, I'm not exactly following you here, but remember, we're not using traditional NAND, etc... gates in CMOS, but NCL gates (e.g., 3 of 5) and they are designed specifically for NCL and the acknowledgement flow.
However, your NCL gates are still composed of transistors set up using CMOS logic rules (or any of a variety of dynamic schemes that accomplish the same thing). This winds up giving effects similar to those you would see with standard boolean logic circuits. As far as I can tell from the documentation, in actual implementation NCL isn't so much a departure from boolean logic as a layer of meta-logic on top of it that makes it self-clocking. The actual physical signal encoding on individual lines is boolean (the lines are just grouped in interesting ways).
Thus, while the gates are self-clocked, they seem to be as vulnerable to glitching as any other combinational logic blocks.
Information regarding "orphans" noted. It's interesting, but doesn't relate to my question.
Again, I do *NOT* speak for Theseus Logic and there are much better individuals here who can clear up any questions. Feel free to fire off some questions to the address(es) on the web site.
Noted; thanks for posting the link, btw. This is a very interesting approach to asynchronus circuitry.
PalmOS is also moving to the ARM processor.
This is interesting because a cellular phone takes so much power for the RF transmission that the CPU consumption is relatively negligible. I don't know about you, but I really like the fact that a Palm runs forever on a pair of AAAs. I don't like the rechargeable Palm V, Palm IIIc, PocketPCs, etc.
----
Stop worrying about the risks of nuclear power and start worrying about the risks of not using nuclear power.
> Still, long live Arthur!
Ah! This brings back memories.... none too pleasant ones. The days when GUIs were written in BASIC will not be missed.
Of course if you're hankering for that Archimedes feeling there's a pretty decent emulator up at this guy's page.
Most games and demos run, but sadly there's no sound yet.
Hahahahaha
Check your facts. ARM chips are used in many mobile devices, including the entire Psion family of organizers. They are also in the Cobalt Qube and Raq, the and are about to become the CPU in all the new devices from Palm. Calling them "rusty old chips" betrays your ignorance of technology: the ARM family of processors is one of the most rich, varied and technologically-advanced around.
The Alphas were not "head and shoulders above the 386." When they were first introduced, they were sucking up way, way more power and requiring much more cooling than an average Intel chip. Faster, yes, but at a price. Alphas were targeted at the "performance at all costs" CPU market, not something for the average desktop or laptop.
I saw a demo of the Amulet a couple of years ago, when I was at Manchester University. They'd wired up one to a variable-voltage power supply, and a speaker.
By putting it into a loop where it powered the speaker every couple of cycles, it generated a tone. By adjusting the voltage of the power supply, it was possible to make the tone higher or lower, as it wa having a direct effect on the running speed of the processor.
Also, when put into a 'halt' loop, it would power down until interrupted. An ammeter connected in series with it showed that it was using almost literally no power.
You could still use BogoMIPS, everybody's favorite!
"It's tough to be bilingual when you get hit in the head."
That does sound kind of harsh, but then I'd hate even more to have to do it for any other kind of modern chip architecture.
The ARM instruction set is pretty clean, and dead dead easy to program even large projects in. Mind you, some of the newer ARMv4, Thumb instructions must be pretty hairy from an implementation POV, especially keeping backwards-compatibility with 26-bit addressing.
Hang on, what's this story doing on /., anyways? The Amulet project has been going a long, long time and achieved ARM9-level performance some time ago, IIRC. Asynchronous chips are interesting but the power of mainstream (particularly x86) processors has kept increasing at such a rate no-one has yet needed to make the huge change of design strategy. I don't expect to see async chips in the mainstream until Moore's law is well and truly broken.
--
This comment was brought to you by And Clover.
Maybe you haven't been exposed to enough processor archictectures? The ARM chips have the cleanest instruction set and overall archictecture that I've seen, and that includes lots of hands-on experience with the PowerPC, x86, SHx, and MIPS chips. The ARM designers had some very good ideas for keeping instructions simple while getting a lot done and they had a novel way of avoiding the usual branch prediction troubles. Very slick.
Amulet's lead, Steve Furber (who also designed the original ARM), wrote a recent editorial coverstory called "Kicking out the Clock" in the May 2000 edition of Integrated System Design (ISD) magazine.
In the article, he used an example of a "dual-rail" logic (as opposed to "single-rail" found in most boolean-designs) call Null Convention Logic (NCL) from Theseus Logic. Theseus' NCL approach not only goes a long way to not only solving the power and noise problems (like most asynchronous), but also the greater problem of design reuse (a problem with both async and, especially, synchronous) -- the later is something Furber was quoted on in a past EE Times article (cannot seem to find it on-line anymore?).
Timing verification is becoming increasingly difficult in IC design, adding rediculous ammounts of extra effort and, in some cases, complete design failures (e.g., AMD, IBM and Intel have all had timing-related design failures). Clocks may soon disappear in favor of async designs, especially those like Theseus Logic's nearly-100% delay INsensitive NCL technology. NCL's delay INsensitive nature comes from the fact that it is NOT boolean logic based, but a new method that breaks the traditional foundation of what boolean logic was design for, mathematicians, not computers.
In addition to an "operand" and an "operator," as with traditional, human-based math, computers require a third "control" line. In synch/boolean, this is the clock. With the limitations of the speed of light, it is IMPOSSIBLE for 10M+ transistor ICs on one section of the chip to be timed synchronous with another. As such, most modern ICs have localized clocks, which further adds to design complexity.
NCL removes the clock as the control (as with most async) *BUT* it places the control back in the data flow lines themselves! NCL is a 3-state logic of "true" and "false", plus the control which is derived from NCL math to be "null" (no data). This representation is 2NCL in NCL math (see Theseus' site for more details on NCL including 4NCL and 3NCL, the later being used with most off-the-shelf tools and optimizers). In 2NCL, the lines (again, "dual-rail") puts the false value (0) on one line and true (1) on the other line *IF* voltage is present, otherwise, no voltage (or low) results in the state of "null" (again, no data). Acknoledgements are used to maintain a delay INsensitive combinational logic circuit, including the fact that NCL can be place alonside synch/boolean and maintain 100% data flow and integrity (again, totally delay INsensitive). So instead of data having to "wait" on a clock to move forward, data moves forward when it arrives! This further increases performance!
Although Theseus' NCL technology is NOT boolean based, it works with off-the-shelf synch/boolean IC design tools (unlike attempts like Cogency's), it is still CMOS-based, and it not too difficult for an engineer to learn coming from the synch/boolean world.
[Bias: I am an employee of Theseus Logic and know Mr. Furber, the Amulet lead. I am NOT an engineering lead, just a regular engineer (who seconds as the sysadmin ;-).]
-- Bryan "TheBS" Smith
-- Bryan "TheBS" Smith
Independent Author, Consultant and Trainer
Its processor core is based on the ARM9 series, but since it is asynchronous (ie it hasn't got 'clock cycles' like normal synchronous processors) it should go very very fast (simple processes will rush through without being delayed by slight harder/longer processes).
While I haven't had a chance to get my hands on one of these yet, the spec's I've seen (I can't remember if they are public or not) look good and the chips should be compatible with current ARM chips - as used in my RISC PC (BTW a RISC PC is used to run the 'Who Wants to Be A Millionare' shows!).
It is difficult to place an exact Mhz rating on these chips due to the way they work, but the current version (AMULET3i) runs at roughly 120Mhz - but they have started from the basics, without using much 'proven technology', so expect development to last a few more years - but the 120Mhz version should be out next month/late this month.
Richy C.
--
Nah, ARM processors are used all over in embedded devices. This isn't just PDAs and palmtops, but all those other electronic devices that have some smarts and don't use `70s derived OSes (*nix, MSDOS, WinX (not using those isn't so bad - do you really want to program your micorwave oven from a command line or a GUI? at 5AM Monday morning? after a late night ?)
These guys have an interesting way to deal with it.
They describe a way to build asynchronous ciruits (using the same design even for different fabs) that run as quickly as the gate/wire delays allow. It takes more surface elements to build the same logic, but once you take removal of the clock lines into consideration, things look a lot closer.
IMHO, the real beauty of async designs is that your bit shifter op can take 1 nanosecond, your add op can take 3 nanoseconds, and your subtract op can take 4 nanoseconds, rather than having them each take a 4 nanosecond cycle. It really disturbs me to see designs where a multiplication (inherently slower by a minimum factor of lg(bits)) takes the same amount of time as an addition.
When I was in grad school at Caltech, I worked on software tools in Alain Martin's asynchronous microprocessor group. The group had actually developed and fabricated a processor before I arrived. To quote their web page (www.cs.caltech.edu/~alains/previous/uP.html):
m l
"Above is the layout of the 1.6 micron version of the Caltech Asynchronous Microprocessor, fabricated in 1989. It is a 16-bit RISC machine with 16 general-purpose registers. Its peak performance is 5 MIPS at 2V drawing 5.2mA of current, 18 MIPS at 5V drawing 45mA, and 26 MIPS at 10V drawing 105mA. The chip was
designed by Professor Alain Martin and his group at Caltech. You can read about the chip in Caltech CS Tech Reports CS-TR-89-02 and CS-TR-89-07."
Keep in mind that this is a 1.6 micron process. The chip was later fabricated in gallium arsenide with very few design changes. This is because the chip, being completely data driven, will perform computation as fast as the underlying device physics will allow. There are no "timing issues" as these must all be worked out in high-level design (or the chip won't function at all... race conditions in hardware really suck).
Of course, the neatest geek feature is to pour liquid nitrogen on the chip and watch the instruction rate climb.
Since I left the group, they have also fabricated an asynchronous "digital filter" or simple DSP. Details at http://www.cs.caltech.edu/~lines/filter/filter.ht
The downside of all this stuff is that the design process is very formalized and arduous. Our group designed by writing parallel programs in a special chip-design notation, then transforming the program by hand and by software into a VLSI gate layout. It was a completely different synthesis method than most designers are used to, so it requires completely new software and designer training to be productive. It's sad, really, because the output chips are so very very nifty.
James Cook
ex-"cook@vlsi.caltech.edu"
now-"cook@alumni.caltech.edu"
This Asyncronous ARM has been around for awhile, and it has yet to hit the shelves.
I recall reading announcements for it back in the mid 90s (I believe it was in Byte, or something silly like that), and despite my frantic attempts to aquire small quantities, I was not successful. It seems that, based on what they say on their web sites, they have no intention of manufacturing it unless you are a large corporation with a specific need.
Bottom line: Who cares. It isnt available to the average silicon hacker.
Feed The Need[goatse.cx]
Disclaimer: Yet Another Manchester University Student...
--
...how many people actualy know about the company that started up ARM in the first place. They made what are in my opinion the best desktops around. Acorn computers may not exist anymore but Castle technologies has taken up the task of developing them. RISC OS was doing things in 1986 that windows has only implemented in Windows 95. RISC OS, now owned and developed by RISC OS Ltd.(I think), seems to be going from strength to strength! People might be interested in Acorn computers which would have taken over the world if people at acorn hadn't decided that there was no need to advertise their new products because they were so good, they would advertise themselves.
BTW, I recommend you read Steve Furbers book on VLSI design (I can't remember the name). Very informative and interesting, using ARM chips as examples and the such.
(ngh, pressing enter accidentally posted article)
2 4535801.pdf
Compare the ARM's nice simple orthogonal instruction set with the crawling horror that is to be Merced. 128 general-purpose registers, 128 floating-point, 64 predicates, 8 branch registers. Background register loads/spills when you do a function call. Multiple instructions issuing at once. No page faults- until you explicity "commit" a bunch of memory accesses. Rollback.
ftp://download.intel.nl/design/ia-64/downloads/
(sadly not the 200 page full description, I can't find that)
I'd figure eCos would be ported before Linux. Amulet is not exactly something that you would use in a traditional thin-client/server system, but more, ultra-low-power/embedded systems.
eCos is the Linux complement in small-footprint, real-time space. Blows Windows CE out of the water, and Cygnus/RedHat are working hard to make EL/IX an API for cross-Linux/eCos development. An excellent model IMHO. Linux is great, but it can't run in the smallest of footprints.
-- Bryan "TheBS" Smith
-- Bryan "TheBS" Smith
Independent Author, Consultant and Trainer
i use my arm quite a bit when i pour a hot bowl of grits down my pants, and it doesn't consume a lot of power, though it helps if i eat a chunky bar first. thank you.
The AMULET itself has been around for *years*. It's led by , IIRC, Steve Furber, one of the original designers of the ARM when it was part of Acorn
-- I'm drinking myself to sleep again...
I was one of members who made AMULET3. Here are myth and truth about Asynchronous design. (as far as I know) 1. Asychnoronous design use delay cells: not necessarily. Amulet 3 uses delay cells because of commercial consideration in terms of the chip size and the usage of synchronous CAD tools. 2. Asynchronous design does not have CAD tools: false argument. There are several tools available. Even an industry level tool is being used by Philips. 3. There is no commercial asynchronous chip: wrong. Philips made several chips and one of them is being used for their product. 4. Asynchronous design is not safe: wrong. This problem is solved in terms of CAD alogorithm. Now totally depending on your brain. Using Asynchronous design is mainaly engineering trade off in my opinion. There are advantages and disadvantages. However, mostly depending on your brain.
Makes sense to me, Redundancy being the important part. However because it makes sense the military probably won't use such an idea.
This could mean just as much a new wave as the Transmeta Crusoe - meaning portable devices will become even better! Oh boy, it's great to be a nerd nowadays.
I hope whomever is in charge of this project becomes aware of this technology - as other posters on the aforementioned story noted, EMF radiation could make these JEDI's a glowing target. Lower EMF means fewer KIA (Killed In Action, not the crappy car company) JEDI's.
Besides, the low power consumption is something that nearly every PDA user can appreciate: and in field-critical situations, could be another lifesaver.
--
We may not imagine how our lives could be more frustrating and complex—but Congress can. – Cullen Hightower
Hm. Maybe now I can chat on a cellphone without worrying deep down that I'm going to get brain cancer.... And, thus, give a nervous mother (mine) one less thing to harangue me about during that same chat.
AHHHHHHH! I'm burning with goodness again!
- Reakk, Sluggy Freelance
I'm still waiting for some consumer devices that will actually make use of these sorts of chips. Like, I'd love to see a laptop that will run for 10 hours and is cheap - if they were only 1000 bux or so, I'd pick one up, but they're still really expensive... it is going to be wonderful when these things get to market :)
Impress the dept. and gain the respect of /.! Certainly much more fun than the Java web-trawler I did this year...
Ever seen the Transmeta webpad? OOoooOOhhh I want one of those. First company to come out with one that runs decent gets my business.
Has anyone ported linux to ARM? That would be cool to have linux on a low power, portable device.
A group at CalTech built a 16bit RISC style self-timed CPU some years back (early 90's I believe) on a 1.5 micron process (I believe, somebody please correct me if I am wrong)
One of the cool features is that as you coll the cpu, it literarly becomes faster!.
The basic design of self timed CPU's has been around for probably more than 20 years.
S Unger's Asynchronous Sequential Switching Circuits, Krieger, Malabar, FL, 1983 is probably one of the books one encounters when taking a course in this subject. (the book is pretty rough going though - )
ARM has lots of market share. LOTS.
You are assuming that the main market for this type of chip is the home PC. This is absolutely not the case.
In light of the recent article about Palm switching to the ARM family of processors, one can oly hope that they would consider a low-power alternative like this.
over engineered? that's a misuse of the term. also, i think you don't have any idea what you're talking about. if alpha is such a failure, why didn't compaq can it when they bought dec?
just cuz all you've ever seen are architecturally obsolete x86s, doesn't mean that's what the whole world uses. ever hear of a little thing called vms? also, alphas are big in sciencetific areas. So intead of trying to perfect the ARM--why not work on getting some market share first?
better products increase market share.
Arthur Lives!
--
This comment was brought to you by And Clover.
I've seen information indicating expressions of interest in a port of PalmOS to StrongARM; I'll believe in there being product when I actually see it on store shelves.
If you're not part of the solution, you're part of the precipitate.
Define over-engineered for me then.
Alpha was doing poorly because of poor marketing. Compaq didn't can it because they want to remedy that.
I have seen more than x86's. I'm not saying Alphas suck--I'm saying they aren't popular.
"better products increase market share."
So conversely, poorer products decrease market share? I guess that explains why Microsoft is doing so poorly...
--
Have Exchange users? Want to run Linux? Can't afford OpenMail?
Linux MAPI Server!
http://www.openone.com/software/MailOne/
(Exchange Migration HOWTO coming soon)
Cambridge teaches us "EDSAC was first coz Baby was just a device to test the memory tubes."
However, I've also heard Manchester's side of the story (having worked in the CS Dept. one summer) and nyaaaaaaaaah to Cambridge - I think Manchester has it.
Maz
-- not daring to walk on the streets for the next few days...
A true sign of the global-ness of the web. I grew up in Manchester, went to the University of Manchester, married a US Citizen, I now live in the US, go to /. and I end up reading about projects in my old University.
It is a small world...
I too have seen the AMULET processor and in fact have a final year degree exam on the very fundamentals of the processor. The new version of the Amulet (3i) is currently awaiting fabrication from what I am aware. I have lectures from a number of the design team, including Steve Furber, and have seen working examples of the processor. I believe there is also an ARM9 which is available with the asynchronous multiplyer from the AMULET processor. This allows the processor to be optimised and use less power. There are a number of aims surrounding the AMULET, mainly in low power, low EMC and actually proving that you can use asynchronous technology in real worlds applications. For anyone who doubts the use of this technology it has incredible potential. I think its best feature is that ability to enter and idle state where no power is consumed. Anyone wanting to use one of these processors should do a degree at Manchester!
Great! I have an aposite and timely post at last!
five minutes ago, I clicked the on-line submission button indicating that I would NOT choose (for my 3rd year project at manchester university):
Porting Linux to Amulet3
Project: 704 Supervisor: DAE Categories: SH=C
Amulet3 is the latest asynchronous version of the ARM microprocessor. Last year, a student designed a demonstrator board based around Amulet3 + an on-board Xilinx chip. This project is to port a cut-down version of linux to the board. Several ports of linux to similar systems exist. See DAE for Details.
Ah well...
Interesting. This sounds a lot like the "data-driven" graphical language LabVIEW, which I spent about three years programming in.
In LabVIEW, the operators only execute when data is present. "Data present" is a condition inherent in the incoming data stream, and, as you said does not require an extra control line to indicate.
The operators themselves can be programs, being activated only when data appears. So, the language is extensible by creating custom "instruments" which are activated by the presence of data.
I've always said implementing LabVIEW in hardware would be a kick!
--The QuantumHack
(no relationship to National Instruments except a satisfied customer)
www.backwoodsengineer.com
The ideas presented in the papers on the Theseus Logic site are interesting. However, the True/False/Null logic scheme defined seems to be vulnerable to glitches in gate inputs. A brief transition to a valid state on all inputs as the previous stage's logic settled would be interpreted as a new input datum by the gate in question, possibly resulting in unwanted output being produced. In other words, using T/F/N logic seems to place stricter timing requirements on input signals than clocked logic with edge-triggered registers.
Is this correct, or am I missing something? I realize that glitching can be reduced by careful logic design, but this seems to be an issue that is addressed neither in your post nor in the papers on the Theseus site.
Yes, all very interesting, but this is hardly a new project! I have a recollection of reading about the Amulet project back in the heady days of the ARM 3. I think it might have even had a fairly large lump of magazine dedicated to it when it was still called Micro User! But ancient history aside, it's good that people are still pushing ARM processors even though x86 seems to have all but won the war. Even Intel seem to think so as there's a 400Mhz StrongARM due real soon now, I hear.
Still, long live Arthur!
ARM has some market share - the ARM chip is used in all sorts of small low-power devices. The most popular of which is probably the Psion range.
Asynchronous logic appears, every once in a while, as a "new" hot topic within VLSI and computer architecture research. Yet it has consistantly failed to offer the benefits it promises. Why?
It is true that clocks in synchronous design consume a great deal of power, but when low power designs are required, it is well understood how to gate and conditionalize clocks so they don't use power when the associated logic is not operating.
And asynchronous design has to be much more conservative than a synchronous design. With a synchronous design, a chip can be designed to operate at the maximum frequency, and then binned down if it fails to meet its target.
However, an asynchronous design requires that the delay lines be very conservatively designed, as if the delay line was a little faster, and the logic a little slower, on the worst case critical path, the chip would fail completly, which results in a slower processor by design.
Finally, the design methodology for building pipelined, synchronous devices is well understood, as a purely digital system. While asynchronous logic relies on building delay lines, essentially analog operations, which is a great disadvantage.
Test your net with Netalyzr
I just finished a computer architecture course here at college (in fact, I'm just out of the final exam). Our main project for the semester was to build a behavioral and structural model of a pipelined ARM7 processor.
At this point in my life, there is not much I hate more than the ARM architecture. Well, maybe complexity theory... but that final doesn't begin for another hour, so I'm okay with it, I guess...
You should never take life too seriously - You'll never get out of it alive.
And I'm still trying to figure out why asynchronous smaller bandwidth (number of lines) buses are faster than synchronous parallel (more data lines).
They aren't; what asynchronus logic in an IC context deals with is reducing power consumption by not clocking all parts of the chip all of the time.
In a synchronus microprocessor, the system clock is distributed to all functional units, and the functional units even when not in use usually wind up having some kind of internal state change every clock cycle. This results in a lot of heat production, because every time the state of a bit in a register or of a bus line changes, heat is dissipated (by nature of the way the parisitic capacitances are charged and discharged).
In a truly asynchronus microprocessor, there is no master system clock distributed to the functional units of the chip. Instead, actions in a functional unit take place when input data changes (i.e. new input data arrives). This results in only the state of units being used changing, which in turn means much less power dissipation if only one or two units is being used at a given time.
In practice, real systems don't fit into either category. Fully synchronus circuits burn a lot of power, but truly asynchronus circuits are difficult to design and are very sensitive to certain types of process variation. An often-used compromise is to use gated clocks - A synchronus clock is propagated, but only to the functional units that are being used. This principle is extended within the functional units themselves; internal clocks and data are propagated only when they need to be for the operation being performed. This results in a circuit that is much easier to design and fabricate than a truly asynchronus circuit, and that is almost as good from a power consumption point of view.
I hope this clarifies what the debate over asynchronus computing is about.
ARM has the largest marketshare of any 32 bit embedded processor. It overtook the 68K last year, when around 150-180 million ARM chips were sold.
Oh, forgot. Sony is also an Epoc licensee - and they make cool devices!
Go ARM!
it's in my head
Theseus Logic have some interesting papers on asynchronous logic design on their website, not directly connected to the story, but they're interesting nonetheless.
Choice of masters is not freedom.
In a self-timed circuit, the instant the gate changes, the next phase of the circuit is ready to go so there is no time "wasted" (19ns in the above example) waiting for the next clock.
This concept of uncertainty (between how much time the gate really takes to propagate and what the published maximum is) is also the reason why a small number of asynchronous lines can be faster than more synchronous lines. The more lines you have, the higher the possibility that there will be "skew" (i.e., different propagation delays) through them, hence you have to wait longer for all of them to come to the same state. The fewer the lines, the lower the skew, the less you have to wait (there is a reason why USB is a serial bus, not a parallel one).
There's some interesting reading on this topic at www.theseus.com. (I have no connection to them)
Yep. ARMs and StrongARMs are selling well, in PDAs (Palm may migrate to it soon) and network computerish type things. Very good for embedded devices.
I once heard about an ARM chip running off the waste heat of a Pentium, almost as fast. Sod the heatsink... shove a co-processor on there! =)
http://develo per.intel.com/design/strong/quicklist/eval-plat/sa -110.htm
This is an ARM chip on a PCI card. You can also get it with a little backplane and build your own linux ARM box. Fun fun.
I need more me's, or more time in the day. So many fun things to hack, so little time.
--
blue
i browse at -1 because they're funnier than you are.
Not having been up on the topic, I just never thought that integrated circuit logic was an alternative. And I'm still trying to figure out why asynchronous smaller bandwidth (number of lines) buses are faster than synchronous parallel (more data lines). But I guess the speed has at least something to do with the noise tolerance. Anyway, I'm reading from one of the links followed from the site that seems to be a pretty good explanation/history of the asynchronous logic.