Slashdot Mirror


Building a 32-Bit, One-Instruction Computer

Hugh Pickens writes "The advantages of RISC are well known — simplifying the CPU core by reducing the complexity of the instruction set allows faster speeds, more registers, and pipelining to provide the appearance of single-cycle execution. Al Williams writes in Dr Dobbs about taking RISC to its logical conclusion by designing a functional computer called One-Der with only a single simple instruction — a 32-bit Transfer Triggered Architecture (TTA) CPU that operates at roughly 10 MIPS. 'When I tell this story in person, people are usually squirming with the inevitable question: What's the one instruction?' writes Williams. 'It turns out there's several ways to construct a single instruction CPU, but the method I had stumbled on does everything via a move instruction (hence the name, "Transfer Triggered Architecture").' The CPU is implemented on a Field Programmable Gate Array (FPGA) device and the prototype works on a 'Spartan 3 Starter Board' with an XS3C1000 device available from Digilent that has the equivalent of about 1,000,000 logic gates, costing between $100 and $200. 'Applications that can benefit from custom instruction in hardware — things like digital signal processing, for example — are ideal for One-Der since you can implement parts of your algorithm in hardware and then easily integrate those parts with the CPU.'"

63 of 269 comments (clear)

  1. That instruction is .......... by 140Mandak262Jamuna · · Score: 4, Insightful
    -------------drum roll

    0x2A

    That is the ultimate instruction.

    --
    sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
    1. Re:That instruction is .......... by MozeeToby · · Score: 2, Funny

      Unless of course, the ultimate question really is 'What is 6 times 9?' as some people believe (meaning 42 is base 13 for some unknown reason). Which would of course make the ultimate instruction 0x36.

    2. Re:That instruction is .......... by MozeeToby · · Score: 4, Informative

      Hence the '42 is in base 13' part of my comment. 42(base 13) == 54(base 10) == 36(base 16). Of course, Adams himself denied this was the case... "No one writes jokes in base 13" but after this theory emerged he did work it into some of his later jokes, probably just to keep people wondering.

    3. Re:That instruction is .......... by dgatwood · · Score: 2, Funny

      Appropriate that the ultimate instruction would also be a wildcard (*) in ASCII.

      And speaking of your drums, on Apple II, it's rotate accumulator left, the ROL instruction.

      How curious.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    4. Re:That instruction is .......... by EkriirkE · · Score: 2, Funny

      But that's just 0xBAADF00D

      --
      from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
      to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
    5. Re:That instruction is .......... by mwvdlee · · Score: 2, Funny

      That's the only thing you can get at the 0xFEEDCAFE

      --
      Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
    6. Re:That instruction is .......... by mwvdlee · · Score: 2, Interesting

      It's got only one instruction. ...and the first parameter to that instruction controls what the instruction does with the rest of the parameters.

      (p.s. I wish this was just a joke, but this is pretty much what it seems to be doing)

      --
      Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
    7. Re:That instruction is .......... by nogginthenog · · Score: 2, Funny

      Shouldn't that be 0xABADCAFE ?

    8. Re:That instruction is .......... by MagicM · · Score: 2, Funny

      Is that where they sell 0xBADC0FEE?

    9. Re:That instruction is .......... by Anonymous Coward · · Score: 5, Funny

      This thread can be categorized as 0xNONEOFTHISISFUNNY

    10. Re:That instruction is .......... by psYchotic87 · · Score: 2, Interesting

      What you describe is pretty much how the linux kernel handles system calls. See this: How system calls work on linux/i86

      For an example of what a single instruction CPU might look like, take a look at this: Building the Turing complete coffee machine: an adequate assembly langauge

    11. Re:That instruction is .......... by V!NCENT · · Score: 2, Funny

      You mean 10 buttons?

      --
      Here be signatures
    12. Re:That instruction is .......... by Thud457 · · Score: 4, Funny

      This thread can be categorized as 0xNONEOFTHISISFUNNY

      I don't get it.
      That's not a valid hexadecimal number.

      --

      the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff

  2. "ideal for One-Der"? by mpoulton · · Score: 4, Insightful

    It seems specious to say that One-Der is optimal for a task because it offers the flexibility of the underlying FPGA hardware. If you have the FPGA hardware present to run the One-Der implementation, then you could just configure a more optimally designed processor out of it for whatever task you are actually performing.

    --
    I am a geek attorney, but not your geek attorney unless you've already retained me. This is not legal advice.
    1. Re:"ideal for One-Der"? by Bakkster · · Score: 2, Interesting

      But most FPGAs utilize a CPU core, which is often hard-wired and has ports to access the programable elements. Assuming the single-instruction MIPS runs faster than the common 'standard' CPUs such as PowerPC, then there would be a benefit. The CPU could be smaller (leaving more space for programmable elements) and more easily expanded upon (run additional functions by address rather than by OPCODE).

      That's a big 'if', but there's merit in exploring it. The biggest barrier I can think of right now is with programming time, and that's the most expensive part of most FPGA projects already.

      --
      Write your representatives! Repeal the 2nd Law of Thermodynamics!
    2. Re:"ideal for One-Der"? by SharpFang · · Score: 2, Interesting

      FPGA is usually the prototype phase.

      Actually, this could be implemented as a really small handful of transistors for the actual processor and a ton of various memory-mapped peripherials. Some of them being really simple old basic logic chips for ALU.

      It would mean a simple version for cheap microcontrollers would be really cheap to make, a family of compatible devices of different scale would be possible, and extending/upgrading existing instruction set would be easy too.

      The above is not a conflicting statement with the 1-instruction set idea. The MOV would not be really THE instruction set. The real instructions would be "place data in register A, read result from register B" and the memory map would be the real instruction set.

      But I just can't see it for anything bigger. It might be used for massively multicore processors if the address and data bus could be shared somehow. But I think it would be the bottleneck really fast.

      --
      45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
  3. nihilist by Nadaka · · Score: 4, Insightful

    vaguely reminds me of the nihilist language joke. A language that realizes that ultimately all things are futile and irrelevant, thus allowing all instructions to be reduced to a no-op.

    1. Re:nihilist by Anonymous Coward · · Score: 4, Funny

      ... and then it does dead code elimination, right?

    2. Re:nihilist by krkoch · · Score: 2, Funny

      Or just the instruction KAH: Push the status register to stack and proceed to kill all humans.

  4. He's Building a One-Der, Stop Him by eldavojohn · · Score: 5, Funny

    Everyone attack him before he wins this round of Age of Empires. Quickly, he's probably low on resources right now.

    --
    My work here is dung.
    1. Re:He's Building a One-Der, Stop Him by Iamthecheese · · Score: 3, Funny

      Some of us are recovering AOE addicts you insensitive clod!

      --
      If video games influenced behavior the Pac Man generation would be eating pills and running away from their problems.
    2. Re:He's Building a One-Der, Stop Him by Quantumstate · · Score: 4, Funny

      Some of us are still addicted you insensitive clod!

  5. Cheating? by happy_place · · Score: 4, Insightful

    So the one instuction is essentially a move command that has multiple modes... Ahem. Isn't that cheating? Isn't move considered two instructions already, a load and store? I guess this is really dependent upon how you define what is and isn't an instruction.

    --
    http://www.beanleafpress.com
    1. Re:Cheating? by Anonymous Coward · · Score: 2, Interesting

      Erm, no. The canonical single instruction machine uses "subtract and branch if negative" and that's not considered to be three instructions (subtract, test, branch) but one.
      Using memory-mapped facilities to perform operations like addition...now THAT is cheating.

    2. Re:Cheating? by quickOnTheUptake · · Score: 2, Informative

      Using memory-mapped facilities to perform operations like addition...now THAT is cheating.

      Isn't that what it does?
      Strikes me that that is just complicating things, insofar as you still effectively have multiple instructions, there is just another semantic level tacked on to hide them.

      --
      Mod points: Guaranteed to remove your sense of humor.
      Side effects may include gullibility and temporary retardation
    3. Re:Cheating? by Talennor · · Score: 2, Informative

      So the one instuction is essentially a move command that has multiple modes... Ahem. Isn't that cheating?

      Yes, it is cheating. He basically took the instruction bits of the program and said, "Behold, for they are now address bits!" With the caveat that the address bits happen to address INSTRUCTIONS. It's all pretty brain-dead.

      --

      //TODO: signature
    4. Re:Cheating? by maxwell+demon · · Score: 5, Interesting

      I'd also consider it cheating. I can also invent a one-instruction computer, where the one instruction is a move immediate instruction. The move instruction moves a byte-sized value into a "command register" which does different things depending on the value of the byte you load into it and the current state of the machine. Indeed, since there's just one instruction, and it always has a single one-byte operand, I just don't encode the instruction itself, I just put all the operands into memory, one after another. And I define the state machine so that the actions are exactly the same as the actions of an x86 interpreting those bytes as separate instructions. Therefore I can avoid doing an implementation myself; I can just use a stock x86 processor as proof of concept.

      --
      The Tao of math: The numbers you can count are not the real numbers.
    5. Re:Cheating? by wd5gnr · · Score: 2, Interesting

      Well with bias, I don't think that's exactly the point. In fact, it is more like exposing the microcode engine directly to the programmer. The advantage is that you can readily add function blocks and therefore instructions without having to know about how the CPU works internally. So regardless of if it is really "one instruction" or not, it could be useful for quickly building application-specific CPUs, or even building CPUs on the fly to best suit certain programs (which could be interesting for "big iron" running lots of CPUs to solve a big problem, for example).

    6. Re:Cheating? by Nazlfrag · · Score: 2, Interesting

      I suppose it's cheating. I think it's useful though simply as a backbone for a custom processor, then patch in what you need. You might need an ALU and DSP for a complex project, and an accumulator & bit shifter for a simpler one. This lets you link them to a common bus architecture which could make for easy prototyping.

  6. GOTO ... by gstoddart · · Score: 4, Funny

    I vote for GOTO as the only instruction.

    That would be hilarious.

    Cheers

    --
    Lost at C:>. Found at C.
    1. Re:GOTO ... by gstoddart · · Score: 2, Funny

      Actually, it sounds an awful lot like a COME FROM instruction.

      Well, if we're going with joke operations, I'm changing my vote to HCF. ;-)

      Cheers

      --
      Lost at C:>. Found at C.
    2. Re:GOTO ... by anotheregomaniac · · Score: 2, Funny

      The Jack Palance computer:

      Curly: Do you know what the secret of life is?
      Curly: This. [holds up one finger]
      Mitch: Your finger?
      Curly: One thing. Just one thing. You stick to that and the rest don't mean shit.
      Mitch: But what is the "one thing?"
      Curly: [smiles] That's what you have to find out.

  7. Can be a bit tricky to program... by nokiator · · Score: 5, Interesting

    I built a single instruction microprocessor at grad school. The only instruction was to move a 32-bit data from one address to another address. All the ALU and I/O functions were memory mapped. For example, you could have an adder where address A was operand #1, address B was operand #2 and address C was the result. Branches were handled through ALU units where the result of the operation changed the instruction pointer for some future instruction. It was very easy to implement and notoriously difficult to program.

    1. Re:Can be a bit tricky to program... by purpledinoz · · Score: 4, Interesting

      For a few seconds there, I thought you said grade school. Made me feel very inferior :) Wouldn't the complexities of programming it be handled by a compiler? If someone managed to write one for a 1 instruction processor?

    2. Re:Can be a bit tricky to program... by multipartmixed · · Score: 2, Interesting

      Interesting.

      First off, your one-instruction CPU, I guess you didn't need to express the instruction in machine code, just the arguments.

      Here's the funny question, why not develop an assembler with synthetic instructions, like SPARC v9? It would certainly make it easier to program.

      --

      Do daemons dream of electric sleep()?
  8. Wrong part number in summary by mako1138 · · Score: 5, Insightful

    It's XC3S1000, not XS3C1000. Been working with these parts too long...

  9. So old it's new. by LaminatorX · · Score: 5, Insightful

    Sounds a hell of a lot like the read/write head of the Turing Machine to me.

  10. What's the one instruction? by Chris+Mattern · · Score: 5, Funny

    Why, DWIW (Do What I Want), of course.

    1. Re:What's the one instruction? by V!NCENT · · Score: 4, Funny

      get me a sandwich is not in the sudoers file. This incident will be reported.

      --
      Here be signatures
    2. Re:What's the one instruction? by wd5gnr · · Score: 2, Informative

      I thought it was: Anonymous Coward is not in the sudoers file. This incident will be reported. ;-)

  11. One instruction... by hey · · Score: 3, Insightful

    ... whose first operand is the task to perform. Followed by the necessary operands for that task.

    1. Re:One instruction... by pz · · Score: 5, Interesting

      ... whose first operand is the task to perform. Followed by the necessary operands for that task.

      Exactly. It isn't a single instruction computer.

      And the idea isn't new.

      If a single instruction architecture is designed, then there is only one instruction (duh), and there's no reason to encode that instruction in the instructions themselves. All that will be left is encoding for operands. There's a tempting but brief foray into semantics where you can argue that the first handful of bits in TFA's instruction set are operands to the execution control unit, but that is, in fact, what most would consider defining a set of instructions where each distinct value in that first handful of bits describes more-or-less a distinct instruction. One quickly realizes, however, that there is a fundamental difference between data operands and instruction operands, and, by stating that it is a single instruction architecture, the implication is that there are no instruction operands. Therefore, TFA's architecture is not single instruction.

      It's well known that there are universal logic elements (like the two-input NOR gate), and, by extension, you can create single instruction architectures that compute the universal logic element operation on two arguments, writing the results to a third. Instructions in such architectures are just memory locations -- source A, source B and destination. While incredibly simple, such a machine is going to have a very, very low instruction set density. It's an interesting project for intellectual curiosity (like in an introductory graduate level machine architecture course) but hardly worthy of a Slashdot front page mention.

      --

      Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
  12. Memory of this from Engineering School by systemeng · · Score: 3, Funny

    I remember hearing about building a one instruction computer back in engineering school. The one I heard about was based on Subtract and Branch if Not Equal. My roommate at the time figured it ought to be a way to get a very high clock rate. It seems like he found a proof in a hoary old book that such a computer was in fact Turing complete. I'm sure I'll get flamed for posting a vague recollection but. . . here it is.

  13. AAA AA A A by tonique · · Score: 5, Funny

    AA A AA  AAAA A  AAA AA   A A  AA  A A AAA    A A AAAA    AAA  AAAA

  14. Not new, and not too useful by Animats · · Score: 5, Interesting

    That's an old idea. The classic "one instruction" is "subtract, store, and branch if negative". This works, but the instructions are rather big, since each has both an operand address and a branch address.

    Once you have your one instruction, you need a macroassembler, because you're going to be generating long code sequences for simple operations like "call". Then you write the subroutine library, for shifting, multiplication, division, etc.

    It's a lose on performance. It's a lose on code density. And the guy needed a 1,000,000 gate FPGA to implement it, which is huge for what he's doing. Chuck Moore's original Forth chip, from 1985 had less than 4,000 gates, and delivered good performance, with one Forth word executed per clock.

  15. RISC vs CISC - sigh by peter3125 · · Score: 2, Informative

    "The advantages of RISC are well known — simplifying the CPU core by reducing the complexity of the instruction set allows faster speeds, more registers, and pipelining to provide the appearance of single-cycle execution." I know this has been argued to death already - but it just isn't completely true that a RISC has advantages over a CISC. The gain in speed is usually negated by the lack of expressiveness and the number of registers would help a CISC just as much as a RISC. Why is this being dragged up again?

  16. "One-der" by porges · · Score: 4, Insightful

    The hyphen being so everyone doesn't call it "The O-need-er", as in That Thing You Do.

  17. Re:AAA AA A A by Yvan256 · · Score: 5, Funny

    Compile error. Instruction "A" missing after "A".

  18. One command? by HockeyPuck · · Score: 2, Interesting

    Reminds me of this old saying,

    "Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work."

    I just wish I knew who came up with it.

  19. Re:Ummmm by Anonymous Coward · · Score: 2, Informative

    This isn't true. Modern processors are highly RISCy -- they just have front-ends that translate from CISC ISAs. The last genuinely CISC processor was, I believe, the Pentium (non-pro edition).

  20. According to MS the instruction is by KiwiCanuck · · Score: 3, Funny

    nop

  21. Re:Ummmm by julesh · · Score: 5, Informative

    Is it just me, or does this sound like RISC fanboyism from the 1990s? The "advantages" of RISC are not nearly so clear these days. Indeed, it is getting rather hard to find real RISC chips. While there are chips based on RISC ISA idea (like being load/store and such), they are not RISC. RISC is about having few instructions and instructions that are simple and only do one thing. Those concepts are pretty much thrown out when you start having SIMD units on the chip and such.

    I wouldn't say that's what RISC was about at all; the basic idea was to have only instructions that could be implemented using a few simple pipeline stages. This is a substantial improvement over the microcoded architectures that were prevalent prior to RISC, because it can be much more easily pipelined (or, indeed, pipelined at all). I don't see SIMD as incompatible with RISC in any fashion; it just happens that the instruction operates on very wide data, but it's still a relatively simple instruction that should be able to complete quite quickly.

    These days complex processors are the norm. They have special instructions for special things and that seems to work well. RISC is just not very common, even in systems with a RISC heritage.

    I'd say it's more the other way around. Even in systems with a CISC ISA (e.g. x86), you tend to find that under the hood the CISC instructions are translated into a series of microops that are then dispatched in a system that is somewhat RISC-like. The most common processor family in the world is the ARM family, and all of those processors subscribe pretty well to the original principles of RISC, from instruction set to internal design of the processor core.

    All of these are much more faithful to the principles of RISC than the chip described in TFA, whose instruction performs two memory accesses on each execution -- note that the removal of such instructions and consequent simplification of the execution pipeline (by having only a single pipleline stage that could access memory) was the original motivation behind RISC architectures.

  22. Re:Not the Ultimate RISC Architecture by harry666t · · Score: 4, Interesting
  23. I think it's misleading to call it 1 instruction by shoor · · Score: 2, Insightful

    There can be different architectures for computers, but, nowadays, for many of us, I'd say there is one particular model of an architecture that is likely to be the only one we're really familiar with, and that automatically comes to mind when one speaks of a computer architecture. It's a rather compartmentalized architecture in which the CPU is the place where opcodes are executed and memory is just a big flat address space for data, including instructions. This "transfer triggered" architecture strikes me as being not so much a 1 instruction computer as one where instructions are implemented in a less compartmentalized fashion, spread out among special units activated by addresses, as opposed to the more plain architecture where bit patterns on the address bus simply activate individual generic memory cells along with a read/write signal. More than that may happen, cache memory comes into play with all it's complications for instance, but the 'model' for the programmer is that simple one.

    --
    In theory, theory and practice are the same; in practice they're different. (Yogi Berra & A. Einstein)
  24. Re:AAA AA A A by Dr.+Evil · · Score: 4, Funny

    Press the key to continue.

  25. I'd settle for base 2 by macraig · · Score: 4, Funny

    All this talk about 13th Base makes me jealous, 'cause I've never even got to 2nd Base yet. I'll have to die first and go to heaven before I'll get to 13th Base with a chick.

  26. Geez, History Repeats Itself by CAOgdin · · Score: 2, Informative

    I invented this and published it more than 30 years ago, during the early debate between CISC and RISC microprocessors. It was in the (now defunct) "Modern Data" magazine, in my column "Carol's Microcosm." It's an obvious solution for any computer programmer who understands hardware logic.

  27. Could really crank up the speeds by straponego · · Score: 3, Funny

    ...if the one instruction is NOP. He could easily crack the petanop barrier.

  28. Oh my, you'll never believe what I'm about to say by stonewolf · · Score: 4, Interesting

    A cousin of mine (Howdy Rusty!) described this concept to me in the '70s while I was taking classes toward my CS degree.

    A little background: I went to the good old University of Utah which had a Boroughs 1700 with user writable microcode and so a lot of project centered around writing microcode and designing micro architectures. A friend was trying to code up a single instruction machine based on Curry Combinators. I thought he was nuts, but I liked the idea of a single instruction machine. So, I was talking to my cousin and he described an architecture that had one instruction that was a source and a destination address. Any address could be either memory or a register in a functional unit, an FU for short. No kidding, that is how he described it.

    The only trouble was trying to figure out how to do a conditional branch.

    A few years later while I was in gradual school I solved that problem and wrote paper about it. Being a gradual student I could not publish without permission from my adviser. Well, he got a good laugh out of the idea and told me not to show it to anyone. So, of course I sent it to everyone I knew. They all had a good laugh to. Said it was the funniest thing I had ever written. You see, I was into writing humorous stories at the time and people thought this was another one. Oh well, I have a print out of the thing around here somewhere.

    What I really liked about the architecture is that if you started modifying it to make it more economical, doing things like making the addresses have different lengths and adding a bit to tell you if the long address is the source or the destination, the move architecture starts looking more and more like a classic instruction set architecture. I thought that was very cool. When you look at micro coded architectures and think about a pure move based processor it really does look like all traditional architectures are attempts to make the one instruction machine make more economical use of instruction bits.

    So, how did I solve the conditional branch problem? Pretty much the way this fellow did. Every FU may, or may not, cause condition flags to be set. I added registers where you could read and write the condition bits and read and write the program counter. I also added a mask register that was anded with the condition register so you could enable and disable conditions. Then I just made the current instruction conditional on the values of the flags register anded with the mask register. If the result was non-zero the current instruction was skipped. Of course, the machine had to clear the condition register after each instruction was executed. (Hmm, it would make more sense to only make moves to the program counter conditional and it would make more sense to only clear the flags after a move to the instructions counter... Hey was a gradual student back then! :) That approach allowed you to select say the sign bit from one ALU, do an subtraction by moving values to two registers in the ALU, then jump if the sign bit is set. It also let you directly make any instruction conditional so you could implement something like the ABS() function without any jumps. Or, at least that was the idea.

    I called my one instruction: The Conditional Move From Here To There And Clear Flags, or TCMFHTTACF insturction. The assembly for it was really dull, it just always had the same op code down the left hand edge of the screen... Ok, really, I just never listed anything but addresses when I wrote code for it.

    Nice to see that someone actually built one of these. BTW, this kind of architecture makes it easy to add multiple execution units. With parallel execution and careful use of shared and private FUs and memories you can build a pretty damn powerful special purpose processor without a lot of hardware complexity.

    This just to damn cool... someone finally built it!

    Stonewolf

  29. the amazing zit shrinking cream by epine · · Score: 4, Interesting

    x86 is with us because of backwards compatibility. even Intel were unable to shrug it off with Itanium and various other things.

    x86 is still with us because is-gross turned out to be 20% is-gross and 80% with-gross. The 20% that actually is-gross has been a minor cross to bear, the other 80% was relegated to traps, microcode, and emulation. The most ridiculous CISC instruction from 1980 is a pimple on a bedbug in silicon area thirty years later. Moore's law: the amazing zit shrinking cream.

    you almost need a different compiler for each generation of CPUs

    If your compiler doesn't work well on a 486, it's badly broken. Since then, there have been two different approaches by Intel which annoy the compiler gods: the Pentium and Pentium IV which place a premium on low level instruction scheduling, and everything else, starting with the Pentium Pro and including the Core Duo, all non-deterministic data-flow architectures at heart.

    The main differences in a good Pentium Pro compiler was a few hazard-aware instruction order tweaks, mostly focused on the complex/simple/simple instruction decode architecture. Hand tweaking for the Pentium Pro did not offer as much as with other architectures. It was hard to gain complete control for cycle precise scheduling, and the OOO logic did a good job of mitigating dependency chains on the fly: you neither had a large problem to solve, nor much control in solving it.

    There's a rumour the trace cache is making a reappearance in Sandy Bridge, so perhaps the pendulum is swinging back to the Pentium/Pentium IV side of the fence.

    A long time ago I read some long papers on TTA, around the time Intel went the wrong direction with Itanium (defining bundles as a unit of independent instructions, rather than bundles as units of dependent instructions).

    What makes TTA interesting is having many buses, with as many buses utilized on each clock cycle as possible. This guy has not invented an instruction set. He has invented a microcode engine. In doing so, he's muddied the notion of processor state, so there's no abstraction for handling interrupts. The great thing on an FPGA is that you can program around the need for interrupts, if you can devote a small core to each concurrent task.

    Real microcode instructions tend to have very long bit vectors, so that multiple buses can be coordinated on the same clock cycles. If you aren't trying to throw maximal resources at a single, dominant task, you can instead have many concurrent execution engines, each with a single function unit bus. This works for some applications.

    My feeling about Itanium is that it should have allowed instruction clusters such as complex multiply in a single bundle.

    r = ac - bd
    i = ad + bc

    This requires four inputs from the register file, two outputs to the register file, four multiplications, and two additions. You can find many examples in TAOCP V4F1 of small instructions clusters of this nature. A single eight byte bundle will be hard pressed to encode six arbitrary registers from a 256 register set, but I would argue that you don't need to. Compilers are extremely clever at register colouring, so a clever subset of full generality would prove more than adequate. Hint: invent the compiler and prove this, before committing the design to silicon.

    From a TTA perspective, such a bundle achieves six operations at the expense of just four reads and two writes to the shared register file, with some intermediate results briefly shunted on local sidings. Managing the local sidings introduces some non-determinism from the perspective of the compiler, but nowhere near the scope of OOO shunting overhead in the Pentium Pro.

    I think the Itanium design fell victim to ATM logic: determinism at the expense of higher aggregate throughput in the common case. That bet rarely pays off. They tricked themselves into believing they could bet against the grain by shuffling the downside of this fictio

  30. Re:Not the Ultimate RISC Architecture by epee1221 · · Score: 2, Funny

    Prolog implemented in hardware?

    --
    "The use-mention distinction" is not "enforced here."
  31. Pardon me for injecting something serious, but... by Bruce+Perens · · Score: 2, Insightful

    It seems to me that a transfer oriented architecture is conceptually very easy to parallelize.

  32. Re:Pardon me for injecting something serious, but. by Bruce+Perens · · Score: 2, Insightful

    The problem with social networking is that society, as an aggregate, sucks :-)