Next Generation Stack Computing

Assembly Code was fun by neonprimetime · 2006-08-10 07:23 · Score: 1, Informative

Sounds like fun. I remember writing in RISC assembly code. I fealt like a real man back then. C# just doesn't cut it.

fyi - The Open Office format Slide links don't work, so sadly I had to open the PPT file in Open Office instead.

wikipedia link by whitelines · 2006-08-10 07:24 · Score: 5, Informative

I didn't know either:
http://en.wikipedia.org/wiki/Stack_machines

--
/* TBD */

Re:wikipedia link by SatanicPuppy · 2006-08-10 08:41 · Score: 3, Informative

This source about stack computing is better.

Sadly I actually still work on a stack computer, and I had to go look it up.

--
ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.

Forth? by dslmodem · 2006-08-10 07:28 · Score: 2, Informative

I remember that FORTH is a language support STACK COMPUTING. Hopefully, it is not totally wrong. Unfortunately, it is really hard to understand FORTH program.

http://en.wikipedia.org/wiki/Forth_programming_lan guage

--

^(oo)^pig~

Re:Forth? by CaptnMArk · 2006-08-10 07:55 · Score: 2, Informative

wouldn't that be more like:

face; mouth; teeth; brush; wash; wash;

Re:.NET Compatibility by jfengel · 2006-08-10 07:31 · Score: 3, Informative

The Java Virtual Machine is also sort of stack-based. The JVM bytecode set uses stack operations but the safety checks that it runs make it equivalent to a sliding set of registers not unlike, say, the SPARC architecture. A JIT implementation could do away with the stack, at least in the individual method calls, though the call-return stack would still have to be there.

Text of PPT by Anonymous Coward · 2006-08-10 07:40 · Score: 4, Informative

Introduction
Discovered field by chance in 2000 (blame the Internet)
Hobby project (simulations and assembly) until 2004
Transformed into Independent Study thesis project
Overview of current state of research
Focus on programmer's view

Stack Computers: Origins
First conceived in 1957 by Charles Hamblin at the University of New South Wales, Sydney.
Derived from Jan Lukasiewicz's Polish Notation.
Implemented as the GEORGE (General Order Generator) autocode system for the DEUCE computer.
First hardware implementation of LIFO stack in 1963: English Electric Company's KDF9 computer.
Stack Computers: Origins (Part 2)
Independently discovered in 1958 by Robert S. Barton (US).
Implemented in the Burroughs B5000 (also in 1963).
Better known
Spawned a whole family of stack computers
The First Generation
The First Generation: Features
Multiple independent stacks in main memory
Stacks are randomly accessible data structures
Contained procedure activation records
Evaluated expressions in Reverse Polish Notation
Complex instructions sets trying to directly implement high-level languages (e.g.: PL/1, FORTRAN, ALGOL)
Few hardware buffers (four or less typically)
Supplanted in the 1980's by RISC and better compilers
Stack Computers: A New Hope
Enter Charles H. ("Chuck") Moore:
Creator of the stack-based FORTH language, circa 1970
Left Forth, Inc. in 1981 to pursue hardware implementations
NOVIX (1986), Sh-BOOM (1991), MuP21 (1994), F21 (1998), X18 (2001)
Currently CTO of Intelasys, still working on hardware
product launch expected April 3, 2006 at Microprocessor Summit
Enter Prof. Philip Koopman, Carnegie-Mellon University
Documented salient stack designs in "Stack Computers: The New Wave", 1989
The Second Generation
The Second Generation: Features
Two or more stacks separate from main memory
Stacks are not addressable data structures
Expression evaluation and return addresses kept separate
Simple instruction sets tailored for stack operations
Still around, but low-profile (RTX-2010 in NASA probes)
Strangely, missed by virtually all mainstream literature
Exception: Feldman & Retter's "Computer Architecture", 1993
Arguments and Defense
Taken from Hennessy & Patterson's "Computer Architecture: A Quantitative Approach", 2nd edition
Summary: Valid for First Generation, but not Second
Argument: Variables
More importantly, registers can be used to hold variables. When variables are allocated to registers, the memory traffic reduces, the program speeds up (since registers are faster than memory), and the code density improves (since a register can be named with fewer bits than a memory location).
[H&P, 2nd ed, pg 71]
Manipulating the stack creates no memory traffic
Stacks can be faster than registers since no addressing is required
Lack of register addressing improves code density even more (no operands)
Globals and constants are kept in main memory, or cached on stack for short sequences of related computations
Ultimately no different than a register machine
Argument: Expression Evaluation
Second, registers are easier for a compiler to use and can be used more effectively than other forms of internal storage. For example, on a register machine the expression (A*B)-(C*D)-(E*F) may be evaluated by doing the multiplications in any order, which may be more efficient due to the location of the operands or because of pipelining concerns (see Chapter 3). But on a stack machine the expression must be evaluated left to right, unless special operations or swaps of stack position are done.
[H&P, 2nd ed, pg. 71]
Less pipelining is required to keep a stack machine busy
Location of operands is always the stack: no WAR, WAW dependencies
However: always a RAW dependency between instructions
Infix can be easily compiled to postfix
Dijkstra's "shunting yard" algorithm
Stack swap operations equivalent to register-register move operations
S

JVM by TopSpin · 2006-08-10 07:40 · Score: 4, Informative

Java bytecode is interpreted on a virtual stack based processor. Most bytecode gets JITed into native register based instructions, but the model JVM processor is a stack processor.

Some previous poster noted that CLI is also a stack based model. I can't verify that myself but it wouldn't surprise me; Microsoft is, after all, highly 'innovative' or something.

--
Lurking at the bottom of the gravity well, getting old

Re:JVM by Anonymous Coward · 2006-08-10 07:55 · Score: 3, Informative

Its not like Java was super-innovative to use the stack-based architecture. Java was designed with web-applications in mind, and as such having small code size was extremely important for bandwidth reasons. One of the best features of stack machines is the small instruction size (no need to store register locations). So a stack machine is a natural choice for the JVM. If you wanna nag on .NET copying Java, there are plenty of good reasons, but this isn't one.

Stop Hurting My Eyes by Anonymous Coward · 2006-08-10 07:49 · Score: 5, Informative

Dear Slashdot Contributors,

Please stop describing undergrads doing independent studies as "Experts". Theres a reason that mainstream processors haven't picked up on "Stack Processors", and it has nothing to do with binary compatibility, the difficulty of writing a compiler for their instruction set, or general programming complexity. Stack Machines are really only good for In-Order processing. Wonder why NASA probes have Stack Processors? Because they don't freaking need to do out of order processing in order to get the performance they require, and they probably found stack processors to have a favorable power / performance ratio for their application. You will never see a full blown Windows running on a Stack processor, because Superscalar processors destroy their performance.

"My research project shows that some people wrote nifty papers in the 1970s, but everyone ignored them for an obvious reason I don't understand." -> Not an Expert

Postscript is based on stacks by Anonymous Coward · 2006-08-10 07:53 · Score: 1, Informative

Once upon a time when laser printers cost $10K, before CorelDraw, I learned to program Postscript. Yes it's a real programming language but it prefers to do things with stacks. You could write real programs that did real calculations etc. What a pain. I'd rather program in machine code. I'd rather program in microcode. aargh. You get the point.

I still use postscript to create graphics but if any computation is involved, I use another language.

Having said the above I realize that most languages insulate you from the architecture. If you're programming in Python or Java, you probably wouldn't notice the difference. My experience with Postscript convinces me that such an architecture isn't as efficient as some people think it is though. There's just too much popping and pushing just to get at a value. Since a given value changes its position on the stack, you also need to keep track of where it is. Bleah. This isn't to say that stacks don't have a place. A variant of a stack called a circular buffer really speeds things up on a dsp. For general purpose use though ...

Re:For the same reason language choice always matt by HiThere · 2006-08-10 08:04 · Score: 4, Informative

Sorry, but LISP (though I don't mean Common LISP) is just as much a stack language as FORTH. I think the first LISP that wasn't was LISP 1.5...but I'm rather vague on LISP history. Still, s-expressions are as stack oriented as FORTH is. The interesting thing is the first Algol 60 compiler (well...really an interpreter) I ever used was a stack machine. (That was why it was an interpreter. The real computer was an IBM 7090/7094 BCS system so it ran a stack machine program that Algol was compiled to run on. Whee!) So if you want a good stack computer language you could pick Algol 60. But FORTH is easier, and could even be the assembler language.

OTOH, most FORTHs I've seen use 3 or more stacks. I.e., most of them have a separate stack for floats. What would be *really* nice is if someone built a machine that used Neon as it's assembler. Neon is/was an Object-oriented dialect of FORTH for the Mac that allowed the user to specify early or late binding for variables. It was developed by Kyria Systems, a now-defunct software house. Unfortunately Neon died during a transition to MSWind95. I understand that it is survived by MOPS, but I've never had a machine that MOPS would run on, so I don't know how similar it was.

I think that FORTH would make a truly great assembler...and the more so if that dialect of FORTH were NEON. But I forget how many stacks it used. At least three, but I have a vague memory that it was actually four. The main stack, the return stack, the floating stack, and ??the Object stack??...I don't know.

--

I think we've pushed this "anyone can grow up to be president" thing too far.

Re:Cup of Joe by AKAImBatman · 2006-08-10 08:32 · Score: 2, Informative

Some modern embedded processors have been specifically designed to execute Java naively. e.g. ARM Jazelle and the new Atmel AVR32.

Yes, I'm aware of these processors. However, they're not actually stack-based. They convert the Java instructions into ARM RISC instructions which are register-based. So while such chips are very useful in accelerating Java on standard RISC architectures (also VLIW architectures such as MAJC), they are not actually stack machines.

The only modern example of a stack-based processor for accelerating Java that I am aware of, is the Java Optimized Processor (JOP).

--
Javascript + Nintendo DSi = DSiCade

Re:Oh? by The_Wilschon · 2006-08-10 08:53 · Score: 4, Informative

(and Turing complete)

Bzzzzt! No actual machine can ever be Turing complete, because theoretical Turing machines are capable of calculations which require an unbounded amount of space. That is, there exist algorithms which a Turing machine can execute which require more memory than any computer that you make.

Computer languages can be Turing complete, but physical computers cannot be.

--
SIGSEGV caught, terminating

wait... not that kind of sig.

Re:For the same reason language choice always matt by Anonymous Coward · 2006-08-10 09:34 · Score: 1, Informative

Hmm...I'm playing around with implementing a simple Lisp...to get lexical closures I'm putting all frames in the heap instead of the stack, which I think is a fairly standard technique (tho not necessarily what a really optimized compiled Lisp does). Doesn't really seem stack-based to me.

A useful HTML article (better than 9999TB AVI) by roman_mir · 2006-08-10 09:51 · Score: 4, Informative

And here it is.

--
You can't handle the truth.

Re:For the same reason language choice always matt by BlueGecko · 2006-08-10 10:27 · Score: 2, Informative

If you miss Neon, you'll be happy to know that you can get about 90% of its implementation and 100% of its concepts in the form of PowerMops, which is open-source and runs great and natively on Leopard. I haven't used it for anything recently, but it's worked fine for hobbyist stuff I've done in the past. I strongly encourage you to check it out.

Re:Oh? by novus+ordo · 2006-08-10 11:04 · Score: 2, Informative

What a brainfuck.

--
"You're everywhere. You're omnivorous."

Re:Size and functionality by Chris+Burke · 2006-08-10 11:11 · Score: 4, Informative

Between 20 - 40% of your compiled code is spent moving data in and out of the accumulator register, since most instructions depend on
specific data being in that register - to the point that they introduced zero-cycle add/mov functionality in the P4 line - basically, if your code performs an add and then movs ax immediately
out to memory (like the above code - and possibly the most common arithmetic operation in compiled code), if the pipeline and data caches are all available, the P4 will
execute both instructions with enough time to put something else in the instruction pipeline that
cycle. It's not really a zero-cycle function - you can do something like 2.5 (add,mov,add,mov,add) a cycle if you stack them back to back to back, for instance...

The only zero-cycle mov I'm familiar with on the P4 is a register-to-register mov, and that just takes advantage of the fact that the P4 has a physical register file and a map between the architectural registers and the physical ones. E.g. given
add bx, [cx]
mov ax, bx

the mapper might assign bx to physical register 10. It will then realize that ax is just a copy of bx, so it will make ax point at register 10 as well, and the mov never has to execute at all, thus 'zero cycle'.

You seem to be saying that the P4 can write the result of an add to the cache in zero cycles, or more than two values in a cycle, which doesn't mesh with what i know of the P4 which is that it has a two-ported cache. But I'm only intimately familiar with early revs of P4; if you know what rev this was added in I would be interested.

--

The enemies of Democracy are

Re:Stack - bad for speed, good for low power by treyb · 2006-08-10 11:21 · Score: 2, Informative

Chuck Moore (the Forth guy) came to a slightly different conclusion: good for speed and good for low power. He uses the chip real estate you want to use pipelining instructions to add another core. In the case of the SeaForth processors, he added 23 other cores. Granted, that chip doesn't pretend to do anything but target embedded devices, but he demonstrates that stack machines can run quickly and use little power.

Slashdot Mirror

Next Generation Stack Computing

20 of 347 comments (clear)