Next Generation Stack Computing
mymanfryday writes "It seems that stack computers might be the next big thing. Expert Eric
Laforest talks
about stack computers and why they are better than register-based
computers. Apparently NASA uses stack computers in some of their probes. He
also claims that a kernel would only be a few kilobytes large! I wonder if
Windows will be supported on a stack computer in the future?"
In Redmond, 640 bytes isn't enough for anybody.
Sounds like fun. I remember writing in RISC assembly code. I fealt like a real man back then. C# just doesn't cut it.
fyi - The Open Office format Slide links don't work, so sadly I had to open the PPT file in Open Office instead.
It say blah blah blah here's some videos to torrent.
I thought the 387 and Burroughs B5000 were odd, antiquated architectures, but apparently they're the wave of the future.
Ewige Blumenkraft.
I didn't know either:
http://en.wikipedia.org/wiki/Stack_machines
/* TBD */
Well, time to test out the CSC's new torrenting capabilities.
He also claims that a kernel would only be a few kilobytes large!
I've seen sub-1k kernels for FORTH systems before. The question is, how much functionality do you want to wrap into that kernel? More capable kernels would, of course, be correspondingly larger.
That said, stack computing and languages like FORTH have long been underrated. Depending on the application, the combination of stack computers and postfix languages can be quite powerful.
Proud member of the Weirdo-American community.
Someone's having a larf. Oh you do crack me up Messrs mymanfryday and CmdrTaco.
Please try the bittorrent. No, wait... Teach em a lesson, make em burn.
Deleted
Interestingly enough the Microsoft Intermediate Language (MSIL) that .NET apps are compiled to before being JITed into machine code is actually built around a stack based system as well... No doubt porting the .NET Framework over to such a system would be quite easy... and give much in the way of performance boosts (especially on startup).
Of course... that would still depend on a version of Windows for it to run on.
Help Brendan pay off his student loans
Mathematicians like stack computers because its easier to formally prove the behaviour of algorithms using stacks.
Hardware engineers like stack computers because the hardware is interesting and easy to design
Investors hate them because they keep loosing money on them.
Evil people are out to get you.
Apparently NASA uses stack computers in some of their probes.
In space no one can hear you blue screen of death. Unless you work for Lucas Films.
I once had a job where I had to sort through stacks of computers. Overall the stacks were pretty useless, a bunch of burnt out 286s. Even if you put all your redundant computing power into a stack doesn't neccesarily make it better!
Does this mean my old HP48GX will be considered cutting edge? I should get ready to sell it on EBay when the craze hits! All my old classmates will be forced to allow me to have the last laugh after I was on the recieving end of much ridicule for using the HP when the TI was the only thing "officially" endorsed by all the calculus textbooks. I don't know if I could ever part with it though. I still use it almost daily, the thing continues to kick ass.
I remember that FORTH is a language support STACK COMPUTING. Hopefully, it is not totally wrong. Unfortunately, it is really hard to understand FORTH program.
http://en.wikipedia.org/wiki/Forth_programming_lan guage
^(oo)^pig~
Damnit, try and make a half-ass witty remark... and then add first pr0st at the end. Something you speculate about the artcile, which will get you modded high at first, and by the time anyone reads the article and realises you were wrong, it's too late.
This way you get first post, and karma whore at the same time. You will soon learn, my pupil.
Find Nearby Indie Events
Universities may have tons of bandwidth, but the servers just can't take it. Looks like this site's dead.
Anyone seeding the torrents yet?
Nothing sucks like a Vax, nothing blows like a PowerMac G4
No, no, no, NO! This is SLASHDOT! The proper response is "Does it run Linux "?
Evil is as eval("does");
Patrick Stewart would be displeased by this misleading headline.
A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
Since the dawn of time, the x86 FPU has been organized as a stack, which has been recognized as a mistake by modern computer architects. For one thing, it is hard to get a stack architecture to take advantage of multiple functional units. Only recently, with the development of SSE, 64 bit modes and other additions have we been able to move away from the stack on the x86.
[...]
In case anyone is wondering, I'm only half joking. Java is a stack-based platform, perfectly suited to processors that don't actually exist in real-life. Sun created the picoJava in the 90's, and claimed that it was faster than the Pentium of the day. They may have been correct at the time, but the chip was never widely used, so it was difficult to say for sure. With CPU speed becoming less important than stability, I/O, and correctness of code, it's possible that such machines may start showing up in more mainstream applications.
Javascript + Nintendo DSi = DSiCade
It's all fun and games until someone hits a stack underflow.
"They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
I can hear all this experts, and I wonder why they are THE experts, one of the important movements in programming is non-stack based programming methodologies. (http://www.theserverside.com/tss) had another expert opinion on why non-stack programming was better.
do you have a mirror for that? I'm too lazy to look.
Introduction
Discovered field by chance in 2000 (blame the Internet)
Hobby project (simulations and assembly) until 2004
Transformed into Independent Study thesis project
Overview of current state of research
Focus on programmer's view
Stack Computers: Origins
First conceived in 1957 by Charles Hamblin at the University of New South Wales, Sydney.
Derived from Jan Lukasiewicz's Polish Notation.
Implemented as the GEORGE (General Order Generator) autocode system for the DEUCE computer.
First hardware implementation of LIFO stack in 1963: English Electric Company's KDF9 computer.
Stack Computers: Origins (Part 2)
Independently discovered in 1958 by Robert S. Barton (US).
Implemented in the Burroughs B5000 (also in 1963).
Better known
Spawned a whole family of stack computers
The First Generation
The First Generation: Features
Multiple independent stacks in main memory
Stacks are randomly accessible data structures
Contained procedure activation records
Evaluated expressions in Reverse Polish Notation
Complex instructions sets trying to directly implement high-level languages (e.g.: PL/1, FORTRAN, ALGOL)
Few hardware buffers (four or less typically)
Supplanted in the 1980's by RISC and better compilers
Stack Computers: A New Hope
Enter Charles H. ("Chuck") Moore:
Creator of the stack-based FORTH language, circa 1970
Left Forth, Inc. in 1981 to pursue hardware implementations
NOVIX (1986), Sh-BOOM (1991), MuP21 (1994), F21 (1998), X18 (2001)
Currently CTO of Intelasys, still working on hardware
product launch expected April 3, 2006 at Microprocessor Summit
Enter Prof. Philip Koopman, Carnegie-Mellon University
Documented salient stack designs in "Stack Computers: The New Wave", 1989
The Second Generation
The Second Generation: Features
Two or more stacks separate from main memory
Stacks are not addressable data structures
Expression evaluation and return addresses kept separate
Simple instruction sets tailored for stack operations
Still around, but low-profile (RTX-2010 in NASA probes)
Strangely, missed by virtually all mainstream literature
Exception: Feldman & Retter's "Computer Architecture", 1993
Arguments and Defense
Taken from Hennessy & Patterson's "Computer Architecture: A Quantitative Approach", 2nd edition
Summary: Valid for First Generation, but not Second
Argument: Variables
More importantly, registers can be used to hold variables. When variables are allocated to registers, the memory traffic reduces, the program speeds up (since registers are faster than memory), and the code density improves (since a register can be named with fewer bits than a memory location).
[H&P, 2nd ed, pg 71]
Manipulating the stack creates no memory traffic
Stacks can be faster than registers since no addressing is required
Lack of register addressing improves code density even more (no operands)
Globals and constants are kept in main memory, or cached on stack for short sequences of related computations
Ultimately no different than a register machine
Argument: Expression Evaluation
Second, registers are easier for a compiler to use and can be used more effectively than other forms of internal storage. For example, on a register machine the expression (A*B)-(C*D)-(E*F) may be evaluated by doing the multiplications in any order, which may be more efficient due to the location of the operands or because of pipelining concerns (see Chapter 3). But on a stack machine the expression must be evaluated left to right, unless special operations or swaps of stack position are done.
[H&P, 2nd ed, pg. 71]
Less pipelining is required to keep a stack machine busy
Location of operands is always the stack: no WAR, WAW dependencies
However: always a RAW dependency between instructions
Infix can be easily compiled to postfix
Dijkstra's "shunting yard" algorithm
Stack swap operations equivalent to register-register move operations
S
Java bytecode is interpreted on a virtual stack based processor. Most bytecode gets JITed into native register based instructions, but the model JVM processor is a stack processor.
Some previous poster noted that CLI is also a stack based model. I can't verify that myself but it wouldn't surprise me; Microsoft is, after all, highly 'innovative' or something.
Lurking at the bottom of the gravity well, getting old
that almost every /. user encounters every day: Postscript and PDF.
Clear, Dark Skies
Even in assembler, the mainstream hasn't been programming to the metal since Pentium I.
Beginning with Pentium II, and propagating to pretty much all of the other archictures in a short time, non of the mainstream CPUs have exposed their metal. We have an instruction set, but it's torn into primitives and scheduled for execution. We don't see the primitives, not even in assembler. AFAIK, there isn't even a way to use the true primitives, except perhaps on the Transmeta, where it was undocumented.
So in this light, since we're already fairly far from the true metal, it seems to me that it makes a lot of sense to re-evaluate the instruction set itself. Of course one could raise the Itanium argument, but I would also argue that politics were too big a part, there. Then again, one could also argue that x86 and amd64 are just so entrenched that it doesn't matter, and they do run well on today's hardware.
Then again I could cite my old favorite, the 6809. It started from the same origins and precepts as RISC, but a different attitude. RISC simply tried to optimize the most common operations, at the expense of less common ones. With the 6809, they tried to understand WHY certain things were happening, and how those things could be done better and faster. They ended up with a few more transistors, the same speed, and something approaching 3X the throughput, as compared to the 6800. More similar to the current topic, there was a paper on 'contour mapping', mapping blocks of cache into stacks and data structures. The 6809 was too old for a cache, but it seems to me that combining it's concepts with the contour mapping would be interesting indeed.
But like stack engines, it's not x86/amd64 compatible.
The living have better things to do than to continue hating the dead.
You “wonder if Windows will run on a stack computer?” Where do you people come up with this nonsense? This is as irrelevant as saying: "someday, car tires will not be made of rubber. I wonder if Windows will support them?" Really, there is no need to try to come up with insightful remarks or questions to tack on the end of your story submissions. Just present the article and leave it at that. Let everyone else do the thinking.
Join Tor today!
because you should choose a language that fits the problem, not the reverse.
If the problem is "make this work on a stack based machine" then look out! You're gonna have aging LISP programmers crawling out of the woodwork to show off their obsolete, er, elite, programming skills.
Clear, Dark Skies
Don't forget the venerable HP3000 "Classic" machines like the Series 68 and 70 machines.
Dear Slashdot Contributors,
Please stop describing undergrads doing independent studies as "Experts". Theres a reason that mainstream processors haven't picked up on "Stack Processors", and it has nothing to do with binary compatibility, the difficulty of writing a compiler for their instruction set, or general programming complexity. Stack Machines are really only good for In-Order processing. Wonder why NASA probes have Stack Processors? Because they don't freaking need to do out of order processing in order to get the performance they require, and they probably found stack processors to have a favorable power / performance ratio for their application. You will never see a full blown Windows running on a Stack processor, because Superscalar processors destroy their performance.
"My research project shows that some people wrote nifty papers in the 1970s, but everyone ignored them for an obvious reason I don't understand." -> Not an Expert
1. stack-based computers -> register-based computers -> stack-based computers
2. virtual machine monitor -> operating system (e.g., MS-DOS, Unix, and Windows) directly on top of the hardware -> virtual machine monitor
3. dumb terminal -> personal computer -> thin client
4. Al Capone's favorite car -> Chrysler LeBaron -> PT Cruiser
5. 1967 Camaro with aggressive, muscular form -> 1982 Camaro with slick, crack-cocaine form -> 2009 Camaro with aggressive, muscular form
6. 1960's "Mission Impossible" -> lots of boring TV-series/theater-movies -> 1990's "Mission Impossible"
7. 1960's "Bewitched" -> lots of boring TV-series/theater-movies -> 2000's "Bewitched"
8. 1970's "Brady Bunch" -> lots of boring TV-series/theater-movies -> 1990's "Brady Bunch"
In art, what goes around comes around. Note that I said, "computer architecture", not "computer science". Computer science is real science. Computer architecture is not. It is art. Just see the 8 items in the above list.
Do these come in short- and tall-stack versions?
Are maple syrup and butter options?
If "disco" means "I learn" in Latin, does "discothèque" mean "I learn technology"?
Once upon a time when laser printers cost $10K, before CorelDraw, I learned to program Postscript. Yes it's a real programming language but it prefers to do things with stacks. You could write real programs that did real calculations etc. What a pain. I'd rather program in machine code. I'd rather program in microcode. aargh. You get the point.
...
I still use postscript to create graphics but if any computation is involved, I use another language.
Having said the above I realize that most languages insulate you from the architecture. If you're programming in Python or Java, you probably wouldn't notice the difference. My experience with Postscript convinces me that such an architecture isn't as efficient as some people think it is though. There's just too much popping and pushing just to get at a value. Since a given value changes its position on the stack, you also need to keep track of where it is. Bleah. This isn't to say that stacks don't have a place. A variant of a stack called a circular buffer really speeds things up on a dsp. For general purpose use though
Who can forget the English Electric Leo-Marconi KDF9, the British stack machine from 1960. That, and the Burroughs 5000, were where it all began.
Stack machines are simple and straightforward to build, but are hard to accelerate or optimize. Classically, there's a bottleneck at the top of the stack; everything has to go through there. With register machines, low-level concurrency is easier. There's been very little work on superscalar stack machines. This student paper from Berkeley is one of the few efforts.
It's nice that you can build a Forth machine with about 4000 gates, but who cares today? It would have made more sense in the vacuum tube era.
If I saw one, I would exclaim, "Wow, that computer is stacked!"
Click here or here.
How important is this parallism? Consider that modern processors have 10-30 pipeline stages, 3-6 execution units that can have an instruction executing at each stage; moreover, most of them have out-of-order execution units that handle instructions more in the order that data is available for them rather than the order they are listed in the object file (and main memory is hundreds of times slower than the processors themselves, so this is important!). Typically, such processors can have more than 100 instructions in some stage of execution (more than 250 for IBM POWER5 :-)
Consider, also, that the only pieces of anything-like-current stack hardware are Intel x87-style floating point units, that Intel is throwing away -- for good reason! -- in favor of (SSE) vector style units. In the current Intel processors, the vector unit emulates an x87 if it needs to -- but giving only a quarter of the performance.
Someone made remarks about Java and .Net interpreters: in both cases, the interpreter is simulating a purely scalar machine with no fine grained parallelism; no wonder an extensible software-stack implementation is one of the simplest to implement. Stacks are not the way that true Java compilers like gjc generate code, though!
No, stack-based hardware is not a good idea. And haven't been since some time in the eighties, when processors started to be pipelined, and processor speed started outstripping memory speed.
"My opinions are my own, and I've got *lots* of them!"
Apparently NASA uses stack computers in some of their probes.
Is that supposed to be a ringing endorsement? I thought NASA was using components the rest of the world treated as obsolete due their proven durability and reliability in the radiation of space.
Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
A big problem with trying to sell stack-based CPUs, or for that matter any hardware different from what "everyone else" uses, is that since they're highly incompatible everything would have to be rewritten for them, and besides, the stuff that's already out there is "good enough". This is one of the reasons for the big Apple/x86 battle that still rages today, and also comes into effect with Linux and other *ix OSes.
The only way, IMO, that a stack-based machine would ever get a toe in the door would be if one were produced that was priced similarly to the current x86 platforms but was several times as fast, or was within 3X the price and was two orders of magnitude faster.
Nothing to see here. Sorry.
But why do you need out-of-order execution? Well, misses to memory are very expensive these days - it can easily take from 200 to 400 cycles to service a load that misses all the way to main memory. This can have a significant effect on performance. What out-of-order execution does is to allow independent instructions that are younger than the load to execute in parallel with it. Quite often these parallely-executed instruction will generate other misses to main memory, overlapping their latencies. So - latency of loads that miss is still very high, but at the very least the processor is not idle while servicing them (for a good read see "MLP Yes! ILP no!" by Andy Glew)
Itanium and Sparc compensate for the fact that they don't do stuff out-of-order by putting sh*tloads of L2/3 cache on-chip. The cost of a miss is still very high, but it happens much less often. The manufacturing cost of a chip is also much higher.
Note that what NASA is sending into space is "old" tech. The reason - well, cosmic rays are much stronger in outer space, and the smaller the gate, the easier it is for them to flip its state.
P.S. I'm a computer architect.
The Raven
Sorry, but LISP (though I don't mean Common LISP) is just as much a stack language as FORTH. I think the first LISP that wasn't was LISP 1.5...but I'm rather vague on LISP history. Still, s-expressions are as stack oriented as FORTH is. The interesting thing is the first Algol 60 compiler (well...really an interpreter) I ever used was a stack machine. (That was why it was an interpreter. The real computer was an IBM 7090/7094 BCS system so it ran a stack machine program that Algol was compiled to run on. Whee!) So if you want a good stack computer language you could pick Algol 60. But FORTH is easier, and could even be the assembler language.
OTOH, most FORTHs I've seen use 3 or more stacks. I.e., most of them have a separate stack for floats. What would be *really* nice is if someone built a machine that used Neon as it's assembler. Neon is/was an Object-oriented dialect of FORTH for the Mac that allowed the user to specify early or late binding for variables. It was developed by Kyria Systems, a now-defunct software house. Unfortunately Neon died during a transition to MSWind95. I understand that it is survived by MOPS, but I've never had a machine that MOPS would run on, so I don't know how similar it was.
I think that FORTH would make a truly great assembler...and the more so if that dialect of FORTH were NEON. But I forget how many stacks it used. At least three, but I have a vague memory that it was actually four. The main stack, the return stack, the floating stack, and ??the Object stack??...I don't know.
I think we've pushed this "anyone can grow up to be president" thing too far.
Chu__ Moo__ is my hero!
No folly is more costly than the folly of intolerant idealism. - Winston Churchill
Okay, that's a priceless quote!
-Rick
"Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs
A 1K kernel will hardly make a lot of difference to anyone. I have a whole gigabyte of RAM in my machine. 1K? 100K? 1 Megabyte? 10 Megs? That's a pretty chunky kernel and we're still taking up a trivial chunk of our memory. Even the huge executables that we have these days aren't causing serious memory issues on modern PCs. Memory has outpaced executable size.
What we do want is lower power, and smaller. We can always take advantage of small or low power devices. If you want to reduce executable size, write a simple stack machine and an interpreter.
Normally this kind of stuff doesn't bug me, but this is like an article in 2006 proclaiming the benefits of object-oriented programming. Doesn't anyone know their computing history?
There were stack computers in the 1960s and 1970s. There was a resurgence of interest in the 1980s--primarily because of Forth's popularity in embedded systems--resulting in a slew of stack-based "Forth engines." Forth creator Chuck Moore has been working on a series of custom Forth CPUs for 20+ years now. His latest version has 24 cores on one chip (and was entirely designed by one person and uses MILLIWATTS of power).
Stack processors and languages have one big advantage: they minimize the overall complexity of a system. The tradeoff is that they often push some of that complexity onto the programmer. That's why Forth tends to shine for small, controlled systems (like a fuel injector or telescope controller or fire alarm), but you don't see people writing 3D video games or web browsers in Forth.
I'm surprised no one's mentioned the transputer.
I always equivocate. Well, almost always.
That would rock! Finally, the supposedly superior yet classically discarded technologies are coming back!
I cannot consider your post valid, as you've claimed that 2000's "Bewitched" was 'art'...
http://en.wikipedia.org/wiki/Saturn_(microprocesso r)
James Tiberius Kirk: "Spock, the women on your planet are logical. No other planet in the galaxy can make that claim."
If stack-based hardware does become more the norm, JIT JVMs will become a lot simpler since the JVM is stack-based.
the proper term is "griddle computing" and it encompasses both pancakes -and- waffles, along with syntactic salt like bacon and eggs.
there is no need to sign your posts. this isn't usenet. your username is right there above your post. stop it.
As others have undoubltedly mentioned, there's not much concurrency you can do with a stack. The top of stack operation has to finish before the next op can pop anything off.
Forget stacks.
In case y'all forgot, the Intel x87 FP engine was stack based until SSE. And it is still in there!
You pushed FP numbers onto F(0:7) and the operations worked on the stack. They than had to be popped off to the accumulator to load to memory.
Kids these days, I tells ya.
https://www.accountkiller.com/removal-requested
You, my friend, are the geekiest geek I've ever seen!
Imagine a Beowulf cluster of these.
(Somebody had to do it.)
0*0
00*
***
IMHO, for all purposes. Everything gets passed on the stack anyway. The last register OS I can remember was AmigaOS.
"I wonder if Windows will be supported on a stack computer in the future?"
Why?
Being small and simple is precisely the point. Tiny stack machines give the opportunity for massive parallelism. Imagine hundreds or thousands of processors in a very small space, handling CPU intensive but relatively simple algorithms. How many such tiny processors could be printed onto a single chip?
Most computer users only consider the high-end chips that they use every day in their laptops and desktops. They are unaware of the amazing chip families at the other end of the spectrum; the new controllers and DSP's. IMHO, simple systems like FORTH will always have a role in ubiquitous computing.
Does that mean you could get a stack-based coprocessor card and make their code run faster?
Stack computers, are basically like rack computers, except you can't pull out the one at the bottom.
The revolution will not be televised... but it will have a page on Wikipedia
I'm a chip designer, and I am working on my Ph.D. in CS. The idea of stack machines is something I have researched a bit, and I have drawn some of my own conclusions.
The main advantage of stack machines is that all or most parameters for each instruction are implicit. Aside from stack shuffle/rotate instructions, the operands are always the top few on the stack. This makes instructions very small. The logic is also exceedingly simple (for fixed-stack designs). If you want a simple, low-power CPU, a stack machine is what you want.
Where I explored this issue, however, is in the realm of high-performance computing. The key advantage of a stack architecture is that smaller instructions take less time to fetch from memory. If your RISC instructions are 32 bits, but your stack machine instructions are 8 bits, then your instruction caches are effectively 4x larger, and your over-all cache miss penalty is greatly reduced.
The problem with stack machines is that they're damn near impossible to add instruction-level parallelism to. With a RISC machine, near-by instructions that deal with different registers (i.e. no dependencies) can be executed in parallel (whether that's multi-issue or just pipelining). With a stack machine, everything wants to read/write the top of the stack.
I came up with two things to deal with this problem, that are very much like the CISC-to-RISC translation done by modern x86 processors, so it's more of a stack ISA on a RISC architecture. One is that the stack is virtual. When you want to pop from the stack, what's happening in the front-end of the CPU is that you're just popping register numbers corresponding to a flat register file. When you want to push, you're allocating an assigned register number from the flat register file. Now, if you can get two instructions going that read different parts of the stack and write (naturally) to different locations, you can parallelize them. The second part is a healthy set of register shuffling instructions. Since you're doing all of this allocation up front, shuffling registers is as simple as renumbering things in your virtual stack. So a swap operation swaps two register numbers (rather than their contents), and a rotate operation renumbers a bunch of them, but the pending instructions being executed still dump their results in the same physical registers.
This all sounds great, but there are some problems with this:
(1) The shuffling instructions are separate instructions. With a RISC processor, you have more information all in one unit. Although you could try to fetch and execute multiple stack instructions at once, it's much more complicated to execute four stack instructions in parallel than to execute a single RISC instruction, even though they require the same amount of memory.
(2) You need a lot of shuffling instructions. Say your stack contains values A, B, C, and D, and you want to sum them. Without shuffling, you'd add A and B, yielding E, then add E and C, yielding F, then add F and D. Three add instructions. If your adder(s) is/are pipelined, you'd like to add A+B and C+D in parallel or overlapping, THEN wait around for their results and do the third add. The problem is that to do that, you'd need to add A+B, then rotate C to the top then D to the top, then add, then add again. The first case was 3 instructions; the second case is 5 instructions. Depending on your architecture, the extra shuffle instructions may take so long to process that you might as well just have waited. No speed gain at all.
(3) The extra shuffing instructions take up space. Optimizers are hard to write. Although it's conceivable that one could optimize for this architecture so as to avoid as many shuffling instructions as possible, you still end up taking up quite a lot of space with them, potentially offsetting much of the space savings that you got from switching from RISC to stack.
So, there you have it. Somewhat OT, because surely NASA's primary goal has got to be low-power, but also somewhat on-topic because stack architectures aren't the holy grail. Just ideal for some limited applications.
KIDS YOU OH !
." Those who don't remember the past get to have all the fun reimplementing it!"
while it was a stack computer, i always thought the most distinctive feature of the transputer was its parallel design, which could be exploited when programming it in occam
Knuth's MMIX architecture uses registers, but the registers themselves are on a register stack. Perhaps this architecture provides the best of both worlds.
What a fool believes, he sees, no wise man has the power to reason away.
Space chips do have a few requirements that those on Earth don't. Ones that are exposed to the space environment need to be resistant to alpha particles (can cause bit flips). Power consumption must be as low as possible. And very important, especially in manned flight, is heat dissipation. Most common off-the-shelf laptops generate way too much heat to be used in a controlled life support system where heat must be carefully managed into and out of the system.
I'm still waiting for a 64 bit processor that treats all registers the same. i.e. one load and one store instruction, but you can do fpadd or regular add on the same registers. This IMHO will reduce the number of opcodes needed, and you usually don't use a lot of FP registers and a lot of integer registers at the same time. Pure stacks suck BTW - you really need a swap (up to some depth) or a copy instruction to bring data to the top. A pure stack is too destructive. I do like the idea of a pure return stack with separate data stack.
You Forth (heart) if honk then
Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.
architectually PowerPC is better than x86/x86_64. but the economics of the situation gives us much high performance and lower prices on x86. If industry focused as much energy on PowerPC as they do x86, then we'd have slightly better computers. I think the same applies to stack computers. They will never be "better" than x86 unless the market dramatically changed and moved away from x86. Right now there isn't much experience out there for implementing stack computers. Common tricks used for branch prediction and cache optimization would be significantly different on a stack computer. I would say that stack computers are only theoretically better than x86 in terms of possible performance.
“Common sense is not so common.” — Voltaire
Hmm...I'm playing around with implementing a simple Lisp...to get lexical closures I'm putting all frames in the heap instead of the stack, which I think is a fairly standard technique (tho not necessarily what a really optimized compiled Lisp does). Doesn't really seem stack-based to me.
because by the time Windows Vista (Forever) comes out, stack machine might have become mainstream
In Soviet Russia, Windows runs YOU!
Wouldnt that be cool.. we have come full circile..
---- Booth was a patriot ----
Anybody remember the RPN vs algebraic entry wars? (HP Rules!)
.. paranoid crackpot leftover from the days of Amiga.
Previous, Microsoft, Stacker, bought?
Hmm, except that it was a compression technology and they first stole it, then bought it after a lawsuit. Oh, well - or rather: Well, oh...
"Stack computers, are basically like rack computers, except you can't pull out the one at the bottom."
...Either way you save a lot of money on the actual rack.
Right, if you pull them from the bottom they are called queue computers. For stack computers it is preferred that you take and replace from the top.
It is bullshit, all a stack computer basically does is to shift the register based ops back into the stack pointer domain, there is not too much to win there except the processors become somewhat simpler. The code itself in fact becomes even somewhat larger due to more excessive stack operations. Many VMs nowadays use stack machines in their core processing (java, .net, lisp etc...) the reason for this is that those machines map easier to many processor architectures than a vm which uses lots of registers.
And here it is.
You can't handle the truth.
PostScript and PDF are stack-based languages used by millions of people very day. PostScript, in particular, has been around a sinc the early 1980s.
http://en.wikipedia.org/wiki/PostScript
Just think, the next big thing in stack computing: Towers^H^H^H^H^H^H Stacks of Hanoi at 2Ghz!
http://ficl.sourceforge.net/
If you liked neon you might like ficl; it has the same features you mentioned with choice of early and late binding.
Many forth object systems do.
I did a port of Ficl to MS Smartphone 2002/2003 but couldn't get permission to release it.
(By port I mean filling in some missing libc functions)
Sam
blog.sam.liddicott.com
Of course... that would still depend on a version of Windows for it to run on.
That was a planned feature for Vista, but of course, it got dropped.
a Beowolf 'stack' of these!
Imagine a Beowulf cluster of them.
---- "XML is like violence. If it doesn't fix the problem, you aren't using enough."
Kind of looks like the way my HP RPN calculator works
Shop smart, Shop S-Mart.
If you miss Neon, you'll be happy to know that you can get about 90% of its implementation and 100% of its concepts in the form of PowerMops, which is open-source and runs great and natively on Leopard. I haven't used it for anything recently, but it's worked fine for hobbyist stuff I've done in the past. I strongly encourage you to check it out.
For more background see the Slashdot interview with Chuck Moore.
Seastead this.
Look at the stack on that one!
How come noone has mentioned the language Joy?
I've looked into it a couple times, and it seems pretty neat. In a word, functional concatenation.
Plus, as we all know, functional languages are so much more fun than procedural.
-------
Incite and flee.
An excerpt from a bit longer essay I wrote:
Seastead this.
It is bullshit, all a stack computer basically does is to shift the register based ops back into the stack pointer domain, there is not too much to win there except the processors become somewhat simpler.
Actually, much, much simpler. Sometimes 3 orders of magnitude or more (in terms of # of transistors). Yes, you lose out-of-order execution, branch prediction, all of the fancy pipeline stuff. What you gain is simplicity and a good cost/performance ratio.
The code itself in fact becomes even somewhat larger due to more excessive stack operations.
Now that is BS. If your code is larger in a stack-based language than in an imperative language then you are doing something wrong. Refactoring: learn it, love it.
I totally agree that instruction level parallelism for a stack machine is a horrible monster to match up to a risc processor. But there are some nice peices for parallism that work well. The circuits for a stack processor take up a small amount of relative space, and require less power per stack "unit". So while making a given stack chip faster might be a monster problem, adding more stack units to a CPU die may allow for some serious performance gains. So there are a few items where the fine grained parallelism just isnt going to happpen. But as far as coarser grained parallelism goes there might be some possibilities.
There are some super low latency situations that I could see this as an untenable situation, and a few linear problems that just wont crumble down simply. Of course I might be missing something obvious here...
Storm
Since I am one of the few people who rad slash dot who has actually programmed a stack machine - my opinion might count.
This is a good architecture. It makes a great deal of sense. One of the things that a stack machine does is pull the memory (at least the automatic variables) for the function currently being executed together at the top of the stack. This means it is feasable to load this memory into cache or even into cpu registers which greatly increases speed. The HP3000 only had 4 registers for the stack top - today a CPU can have 100's. Whether there is 4 or 40 or 400 registers the instruction set is the same.
This has always been a very smart way to organise a machine.
I did a computer architecture course a number of years ago. One day, we came to the consensus that the X86 architecture was an example of every computer architecture in existence. You want load store: look at all those MOV AX, xxxx instructions. You want register RISC, look at all those registers AX, BX, CX, DX, SI, DI, SP, BP. You want stack based: look at the FPU. You want vector parallel processing, look at those MMX/SSE instructions. You want symmetric multi-processing, look at those dual cores.
...
The course went quickly downhill after this observation. No one could figure out how incorporating every processor architecture into one product was a good thing
A few kilobytes for a kernel? But how USEFUL is it? I got a kernel right ova' hea': PUSH $0 POP EAX JMP .-$9
done.
If a CISC with a small logical register set can find leafs in the pipe and distribute them to cores, even a true TOS limited stack processor could be designed to find the independent leaves and distribute them.
loop:
jmp
bra loop
or something like that?
Direct threading actually turned out to be slower than indirect threading. I know from the implementation I did (which is still out there and downloadable, somewhere).
by the virtual execution model. (Witness the 8086 descendent that is probably munching your data stream right now.)
I'll tire of saying so pretty soon, but here is one more try.
The current x86 chips _rename_ their registers to ease the bottleneck.
With only one TOS, renaming becomes a much simpler process. SWAP and ROT and DROP could easily become zero-cycle no-ops.
The pipline optimizer simply has to go hunting for independent leaf expressions, and since they are bounded by the patterns of growth and shrinkage of the stack, they should be easy to find. It has been done, and it works.
As I haven't seen anyone mention it, I will.
Factor is a fucking brilliant (yes, I mean that) stack-based language. It is, in many ways, a hyper-modern Forth. If you've never dealt with such a thing before, give it a look. It'll completely change how you think about programming.
is way, way, overkill for the stack.
Expression leafs exist, the patterns of stack usage reveal them, if the leafs can be found they can be distributed.
No, the pipelines you're probably used to are not appropriate for the redistribution.
I wonder if Windows will be supported on a stack computer in the future?"
Considering it took Apple over three years longer than Microsoft to support 64-bit computers, I think the better question is how many centuries it will be before OS-X will run on a stack.
are mostly derived from everyone being dazzled by the magic and missing the meaning.
The biggest issue is the source of the compactness. That much code re-use makes it difficult to separate modules, either for design purposes or for security/safety.
I'd go on, but I have to go to bed. Graveyard shift was fun, but I'm tired.
... concepts is to be expected.
In other words, stacks is a programming concept that can be applied at many different levels of computing.
This applies to other programming concepts as well. You might say its the recursive nature of programming to re-explore a concept in a differently configured computing environment and/or at a different level of computing abstraction.
Really no different than some mathmatical equasion where some element of the equasion is the use of stacks.
My reply to the main themes in the comments are here: http://funos.livejournal.com/367820.html
none
Sorry, but LISP (though I don't mean Common LISP) is just as much a stack language as FORTH.
Ummm. Yeah. I know that. That's why I mentioned it. It was a joke, you know?
I've programmed in Forth, Lisp, APL, hand-written Postscript, and just about every other computer language that was ever popular, and several that never were.
Heh. I can still remember writing Postscript by hand to convince our laser printer to make the signs I needed for my N-gauge train layout...
Clear, Dark Skies
I guess it was back in 2003 that I was introduced to a "highly proprietary, specialized scripting language" that I immediately recognized as FORTH - which the salesman denied.... Later, after playing with it I even figured out which PD implementation of FORTH they had ripped off, but I also discovered that you can no longer get "Thinking FORTH" or "Starting FORTH" and that the FORTH had apparently become a completely dead language.
To be honest, I wouldn't recommend using it in a modern environment - or as a machine architecture - but for squeezing complex applications into machines with 4k of RAM it was very sweet.
Clear, Dark Skies
What makes it so great?
I pulled it down, and (superficially) it looks like any other FORTH interpreter.
Seriously - I'm not trying to run it down, tell me what makes it worth my time to revisit a UI model I first saw on a C64?
Clear, Dark Skies
There's a very good reason for that.
Clear, Dark Skies
I think it technically qualifies, much like fingerpainting...
Range Voting: preference intensity matters
Yeah, sound familiar. PS-Algol (Persistent S-Algol) designed at St Andrews by a team including one of the Java designers, did this. There was a standard stack for evaluations etc., but I think the stack frames, structures etc., themselves lived on the heap.
... think a programming assignment where you open a db, then reflect the interface, then interact with a closure saved by the professor in order to perform the assignment. One example was implementing a communications protocol to communicate with stubs in a pre-saved closure.
PS-Algol included persistent hashes through a simple but very powerful api, and support serialisation of anything you could point to. Ron Morrison would probably kill me if I didn't mention that procedures in PS are first class objects, permitting not just closures, but save/load of closures
Google for PS-Algol to find papers and research. Hi to anyone at St Andrews that still remembers me!
-- "It's not stalking if you're married!" My Wife.
If stack-based computing is so great, powerful, and cheap, why aren't IBM PPC, AMD Athlon, Intel Core pick-a-number, and Sun Sparc dueling it out for the best stack-based chip. Why aren't the next-gen game consoles all using it, since Microsoft and Sony at least (Wii is just a faster GC) went to new architectures. Don't tell me no one has ever heard of the concept before. The Burroughs 5500 dates back to the late 1960's. I think there's more here than is being told.
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
I'm not a salesman. If you're not interested enough by what I said to give it a closer look, then why would I waste my time?
Unfortunately, I no longer use a Mac as my primary computer...though I did at the time that Neon was alive. (I'm glad to hear that it's still going, however.)
I think we've pushed this "anyone can grow up to be president" thing too far.
As anyone that has ever had the misfortune to have to reverse anything done for the ST20 can confirm, stack based processors just hurt the brain. I personally take three times as long to grok a program on an ST20, which uses a transputer core, as I do when working on an ST40, which is based on an SH-4 RISC architecture. That said, it would make life a bit more difficult for virus writers as well :-)
The instruction set of the CPU is no longer even relevant. As far as I can tell, there is no logical reason to even suggest that stack based vs. register based is an issue when discussing CPU architecture any longer.
If you read the Core 2 Duo architecture details and all the well written articles across the web describing the design of them, you'll quickly come to the two conclusions that I came to early on :
- Branch prediction/Out of Order execution handlers completely render assembler irrelevant to anyone except the compiler developer
- The instruction set decoder no longer supports x86 but a wider architecture more similar (in many ways) to RISC computing than to x86. x86 appears to be intepretted into simpler RISC style instructions for faster decoding and execution when feeding the arithmetic/logical/etc... units.
The one place where this is less obvious is in the case of the SSE/SIMD/Vector processing units since they appear to be designed for two spefic groups of people
- Hand coding assember for faster operations
- Parallel Processing compilers that seem to ignore the existance of an FPU and use vectors instead when calculating massive equations. (such as Intel/former Kai compilers).
As for stack based computing. As a hobby, I've developed a great deal of code for both Open Firmware and for EFI using stack based computing. It's a great thing, but frankly, although it works well enough, when it comes to any other type of development, I don't really care what style of instruction set is used, that's the compilers problem.
My most important issue here is that it doesn't matter what advantages the CPU architecture has over another type, the pure simple fact is that there are thousands of compilers already in existance for register based computing and for the most part, compiler engineers/scientists across the world have the register based architecture more or less worked out. The drawbacks they have are no involved with performance regarding registers anymore, most optimizations made to compilers these days are focussed on memory allocation and usage.
If you want to write a paper which truly revolutionizes computing, focus instead on memory allocation technologies. Products such as SmartHeap and Doug Lea's Malloc and other technologies are only a minor fraction of the issue, though they themselves are amazing products.
The real problem in design is that there is no real method of dealing with object oriented developers that have little or no understanding of memory allocation architectures and therefore cause programs to use far too much heap and the cpu uses WAY too many cycles dealing simply with allocation and deallocation of memory. Fragmentation is an issue and no matter how advance the MMU in a system becomes, there's a certain limit to what can be done in hardware. Almost every single application I've ever seen the code of regularly expands its' own heap by calls to functions such as sbrk(), but almost none of the systems actually every give heap back to the system. In an environment such as Windows/Linux desktop/Mac it doesn't matter that much, but on systems with limited space, it's a killer.
So why waste time with ridiculos issues such as register vs. stack architectures when in reality it would make little real difference?
come on, you can use every register based cpu in the meaning of a "stack computer" ..... just keep away at registermongering and use only push/pop (or an emulation of these ...) and memory access ..... how innovative
Back when Java was new and mysterious, there was a lot of talk about java chips figuring in the near future, and they would most probably be stack based, since the runtime spec is. Never really happened, but maybe it's time for that sort of thinking, now that instruction sets are virtual anyway?
sudo ergo sum
I downloaded it and it looks like the exact same thing I was using in 1983. I'm not going to waste time trying to find out any different if you don't care enough to write a 30 second message.
Clear, Dark Skies
If a stack machine is that much simpler, couldn't you either have:
- A vast amount of cores for many unrelated threads
- Or: Multiple pipelines and explicit division of instructions into the pipelines?
The second refers to an instruction coding similar to VLIW such that you parallelise the code on multiple stacks but it still shares an instruction/data cache and allows for parallelism without heavy multi-threading at the high-level (and instead having parallelism as a compiler optimization at the low-level).I've dropped into EFI off and on, but I never worked up the courage to install that EFI pong game that came out at one point.
Clear, Dark Skies
I have been using stacks of computers for years. Isn't that about the same thing?
Computer Sysstem Architects, for whom I was working, was selling the Inmos Occam development kit and Logical Systems C, both of which allowed you to exploit the parallelism.
I was already familiar with C before I started there, and I learned Occam and Transputer Assembly Language on the job. I'd never even HEARD of a stack-based architecture before, so I was in for a real eye-opener.
What's truly sad is that, these days, some geeks would just port Linux to it and it would have a thriving market. Back then, it didn't run DOS or Windows 3.0, so it didn't sell.
... by the Dew of Mountains the thoughts acquire speed, the hands acquire shakes, the shakes become a warning
However, MOPS' developer Mike Hore has stated that he wasn't too keen on porting the whole system to Intel Macs.
Knowledge is power; knowledge shared is power lost.
Sounds interesting, thanks!
Why not use variable-length integers, consisting of significant bits followed by a special symbol that designates the end of a pointer? For example, MIDI uses this concept for delays between notes.
Good responses from the original author.
Also, the Bernd Paysan mentioned is one of the prime authors of GNU Forth, one of the most popular Forth interpreters in the Unix world.
Too bad Display Postscript didn't really seem to go anywhere. Did it evolve into anything?
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
a free and open architecture for a forth based cpu/microcontroller is here http://indi.joox.net/ Any comments?
Thanks for the heads up - I googled for "thinking forth" and found the source forge site. In my defense, it was opened a good 2 years after I went looking for the book. At the time I needed it, I could only find copies on eBay.
Heh. And to think, if that source forge site had been open 2 years earlier, I might not have quit my job.
Clear, Dark Skies
Just adding a stack is no good. You have to be coordinated about it.
I mentioned, I think, dedicated stack caches? Hysteric, with spill and fill at the 1/4 and 3/4 points? Much less complicated than 4-way set associative, and much smaller for the performance gain. And that cache is what the renaming function maps onto.
Cache for the call/return stack only needs to be maybe 16 or 32 words deep for almost any GP hardware, including all consumer stuff.
Cache for the parameter stack would need a bit more, but 1K words would be easily sufficient for moste purposes. Needless to say, renaming would only target the top 16 or 32 registers.
Small stack cache would allow implementing cache pairs for faster context switching. (Sets of four might be useful for dedicated real-time hardware.)
There are other optimization that are enabled (but not implied) by using multiple stacks. But it's my bedtime now, so I'll leave it alone.