Next Generation Stack Computing

Twelfth of Never by Tackhead · 2006-08-10 07:20 · Score: 5, Funny

> He also claims that a kernel would only be a few kilobytes large! I wonder if Windows will be supported on a stack computer in the future?"

In Redmond, 640 bytes isn't enough for anybody.

Re:Twelfth of Never by jellomizer · 2006-08-10 07:35 · Score: 2, Insightful

But in reality this would be a Major Redesign of the OS, and all Apps would need to be recompiled/emulated. Registers are a core part of assembably language. Having to remake Windows would be like making Windows for the Power PC, If not more difficult.

--
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
Re:Twelfth of Never by real_b0fh · 2006-08-10 07:47 · Score: 3, Insightful

actually, if windows is 'done right' all it would take is a recompile. I don't think there is a lot of assembler code in the windows source that needs to be rewritten, most code will be in C or C++.

--
"Contrary to popular belief, UNIX is user friendly. It just happens to be selective on who it makes friendship with"
Re:Twelfth of Never by Anonymous Coward · 2006-08-10 07:49 · Score: 0

actually, if windows is 'done right' all it would take is a recompile. I don't think there is a lot of assembler code in the windows source that needs to be rewritten, most code will be in C or C++.

As a former Microsoft employee, all I can say is...

hah
Re:Twelfth of Never by OrangeTide · 2006-08-10 08:11 · Score: 1

I've ran C code on stack computers before. Now what's harder is running on a stack-less computer. register-based cpus have some form of stack typically, but some microcontrollers don't have enough space for a call stack or they have some unusual limitation (like no CALL/RET instructions).

--
“Common sense is not so common.” — Voltaire
Re:Twelfth of Never by lcam · 2006-08-10 08:51 · Score: 1

Or was it 640K?
Re:Twelfth of Never by Anonymous Coward · 2006-08-10 09:05 · Score: 0

Windows is built on top of a HAL (Hardware Abstraction Layer). Assuming the majority of platform specific code is located in the HAL then one could port the HAL and retarget the C/C++ compiler (and other tools) to the new CPU architecture and hopefully not have a lot of porting left to be done.

IN THEORY :-)
Re:Twelfth of Never by gfody · 2006-08-10 10:30 · Score: 1

Isn't Intermediate Language in .NET stack based?

--

bite my glorious golden ass.
Re:Twelfth of Never by aminorex · 2006-08-10 11:46 · Score: 1

In fact, Windows NT did ship for the PowerPC PREP platform, back in the late 90's.

--
-I like my women like I like my tea: green-
Re:Twelfth of Never by Knetzar · 2006-08-10 15:44 · Score: 1

If I recall correctly, .NET is register based, and Java is stack based.
Re:Twelfth of Never by Nutria · 2006-08-10 20:18 · Score: 1

In fact, Windows NT did ship for the PowerPC PREP platform

And the Alpha. We (the company I worked for) actually had an AlphaStation 255(?) running Windown 3.51(?).

--
"I don't know, therefore Aliens" Wafflebox1
Re:Twelfth of Never by Hal_Porter · 2006-08-11 01:16 · Score: 1

And MIPS.

Development started on the i960, the MIPS port was next and the x86 one was the done later

http://www.winsupersite.com/reviews/winserver2k3_g old1.asp

--
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
Re:Twelfth of Never by Hal_Porter · 2006-08-11 01:20 · Score: 1

Call can just store the return address in a register, and return can get it back from there.

If you need to nest calls, you can spill the register to a stack which you manage in software. MIPS is like this, even PUSH and POP instructions need to be synthesized out of a load or store followed by an increment or decrement.

--
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
Re:Twelfth of Never by lenski · 2006-08-11 01:53 · Score: 1

For a good example, see the classical IBM 360/370/etc./Z-series "SAVE" macro. Also the PowerPC, which has no distinct instruction-level stack. It does, however, have some nice instructions and addressing modes that manipulate registers in a stack-friendly way.
Re:Twelfth of Never by Hal_Porter · 2006-08-11 02:32 · Score: 1

The Arm has single instruction PUSH and POP - even for multiple registers. You can use it as a function prolog/epilog. Even though it's a Risc chip, it actually does this stuff in fewer instructions than a x86. xxMFD instructions are more efficient too, since they're easy to turn into burst accesses to external memory.
; we need more than r0-r3 for computation, also we need to call functions so ; we must preserve R14 aka the link reg stmfd R13!, {r5-r6,r14} ; push some registers bl func ; call a function, overwriting the link reg ldmfd R13!, {r5-r6,pc} ; get them back and return to caller by setting pc to the old R14

--
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
Re:Twelfth of Never by Eric+LaForest · 2006-08-11 06:55 · Score: 2, Interesting

Correct, except 2nd-gen stack machines have a dedicated stack to hold those return addresses, so they never get to memory. Makes for very fast calls and returns. Experiments by Prof. Koopman have shown that for all practical purposes, a return address stack of 16 elements is deep enough. There is such a 16-deep stack (hidden from the programmer) on the Pentium 4 (and the Alpha AXP too I think): http://blogs.msdn.com/oldnewthing/archive/2004/12/ 16/317157.aspx

--
none

Assembly Code was fun by neonprimetime · 2006-08-10 07:23 · Score: 1, Informative

Sounds like fun. I remember writing in RISC assembly code. I fealt like a real man back then. C# just doesn't cut it.

fyi - The Open Office format Slide links don't work, so sadly I had to open the PPT file in Open Office instead.

Re:Assembly Code was fun by hal2814 · 2006-08-10 07:52 · Score: 4, Funny

RISC assembly code? That's so weak. I'd rather spend a day writing an assebmly routine that has an equivalent single obscure machine instruction I didn't know about beforehand, thank you very much.
Re:Assembly Code was fun by x2A · 2006-08-10 08:51 · Score: 1

Yeah that's the great thing about CISC assembly... you write your code, feel great because it does actually work, then you can go learn a few more instructions, use them to make your code smaller and faster, and feel great about yourself AGAIN!

--
The revolution will not be televised... but it will have a page on Wikipedia
Re:Assembly Code was fun by Thuktun · 2006-08-10 12:05 · Score: 3, Funny

I'd rather spend a day writing an assebmly routine that has an equivalent single obscure machine instruction I didn't know about beforehand, thank you very much.
http://www.netfunny.com/rhf/jokes/97/Nov/assembly. html
Re:Assembly Code was fun by rbanffy · 2006-08-10 15:00 · Score: 1

RISC Assembly is for sissies. Men do it in microcode.

Real men do it in hardware. ;-)

--
http://www.dieblinkenlights.com

Re:I Know... by Anonymous Coward · 2006-08-10 07:23 · Score: 0

It say blah blah blah here's some videos to torrent.

Oh? by qbwiz · 2006-08-10 07:23 · Score: 3, Funny

I thought the 387 and Burroughs B5000 were odd, antiquated architectures, but apparently they're the wave of the future.

--
Ewige Blumenkraft.

Re:Oh? by Rob+T+Firefly · 2006-08-10 07:26 · Score: 1

Everything old is new again! *kick-starts the old ENIAC*

--
Slashdot Burying Stories About Slashdot Media Owned
Re:Oh? by HiThere · 2006-08-10 07:49 · Score: 1

ENIAC's a great machine. All you need to do is re-implement it in nano-technology. Unfortunately, it's analog, so you won't be able to translate C to run on it.

--

I think we've pushed this "anyone can grow up to be president" thing too far.
Re:Oh? by mclaincausey · 2006-08-10 07:51 · Score: 0, Offtopic

TheFNORD! last time I wrote anyFNORD!thing in a stack-based languaFNORD!ge it was silly little progrFNORD!ams in Reverse Polish LiFNORD!sp on an HP-48.

--
(%i1) factor(777353); (%o1) 777353
Re:Oh? by Rob+T+Firefly · 2006-08-10 08:04 · Score: 2, Funny

I had a good C interpreter ported over to punch cards, but one day I accidentally dropped crate #147 off the forklift and they went everywhere. Damn my lazy habit of not labelling my media!

I should be finished unshuffling them in another six or seven months.

--
Slashdot Burying Stories About Slashdot Media Owned
Re:Oh? by OrangeTide · 2006-08-10 08:15 · Score: 1

It's like how bellbottoms where poised to make a comeback in the early 1990s. It was just a big scam cooked up by the fashion industry because they were out of ideas (again).

Also Java JVM is a stack architecture, and we have lots of microcontrollers that run JVM natively. so basically you are all way behind the times on this "stack cpu fad"

--
“Common sense is not so common.” — Voltaire
Re:Oh? by cnettel · 2006-08-10 08:15 · Score: 0, Redundant

ENIAC was digital. That, and electronic, and programmable (and Turing complete).
Re:Oh? by The_Wilschon · 2006-08-10 08:53 · Score: 4, Informative

(and Turing complete)
Bzzzzt! No actual machine can ever be Turing complete, because theoretical Turing machines are capable of calculations which require an unbounded amount of space. That is, there exist algorithms which a Turing machine can execute which require more memory than any computer that you make.

Computer languages can be Turing complete, but physical computers cannot be.

--
SIGSEGV caught, terminating

wait... not that kind of sig.
Re:Oh? by The_Wilschon · 2006-08-10 09:00 · Score: 1

In my experience at an American high school and then college in the south, bellbottoms didn't make a comeback, but flare-leg jeans did.......

--
SIGSEGV caught, terminating

wait... not that kind of sig.
Re:Oh? by OrangeTide · 2006-08-10 09:11 · Score: 1

being a levi 501 man for the past 20 years I don't know if I could tell the difference between flare-leg and bellbottoms. likely there is no difference.

--
“Common sense is not so common.” — Voltaire
Re:Oh? by shawb · 2006-08-10 09:16 · Score: 1

Your post frightens me. I don't know what it is, but for some reason faint images of a fan with a black and white spiral come to mind.

--
I'll never make that mistake again, reading the experts' opinions. - Feynman
Re:Oh? by kfg · 2006-08-10 09:36 · Score: 1

No actual machine can ever be Turing complete, because theoretical Turing machines are capable of calculations which require an unbounded amount of space.

What if the universe isn't bounded?

Think about what the mice might be up to and be afraid.

KFG
Re:Oh? by homer_ca · 2006-08-10 09:39 · Score: 1

All fashion gets recycled. Five years ago the vest came back (the early 80's Back to the Future style vest). Two years ago trucker hats came back. I think the late 80's ripped jeans are coming back this year.
Re:Oh? by WilyCoder · 2006-08-10 09:41 · Score: 1

If you can write an algorithm that requires an unbounded amount of space, it should be assumed that memory contents can be written to disk. If that disk starts to get full, another disk is swapped in. Process continues.

From a theoretical perspective, RAM and HDD space are the same. They are both memory. Maybe disk swaps aren't 'practical' in real life, but we're dealing with theoretical constructs (like the Turing Machine itself). Is it theoretically possible to run a turing program on a stored-program computer? Yes.
Re:Oh? by fbjon · 2006-08-10 10:01 · Score: 1

Only if you have an unlimited amount of matter/energy from which to build the computer.

--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:Oh? by Slithe · 2006-08-10 10:02 · Score: 1

But what if you run out of hard drives?

--
---- "XML is like violence. If it doesn't fix the problem, you aren't using enough."
Re:Oh? by The_Wilschon · 2006-08-10 10:06 · Score: 1

Precisely. But nobody would buy bellbottoms, because OMGWTFBBQ that's soooo 70s. But flareleg jeans, which are exactly the same, sold. So in one sense the projection of bellbottoms coming back was wrong, because you can't go to the store and pick up "bellbottom jeans". But you can get the same thing under a different name, so the projection was right in another sense.

--
SIGSEGV caught, terminating

wait... not that kind of sig.
Re:Oh? by novus+ordo · 2006-08-10 11:04 · Score: 2, Informative

What a brainfuck.

--
"You're everywhere. You're omnivorous."
Re:Oh? by Anonymous Coward · 2006-08-10 12:16 · Score: 1, Insightful

Bzzzzt! No actual machine can ever be Turing complete, because theoretical Turing machines are capable of calculations which require an unbounded amount of space.

Bzzzzt! No actual machine may be Turing complete because all actual machines are bounded (even if the difference rarely matters in practice), but a machine architecture can certainly be Turing complete, because they're also theoretical, abstract constructs. (To make a connection with your statements, a machine architecture is equivalent to a programming language, although that's a rather limited way of looking at Turing completeness as well.)

Proof: Simply specify any of the available explicit Turing machine descriptions as your machine architecture. You may only be able to build a reduced version of it, but your machine architecture is still Turing complete. And that's what we're talking about, machine architectures, not implementations.
Re:Oh? by TClevenger · 2006-08-10 16:27 · Score: 1

Kinda like the Dodge Magnum. 30 years ago, it would have been called a station wagon. Now station wagons are selling like hotcakes because they're called "crossovers."
Re:Oh? by HiThere · 2006-08-10 17:54 · Score: 1

Sorry. I was assuming that the AC of ENIAC stood for Analog Computer, as I believe was usual.

--

I think we've pushed this "anyone can grow up to be president" thing too far.
Re:Oh? by chthon · 2006-08-10 18:40 · Score: 1

Electronic Numeric Integrator And Computer
Re:Oh? by rbarreira · 2006-08-11 00:18 · Score: 1

In the real world, when people say a machine is turing complete, what they mean is "it would be turing complete if it had infinite memory". Which is more than enough for all practical purposes.

--

The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
Re:Oh? by bWareiWare.co.uk · 2006-08-11 00:28 · Score: 1

But you would need pointers to that infinite amount of memory (which HDD did we store what on). Unfortunately a pointer in an infinite address space needs and infinite amount of storage meaning you would have converted all matter in the universe into HDD before you had finished storing the first pointer.
Re:Oh? by fbjon · 2006-08-11 02:44 · Score: 1

No, it's Electronic Netiquette-Ignoring Anonymous Coward.

--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:Oh? by OrangeTide · 2006-08-12 15:20 · Score: 1

I can't wait for acid wash jeans to make a comeback!

--
“Common sense is not so common.” — Voltaire

wikipedia link by whitelines · 2006-08-10 07:24 · Score: 5, Informative

I didn't know either:
http://en.wikipedia.org/wiki/Stack_machines

--
/* TBD */

Re:wikipedia link by Anonymous Coward · 2006-08-10 07:37 · Score: 0

Damn, when I checked out the Wiki site, it said

"In computer science, a stack machine is a model of computation in which the computer's memory takes the form of one or more horse penises."

Good old Wikipedia!
Re:wikipedia link by The+MAZZTer · 2006-08-10 07:41 · Score: 1, Offtopic

Some idiot changed "Turing machine" to "Penis machine". But then someone else changed it back before I could... good ol' wikipedia indeed.
Re:wikipedia link by Anonymous Coward · 2006-08-10 08:06 · Score: 0

Odd, shouldn't a stack(ed) machine be related to large boobs rather than penises?
A well hung machine?
Re:wikipedia link by Anonymous Coward · 2006-08-10 08:12 · Score: 0

A better Wiki reference for stacked machines:

http://en.wikipedia.org/wiki/Seven_of_nine
Re:wikipedia link by SatanicPuppy · 2006-08-10 08:41 · Score: 3, Informative

This source about stack computing is better.

Sadly I actually still work on a stack computer, and I had to go look it up.

--
ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
Re:wikipedia link by EvilBudMan · 2006-08-10 08:48 · Score: 1

I thought the concept was last in first out.
Re:wikipedia link by Anonymous Coward · 2006-08-10 10:01 · Score: 0

Good old Wikipedia!

It said that for precisely 51 seconds. Not bad for a page that anyone can vandalise.
Re:wikipedia link by aminorex · 2006-08-10 11:59 · Score: 1

Dang, I had a plan to rake in the bucks, but the yokels refused to give me an adultcheck id for TuringMachines.com.

--
-I like my women like I like my tea: green-

Bit-torrent by KingEomer · 2006-08-10 07:25 · Score: 1

Well, time to test out the CSC's new torrenting capabilities.

Size and functionality by Angst+Badger · 2006-08-10 07:25 · Score: 4, Insightful

He also claims that a kernel would only be a few kilobytes large!

I've seen sub-1k kernels for FORTH systems before. The question is, how much functionality do you want to wrap into that kernel? More capable kernels would, of course, be correspondingly larger.

That said, stack computing and languages like FORTH have long been underrated. Depending on the application, the combination of stack computers and postfix languages can be quite powerful.

--
Proud member of the Weirdo-American community.

Re:Size and functionality by mrchaotica · 2006-08-10 07:35 · Score: 1

Depending on the application, the combination of stack computers and postfix languages can be quite powerful.

Why would the type of notation matter? Couldn't you program a stack computer just as well with a prefix functional language like Scheme?

--
"[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz
Re:Size and functionality by merlin_jim · 2006-08-10 07:55 · Score: 4, Insightful

Couldn't you program a stack computer just as well with a prefix functional language like Scheme?

Sure you can - and it compiles to postfix notation anyways, rather ineffeiciently I might add (get it, add????)

let's say you wanted to write a function like:
function addsubandmultiply(b, c, d, e) {
a = (b + c) * (d - e);
return a;
}

and you've got assembly level instructions such as mov, add, sub, mult, push, and pop, as well as
the very stack-centric stor and lod, allowing you to move one or more stack variables to memory and
the reverse.

A typical register based computer might compile the above as:

pop b
pop c
pop d
pop e
mov b, ax
mov c, bx
add bx
mov ax, temp_memory
mov d, ax
mov e, bx
sub bx
mov temp_memory, bx
mult bx
push a

Whereas a stack-based computer might compile as:

add
stor temp_memory
sub
lod temp_memory
mult

In a stack based computer, operations are carried out directly on your stack... it's very convenient,
since most languages compile function calls to use the stack anyways, and as you can see not having
to deal with an accumulator register makes for much terser code. Between 20 - 40% of your compiled code is spent moving data in and out of the accumulator register, since most instructions depend on
specific data being in that register - to the point that they introduced zero-cycle add/mov functionality in the P4 line - basically, if your code performs an add and then movs ax immediately
out to memory (like the above code - and possibly the most common arithmetic operation in compiled code), if the pipeline and data caches are all available, the P4 will
execute both instructions with enough time to put something else in the instruction pipeline that
cycle. It's not really a zero-cycle function - you can do something like 2.5 (add,mov,add,mov,add) a cycle if you stack them back to back to back, for instance...

Yes, Intel released a benchmark for it. No, I can't imagine why you would want to keep adding and moving the results around memory - maybe some esoteric functions like a fibbanoci generator or even a DSP algorithm of some sort might need to do it, but I don't think it'll be all that often... or that any compiler would have an optimisation to specifically output that sequence if appropriate...

--
I am disrespectful to dirt! Can you see that I am serious?!
Re:Size and functionality by Anonymous Coward · 2006-08-10 08:27 · Score: 4, Insightful

Actually x86 is inbetween a stack machine and a register based machine.
What most register machines compile the following code:
function addsubandmultiply(b, c, d, e) {
a = (b + c) * (d - e);
return a;
}
Into something like (sorry for PPC asm):
add r3, r3, r4
sub r4, r5, r6
mulw r3, r3, r4
blr #(return)

Now tell me that is not just as simple (or even simplier) as the stack based one?
Re:Size and functionality by treyb · 2006-08-10 08:48 · Score: 1

In Forth, we'd write:
: -rot rot rot ; : addsubmul ( b c d e -- a ) - -rot + * ;
which in many stack machines maps exactly to machine instructions:
sub rot rot add mul
where rot rotates the the top three items on the stack, moving the third to the first ( a b c -- b c a ).
Re:Size and functionality by BlueDreaux · 2006-08-10 08:50 · Score: 1

Reminds me of my language translators class. In writing my translator I saved myself a bunch of gorilla work by using Macros in order to avoid having to manually type many common sets of machine code instructions to translate to. Of course you could do things with a stack and think you're flying fast shit but then you're really just flying slow shit. You should give a processor with a proper matrix instruction set a try some time and you'll see that there's far superior methods to handling, maintaining, and engineering problem solutions than just using a simple stack. Too bad the prices on real matrix operating processors haven't gotten to be affordable yet, perhaps then people would become a lot more programming literate.
Re:Size and functionality by merlin_jim · 2006-08-10 08:53 · Score: 1

lol forth was one of my first languages... how I missed the rot!!!!

But yeah forth's big two advantages; most operations map to atomic instructions, and postfix notation matches instruction order. Again assuming you have a stack machine....

If only every language could be a simple lol...

--
I am disrespectful to dirt! Can you see that I am serious?!
Re:Size and functionality by shawnce · 2006-08-10 08:53 · Score: 1

Thank god most OS vendors are utilizing register based ABIs on x86-64 (would have loved 32 named registers... but 16 is better then nothing) to allow operations similar to what you outlined for PPC.
Re:Size and functionality by Anonymous Coward · 2006-08-10 09:04 · Score: 0

ACC!
Re:Size and functionality by Anonymous Coward · 2006-08-10 09:23 · Score: 0

Dude, apart from your "register based code" being in horrible AT&T syntax, it's also horribly inefficient... Here's how it {w,c}ould look like on x86 for unsigned registers integers.

; arg_* are read directly from the stack (ie., through ESP + offset)
mov edx, [arg_d]
mov eax, [arg_c]
mov ecx, [arg_b]
sub edx, [arg_e]
add eax, ecx
imul eax, edx
; return value in EAX, as per x86-standard Intel ABI
Re:Size and functionality by Guy+Harris · 2006-08-10 09:40 · Score: 1

Wouldn't that be more like
mov edx, [arg_d] mov eax, [arg_c] sub edx, [arg_e] add eax,[arg_b] imul eax, edx
(one instruction fewer)given that x86 supports memory-to-register arithmetic ops?
Re:Size and functionality by speculatrix · 2006-08-10 09:41 · Score: 1

forth lives on in Sun Microsystems - the openbootprom = equivalent of the bios = is a forth engine.
a simple home computer in the UK called the Jupiter Ace was entirely forth based; it still has a fan club today!
Re:Size and functionality by Chris+Burke · 2006-08-10 11:11 · Score: 4, Informative

Between 20 - 40% of your compiled code is spent moving data in and out of the accumulator register, since most instructions depend on
specific data being in that register - to the point that they introduced zero-cycle add/mov functionality in the P4 line - basically, if your code performs an add and then movs ax immediately
out to memory (like the above code - and possibly the most common arithmetic operation in compiled code), if the pipeline and data caches are all available, the P4 will
execute both instructions with enough time to put something else in the instruction pipeline that
cycle. It's not really a zero-cycle function - you can do something like 2.5 (add,mov,add,mov,add) a cycle if you stack them back to back to back, for instance...

The only zero-cycle mov I'm familiar with on the P4 is a register-to-register mov, and that just takes advantage of the fact that the P4 has a physical register file and a map between the architectural registers and the physical ones. E.g. given
add bx, [cx]
mov ax, bx

the mapper might assign bx to physical register 10. It will then realize that ax is just a copy of bx, so it will make ax point at register 10 as well, and the mov never has to execute at all, thus 'zero cycle'.

You seem to be saying that the P4 can write the result of an add to the cache in zero cycles, or more than two values in a cycle, which doesn't mesh with what i know of the P4 which is that it has a two-ported cache. But I'm only intimately familiar with early revs of P4; if you know what rev this was added in I would be interested.

--

The enemies of Democracy are
Re:Size and functionality by Anonymous Coward · 2006-08-10 12:15 · Score: 0

"In a stack based computer, operations are carried out directly on your stack..."

How big is this stack? I hope it's not arbitrarily large, 'cause then you're using main RAM and you can kiss your performance gains goodbye. Regardless, I don't see a stack machine doing very well with serious frequent context switches. I've seen some tiny forth apps do amazing things... but rarely in real time.
Re:Size and functionality by Anonymous Coward · 2006-08-10 13:20 · Score: 0

There is no inherent difference in performance. Its just as easy to cache the top of the stack as any other memory. If anything, stack machines have LESS state to save on a context switch.

Although forth is OK, the best stack language I've programmed in is Postscript. In addition to having full general purpose programming capabilities, it has all those neat matrix operations.
Re:Size and functionality by budgenator · 2006-08-10 16:02 · Score: 1

comparing a stack based computer to a register based computer is the same as comparing a HP calculator to a TI. The stack based machine will be blazingly effiecent, but it take a proper mindset to program it and not everybody can do it.

--
Apocalypse Cancelled, Sorry, No Ticket Refunds
Re:Size and functionality by Hal_Porter · 2006-08-11 02:08 · Score: 1

in x64 (64 bit x86) you could do something like this -
function addsubandmultiply(b, c, d, e) { a = (b + c) * (d - e); return a; } ; args in rcx, rdx, r8, r9 ; we need to do rax = ( rcx + rdx ) * ( r8 - r9 ) sub r8, r9 ; r8 -= r9 lea eax, [rcx+rdx] ; eax = rcx+rdx mul r8 ; eax *= r8
mul needs one of the params in RAX, and the LEA is just a way to cause that and do the add in a single instruction.

On an ARM
function addsubandmultiply(b, c, d, e) { a = (b + c) * (d - e); return a; } ; args in R0, R1, R2, R3 ; we need to do r0 = ( r0 + r1 ) * ( r2 - r3 ) sub r2, r2, r3 ; r2 = r2 -r3 add r0, r0, r1 ; r0 = r0 + r1 mul r0, r0, r2 ; r0 = r0 * r2
No need for any monkey business on the ARM as you'd expect.

--
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
Re:Size and functionality by merlin_jim · 2006-08-11 02:19 · Score: 1

But I'm only intimately familiar with early revs of P4; if you know what rev this was added in I would be interested.

I stumbled on it while researching the Intel Performance pack, which uses the various MMX/SSE/SSE2/etc features of a processor to implement a nice fast floating point library. A few of the functions use this and a few of the other zero-cycle instructions to get performance down below the 0.5 instructions/cycle mark...

I spent a few minutes trolling intel.com trying to find some document that mentions this - there are zero-cycle instructions on the P4 (reg-reg mov, fxch, a few others) but I couldn't find anything about this so I may have been mistaken or confused...

--
I am disrespectful to dirt! Can you see that I am serious?!
Re:Size and functionality by Guy+Harris · 2006-08-14 13:13 · Score: 1

Between 20 - 40% of your compiled code is spent moving data in and out of the accumulator register, since most instructions depend on specific data being in that register - to the point that they introduced zero-cycle add/mov functionality in the P4 line

Got any numbers to back up that "most" claim? Or by "the accumulator register" and "that register" do you really mean "any of the 8 GPRs"?
Re:Size and functionality by Guy+Harris · 2006-08-14 13:29 · Score: 1

mul needs one of the params in RAX

...but imul doesn't - one of the remaining irregularities.
Re:Size and functionality by dezert_fox · 2006-08-15 12:59 · Score: 1

The actual code is simple. With the stack based machine, the code and the actual process which ensues are simple. What you've shown there is not a mere 4 clock cycles, but more like ~15; it makes use of a complicated instruction set which implies a much larger amount of work to physically do than to write.

Linking to 300MB video files from Slashdot? by Colin+Smith · 2006-08-10 07:25 · Score: 2, Funny

Someone's having a larf. Oh you do crack me up Messrs mymanfryday and CmdrTaco.

Please try the bittorrent. No, wait... Teach em a lesson, make em burn.

--
Deleted

.NET Compatibility by DaHat · 2006-08-10 07:26 · Score: 4, Interesting

Interestingly enough the Microsoft Intermediate Language (MSIL) that .NET apps are compiled to before being JITed into machine code is actually built around a stack based system as well... No doubt porting the .NET Framework over to such a system would be quite easy... and give much in the way of performance boosts (especially on startup).

Of course... that would still depend on a version of Windows for it to run on.

--
Help Brendan pay off his student loans

Re:.NET Compatibility by jfengel · 2006-08-10 07:31 · Score: 3, Informative

The Java Virtual Machine is also sort of stack-based. The JVM bytecode set uses stack operations but the safety checks that it runs make it equivalent to a sliding set of registers not unlike, say, the SPARC architecture. A JIT implementation could do away with the stack, at least in the individual method calls, though the call-return stack would still have to be there.
Re:.NET Compatibility by evil_Tak · 2006-08-10 08:02 · Score: 1

Nah, just some platform for http://www.mono-project.com/
Re:.NET Compatibility by Anonymous Coward · 2006-08-10 09:36 · Score: 0

nterestingly enough the Microsoft Intermediate Language (MSIL) that .NET apps are compiled to before being JITed into machine code is actually built around a stack based system as well.

Of course it is! They just copied Java, which is stack based.
Re:.NET Compatibility by Anonymous Coward · 2006-08-10 10:28 · Score: 0

Interestingly enough the Microsoft Intermediate Language (MSIL) that .NET apps are compiled to before beingJITed
into machine code is actually built around a stack based system as well

As indeed is the Java Virtual Machine - as was the UCSD p-code.

Both of which (and I'd be surprised if the MS is not) derive from Peter Landin's SECD machine (Stack, Environment, Control, Dump) described in 1964.
Re:.NET Compatibility by Chysn · 2006-08-10 10:32 · Score: 1

The language previously known as MSIL is now called CIL, Common Intermediate Language, and the idea is that other (non-Micro$oft) platforms can run .NET Framework stuff.

So far, I haven't had the experience that C# and .NET is anything other than a MS-specific skill. But one always hopes for... heh... interoperability and portability.

--
--I'm so big, my sig has its own sig.
-- See?
Re:.NET Compatibility by Jerry+Coffin · 2006-08-10 11:16 · Score: 1

They just copied Java, which is stack based.

..or maybe Sun copied it from Microsoft, who'd been using it in VB for years, and in QuickBASIC for years before that. MS even had a C/C++ compiler that produced stack-based P-code (MS C/C++ 7.0, the immediate predecessor to VC++ 1.0). QuickPascal (short-lived though it was) also used stack-based P-code.
Of course, none of this is anywhere close to original with either MS or Sun. The very first Pascal compiler produced stack-based P-Code which was then interpreted by a virtual machine.
Getting back to the original subject of TFA, I'm afraid the poster was letting his enthusiasm get the better of him. Stack-based machines have one fairly fundamental problem: they make extremely heavy usage of a few registers at the very top of the stack. Nearly every instruction depends on those registers. Modern CPUs attempt to execute a number of instructions in parallel. They do this (in large part) by keeping track of dependencies between instructions -- e.g. when a value used as the input to one instruction is the value produced by a previous instruction.
Now, when you have a number of registers you can use interchangeably, you do you best to cycle through them to produce values in different registers before using them in subsequent instructions. The instructions that don't directly depend on each other can be executed in parallel. Most modern processors can theoretically produce around 3 or 4 results per clock, and most actually DO produce close to 2 result per clock as a general rule.
A stack-based processor makes this substantially more difficult. Most of the advantages result from the fact that most instructions have implicit sources and an implicit destination (e.g. an ADD instruction takes the two items at the top of the stack, adds them together, and desposits the result back on the top of the stack).
This means a few registers near the top of the stack are used almost constantly, and nearly every instruction depends on them. It's possible to break this dependency by using register renaming. For example, if you have something like:
lod a
lod b
add
lod c
lod d
mul
sub
sto x

[equivalent to: x = (a+b)-(c*d) ]
You can break the dependency by internally allocating a number of different registers that will each act as the top of the stack at different times. Each time you load a value, you allocate a new register for it, independent of the previous top of stack register. Internally, the processor can figure out that the code above is equivalent to:
lod R0, a
lod R1, b
add R0, R1
lod R2, c
lod R3, d
mul R2, R3
sub R0, R2
sto X

But even a minimal look at that reveals something pretty obvious: you no longer have a stack-based CPU at all -- you're really using a perfectly normal register-based CPU.
This is largely why stack-based architectures are so popular in virtual machines and such. On one hand, they abstract out most of the details that vary between different CPUs, such as the number of registers. OTOH, it's generally quite easy to execute stack-based code efficiently on a register-based CPU.
There might be an advantage to this being done internally in the CPU though. The most obvious would be that insructions that instructions that need to specify sources and destinations are larger than instructions that use implicit sources and destinations. That means teh CPU is reading data from memory that it could pretty easily figure out on its own. Given the large (and increasing) disparity between memory speed and processor speed, this could be a real improvement.

--
The universe is a figment of its own imagination.
Re:.NET Compatibility by anothy · 2006-08-11 01:05 · Score: 1

...it's generally quite easy to execute stack-based code efficiently on a register-based CPU.
i don't think this is true. it's easy to write the code to do it, but the translation between a stack-based system and a more traditional system remains an expensive operation in most cases. take a look at The design of the Inferno virtual machine for a description of the Dis virtual machine and a discussion of the comparative benefits of stack and memory transfer virtual machines.

of course, this discussion is somewhat removed from the article topic, which is the implementation of a real stack-based processor. this sounds promising, as long as it's not designed to be a "java processor" or a "CLI processor" - those always fail. there's another really good Bell Labs paper talking about their experiences with the hobbit/crisp chip, designed to be a "C processor", and why that's a fundamentally flawed idea. lacking that, see the wikipedia article on the Hobbit, particularly the last paragraph.

--

i speak for myself and those who like what i say.

They're great by TechyImmigrant · 2006-08-10 07:26 · Score: 5, Funny

Mathematicians like stack computers because its easier to formally prove the behaviour of algorithms using stacks.
Hardware engineers like stack computers because the hardware is interesting and easy to design
Investors hate them because they keep loosing money on them.

--
Evil people are out to get you.

Re:They're great by freeweed · 2006-08-10 08:43 · Score: 1

Investors hate them because they keep loosing money on them.

Yeah, but do they ever make any of it back? :)

--
Endless arguments over trivial contradictions in books written by ignorant savages to explain thunder in the dark.

We are heard this before... by __aaclcg7560 · 2006-08-10 07:26 · Score: 3, Funny

Apparently NASA uses stack computers in some of their probes.

In space no one can hear you blue screen of death. Unless you work for Lucas Films.

PC Stacks by celardore · 2006-08-10 07:26 · Score: 5, Funny

I once had a job where I had to sort through stacks of computers. Overall the stacks were pretty useless, a bunch of burnt out 286s. Even if you put all your redundant computing power into a stack doesn't neccesarily make it better!

Awesome by LinuxFreakus · 2006-08-10 07:27 · Score: 4, Insightful

Does this mean my old HP48GX will be considered cutting edge? I should get ready to sell it on EBay when the craze hits! All my old classmates will be forced to allow me to have the last laugh after I was on the recieving end of much ridicule for using the HP when the TI was the only thing "officially" endorsed by all the calculus textbooks. I don't know if I could ever part with it though. I still use it almost daily, the thing continues to kick ass.

Re:Awesome by Anonymous Coward · 2006-08-10 07:46 · Score: 0

You should have beat them uncounsious with your calculator, then resumed performing real mathmatics. HP calculators were built like tanks and used by tank designers. One of my classmates accidentally ran his over with his car, it didn't break it.
Re:Awesome by pclminion · 2006-08-10 08:05 · Score: 1

Just because the HP calc used RPN doesn't mean the CPU itself was stack based. Does anybody know the specific processors used in those calculators?
Re:Awesome by LinuxFreakus · 2006-08-10 08:13 · Score: 1

Yes, they used 8 bit saturn microprocessors. I believe they are in fact register based. But my comment was meant to be sarcastic anyway :)
Re:Awesome by seminumerical · 2006-08-10 08:24 · Score: 1

I have an HP 48GX. One cannot think with a TI in hand. Once one has typeset the equation and carefully paired off the matching parentheses a TI reports the answer, but one isn't involved. On the other hand, solving problems with an HP is an interactive, instructive, and inspirational process. I'd be happy if they gave five function RPN calculators to elementary school students (reciprocal, 1/x, would be the fifth function).
But by preference I use a 1970's HP 65 (with a cardreader! I got it on Ebay). When they gave HP's more power and more features, they made them less good. HP peaked with the HP 67. Also HP failed to give the 48 GX a good programming language, and they gave up on the mythology of greatness.
I got in trouble in college for designing a stack computer and writing a sample program for it, in defiance of instructions to design a register based machine as an assignment.

--
In wartime... truth is so precious that she should always be attended by a bodyguard of lies. (Churchill)
Re:Awesome by jepaton · 2006-08-10 08:28 · Score: 1

<< "" "dneirf olleH" while dup size repeat dup head swap tail rot rot swap + swap end drop msgbox >> Perhaps I should stick to Perl :)
Re:Awesome by $RANDOMLUSER · 2006-08-10 08:39 · Score: 1

> Perhaps I should stick to Perl
Doesn't matter, they're both write-only languages.

--
No folly is more costly than the folly of intolerant idealism. - Winston Churchill
Re:Awesome by caseih · 2006-08-10 09:10 · Score: 1

Nope. The Saturn processor is a 9-register cpu (each register is 64-bits) coupled with an outrageous 4-bit data bus. RPL and the stack are created in software.

http://en.wikipedia.org/wiki/Saturn_(microprocesso r)
Re:Awesome by ettlz · 2006-08-10 10:24 · Score: 1

Also HP failed to give the 48 GX a good programming language, and they gave up on the mythology of greatness.
I think HP did very well with User RPL considering the hardware limitations. It's a remarkably powerful and elegant language that should've been applied elsewhere. For example, it's object orientation features could've been greatly expanded to permit a user-defined class structure. Parallel execution is another possible feature (not on a calculator, but you could easily run multiple concurrent stacks on a computer). And you could go berserk with RPL's namespace system. How many other calculators have a programming language that can re-write its own programs?
Re:Awesome by Anonymous Coward · 2006-08-10 11:28 · Score: 0

Just my opinion, but they should have used a minimal, stripped down, conventional programming language. Or offered the choice of many. I ran pascal on a 32 kb pc in 81 or 82. A stripped down c or python would have been acceptable on an HP also. The 48 GX asking me to learn yet another programming language was hubris. I had no choice with the HP 65/67 (it was like a low assembler), but with the 48 GX I would just say "F" it and turn to the PC. Plus, a car could run over a 1970s HP, but I couldn't even step on my 48 GX without breaking it.
remember when the HP salesman would throw the HP against a concrete wall and then show you that it still worked? remember when one got run over by a tank (on a dirt road, not concrete admittedly (and it was the calculator not the salesman who was run over)) and though deformed it still worked. And the feel of the keys, the feedback from them, was great. I solved some really hard problems on the 65, with only 100 instructions. Wythoff's nim, factoring 9 digit numbers, orbital calculations. Twas great.

Forth? by dslmodem · 2006-08-10 07:28 · Score: 2, Informative

I remember that FORTH is a language support STACK COMPUTING. Hopefully, it is not totally wrong. Unfortunately, it is really hard to understand FORTH program.

http://en.wikipedia.org/wiki/Forth_programming_lan guage

--

^(oo)^pig~

Re:Forth? by dslmodem · 2006-08-10 07:32 · Score: 1

FORTH is a fun language to learn though. I've had 4 happy months with it. Just think about, I started to speak with others like:
"wash your face; wash your mouth; brush your teeth;", which is totally out of common order. :-)

--
^(oo)^pig~
Re:Forth? by CaptnMArk · 2006-08-10 07:55 · Score: 2, Informative

wouldn't that be more like:

face; mouth; teeth; brush; wash; wash;
Re:Forth? by Eric+LaForest · 2006-08-10 08:27 · Score: 1

Second-gen stack computers are basically a hardware implementation of the Forth virtual machine, so Forth code maps pretty much directly to such a machine.

--
none
Re:Forth? by MROD · 2006-08-10 09:12 · Score: 1

Ah, yes, FORTH.... what a wonderful write-only language that is! I remember it on the venerable Camputers Jupiter Ace.

Anyway, there's the obligatory Slashdot...

May the FORTH be with you! ;-)

--

Agrajag: "Oh no, not again!"
Re:Forth? by Ignominious+Cow+Herd · 2006-08-10 16:48 · Score: 1

"Forth with you may be."

Wow! All this time Yoda was speaking in Forth?
Or is this just some Soviet Russia mind-trick?

--
Lump lingered last in line for brains, and the ones she got were sorta rotten and insane.
Re:Forth? by NickFitz · 2006-08-11 01:45 · Score: 1

Having worked with Forth from time to time over the last 22 years, I would say that it's badly-written Forth programs that are hard to understand - in other words, the same as any other language.

The key to understanding Forth is that it is really a language for writing application-specific languages, in which you then write your program. All the words you define become part of the language with the same status as any other word, but for preference you use the low-level words to define high-level words which don't contain many of the "noise" words relating to stack manipulation.

As an example, when I worked as a games programmer in the 1980s I implemented Forth on the Atari ST, and then implemented a conversion of a C64 game in Forth. One particular aspect of this abstract puzzle game was that when time ran out, everything on the playfield blew up. The Forth word that did this was:

: kill-everything ( - ) player explodes mine explodes pod explodes toprim explodes siderim explodes bubble bursts ;

(Slashdot's useless formatting messes up the indentation in these examples; why don't they support <pre>?)

The word "explodes" did a bunch of low-level stuff resetting animations to their start positions and assigning the relevant frame sequences to them; but you could read the high-level words and understand exactly what was going on.

For what it's worth, I could have said "bubble explodes", but bubbles don't explode, they burst. So I defined one extra word:

: bursts (a - ) explodes ;

because it reads better that way :-)

--
Using HTML in email is like putting sound effects on your phone calls. Just say <strong>no</strong>.

Re:All your architecture are belong to us by ahsile · 2006-08-10 07:29 · Score: 0, Offtopic

Damnit, try and make a half-ass witty remark... and then add first pr0st at the end. Something you speculate about the artcile, which will get you modded high at first, and by the time anyone reads the article and realises you were wrong, it's too late.

This way you get first post, and karma whore at the same time. You will soon learn, my pupil.

--

Find Nearby Indie Events

When will you learn? by frostilicus2 · 2006-08-10 07:30 · Score: 1

Universities may have tons of bandwidth, but the servers just can't take it. Looks like this site's dead.

Anyone seeding the torrents yet?

--
Nothing sucks like a Vax, nothing blows like a PowerMac G4

Does it run Windows?!? by Stealth+Dave · 2006-08-10 07:30 · Score: 5, Funny

I wonder if Windows will be supported on a stack computer in the future?

No, no, no, NO! This is SLASHDOT! The proper response is "Does it run Linux "?

--
Evil is as eval("does");

Re:Does it run Windows?!? by Kaenneth · 2006-08-10 08:00 · Score: 1

I think you mean a Beowolf Cluster...
Re:Does it run Windows?!? by doi · 2006-08-10 08:13 · Score: 2, Funny

I think you mean a Beowolf Cluster...
I think you mean a Beowolf STACK...
Which would be better, a cluster of Beowolf Stacks, or a stack of Beowolf Clusters? Of course, the answer is a stacked cluster of Beowolf Clustered Stacks.

--
A man's reach must exceed his grasp, or what's an erection for?
Re:Does it run Windows?!? by roman_mir · 2006-08-10 08:52 · Score: 4, Funny

No, the proper response here is this: it Linux run does?

--
You can't handle the truth.
Re:Does it run Windows?!? by pdbaby · 2006-08-10 09:58 · Score: 1

think you mean a Beowolf Cluster...
I think you mean a Beowolf STACK... Which would be better, a cluster of Beowolf Stacks, or a stack of Beowolf Clusters? Of course, the answer is a stacked cluster of Beowolf Clustered Stacks.

Either way, I think we can all agree... we welcome our new Beowolf stack overlords

--
Global symbol "$deity" requires explicit package name at line 2. - If only $scripture started "use strict;"
Re:Does it run Windows?!? by Anonymous Coward · 2006-08-10 11:58 · Score: 1, Funny

>>>think you mean a Beowolf Cluster...

>> I think you mean a Beowolf STACK... Which would be better, a cluster of Beowolf Stacks, or a stack of Beowolf Clusters? Of course, the answer is a stacked cluster of Beowolf Clustered Stacks.

>Either way, I think we can all agree... we welcome our new Beowolf stack overlords

On our way to

1. Build Beowolf stack overlords using linux
2. ...
3. Profit
Re:Does it run Windows?!? by ScrewMaster · 2006-08-10 12:46 · Score: 1

it Linux run does?

Interesting ... I never knew that Yoda's brain was stack-oriented.

--
The higher the technology, the sharper that two-edged sword.
Re:Does it run Windows?!? by cagle_.25 · 2006-08-10 13:42 · Score: 1

How about a Marshall Stack amping someone reading Beowulf into a microphone?

--
Human being (n.): A genetically human, genetically distinct, functioning organism.
Re:Does it run Windows?!? by Hillgiant · 2006-08-10 14:09 · Score: 1

Of course it runs netBSD.

--
-
Re:Does it run Windows?!? by Eideewt · 2006-08-10 14:23 · Score: 1

That would be, "Linux, does it run?"
Re:Does it run Windows?!? by Anonymous Coward · 2006-08-10 21:25 · Score: 0

stacked cluster of Beowolf Clustered Stacks.
Compiler error: stack overflow

Next Generation? by Lord+Ender · 2006-08-10 07:31 · Score: 1

Patrick Stewart would be displeased by this misleading headline.

--
A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.

X86 FPU's finally losing their stackness by GGardner · 2006-08-10 07:31 · Score: 4, Interesting

Since the dawn of time, the x86 FPU has been organized as a stack, which has been recognized as a mistake by modern computer architects. For one thing, it is hard to get a stack architecture to take advantage of multiple functional units. Only recently, with the development of SSE, 64 bit modes and other additions have we been able to move away from the stack on the x86.

Re:X86 FPU's finally losing their stackness by Anonymous Coward · 2006-08-10 07:50 · Score: 2, Interesting

we been able to move away from the stack on the x86
As someone who has written several Forth compilers for the x86 I'd like to point out that the design of the stacks on the x86 is very inefficient. The main stack is just that: a stack tied to one particular register. The FP stack was just a joke; a dead weasel could have designed something better. Anyway, I do like using Forth even under the x86 model - it's nice to know that my entire programming system fits into the on-die cache!
Re:X86 FPU's finally losing their stackness by Tumbleweed · 2006-08-10 08:05 · Score: 2, Funny

Since the dawn of time, the x86 FPU has been organized as a stack

No no no, since the dawn of time, Man has yearned to destroy the Sun!

x86 came much later, right after the COBOL and the other dinosaurs.
Re:X86 FPU's finally losing their stackness by $RANDOMLUSER · 2006-08-10 08:48 · Score: 1

> Since the dawn of time, the x86 FPU has been organized as a stack

Or, at least close enough, for non-technical people.

--
No folly is more costly than the folly of intolerant idealism. - Winston Churchill
Re:X86 FPU's finally losing their stackness by wiredlogic · 2006-08-10 10:15 · Score: 1

Well this is more of a long standing historical artifact. The x86 FP stack derives from the expansion mechanism originally devised for the 8086 whereby any unrecognised opcode could be passed off to a co-processor device. Due to pin limitations, a stack architecture was the most efficient way to send operands over to the FPU.

--
I am becoming gerund, destroyer of verbs.
Re:X86 FPU's finally losing their stackness by alw53 · 2006-08-10 11:33 · Score: 1

Having built a code generator for the X86 FPU I can attest to the difficulty.
It's just about impossible to keep stuff on the stack around loops, plus
common subexpressions cause problems.

Cup of Joe by AKAImBatman · 2006-08-10 07:32 · Score: 1

Expert Eric Laforest talks about stack computers and why they are better than register-based computers. Apparently NASA uses stack computers in some of their probes.

Therefore, we should consider moving to Java-based Operating Systems and accelerator chips!

[...]

In case anyone is wondering, I'm only half joking. Java is a stack-based platform, perfectly suited to processors that don't actually exist in real-life. Sun created the picoJava in the 90's, and claimed that it was faster than the Pentium of the day. They may have been correct at the time, but the chip was never widely used, so it was difficult to say for sure. With CPU speed becoming less important than stability, I/O, and correctness of code, it's possible that such machines may start showing up in more mainstream applications.

--
Javascript + Nintendo DSi = DSiCade

Re:Cup of Joe by 4815162342 · 2006-08-10 07:53 · Score: 1

Actually, the idea of processors running Java naively hasn't died.
Some modern embedded processors have been specifically designed to execute Java naively. e.g. ARM Jazelle and the new Atmel AVR32.
These processors use various hardware mechanisms to make part of the RISC register file look like a stack. They then convert as much as 80% of the JAVA instruction set into native RISC instructions and interpret only those instructions which they cannot convert in simple hardware.
Such processors are mainly targeted at the increasing number of embedded devices which run JAVA. e.g. Mobile phones.

--
There are only 10 types of people in the world. Those who understand binary and those who don't!
Re:Cup of Joe by Anonymous Coward · 2006-08-10 07:57 · Score: 0

Ever hear of JStamp? Java is even used in real-time systems.

http://www.jstamp.com/reality.htm
Re:Cup of Joe by AKAImBatman · 2006-08-10 08:32 · Score: 2, Informative

Some modern embedded processors have been specifically designed to execute Java naively. e.g. ARM Jazelle and the new Atmel AVR32.
Yes, I'm aware of these processors. However, they're not actually stack-based. They convert the Java instructions into ARM RISC instructions which are register-based. So while such chips are very useful in accelerating Java on standard RISC architectures (also VLIW architectures such as MAJC), they are not actually stack machines.

The only modern example of a stack-based processor for accelerating Java that I am aware of, is the Java Optimized Processor (JOP).

--
Javascript + Nintendo DSi = DSiCade

Fun and games by Carnildo · 2006-08-10 07:32 · Score: 4, Funny

It's all fun and games until someone hits a stack underflow.

--
"They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.

Re:Fun and games by Eric+LaForest · 2006-08-10 08:02 · Score: 1

Cute. :)
Then you either fill from memory, or you check for it at compile time.

--
none
Re:Fun and games by Anonymous Coward · 2006-08-10 10:54 · Score: 0

Be careful with those stacks, Eugene!
Re:Fun and games by rubycodez · 2006-08-10 15:07 · Score: 1

oh NOW you tell me, after I underflowed and poked my eye out!

just confused by crodrigu1 · 2006-08-10 07:36 · Score: 0

I can hear all this experts, and I wonder why they are THE experts, one of the important movements in programming is non-stack based programming methodologies. (http://www.theserverside.com/tss) had another expert opinion on why non-stack programming was better.

Re:just confused by Eric+LaForest · 2006-08-10 08:14 · Score: 1

Both methodologies work. Using stacks is better in the small, when software and hardware size are the limiting factors.

--
none

Re:I Know... by SnowZero · 2006-08-10 07:39 · Score: 1

do you have a mirror for that? I'm too lazy to look.

Text of PPT by Anonymous Coward · 2006-08-10 07:40 · Score: 4, Informative

Introduction
Discovered field by chance in 2000 (blame the Internet)
Hobby project (simulations and assembly) until 2004
Transformed into Independent Study thesis project
Overview of current state of research
Focus on programmer's view

Stack Computers: Origins
First conceived in 1957 by Charles Hamblin at the University of New South Wales, Sydney.
Derived from Jan Lukasiewicz's Polish Notation.
Implemented as the GEORGE (General Order Generator) autocode system for the DEUCE computer.
First hardware implementation of LIFO stack in 1963: English Electric Company's KDF9 computer.
Stack Computers: Origins (Part 2)
Independently discovered in 1958 by Robert S. Barton (US).
Implemented in the Burroughs B5000 (also in 1963).
Better known
Spawned a whole family of stack computers
The First Generation
The First Generation: Features
Multiple independent stacks in main memory
Stacks are randomly accessible data structures
Contained procedure activation records
Evaluated expressions in Reverse Polish Notation
Complex instructions sets trying to directly implement high-level languages (e.g.: PL/1, FORTRAN, ALGOL)
Few hardware buffers (four or less typically)
Supplanted in the 1980's by RISC and better compilers
Stack Computers: A New Hope
Enter Charles H. ("Chuck") Moore:
Creator of the stack-based FORTH language, circa 1970
Left Forth, Inc. in 1981 to pursue hardware implementations
NOVIX (1986), Sh-BOOM (1991), MuP21 (1994), F21 (1998), X18 (2001)
Currently CTO of Intelasys, still working on hardware
product launch expected April 3, 2006 at Microprocessor Summit
Enter Prof. Philip Koopman, Carnegie-Mellon University
Documented salient stack designs in "Stack Computers: The New Wave", 1989
The Second Generation
The Second Generation: Features
Two or more stacks separate from main memory
Stacks are not addressable data structures
Expression evaluation and return addresses kept separate
Simple instruction sets tailored for stack operations
Still around, but low-profile (RTX-2010 in NASA probes)
Strangely, missed by virtually all mainstream literature
Exception: Feldman & Retter's "Computer Architecture", 1993
Arguments and Defense
Taken from Hennessy & Patterson's "Computer Architecture: A Quantitative Approach", 2nd edition
Summary: Valid for First Generation, but not Second
Argument: Variables
More importantly, registers can be used to hold variables. When variables are allocated to registers, the memory traffic reduces, the program speeds up (since registers are faster than memory), and the code density improves (since a register can be named with fewer bits than a memory location).
[H&P, 2nd ed, pg 71]
Manipulating the stack creates no memory traffic
Stacks can be faster than registers since no addressing is required
Lack of register addressing improves code density even more (no operands)
Globals and constants are kept in main memory, or cached on stack for short sequences of related computations
Ultimately no different than a register machine
Argument: Expression Evaluation
Second, registers are easier for a compiler to use and can be used more effectively than other forms of internal storage. For example, on a register machine the expression (A*B)-(C*D)-(E*F) may be evaluated by doing the multiplications in any order, which may be more efficient due to the location of the operands or because of pipelining concerns (see Chapter 3). But on a stack machine the expression must be evaluated left to right, unless special operations or swaps of stack position are done.
[H&P, 2nd ed, pg. 71]
Less pipelining is required to keep a stack machine busy
Location of operands is always the stack: no WAR, WAW dependencies
However: always a RAW dependency between instructions
Infix can be easily compiled to postfix
Dijkstra's "shunting yard" algorithm
Stack swap operations equivalent to register-register move operations
S

Re:Text of PPT by Eric+LaForest · 2006-08-10 07:48 · Score: 1

Thank you!

--
none

JVM by TopSpin · 2006-08-10 07:40 · Score: 4, Informative

Java bytecode is interpreted on a virtual stack based processor. Most bytecode gets JITed into native register based instructions, but the model JVM processor is a stack processor.

Some previous poster noted that CLI is also a stack based model. I can't verify that myself but it wouldn't surprise me; Microsoft is, after all, highly 'innovative' or something.

--
Lurking at the bottom of the gravity well, getting old

Re:JVM by Anonymous Coward · 2006-08-10 07:55 · Score: 3, Informative

Its not like Java was super-innovative to use the stack-based architecture. Java was designed with web-applications in mind, and as such having small code size was extremely important for bandwidth reasons. One of the best features of stack machines is the small instruction size (no need to store register locations). So a stack machine is a natural choice for the JVM. If you wanna nag on .NET copying Java, there are plenty of good reasons, but this isn't one.

There is one very widely used FORTH-type language by porkchop_d_clown · 2006-08-10 07:46 · Score: 2, Insightful

that almost every /. user encounters every day: Postscript and PDF.

--
Clear, Dark Skies

Appropriate instruction set by dpilot · 2006-08-10 07:46 · Score: 4, Insightful

Even in assembler, the mainstream hasn't been programming to the metal since Pentium I.

Beginning with Pentium II, and propagating to pretty much all of the other archictures in a short time, non of the mainstream CPUs have exposed their metal. We have an instruction set, but it's torn into primitives and scheduled for execution. We don't see the primitives, not even in assembler. AFAIK, there isn't even a way to use the true primitives, except perhaps on the Transmeta, where it was undocumented.

So in this light, since we're already fairly far from the true metal, it seems to me that it makes a lot of sense to re-evaluate the instruction set itself. Of course one could raise the Itanium argument, but I would also argue that politics were too big a part, there. Then again, one could also argue that x86 and amd64 are just so entrenched that it doesn't matter, and they do run well on today's hardware.

Then again I could cite my old favorite, the 6809. It started from the same origins and precepts as RISC, but a different attitude. RISC simply tried to optimize the most common operations, at the expense of less common ones. With the 6809, they tried to understand WHY certain things were happening, and how those things could be done better and faster. They ended up with a few more transistors, the same speed, and something approaching 3X the throughput, as compared to the 6800. More similar to the current topic, there was a paper on 'contour mapping', mapping blocks of cache into stacks and data structures. The 6809 was too old for a cache, but it seems to me that combining it's concepts with the contour mapping would be interesting indeed.

But like stack engines, it's not x86/amd64 compatible.

--
The living have better things to do than to continue hating the dead.

Re:Appropriate instruction set by Sebastopol · 2006-08-10 08:27 · Score: 1

Even in assembler, the mainstream hasn't been programming to the metal since Pentium I.

What are you talking about, you clueless git?

Nearly every device driver in your Windows, Linux or Mac machine has assembly code modules which are HAND-TUNED to the processor type (which is why every processor offers a CPUID). And I'm not referring just to graphics cards... There are teams where I work that still need to use MSofts MASM 6.22 to compile 16 bit portions of BIOS code.

I'd say it is 50/50 assembly vs. higher level in the world outside college. The embedded market is far larger than the PC market.

--
https://www.accountkiller.com/removal-requested
Re:Appropriate instruction set by Anonymous Coward · 2006-08-10 08:44 · Score: 0

What he's talking about, you clueless git, is that on recent CPUs, the x86 code is decomposed into "micro-ops" which are what actually gets executed; even when you're "writing assembly", you aren't writing native code.

Of course, this is nothing new -- microcoded architectures have been around since the '50s or so.
Re:Appropriate instruction set by stevesliva · 2006-08-10 08:47 · Score: 2, Insightful

He's saying that the latest Intel chips run micro-ops that do not have a 1-to-1 correspondence with the x86 ISA to which you refer. Git it?

--
Who do you get to be an expert to tell you something's not obvious? The least insightful person you can find? -J Roberts
Re:Appropriate instruction set by Sebastopol · 2006-08-10 08:47 · Score: 1

Another "The 50's had it first reply". Yawn.

Ok, please enlighten me with the 50's era micrcode-tranlation CPU you are referring to, because I've never heard of such a device. This is your chance to show us how smart you are.

Too bad you posted as AC.

--
https://www.accountkiller.com/removal-requested
Re:Appropriate instruction set by Waffle+Iron · 2006-08-10 09:10 · Score: 1

Ok, please enlighten me with the 50's era micrcode-tranlation CPU you are referring to, because I've never heard of such a device.

The roots of microcoding started in 1947, and reached a more modern form by 1951. Now you've heard of such a device.
And to buy you a clue, you haven't been programming any X86 CPU to anything near the "bare metal" since the Pentium II came out. Maybe you should go read up on the actual internal architecture of modern CPUs before spouting off.
Re:Appropriate instruction set by Anonymous Coward · 2006-08-10 09:13 · Score: 0

You haven't actually looked at much device driver code, have you? The amount of assembly code in there for modern operating systems (well, NT and lin00x) is *very* small, except perhaps for some hand-tuned SSE/2/3 code in graphics drivers.

Also, as already noted, the OP was referring to the fact that modern x86 CPUs have a frontend decoding x86 instructions to the -ops the execution backend actually handles.

n00b.
Re:Appropriate instruction set by vtcodger · 2006-08-10 09:23 · Score: 2, Interesting

***Then again I could cite my old favorite, the 6809.***
The 6809 was not only easy and fun to program, 6809 programs tended to benchmark out significantly faster than programs for comperable CPUs like the Z80, 6800 and 8080. If the industry ever decides to scrap the x86 mess -- which they won't -- going back to the 6809 for a starting point might not be a bad idea at all. I once did a plot of measured times for a benchmark where timings were available for a bunch of CPUs (Sieve of Eratosthenes). When you plotted out clockspeed vs word width, all the CPUs from the 8080 to the Cray something or other fell out into an untidy straight line, except for the 6809. There were, as I recall, three different results published for SOE on the 6809 and all three were an order of magnitude faster than they had any reasonable expectation of being based on the hardware's apparent capabilities.

--
You can't see ANYTHING from a car, You've got to get out of the goddamned contraption and walk...Edward Abbey
Re:Appropriate instruction set by Sebastopol · 2006-08-10 09:40 · Score: 1

And to buy you a clue, you haven't been programming any X86 CPU to anything near the "bare metal" since the Pentium II came out. Maybe you should go read up on the actual internal architecture of modern CPUs before spouting off.

You clearly think bare metal means programming microcode. Since I don't program micro code for intel, I don't do this.

Or do you mean microcode isn't bare metal, maybe you mean i should be programming bits into the ALU myself?

Maybe you think there is something below assembly which is bare metal. Maybe you mean wiring 74Cxx ICs together and programming your own ROMs.

Or maybe you're just a relic from the RISC v. CISC debate, and in your mind you need to program RISC uop to be "bare metal".

I simply do not understand your claim that microcode is not bare metal. It makes zero sense from an architecture point of view.

--
https://www.accountkiller.com/removal-requested
Re:Appropriate instruction set by Waffle+Iron · 2006-08-10 09:53 · Score: 1

Or maybe you just enjoy posting long lists of strawman questions?
Re:Appropriate instruction set by dpilot · 2006-08-10 11:42 · Score: 1

Perhaps an appropriate point, but I still think there's something different between microcode and the micro-ops of modern architectures.

By the way, I *have* done microcode. In a way, it's maybe the most enlightening programming ever, because with microcode you know (and have to know) exactly on a cycle-by-cycle basis what's going on. For that matter, even the cycle is too coarse a measure, you have to know what's going on inside the cycle.

As an extra qualifier, I know there's horizontal and vertical microcode, and I did fairly narrow horizontal. But I guess I think that the difference is that microcode, at least the stuff I did, didn't really dispatch instructions, but rather the bits in the microcode corresponded, after a little simple decode, to actual wires that did things like select registers, pick a leg of a mux, or gate read/write operations. The correspondence was stunningly one-to-one, so that a hardware type with a few software courses could pick it up pretty readily.

By contrast, I think of the dispatches of the Pentium II - come to think of it, really the Pentium Pro and later were higher-level constructs, not simple decoded wires. As I said, maybe the distinction between this and vertical microcode is blurrier than it is with horizontal.

--
The living have better things to do than to continue hating the dead.
Re:Appropriate instruction set by dpilot · 2006-08-10 11:48 · Score: 1

As long as we're talking about the 6809 and stack architectures, particularly Forth, in the same thread...

Did you ever do Forth on the 6809?
Did you ever see the source?
The main "execute loop" was 2 instructions long. I figured it out once, that had they had another address mode (I forget what it was now, but it did exist on other architectures at the time, PDP-11 and/or Series/1.) it would have been a single instruction.

I tweaked up a Sieve myself on a 1MHz 6809 that beat the published values for a 4.77MHz 8088.

--
The living have better things to do than to continue hating the dead.
Re:Appropriate instruction set by aminorex · 2006-08-10 12:15 · Score: 1

You can, in fact, load microcode into many recent Intel CPUs. And you can, in fact, thereby
make it faster to perform certain specialized tasks. It is concievable, even, that you could
make a superior ISA, in both design and performance terms.

Programming to the bare metal is more closely akin to VLIW/EPIC ISA programming than it is to
RISC programming. Whereas RISC designs attempt to eliminate microcode from the stack in favor of
a highly orthogonalized ISA, VLIW designs essentially incorporate the microcode into each opcode,
thus giving the programmer the ability to schedule the gates on functional units manually. The
Intel/HP Itanic is an example of such an architecture (although generally distinguished from
classical -- e.g. Multiflow -- VLIW through the use of the marketing term EPIC).

--
-I like my women like I like my tea: green-
Re:Appropriate instruction set by printman · 2006-08-10 14:40 · Score: 1

I still have my old 6809 reference manual by Lance Leventhal - 8-bit multiply (producing a 16-bit result) was a mere 5 clock cycles, and a full 16-bit multiple could be implemented in about 25 clock cycles. Motorola's 68xx 8-bit CPUs were definitely the best ever made... It's too bad that they could never get the 680x0 series to run as fast as other 32-bit CPUs, as it was really easy to program, too... :(

--
I print, therefore I am.
Re:Appropriate instruction set by Sebastopol · 2006-08-11 04:13 · Score: 1

I win.

--
https://www.accountkiller.com/removal-requested
Re:Appropriate instruction set by Sebastopol · 2006-08-11 04:26 · Score: 1

So if I understand correctly, the argument is really, "the x86 decoder gets in the way", and that people could do better if they could schedule uops themselves.

One only need to look at the past 15 years of CPU history to see the flaws in that assumption.

--
https://www.accountkiller.com/removal-requested
Re:Appropriate instruction set by Waffle+Iron · 2006-08-11 05:13 · Score: 1

You can't help but win, in your own mind, when you sidestep every point others make with pedantic semantic quibbling. Congratulations.
Re:Appropriate instruction set by raftpeople · 2006-08-11 06:06 · Score: 1

I programmed the 6809 also and I too had that Lance Leventhal book. I couldn't afford an assembler so I programmed in machine language, used that book a lot. When I saw what my friends were doing with the 6502 and other processors I was surprised with how basic they were and how advanced the 6809 was with 16 bit operations and it seems like there was something about addressing that was more advanced on the 6809 but I can't remember what.
Re:Appropriate instruction set by printman · 2006-08-11 10:50 · Score: 1

The 6809 had an indexed + offset addressing mode (i.e. index register + offset to access memory, not just index register), plus it had two index registers instead of the 1 provided by the 6502.

--
I print, therefore I am.

Why these downright stupid comments? by Jerk+City+Troll · 2006-08-10 07:47 · Score: 3, Insightful

You “wonder if Windows will run on a stack computer?” Where do you people come up with this nonsense? This is as irrelevant as saying: "someday, car tires will not be made of rubber. I wonder if Windows will support them?" Really, there is no need to try to come up with insightful remarks or questions to tack on the end of your story submissions. Just present the article and leave it at that. Let everyone else do the thinking.

--
Join Tor today!

Re:Why these downright stupid comments? by LWATCDR · 2006-08-10 08:29 · Score: 1

Yep I agree but the answer is who cares. I bet NetBSD already does.

--
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Re:Why these downright stupid comments? by Anonymous Coward · 2006-08-10 11:37 · Score: 0

Actually, the correct analogy is ... Someday tires will no longer be made of rubber, but will I still be able to run my mule-driven cart on them without it crashing? Also will this finally be the silver bullet that will eliminate the need for my manufactured buggy-whip?
Re:Why these downright stupid comments? by JonnyQabbala · 2006-08-10 15:19 · Score: 0

>Really, there is no need to try to come up with insightful remarks or questions to tack on the end of your story submissions.
You hear that woosh? Thats the sound of a joke going WAY over your head

-------*

*
|-
|\

--
This sig intentionally left blank
Re:Why these downright stupid comments? by Anonymous Coward · 2006-08-10 15:43 · Score: 0

That was a joke?

For the same reason language choice always matters by porkchop_d_clown · 2006-08-10 07:48 · Score: 1

because you should choose a language that fits the problem, not the reverse.

If the problem is "make this work on a stack based machine" then look out! You're gonna have aging LISP programmers crawling out of the woodwork to show off their obsolete, er, elite, programming skills.

--
Clear, Dark Skies

Don't forget the classic HP3000 by Nick+Driver · 2006-08-10 07:49 · Score: 1

Don't forget the venerable HP3000 "Classic" machines like the Series 68 and 70 machines.

Re:Don't forget the classic HP3000 by walt-sjc · 2006-08-10 09:16 · Score: 1

Oh Gawd... I took a Cobol class on one of those back in the early 80's. What a horrible pile of shit that thing was. They retired it after our class.
Re:Don't forget the classic HP3000 by aminorex · 2006-08-10 11:49 · Score: 1

If a system designed in 1973 was still operating in the early 80's, even in the guise of a POS, then I'd have to consider it a roaring success. Can you imagine trying to do all the things you want and expect to be able to do today, using a computer designed in 1994, running software from 1994? POS would be kind words!

--
-I like my women like I like my tea: green-
Re:Don't forget the classic HP3000 by NaDrew · 2006-08-11 12:47 · Score: 1

Can you imagine trying to do all the things you want and expect to be able to do today, using a computer designed in 1994, running software from 1994? POS would be kind words!
I don't have to imagine it. I see it every Sunday and two evenings a week at my second job at Barnes & Noble, where the registers are antique Compaq boxes ca. 1994, running on (natch) Windows 95. POS is indeed a kind word for these.

--
Vista:XPSP2::ME:98SE

Stop Hurting My Eyes by Anonymous Coward · 2006-08-10 07:49 · Score: 5, Informative

Dear Slashdot Contributors,

Please stop describing undergrads doing independent studies as "Experts". Theres a reason that mainstream processors haven't picked up on "Stack Processors", and it has nothing to do with binary compatibility, the difficulty of writing a compiler for their instruction set, or general programming complexity. Stack Machines are really only good for In-Order processing. Wonder why NASA probes have Stack Processors? Because they don't freaking need to do out of order processing in order to get the performance they require, and they probably found stack processors to have a favorable power / performance ratio for their application. You will never see a full blown Windows running on a Stack processor, because Superscalar processors destroy their performance.

"My research project shows that some people wrote nifty papers in the 1970s, but everyone ignored them for an obvious reason I don't understand." -> Not an Expert

Re:Stop Hurting My Eyes by HiThere · 2006-08-10 08:23 · Score: 2, Insightful

I believe that your criticisms apply to only specific stack based architectures. That they do apply to the commonly presumed architectures I accept, but this is far from asserting them as general truths.

Actually, even asserting that register based computers solve the problems that you are describing is not a general truth. You need to specify how many registers of what type can deal with how many out of order processes. And I suspect that a stack computer with 5 or 6 stacks could deal as flexibly with that problem as is commonly required...with a bit of extra leeway. It would need to be able to implement rapid task switching based on a "high priority task stack"...and maintaining that would be a bit of a nuisance...but that particular stack could have a very short limit, say 50 items. I'll agree that this is one function that it would be better to handle in scratchpad memory, but it would be eminently possible to do it purely from a stack based approach. (Still, there's a good reason that priority queues are queues rather than stacks...and I would argue that a dequeue would be an even better approach.

Well, I'm neither a hardware engineer nor a computer system designer, so I could be wrong. OTOH, you're anonymous, which means that your arguments only deserve the weight that their own internal logic provides.

--

I think we've pushed this "anyone can grow up to be president" thing too far.
Re:Stop Hurting My Eyes by Anonymous Coward · 2006-08-10 08:25 · Score: 0

The difficulty of writing a compiler for their instruction set, or general programming complexity. Stack Machines are really only good for In-Order processing.
Nope; wrong on both counts. Forth compilers are quite easy to write and one which optimises to better quality code than most C compilers is not as hard to write as those same C compilers. Out of order processing has been done in stack machines for years now. The reason it's not mainstream is the same reason no processor is mainstream other than the crappy Intel architecture: scale of pre-deployment and economies of production. Simply put - no new system of computing can get a hold in the market until the current Intel/AMD design finally hits the wall for good. Until then the cost/benifit of backwards compatibility and large scale fabs will keep stack machines and pretty well anything else in the fringes.
Re:Stop Hurting My Eyes by Anonymous Coward · 2006-08-10 09:11 · Score: 1, Interesting

Agreed. I work with this guy; he's an idiot.
Re:Stop Hurting My Eyes by Anonymous Coward · 2006-08-10 10:02 · Score: 0

He also makes a bunch of ridiculous simplifying assumptions - e.g. flat memory, no pages. It's a lot easier to go fast when you don't have to support features required by OSes!
Re:Stop Hurting My Eyes by AcidPenguin9873 · 2006-08-10 11:24 · Score: 1

how many out of order processes...It would need to be able to implement rapid task switching

The parent was referring to out-of-order execution of a single thread/process, not multiple threads/processes. OoO execution is something that mainstream microprocessors have done since the Pentium Pro. A stack-based instruction set (ISA) basically precludes any sort of out-of-order execution of individual instructions, because every instruction is directly dependent on the one before it (basically, every instruction is dependent on both the value at the top of the stack AND the top-of-stack pointer). When every instruction is dependent on the one before it, the ONLY thing a microprocessor can do is execute the very next instruction. It can't "look ahead" in its instruction window to find independent instructions to execute out-of-order because there aren't any.

The upshot of this is, a stack-based microprocessor is stuck executing instructions in-order. This is very slow compared to out-of-order execution. Now, you've inadvertantly brought up a decent point: with the multi-core CPU trend upon us, and assuming software ever gets parallel enough (it's not even close right now), maybe it would be better to have 32 or 64 cheap, low-power, in-order, stack-based CPU cores on a single die. But that's 10-15 years away. For now, OoO cores kill in-order cores.
Re:Stop Hurting My Eyes by Eric+LaForest · 2006-08-10 11:57 · Score: 2, Interesting

"Out of order processing has been done in stack machines for years now."

As far as I know, there are no implemented second-gen stack computers that support that feature.
(There have been a few theoretical ones.)
Which ones are you talking about?

--
none
Re:Stop Hurting My Eyes by HiThere · 2006-08-10 18:11 · Score: 1

I thought I had handled that case. Now admittedly if you have 256 general purpose registers, then you have a large number of choices as to what you can do next, while if you only have, say, 6 stacks then your choices are fewer. This doesn't mean that they disappear.

One can't run connected calculations out of order (except as provided by commutativity and associativity) no matter how many registers you have. With multiple stacks the requirement would merely be to ensure that each stack only contained the connected calculations in order to have several logical threads proceed simultaneously. (No, I'm not talking about calling a Thread Library...I'm talking about the logical thread linking the calculations.) It's true that this was a minor consideration, but I did think I'd covered it.

Actually, even that is too strong a statement, though one would desire for efficiency that the top of each stack be a logically connected series of entities (numbers, processes, return values, etc.)

It's not clear to me that this would be a particularly efficient approach, but that's far different from asserting that it's impossible.

--

I think we've pushed this "anyone can grow up to be president" thing too far.
Re:Stop Hurting My Eyes by AcidPenguin9873 · 2006-08-11 01:31 · Score: 1

After thinking about it more, I agree that it's possible. However, the overall tone of your post seemed to indicate only a multi-threaded model, not a single-thread, out-of-order execution model.
Re:Stop Hurting My Eyes by Portfolio · 2006-08-11 04:14 · Score: 1

...maybe it would be better to have 32 or 64 cheap, low-power, in-order, stack-based CPU cores on a single die. But that's 10-15 years away

This is exactly what Chuck Moore, the inventor of Forth, has been designing for the last few decades. Many of his MIMD grid multi-core designs made it to prototyped silicon, though they didn't take off in the market. His latest effort is called SEAforth (for "a sea of processors").
Re:Stop Hurting My Eyes by HiThere · 2006-08-11 05:57 · Score: 1

You are correct. I was mainly thinking about simultaneous execution of different "light weight threads" or processes, not logical threads. So that's where I did put all the emphasis.

--

I think we've pushed this "anyone can grow up to be president" thing too far.

Computer-Science Motto: Back to the Future by Anonymous Coward · 2006-08-10 07:50 · Score: 0

The motto in computer architecture and any other field of art is "Let's go back to the future and see whether any old ideas have traction." Here are some examples of u-turns in art.

1. stack-based computers -> register-based computers -> stack-based computers

2. virtual machine monitor -> operating system (e.g., MS-DOS, Unix, and Windows) directly on top of the hardware -> virtual machine monitor

3. dumb terminal -> personal computer -> thin client

4. Al Capone's favorite car -> Chrysler LeBaron -> PT Cruiser

5. 1967 Camaro with aggressive, muscular form -> 1982 Camaro with slick, crack-cocaine form -> 2009 Camaro with aggressive, muscular form

6. 1960's "Mission Impossible" -> lots of boring TV-series/theater-movies -> 1990's "Mission Impossible"

7. 1960's "Bewitched" -> lots of boring TV-series/theater-movies -> 2000's "Bewitched"

8. 1970's "Brady Bunch" -> lots of boring TV-series/theater-movies -> 1990's "Brady Bunch"

In art, what goes around comes around. Note that I said, "computer architecture", not "computer science". Computer science is real science. Computer architecture is not. It is art. Just see the 8 items in the above list.

Question about stack computer types by thewiz · 2006-08-10 07:51 · Score: 3, Funny

Do these come in short- and tall-stack versions?
Are maple syrup and butter options?

--
If "disco" means "I learn" in Latin, does "discothèque" mean "I learn technology"?

Re:Question about stack computer types by Tumbleweed · 2006-08-10 08:07 · Score: 1

Do these come in short- and tall-stack versions?
Are maple syrup and butter options?

I'm more into grid computing, where it's all about the waffles!

Postscript is based on stacks by Anonymous Coward · 2006-08-10 07:53 · Score: 1, Informative

Once upon a time when laser printers cost $10K, before CorelDraw, I learned to program Postscript. Yes it's a real programming language but it prefers to do things with stacks. You could write real programs that did real calculations etc. What a pain. I'd rather program in machine code. I'd rather program in microcode. aargh. You get the point.

I still use postscript to create graphics but if any computation is involved, I use another language.

Having said the above I realize that most languages insulate you from the architecture. If you're programming in Python or Java, you probably wouldn't notice the difference. My experience with Postscript convinces me that such an architecture isn't as efficient as some people think it is though. There's just too much popping and pushing just to get at a value. Since a given value changes its position on the stack, you also need to keep track of where it is. Bleah. This isn't to say that stacks don't have a place. A variant of a stack called a circular buffer really speeds things up on a dsp. For general purpose use though ...

Re:Postscript is based on stacks by Eric+LaForest · 2006-08-10 08:35 · Score: 1

The problem of buried values on a stack is dealt with by factoring the program really finely. A procedure should ideally use no more than 2-3 elements on a stack. More than that and the code gets very hard to follow and needed data items get buried, as you mention.

--
none

Stack machines - again? by Animats · 2006-08-10 07:54 · Score: 3, Insightful

Who can forget the English Electric Leo-Marconi KDF9, the British stack machine from 1960. That, and the Burroughs 5000, were where it all began.

Stack machines are simple and straightforward to build, but are hard to accelerate or optimize. Classically, there's a bottleneck at the top of the stack; everything has to go through there. With register machines, low-level concurrency is easier. There's been very little work on superscalar stack machines. This student paper from Berkeley is one of the few efforts.

It's nice that you can build a Forth machine with about 4000 gates, but who cares today? It would have made more sense in the vacuum tube era.

Re:Stack machines - again? by powerlord · 2006-08-10 08:09 · Score: 1

It's nice that you can build a Forth machine with about 4000 gates, but who cares today

Considering that we seem to be entering the vacuum tube era in nano-tech, perhaps a 4000 gate forth machine can be used to run programmable nano-machines.

--
This space for rent. All reasonable inquiries will be entertained at proprietors discretion.
Re:Stack machines - again? by Anonymous Coward · 2006-08-10 09:02 · Score: 0

You could always just put a few thousand of them on one chip. If you built it like the transputer, you'd have interprocessor communications as a primitive. Task switching can also be very fast. The entire thing would probably be an excellent target for things like Erlang (heavily used in telecoms) programs in which tend to use thousands of (internal, lightweight) processes.
Re:Stack machines - again? by aminorex · 2006-08-10 14:11 · Score: 1

> It's nice that you can build a Forth machine with about 4000 gates, but who cares today?

Now put one of those CPUs on the RAM chip for every 256KB of RAM. Think about the petaflops that result
from a decent sized array of those. Not MDGRAPE3 petaflops mind you, but real, general-purpose, linpack
smacking petaflops, for pennies on the gigaflop.

I care.

--
-I like my women like I like my tea: green-

Maybe funny only to me, but.. by totallygeek · 2006-08-10 07:54 · Score: 0, Redundant

If I saw one, I would exclaim, "Wow, that computer is stacked!"

--
Click here or here.

Re:Maybe funny only to me, but.. by The_Wilschon · 2006-08-10 09:52 · Score: 1

(Score:1)

Yep. Funny only to you.

--
SIGSEGV caught, terminating

wait... not that kind of sig.

Not a good idea by coats · 2006-08-10 07:55 · Score: 4, Insightful

The reason modern systems are so fast is that they hide a lot of fine grained parallelism behind the scenes. It is very hard to express this kind of parallelism in a way that it can be executed on a stack machine.

How important is this parallism? Consider that modern processors have 10-30 pipeline stages, 3-6 execution units that can have an instruction executing at each stage; moreover, most of them have out-of-order execution units that handle instructions more in the order that data is available for them rather than the order they are listed in the object file (and main memory is hundreds of times slower than the processors themselves, so this is important!). Typically, such processors can have more than 100 instructions in some stage of execution (more than 250 for IBM POWER5 :-)

Consider, also, that the only pieces of anything-like-current stack hardware are Intel x87-style floating point units, that Intel is throwing away -- for good reason! -- in favor of (SSE) vector style units. In the current Intel processors, the vector unit emulates an x87 if it needs to -- but giving only a quarter of the performance.

Someone made remarks about Java and .Net interpreters: in both cases, the interpreter is simulating a purely scalar machine with no fine grained parallelism; no wonder an extensible software-stack implementation is one of the simplest to implement. Stacks are not the way that true Java compilers like gjc generate code, though!

No, stack-based hardware is not a good idea. And haven't been since some time in the eighties, when processors started to be pipelined, and processor speed started outstripping memory speed.

--
"My opinions are my own, and I've got *lots* of them!"

Re:Not a good idea by Anonymous Coward · 2006-08-10 08:13 · Score: 0

But stack-based hardware is much simpler; it would run at a higher clock rate, and you could put dozens of them on the same piece of silicon required for one VLIW CPU.
Re:Not a good idea by coats · 2006-08-13 00:58 · Score: 2, Interesting

Suppose you put *two* dozen of them on a chip, and suppose they are *four* times faster. You still have less than a quarter the performance of a Conroe or POWER5 (both of which are dual-core, with each core sustaining more than 200 instructions in flight at a time), and you still have to manage that parallelism "by hand". Actually, the "four times faster" won't work, either -- remember that memory is still 200 times slower than Conroe or POWER5; if memory were 800 times slower than your processor, you'd really lose your performance!
This has been discussed ad nauseam in the computer architecture community, and I repeat: it's not a good idea!

--
"My opinions are my own, and I've got *lots* of them!"

NASA by HTH+NE1 · 2006-08-10 07:55 · Score: 4, Insightful

Apparently NASA uses stack computers in some of their probes.

Is that supposed to be a ringing endorsement? I thought NASA was using components the rest of the world treated as obsolete due their proven durability and reliability in the radiation of space.

--
Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?

Re:NASA by Medievalist · 2006-08-10 08:18 · Score: 2, Interesting

I thought NASA was using components the rest of the world treated as obsolete due their proven durability and reliability in the radiation of space.
Essentially correct. It is so costly and time-consuming to get a new component certified for use that it's usually less work to find a clever way to use old components. Then ten months after launch production ceases on the old part, and you have to have special ones built at a hundred times the cost (military option) or scavenge them on eBay (scientific option).
Re:NASA by Anonymous Coward · 2006-08-10 08:26 · Score: 0

These are the guys that missed Mars.
Re:NASA by LOTHAR,+of+the+Hill · 2006-08-10 08:33 · Score: 1

They hit Mars. It's just that they were going 100,000 miles an hour at the time.

I read that the Rover just explored the newly created and aptly named "Beagle Crater"
Re:NASA by wiredlogic · 2006-08-10 10:25 · Score: 1

The stack processor in question is the 1750, a government specified design available from multiple vendors. It isn't exactly the paragon of high performance. There are more convenient space grade processors available that don't stick you with an inconvenient development system.

--
I am becoming gerund, destroyer of verbs.
Re:NASA by Eric+LaForest · 2006-08-10 11:39 · Score: 1

Actually, the ones I refer to are the RTX-2000 and RTX-2010.
See http://forth.gsfc.nasa.gov/ for examples

--
none

"good enough" syndrome by Anonymous Coward · 2006-08-10 07:55 · Score: 0

A big problem with trying to sell stack-based CPUs, or for that matter any hardware different from what "everyone else" uses, is that since they're highly incompatible everything would have to be rewritten for them, and besides, the stuff that's already out there is "good enough". This is one of the reasons for the big Apple/x86 battle that still rages today, and also comes into effect with Linux and other *ix OSes.

The only way, IMO, that a stack-based machine would ever get a toe in the door would be if one were produced that was priced similarly to the current x86 platforms but was several times as fast, or was within 3X the price and was two orders of magnitude faster.

FORTH post! by Anonymous Coward · 2006-08-10 07:58 · Score: 2, Funny

Nothing to see here. Sorry.

No, they're not better by vlad_petric · 2006-08-10 08:01 · Score: 3, Interesting

I didn't even try to torrent-download the thing, but I can tell you why stack machines aren't better than register-based ones. The main reason is that it's much much much harder to do renaming of a stack than of"regular" registers. Without renaming you can't do out-of-order execution ... Currently, there are two "mainstream" architectures that include stacked register files: Itanium and Sparc. Neither of them have out-of-order implementations.

But why do you need out-of-order execution? Well, misses to memory are very expensive these days - it can easily take from 200 to 400 cycles to service a load that misses all the way to main memory. This can have a significant effect on performance. What out-of-order execution does is to allow independent instructions that are younger than the load to execute in parallel with it. Quite often these parallely-executed instruction will generate other misses to main memory, overlapping their latencies. So - latency of loads that miss is still very high, but at the very least the processor is not idle while servicing them (for a good read see "MLP Yes! ILP no!" by Andy Glew)

Itanium and Sparc compensate for the fact that they don't do stuff out-of-order by putting sh*tloads of L2/3 cache on-chip. The cost of a miss is still very high, but it happens much less often. The manufacturing cost of a chip is also much higher.

Note that what NASA is sending into space is "old" tech. The reason - well, cosmic rays are much stronger in outer space, and the smaller the gate, the easier it is for them to flip its state.

P.S. I'm a computer architect.

--

The Raven

Re:No, they're not better by Eric+LaForest · 2006-08-10 11:47 · Score: 1

I agree about the advantage of out-of-order execution, but what about thread-level parallelism?
Given multiple smaller, simpler processors on-chip, if one stalls on a memory fetch, the others may still be crunching away on stack and on cache. This is not unlike the approach Sun is taking with the UltraSparc T1 (aka "Niagara").

Admitedly, this will not parallelize single-threaded code. But it's a lot easier to design. :)

--
none
Re:No, they're not better by Anonymous Coward · 2006-08-10 12:07 · Score: 0

"P.S. I'm a computer architect."
great, let us know when your a computer scientist, because all those issue can be overcome.
Re:No, they're not better by aminorex · 2006-08-10 15:01 · Score: 1

But, all of those problems would cease to be relevant if a few K gates were present on every DRAM die, implementing a CPU with latencies to memory that no COTS chip designer would dare to dream about in his wildest fancies.

--
-I like my women like I like my tea: green-
Re:No, they're not better by vlad_petric · 2006-08-10 20:50 · Score: 1

Yeap, if you have a server app (web/app/db server) with a lot of inherent parallelism, that's great! Niagra kicks ass. As you were saying - that won't parallelize single-threaded code though. You could, I guess, do some speculative parallelization (there are a few very good research proposals here). OTOH you could also try various tricks to increase your out-of-order instruction window (there are some good proposals here as well).
BTW, I wasn't trying to say that stacked architectures are not worth it, I just didn't like the blank statement from the story: "stacked architectures are better than register-based ones". In architecture one size doesn't fit all (which can be applied the other way around, too, you wouldn't want out-of-order execution for an embedded app)

--
The Raven

Re:For the same reason language choice always matt by HiThere · 2006-08-10 08:04 · Score: 4, Informative

Sorry, but LISP (though I don't mean Common LISP) is just as much a stack language as FORTH. I think the first LISP that wasn't was LISP 1.5...but I'm rather vague on LISP history. Still, s-expressions are as stack oriented as FORTH is. The interesting thing is the first Algol 60 compiler (well...really an interpreter) I ever used was a stack machine. (That was why it was an interpreter. The real computer was an IBM 7090/7094 BCS system so it ran a stack machine program that Algol was compiled to run on. Whee!) So if you want a good stack computer language you could pick Algol 60. But FORTH is easier, and could even be the assembler language.

OTOH, most FORTHs I've seen use 3 or more stacks. I.e., most of them have a separate stack for floats. What would be *really* nice is if someone built a machine that used Neon as it's assembler. Neon is/was an Object-oriented dialect of FORTH for the Mac that allowed the user to specify early or late binding for variables. It was developed by Kyria Systems, a now-defunct software house. Unfortunately Neon died during a transition to MSWind95. I understand that it is survived by MOPS, but I've never had a machine that MOPS would run on, so I don't know how similar it was.

I think that FORTH would make a truly great assembler...and the more so if that dialect of FORTH were NEON. But I forget how many stacks it used. At least three, but I have a vague memory that it was actually four. The main stack, the return stack, the floating stack, and ??the Object stack??...I don't know.

--

I think we've pushed this "anyone can grow up to be president" thing too far.

Re:There is one very widely used FORTH-type langua by $RANDOMLUSER · 2006-08-10 08:05 · Score: 1

Chu__ Moo__ is my hero!

--
No folly is more costly than the folly of intolerant idealism. - Winston Churchill

mod parent funny! by RingDev · 2006-08-10 08:07 · Score: 1

Okay, that's a priceless quote!

-Rick

--
"Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs

Always fighting the wrong battle by 91degrees · 2006-08-10 08:07 · Score: 1

A 1K kernel will hardly make a lot of difference to anyone. I have a whole gigabyte of RAM in my machine. 1K? 100K? 1 Megabyte? 10 Megs? That's a pretty chunky kernel and we're still taking up a trivial chunk of our memory. Even the huge executables that we have these days aren't causing serious memory issues on modern PCs. Memory has outpaced executable size.

What we do want is lower power, and smaller. We can always take advantage of small or low power devices. If you want to reduce executable size, write a simple stack machine and an interpreter.

Re:Always fighting the wrong battle by Ant+P. · 2006-08-10 09:30 · Score: 1

A 1K kernel means you have 1024KB L2 cache + 15KB of L1 cache left over at about 20GB/s each, instead of 0.

Stack computers are hardly new by Junks+Jerzey · 2006-08-10 08:09 · Score: 4, Insightful

Normally this kind of stuff doesn't bug me, but this is like an article in 2006 proclaiming the benefits of object-oriented programming. Doesn't anyone know their computing history?

There were stack computers in the 1960s and 1970s. There was a resurgence of interest in the 1980s--primarily because of Forth's popularity in embedded systems--resulting in a slew of stack-based "Forth engines." Forth creator Chuck Moore has been working on a series of custom Forth CPUs for 20+ years now. His latest version has 24 cores on one chip (and was entirely designed by one person and uses MILLIWATTS of power).

Stack processors and languages have one big advantage: they minimize the overall complexity of a system. The tradeoff is that they often push some of that complexity onto the programmer. That's why Forth tends to shine for small, controlled systems (like a fuel injector or telescope controller or fire alarm), but you don't see people writing 3D video games or web browsers in Forth.

Re:Stack computers are hardly new by Sebastopol · 2006-08-10 08:22 · Score: 1

how fast does it run specINT @milliwatts? It doesn't? Oh, there ya go.

Saying it is low power is meaningiless when there are CPUs that use hundreds of MICROWATTS in a plethora of embedded devices today.

--
https://www.accountkiller.com/removal-requested
Re:Stack computers are hardly new by Junks+Jerzey · 2006-08-10 08:24 · Score: 1

Okay, I posted before reading the article. Can you tell :)

I think the fault is more with the submitter of the story than the author's presentation. The author just gives an overview of how stack computers work and their history. The submitter apparently never knew about stack computers and is all excited about them as a possible future of computing. The presentation is simple a history, mostly about stuff that's 20+ years old. So, yes, while stack processors have been commercially available and they've been used in commercial embedded systems work, they've failed to get outside that realm.

(As an aside, I'd say 99% of desktop programmers have no clue about all the stuff that's gone on in the embedded systems world, so it isn't surprising that old stuff seems new.)
Re:Stack computers are hardly new by $RANDOMLUSER · 2006-08-10 08:36 · Score: 1

> ...often push some of that complexity onto the programmer. That's why Forth tends to shine for small, controlled systems...

Truer words were never typed. However, as a low-overhead, "portable assembly language", Forth is a beautiful way to go. The nature of the language causes you to think about the problem as a heirarchy of "procedural objects", which is really ideal for polling the inputs and turning on the lights and motors. I used Forth a lot in an environment where others were using PLCs and ladder logic, and the leverage in programmer time and the difference in what could and couldn't be reasonably done was just amazing. As you say, you don't see people writing 3D video games in Forth, but I'd certainly prefer Forth if I had to write the firmware for the GPU.

--
No folly is more costly than the folly of intolerant idealism. - Winston Churchill
Re:Stack computers are hardly new by Anonymous Coward · 2006-08-10 11:39 · Score: 0

but you don't see people writing 3D video games or web browsers in Forth.
Your point is well taken, however, one correction is in order: Battle Zone, an ancient Atari 3D FPS using wire-frame objects, was written in Forth.
Re:Stack computers are hardly new by Junks+Jerzey · 2006-08-11 01:12 · Score: 1

Your point is well taken, however, one correction is in order: Battle Zone, an ancient Atari 3D FPS using wire-frame objects, was written in Forth.

What's your source for this? At one time or another I've heard people say that various classic video games were programming in Forth (Pac-Man, Defender, etc.)., but so far none of these rumors have been true. To the best of my knowledge, Battle Zone was programmed in assembly language.

One notable game actually written in Forth, BTW, is Starflight for the IBM PC (released by Electronic Arts in 1986).

Re: transputer wikipedia link by Kevster · 2006-08-10 08:09 · Score: 2, Interesting

I'm surprised no one's mentioned the transputer.

--
I always equivocate. Well, almost always.

Can we bring back BetaMax too? by JoshDM · 2006-08-10 08:10 · Score: 1

That would rock! Finally, the supposedly superior yet classically discarded technologies are coming back!

Re:Can we bring back BetaMax too? by wild_berry · 2006-08-10 09:02 · Score: 1

Yes. And set up a futures market for questions like which of the HD-DVD or blu-ray will be resurrected in future as superior technology.

Re:Computer-Science Motto: Back to the Future by Andrew+Kismet · 2006-08-10 08:10 · Score: 2, Funny

I cannot consider your post valid, as you've claimed that 2000's "Bewitched" was 'art'...

Saturn by Ghoser777 · 2006-08-10 08:16 · Score: 1

http://en.wikipedia.org/wiki/Saturn_(microprocesso r)

--
James Tiberius Kirk: "Spock, the women on your planet are logical. No other planet in the galaxy can make that claim."

Java by HeaththeGreat · 2006-08-10 08:20 · Score: 1

If stack-based hardware does become more the norm, JIT JVMs will become a lot simpler since the JVM is stack-based.

that's a common misnomer by bunions · 2006-08-10 08:26 · Score: 1

the proper term is "griddle computing" and it encompasses both pancakes -and- waffles, along with syntactic salt like bacon and eggs.

--
there is no need to sign your posts. this isn't usenet. your username is right there above your post. stop it.

Who likes them? by Ancient_Hacker · 2006-08-10 08:28 · Score: 1

Ivory-tower types LOVE stack computers. No messy "addresses". No messy "registers". Nice and simple. Many an undergrad has written a stack machine emulator. An occasionally a stack machine makes it to the market. A few old Burroughs mainframes. The HP 3000 series. oops, that one had to be recalled cause it was so ungodly slow. Lots of software virtual stadck machines:: The Pascal P-machine. UCSD P-code. FORTH. PostScript. Java byte code (I think). All kinds of weird and wonderful and generally SLOW languages.

As others have undoubltedly mentioned, there's not much concurrency you can do with a stack. The top of stack operation has to finish before the next op can pop anything off.

Forget stacks.

Re:Who likes them? by Anonymous Coward · 2006-08-10 08:39 · Score: 2, Interesting

You're Just Wrong(tm) about that, actually. See BOOST. No, not the masochistic c++ template library (ANYTHING written in C++ is masochistic), Berkeley's Out of Order Stack Thingy. http://www.cs.berkeley.edu/~satrajit/cs252/BOOST.p df

Probably mostly just an accident of history register machines went superscalar first and "won" (mostly, because maybe since stack machines were more efficient, the need for superscalarity didn't hit so early...),. But, in short: stack machines, with similar design overheads to register machines, can extract at least as much concurrency as register machines, maybe more.
Re:Who likes them? by Ancient_Hacker · 2006-08-10 23:13 · Score: 1
Thanks for the info. Let me be less glib:
History seems to show that when it comes to actual implementations, stack machines are slow. Perhaps this is jsut a coincidence, or they may be various factors at play:
- The folks that implement stack machines are trying for simplicity rather than speed.
- The allure of stacks leads implementors to other fancy fribbles, such as:
  - separate segments for each major data item (Burroughs)
  - implementation thru microcode (LSI-11 with P-Code Microcode)
  - totally software implementation (Postscript, FORTH, J-code)
  - Combinations of the above (HP 3000)
- The strain of implementing a stack machine (whatever that may be) leads implementors to poop out before they get to the point of implementing optimizations and paralelism.
My point is, there may be some sub-surface reasons why stack machines have tended to be slow.
Re:Who likes them? by Anonymous Coward · 2006-08-11 02:08 · Score: 0

Most stack machines were implemented before any superscalar architectures to speak of *anyway*. The modern ones implemented are all for embedded devices, where simplicity/reliability (think rad-hard...) and low power consumption are indeed more important than performance. BOOST shows that in principle, if someone wanted to build a "desktop/server" superscalar stack machine, it needn't be any slower or particularly harder to design than a conventional superscalar register machine. (Remember, it wasn't until Pentium/ MC68060 that superscalar architectures really hit mainstream.). But the register machines are already designed, so who'd bother except as a curiosity or to "own" their own proprietary architecture.

Intel x87 is A STACK ENGINE by Sebastopol · 2006-08-10 08:30 · Score: 1

In case y'all forgot, the Intel x87 FP engine was stack based until SSE. And it is still in there!

You pushed FP numbers onto F(0:7) and the operations worked on the stack. They than had to be popped off to the accumulator to load to memory.

Kids these days, I tells ya.

--
https://www.accountkiller.com/removal-requested

Re:For the same reason language choice always matt by Anonymous Coward · 2006-08-10 08:39 · Score: 0

You, my friend, are the geekiest geek I've ever seen!

Obligatory by Hyram+Graff · 2006-08-10 08:43 · Score: 1

Imagine a Beowulf cluster of these.

(Somebody had to do it.)

--
0*0
00*
***

Re:Obligatory by The_Wilschon · 2006-08-10 10:01 · Score: 1

0*0
00*
***

Didn't you mean a glider?

0**
0*0
00*

--
SIGSEGV caught, terminating

wait... not that kind of sig.
Re:Obligatory by Hyram+Graff · 2006-08-10 10:20 · Score: 2, Funny

First, rotate your version 90 degrees counter-clockwise. Next exchange all '0's and '*'s. What do you have? The answer is down there in my sig.

I was going to use '*'s and '.'s but with variable width fonts I couldn't get it to come out in a grid and I couldn't figure out how to have a monospace font appear in my sig. Thus, I replaced the '.'s with '0's and have the version that you see.

--
0*0
00*
***
Re:Obligatory by The_Wilschon · 2006-08-11 02:03 · Score: 1

Oh geez. I am an idiot. Thanks for pointing that out!

--
SIGSEGV caught, terminating

wait... not that kind of sig.

x86 OS's are stack based by s800 · 2006-08-10 08:44 · Score: 0

IMHO, for all purposes. Everything gets passed on the stack anyway. The last register OS I can remember was AmigaOS.

Re:x86 OS's are stack based by alnicodon · 2006-08-10 10:52 · Score: 1

Elaborating on this, it seems to me that this "hardware stack" is really an implementation detail: there's a stack anyway, and since its top is really often accessed, I suspect it gets easily loaded in the innermost, fastest, cache of a processor. Don't forget that knowledgeable people here keep saying that "even assembly is now far from the metal".

What people do say about this "fine grained paralellism" is only currently doable because some C compiler had the guts to allocate register for some of variables that would have otherwise (say, gcc -O0) been put on the stack: there may be some information a stack based processor would miss in order to achieve the same optimisation, but, well, I'm just talking out of the blue.

Would there be something to do with the "function boudaries" that C language imposes on the register usage schema ? But then, that would only be a problem wrt C based language. And a stack based language may rely on an efficient implementation of pure stack manipulations by an x86 clone.

Any comments appreciated

Strange thing to wonder about by Anonymous Coward · 2006-08-10 08:52 · Score: 0

"I wonder if Windows will be supported on a stack computer in the future?"

Why?

Orthogonality by ishmalius · 2006-08-10 08:53 · Score: 1

With the same logic, why use a laptop when a mainframe can do your word processing?

Being small and simple is precisely the point. Tiny stack machines give the opportunity for massive parallelism. Imagine hundreds or thousands of processors in a very small space, handling CPU intensive but relatively simple algorithms. How many such tiny processors could be printed onto a single chip?

Most computer users only consider the high-end chips that they use every day in their laptops and desktops. They are unaware of the amazing chip families at the other end of the spectrum; the new controllers and DSP's. IMHO, simple systems like FORTH will always have a role in ubiquitous computing.

Programming languages use stack machines? by Ant+P. · 2006-08-10 08:53 · Score: 1

Does that mean you could get a stack-based coprocessor card and make their code run faster?

Re:Programming languages use stack machines? by aminorex · 2006-08-10 11:56 · Score: 1

Sometimes yes, often no.

The reason we have the x86 architecture today as a de facto standard is economy of scale. Were a wholesale conversion to occur, such that economies of scale applied to stack machines, or lisp machines, or jellybean machines, or *T machines, then those would likely execute your code faster than anything else in the commodity price bracket.

Even Intel and HP in alliance, with all their weight behind it, couldn't shake loose the x86 monopoly through revolutionary change, as witness the fate of (VLIW/)EPIC architecture in the Itanic. Now an evolutionary, gradualistic change could be made which converted the world to stack machines, and the result would probably be better than a perpetuation of x86 (although the impact of unanticipated technological change in the meanwhile might change that equation), but barring a centrally planned economy (which would fail it on other grounds) that sort of optimized path forward seems extremely unlikely.

--
-I like my women like I like my tea: green-

Re:I Know... by x2A · 2006-08-10 08:54 · Score: 3, Funny

Stack computers, are basically like rack computers, except you can't pull out the one at the bottom.

--
The revolution will not be televised... but it will have a page on Wikipedia

Stack - bad for speed, good for low power by Theovon · 2006-08-10 08:59 · Score: 5, Insightful

I'm a chip designer, and I am working on my Ph.D. in CS. The idea of stack machines is something I have researched a bit, and I have drawn some of my own conclusions.

The main advantage of stack machines is that all or most parameters for each instruction are implicit. Aside from stack shuffle/rotate instructions, the operands are always the top few on the stack. This makes instructions very small. The logic is also exceedingly simple (for fixed-stack designs). If you want a simple, low-power CPU, a stack machine is what you want.

Where I explored this issue, however, is in the realm of high-performance computing. The key advantage of a stack architecture is that smaller instructions take less time to fetch from memory. If your RISC instructions are 32 bits, but your stack machine instructions are 8 bits, then your instruction caches are effectively 4x larger, and your over-all cache miss penalty is greatly reduced.

The problem with stack machines is that they're damn near impossible to add instruction-level parallelism to. With a RISC machine, near-by instructions that deal with different registers (i.e. no dependencies) can be executed in parallel (whether that's multi-issue or just pipelining). With a stack machine, everything wants to read/write the top of the stack.

I came up with two things to deal with this problem, that are very much like the CISC-to-RISC translation done by modern x86 processors, so it's more of a stack ISA on a RISC architecture. One is that the stack is virtual. When you want to pop from the stack, what's happening in the front-end of the CPU is that you're just popping register numbers corresponding to a flat register file. When you want to push, you're allocating an assigned register number from the flat register file. Now, if you can get two instructions going that read different parts of the stack and write (naturally) to different locations, you can parallelize them. The second part is a healthy set of register shuffling instructions. Since you're doing all of this allocation up front, shuffling registers is as simple as renumbering things in your virtual stack. So a swap operation swaps two register numbers (rather than their contents), and a rotate operation renumbers a bunch of them, but the pending instructions being executed still dump their results in the same physical registers.

This all sounds great, but there are some problems with this:

(1) The shuffling instructions are separate instructions. With a RISC processor, you have more information all in one unit. Although you could try to fetch and execute multiple stack instructions at once, it's much more complicated to execute four stack instructions in parallel than to execute a single RISC instruction, even though they require the same amount of memory.
(2) You need a lot of shuffling instructions. Say your stack contains values A, B, C, and D, and you want to sum them. Without shuffling, you'd add A and B, yielding E, then add E and C, yielding F, then add F and D. Three add instructions. If your adder(s) is/are pipelined, you'd like to add A+B and C+D in parallel or overlapping, THEN wait around for their results and do the third add. The problem is that to do that, you'd need to add A+B, then rotate C to the top then D to the top, then add, then add again. The first case was 3 instructions; the second case is 5 instructions. Depending on your architecture, the extra shuffle instructions may take so long to process that you might as well just have waited. No speed gain at all.
(3) The extra shuffing instructions take up space. Optimizers are hard to write. Although it's conceivable that one could optimize for this architecture so as to avoid as many shuffling instructions as possible, you still end up taking up quite a lot of space with them, potentially offsetting much of the space savings that you got from switching from RISC to stack.

So, there you have it. Somewhat OT, because surely NASA's primary goal has got to be low-power, but also somewhat on-topic because stack architectures aren't the holy grail. Just ideal for some limited applications.

Re:Stack - bad for speed, good for low power by 0xABADC0DA · 2006-08-10 10:14 · Score: 2, Interesting

Uh, I guess I'm too daft to get a Ph. D, but it sure seems to me like optimizing on the instruction level with a stack machine is solving the wrong problem.

With a stack machine, running one instruction stream in parallel is very hard, while very easy on a register-based one. But the flip side of this is that on a stack machine running multiple instruction streams in parallel is incredibly easy while *Very* difficult on a register based CPU.

For instance, take "add 1 to each element of this 30-length array" and the optimization to unroll the loop by three:

The stack version can use parallel streams:

push array # "stack[2]"
push 30 # "stack[1]"
push 1 # "stack[0]" of stream #1
push 2 # "stack[0]" of stream #2
push 3 # "stack[0]" of stream #3
push 3 # number of parallel streams to run
fork
loop:
add 1 to mem at (stack[2] + stack[0])
stack[0] += 3
if stack[0] < stack[1] goto loop
join

You'll have to use your imagination to expand the loop body into what it would look like in stack-instructions, but basically the fork pops the number of parallel stacks to run and then the join waits for each parallel stack to complete. Of course in a real implementation you would also push a number of stack elements to copy, etc. Since instruction decoders for stack machines are so simple your cpu can have literally hundreds of them on a die and each one still doing useful work.

The register-based machine will unroll the loop:

set r1 to 30
set r2 to 0
set r3 to array
loop:
set r4 to r3[r2]
set r5 to r3[r2+1]
set r6 to r3[r2+2]
add 1, r4 store in r4
add 1, r5 store in r5
add 1, r6 store in r6
store r4 to r3[r2]
store r5 to r3[r2+1]
store r6 to r3[r2+2]
add 3 to r2
compare r2 to r1, jump to loop

Now try to run that in parallel and you get a couple memory fetches/write overlayed, but mostly it is pretty slow. Just one hiccup in the pipeline and all of the parallelism stops. Now to mention the code to catch the remainder of the loop if not an even multiple.
Re:Stack - bad for speed, good for low power by phantomfive · 2006-08-10 10:47 · Score: 1

Comments like this is why I read slashdot and not digg. Thanks.

One thought I had was, what if the stack could also be accessed as an array? You could easily build hardware that would allow you to remove or insert values into the stack/array at any location in O(1) time. This would allow you to treat it as a stack or as a register based processor. This would allow you to do your add in just two cycles, add A+B, and C+D in two different threads then add the result together. For example, 'add A B' would remove B from the stack (from whatever position it's at), add them together, and put the result in 'A'. I admit this has a number of issues, but it is worth thinking about.

--
Qxe4
Re:Stack - bad for speed, good for low power by treyb · 2006-08-10 11:21 · Score: 2, Informative

Chuck Moore (the Forth guy) came to a slightly different conclusion: good for speed and good for low power. He uses the chip real estate you want to use pipelining instructions to add another core. In the case of the SeaForth processors, he added 23 other cores. Granted, that chip doesn't pretend to do anything but target embedded devices, but he demonstrates that stack machines can run quickly and use little power.

For... telescopes? by Anonymous Coward · 2006-08-10 09:00 · Score: 0

KIDS YOU OH !
." Those who don't remember the past get to have all the fun reimplementing it!"

Re: transputer wikipedia link by trb · 2006-08-10 09:04 · Score: 1

while it was a stack computer, i always thought the most distinctive feature of the transputer was its parallel design, which could be exploited when programming it in occam

MMIX uses a register stack by bunratty · 2006-08-10 09:06 · Score: 2, Interesting

Knuth's MMIX architecture uses registers, but the registers themselves are on a register stack. Perhaps this architecture provides the best of both worlds.

--
What a fool believes, he sees, no wise man has the power to reason away.

Re:MMIX uses a register stack by aminorex · 2006-08-10 15:40 · Score: 1

Like a stack of register windows. SPARC even.

--
-I like my women like I like my tea: green-

Don't forget alpha and heat by ishmalius · 2006-08-10 09:08 · Score: 1

Space chips do have a few requirements that those on Earth don't. Ones that are exposed to the space environment need to be resistant to alpha particles (can cause bit flips). Power consumption must be as low as possible. And very important, especially in manned flight, is heat dissipation. Most common off-the-shelf laptops generate way too much heat to be used in a controlled life support system where heat must be carefully managed into and out of the system.

Re:FP stacks by gr8_phk · 2006-08-10 09:15 · Score: 1

I'm still waiting for a 64 bit processor that treats all registers the same. i.e. one load and one store instruction, but you can do fpadd or regular add on the same registers. This IMHO will reduce the number of opcodes needed, and you usually don't use a lot of FP registers and a lot of integer registers at the same time. Pure stacks suck BTW - you really need a swap (up to some depth) or a copy instruction to bring data to the top. A pure stack is too destructive. I do like the idea of a pure return stack with separate data stack.

A bumper sticker I saw once by Michael+Woodhams · 2006-08-10 09:17 · Score: 4, Funny

You Forth (heart) if honk then

--
Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.

Re:A bumper sticker I saw once by whitehatlurker · 2006-08-10 09:36 · Score: 1

Great quote!

--
.. paranoid crackpot leftover from the days of Amiga.

And PowerPC is better than x86. by OrangeTide · 2006-08-10 09:19 · Score: 1

architectually PowerPC is better than x86/x86_64. but the economics of the situation gives us much high performance and lower prices on x86. If industry focused as much energy on PowerPC as they do x86, then we'd have slightly better computers. I think the same applies to stack computers. They will never be "better" than x86 unless the market dramatically changed and moved away from x86. Right now there isn't much experience out there for implementing stack computers. Common tricks used for branch prediction and cache optimization would be significantly different on a stack computer. I would say that stack computers are only theoretically better than x86 in terms of possible performance.

--
“Common sense is not so common.” — Voltaire

Re:For the same reason language choice always matt by Anonymous Coward · 2006-08-10 09:34 · Score: 1, Informative

Hmm...I'm playing around with implementing a simple Lisp...to get lexical closures I'm putting all frames in the heap instead of the stack, which I think is a fairly standard technique (tho not necessarily what a really optimized compiled Lisp does). Doesn't really seem stack-based to me.

it isn't a totally stupid question by Anonymous Coward · 2006-08-10 09:35 · Score: 0

because by the time Windows Vista (Forever) comes out, stack machine might have become mainstream

*sigh* in the essence of completeness . . . by Anonymous Coward · 2006-08-10 09:35 · Score: 0

In Soviet Russia, Windows runs YOU!

Re:*sigh* in the essence of completeness . . . by roman_mir · 2006-08-10 09:42 · Score: 1

here the same thing correctly: runs, YOU, in, Windows, Soviet Russia

--
You can't handle the truth.
Re:*sigh* in the essence of completeness . . . by maxwell+demon · 2006-08-10 11:48 · Score: 1

Actually it's: Russia Soviet Windows YOU runs in

--
The Tao of math: The numbers you can count are not the real numbers.
Re:*sigh* in the essence of completeness . . . by roman_mir · 2006-08-10 13:27 · Score: 1

you have a different interpretation of what 'in' and 'runs' here means.

--
You can't handle the truth.

Return of FORTH and true RISC? by nurb432 · 2006-08-10 09:39 · Score: 1

Wouldnt that be cool.. we have come full circile..

--
---- Booth was a patriot ----

Calculator Wars by whitehatlurker · 2006-08-10 09:41 · Score: 1

Anybody remember the RPN vs algebraic entry wars? (HP Rules!)

--
.. paranoid crackpot leftover from the days of Amiga.

Stacker by Anonymous Coward · 2006-08-10 09:42 · Score: 0

Previous, Microsoft, Stacker, bought?

Hmm, except that it was a compression technology and they first stole it, then bought it after a lawsuit. Oh, well - or rather: Well, oh...

Re:I Know... by yaphadam097 · 2006-08-10 09:43 · Score: 1

"Stack computers, are basically like rack computers, except you can't pull out the one at the bottom."

Right, if you pull them from the bottom they are called queue computers. For stack computers it is preferred that you take and replace from the top. ...Either way you save a lot of money on the actual rack.

Re:Next Generation Stack Computing by MemoryDragon · 2006-08-10 09:44 · Score: 1

It is bullshit, all a stack computer basically does is to shift the register based ops back into the stack pointer domain, there is not too much to win there except the processors become somewhat simpler. The code itself in fact becomes even somewhat larger due to more excessive stack operations. Many VMs nowadays use stack machines in their core processing (java, .net, lisp etc...) the reason for this is that those machines map easier to many processor architectures than a vm which uses lots of registers.

A useful HTML article (better than 9999TB AVI) by roman_mir · 2006-08-10 09:51 · Score: 4, Informative

And here it is.

--
You can't handle the truth.

Will they run PostScript and PDF natively? by thingie · 2006-08-10 09:55 · Score: 1

PostScript and PDF are stack-based languages used by millions of people very day. PostScript, in particular, has been around a sinc the early 1980s.

http://en.wikipedia.org/wiki/PostScript

Towers of Hanoi by BierGuzzl · 2006-08-10 09:59 · Score: 1

Just think, the next big thing in stack computing: Towers^H^H^H^H^H^H Stacks of Hanoi at 2Ghz!

Re:For the same reason language choice always matt by samjam · 2006-08-10 10:05 · Score: 1

http://ficl.sourceforge.net/

If you liked neon you might like ficl; it has the same features you mentioned with choice of early and late binding.

Many forth object systems do.

I did a port of Ficl to MS Smartphone 2002/2003 but couldn't get permission to release it.
(By port I mean filling in some missing libc functions)

Sam

--
blog.sam.liddicott.com

Vista Feature by norminator · 2006-08-10 10:06 · Score: 1

Of course... that would still depend on a version of Windows for it to run on.

That was a planned feature for Vista, but of course, it got dropped.

Imagine.. by JohnnyOpcode · 2006-08-10 10:08 · Score: 1

a Beowolf 'stack' of these!

No, the *PROPER* response is by Slithe · 2006-08-10 10:12 · Score: 2, Funny

Imagine a Beowulf cluster of them.

--
---- "XML is like violence. If it doesn't fix the problem, you aren't using enough."

Re:No, the *PROPER* response is by AAWood · 2006-08-11 05:23 · Score: 1

I'd offer an even better response, but I don't want my post to clog up the tubes.

Stack computers by BigGar' · 2006-08-10 10:23 · Score: 1

Kind of looks like the way my HP RPN calculator works

--

Shop smart, Shop S-Mart.

Re:For the same reason language choice always matt by BlueGecko · 2006-08-10 10:27 · Score: 2, Informative

If you miss Neon, you'll be happy to know that you can get about 90% of its implementation and 100% of its concepts in the form of PowerMops, which is open-source and runs great and natively on Leopard. I haven't used it for anything recently, but it's worked fine for hobbyist stuff I've done in the past. I strongly encourage you to check it out.

Previous slashdot interview of Chuck Moore by Baldrson · 2006-08-10 10:56 · Score: 1

For more background see the Slashdot interview with Chuck Moore.

--
Seastead this.

Mmmm... by Anonymous Coward · 2006-08-10 11:03 · Score: 0

Look at the stack on that one!

Joy! by selfdiscipline · 2006-08-10 11:09 · Score: 2, Interesting

How come noone has mentioned the language Joy?
I've looked into it a couple times, and it seems pretty neat. In a word, functional concatenation.
Plus, as we all know, functional languages are so much more fun than procedural.

--

-------
Incite and flee.

Re:Joy! by waxwing · 2006-08-11 03:16 · Score: 1

Joy is more than interesting. http://www.latrobe.edu.au/philosophy/phimvt/joy.ht ml
Recursion Theory and Joy by Manfred von Thun Abstract: Joy is a functional programming language which is not based on the application of functions to arguments but on the composition of functions. Many topics from the theory of computability are particularly easy to handle within Joy. They include the parameterisation theorem, the recursion theorem and Rice's theorem. Since programs are data, it is possible to define a Y-combinator for recursion and several variants. It follows that there are self-reproducing and self-describing programs in Joy. Practical programs can be written without recursive definitions by using several general purpose recursion combinators which are more intuitive and more efficient than the classical ones.
I wrote an interpreter for a subset of Joy (in Oberon, btw) and as an unexpected side effect, I now understand continuations.

A Near Miss for Stack Computing Circa 1981 by Baldrson · 2006-08-10 11:12 · Score: 5, Interesting

Stack computing came close to changing the course of the computer industry, including setting networking forward 15 years (displacing Microsoft's stand-alone approach to software) back in 1981.

An excerpt from a bit longer essay I wrote:

In August 1980, Byte magazine published its issue on the Forth programming language
At that time, I was working with Control Data Corporation's PLATO project, pursuing a mass market version of that system using the Intelligent Student Terminal (IST). The IST's were Z80 processor terminals sporting 512*512 bit mapped displays with touch sensitive screens and 1200bps modems that went for about $1500. We were shooting for, and actually successfully tested, a system that could support almost 8,000 simultaneous users on 7600-derived Cybers (the last machine designed by Seymour Cray to be marketed by CDC --with 60 bits per word, 6 bits per character, no virtual memory, but very big and very fast) with under 1/4 second response time (all keys and touch inputs went straight to the central processor) for $40/month flat rate including terminal rental. Ray Ozzie had been working at the University of Illinois on offloading the PLATO central system to the Z80 terminal through downloaded assembly language programming, doing exotic things like "local key echo" and such functions.
I was interested in extending Ray's work to offload the mass-market version of the PLATO central system. In particular I was looking at a UCSD Pascal-based approach to download p-code versions of terminal functions -- and even more in particular the advanced scalable vector graphics commands of TUTOR (the "relative/rotatable" commands like rdraw, rat, rcircle, rcircleb, etc.) if not entire programs, to be executed offline. Pascal was an attractive choice for us at the time because CDC's new series of computers, the Cyber 180 (aka Cyber 800) was to have virtual memory, 64 bit words, 8 bit characters and be programmed in a version of the University of Minnesota Pascal called CYBIL (which stood for Cyber Implementation Language). Although this was a radically different architecture than that upon which PLATO was then running, I thought it worthwhile to investigate an architecture in which a reasonable language (you should have seen what we were used to!) could be made to operate on both the server and the terminal so that load could be dynamically redistributed. This idea of dynamic load balancing would, later, contribute to the genesis of Postscript.
Over one weekend a group of us junior programmers managed to implement a good portion of TUTOR's (PLATO's authoring language) advanced graphics commands in CYBIL. Our little hunting pack at CDC 's Arden Hills Operations was in a race against the impending visit of Dave Anderson of the University of Illinois' PLATO project who was promoting what he called "MicroTUTOR". Anderson was going to take the TUTOR programming language and implement a modified version of it for execution in the terminal -- possibly in a stand-alone mode. Many of us didn't like TUTOR, itself, much. Indeed, I had to pull teeth to get the authorization to put local variables into TUTOR -- and we were determined to select a better board from our quiver with which to surf Moore's Shockwave into the Network Revolution. CDC management wasn't convinced that such a radical departure from TUTOR would be wise, and we hoped to demonstrate that a p-code Pascal approach could accomplish what microTUTOR purported to -- and more. We quickly ported a TUTOR central sy

--
Seastead this.

Re:Next Generation Stack Computing by Anonymous Coward · 2006-08-10 11:20 · Score: 0

It is bullshit, all a stack computer basically does is to shift the register based ops back into the stack pointer domain, there is not too much to win there except the processors become somewhat simpler.

Actually, much, much simpler. Sometimes 3 orders of magnitude or more (in terms of # of transistors). Yes, you lose out-of-order execution, branch prediction, all of the fancy pipeline stuff. What you gain is simplicity and a good cost/performance ratio.

The code itself in fact becomes even somewhat larger due to more excessive stack operations.

Now that is BS. If your code is larger in a stack-based language than in an imperative language then you are doing something wrong. Refactoring: learn it, love it.

But is it really that bad for speed? by tempest69 · 2006-08-10 11:23 · Score: 1

I'm not a chip designer, but I still tend to run my mouth (er um fingers) quite a bit.

I totally agree that instruction level parallelism for a stack machine is a horrible monster to match up to a risc processor. But there are some nice peices for parallism that work well. The circuits for a stack processor take up a small amount of relative space, and require less power per stack "unit". So while making a given stack chip faster might be a monster problem, adding more stack units to a CPU die may allow for some serious performance gains. So there are a few items where the fine grained parallelism just isnt going to happpen. But as far as coarser grained parallelism goes there might be some possibilities.

There are some super low latency situations that I could see this as an untenable situation, and a few linear problems that just wont crumble down simply. Of course I might be missing something obvious here...

Storm

Re:But is it really that bad for speed? by philipgar · 2006-08-10 15:42 · Score: 1

Ah yes, there are many many options and thread level parallelism will almost always yield better performance on the perfect application, when the programmer is willing to spend tons of time working on the code. Better yet, we could use FPGAs with custom written hardware kernels running on them. This will yield even higher performance. Of course development might take a couple extra years, and the cost... well what does that matter, THIS THING WILL BE FAST!!! We can end up with superfast computers, it's just a lot of work. Basically the problem comes down to do we spend a lot of money on hardware, or do we spend a ton of money on software, or do we balance the two? Most of the time the balance works best. Out of order execution wastes the majority of it's hardware trying to improve performance by tiny amounts, yet for most of the workloads people run, it's been considered worthwhile. Good computer architects have learned that they have to balance the needs of the programmer before they can make a successful architecture.

Whoever thought this story was worthy of slashdot is a moron. There's nothing new here, it's just that stack machines are good for some low power applications. Move along, and consult someon who has actually done some research on computer architecture before posting some random persons worthless presentation.

Phil

MPE & HP3000 by cdn-programmer · 2006-08-10 11:52 · Score: 1

Since I am one of the few people who rad slash dot who has actually programmed a stack machine - my opinion might count.

This is a good architecture. It makes a great deal of sense. One of the things that a stack machine does is pull the memory (at least the automatic variables) for the function currently being executed together at the top of the stack. This means it is feasable to load this memory into cache or even into cpu registers which greatly increases speed. The HP3000 only had 4 registers for the stack top - today a CPU can have 100's. Whether there is 4 or 40 or 400 registers the instruction set is the same.

This has always been a very smart way to organise a machine.

The X86 is an example of everything! by Cassini2 · 2006-08-10 12:28 · Score: 3, Interesting

I did a computer architecture course a number of years ago. One day, we came to the consensus that the X86 architecture was an example of every computer architecture in existence. You want load store: look at all those MOV AX, xxxx instructions. You want register RISC, look at all those registers AX, BX, CX, DX, SI, DI, SP, BP. You want stack based: look at the FPU. You want vector parallel processing, look at those MMX/SSE instructions. You want symmetric multi-processing, look at those dual cores.

The course went quickly downhill after this observation. No one could figure out how incorporating every processor architecture into one product was a good thing ...

I got your kernel right here... by sethmeisterg · 2006-08-10 12:33 · Score: 1

A few kilobytes for a kernel? But how USEFUL is it? I got a kernel right ova' hea': PUSH $0 POP EAX JMP .-$9 done.

leaf expressions, register renaming by Joseph_Daniel_Zukige · 2006-08-10 12:37 · Score: 1

If a CISC with a small logical register set can find leafs in the pipe and distribute them to cores, even a true TOS limited stack processor could be designed to find the independent leaves and distribute them.

Re:leaf expressions, register renaming by Anonymous Coward · 2006-08-10 15:06 · Score: 0

If a CISC with a small logical register set can find leafs in the pipe and distribute them to cores, even a true TOS limited stack processor could be designed to find the independent leaves and distribute them.

"distribute them to cores"? Current processors can run one thread only on a single core, whether or not it has independent gropus of instructions.

next loop by Joseph_Daniel_Zukige · 2006-08-10 12:48 · Score: 1

loop:
jmp ,y++
bra loop

or something like that?

Direct threading actually turned out to be slower than indirect threading. I know from the implementation I did (which is still out there and downloadable, somewhere).

re-ordering is not prevented by Joseph_Daniel_Zukige · 2006-08-10 12:52 · Score: 1

by the virtual execution model. (Witness the 8086 descendent that is probably munching your data stream right now.)

Re:re-ordering is not prevented by Anonymous Coward · 2006-08-11 02:17 · Score: 0

I don't think you understand what "virtual execution model" means.

order of execution is independent of the model by Joseph_Daniel_Zukige · 2006-08-10 12:57 · Score: 1

I'll tire of saying so pretty soon, but here is one more try.

The current x86 chips _rename_ their registers to ease the bottleneck.

With only one TOS, renaming becomes a much simpler process. SWAP and ROT and DROP could easily become zero-cycle no-ops.

The pipline optimizer simply has to go hunting for independent leaf expressions, and since they are bounded by the patterns of growth and shrinkage of the stack, they should be easy to find. It has been done, and it works.

Re:order of execution is independent of the model by AcidPenguin9873 · 2006-08-10 15:17 · Score: 1

Basically what you're proposing is taking the single-stack ISA and converting it, internally, to a multiple-stack execution model (sort of analogous to how current processors convert the architectural registers into physical registers. sort of.). I'll agree that this is possible; however, how exactly is this faster/better than current register-based processors? You'd still have the rename, scheduling, and reorder buffer logic, so there goes any complexity/power argument. And stack swap operations only exist because the instructions (say, subtract) only operate on the top few elements of the stack (say, M[top-of-stack] - M[top-of-stack - 1]), and so those are actually overhead operations when compared with the register-based model. I have yet to hear a convincing argument for building such a processor.

Factor by John+Nowak · 2006-08-10 13:02 · Score: 1

As I haven't seen anyone mention it, I will.

Factor is a fucking brilliant (yes, I mean that) stack-based language. It is, in many ways, a hyper-modern Forth. If you've never dealt with such a thing before, give it a look. It'll completely change how you think about programming.

four way set associative by Joseph_Daniel_Zukige · 2006-08-10 13:03 · Score: 1

is way, way, overkill for the stack.

Expression leafs exist, the patterns of stack usage reveal them, if the leafs can be found they can be distributed.

No, the pipelines you're probably used to are not appropriate for the redistribution.

Twelfth of Never, but still 3 years ahead of apple by Anonymous Coward · 2006-08-10 13:04 · Score: 0

I wonder if Windows will be supported on a stack computer in the future?"

Considering it took Apple over three years longer than Microsoft to support 64-bit computers, I think the better question is how many centuries it will be before OS-X will run on a stack.

The problems with FORTH by Joseph_Daniel_Zukige · 2006-08-10 13:18 · Score: 1

are mostly derived from everyone being dazzled by the magic and missing the meaning.

The biggest issue is the source of the compactness. That much code re-use makes it difficult to separate modules, either for design purposes or for security/safety.

I'd go on, but I have to go to bed. Graveyard shift was fun, but I'm tired.

Abstraction physics - looping thru programming... by 3seas · 2006-08-10 13:33 · Score: 1

... concepts is to be expected.

In other words, stacks is a programming concept that can be applied at many different levels of computing.
This applies to other programming concepts as well. You might say its the recursive nature of programming to re-explore a concept in a differently configured computing environment and/or at a different level of computing abstraction.

Really no different than some mathmatical equasion where some element of the equasion is the use of stacks.

Some clarifications and a reply by Eric+LaForest · 2006-08-10 14:28 · Score: 1

My reply to the main themes in the comments are here: http://funos.livejournal.com/367820.html

--
none

Ummm... What? by porkchop_d_clown · 2006-08-10 14:38 · Score: 1

Sorry, but LISP (though I don't mean Common LISP) is just as much a stack language as FORTH.

Ummm. Yeah. I know that. That's why I mentioned it. It was a joke, you know?

I've programmed in Forth, Lisp, APL, hand-written Postscript, and just about every other computer language that was ever popular, and several that never were.

Heh. I can still remember writing Postscript by hand to convince our laser printer to make the signs I needed for my N-gauge train layout...

--
Clear, Dark Skies

I'll have to look into that. by porkchop_d_clown · 2006-08-10 14:45 · Score: 1

I guess it was back in 2003 that I was introduced to a "highly proprietary, specialized scripting language" that I immediately recognized as FORTH - which the salesman denied.... Later, after playing with it I even figured out which PD implementation of FORTH they had ripped off, but I also discovered that you can no longer get "Thinking FORTH" or "Starting FORTH" and that the FORTH had apparently become a completely dead language.

To be honest, I wouldn't recommend using it in a modern environment - or as a machine architecture - but for squeezing complex applications into machines with 4k of RAM it was very sweet.

--
Clear, Dark Skies

Re:I'll have to look into that. by samjam · 2006-08-10 19:15 · Score: 1

Forth isn't dead, and you can get the beginner forth books online.

Try comp.lang.forth and see whats kicking.
Look for Forth Webring websites and get all the online books you need.

Forth will never die just because its too useful and its too easy to develop reliable systems with it.

Sam

--
blog.sam.liddicott.com
Re:I'll have to look into that. by mrchaotica · 2006-08-11 00:32 · Score: 1

It saddens me that Apple's switch from OpenFirmware to EFI means that Macs no longer have a built-in Forth interpreter. : (

--
"[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

So, for those of us without limitless spare time.. by porkchop_d_clown · 2006-08-10 14:55 · Score: 1

What makes it so great?

I pulled it down, and (superficially) it looks like any other FORTH interpreter.

Seriously - I'm not trying to run it down, tell me what makes it worth my time to revisit a UI model I first saw on a C64?

--
Clear, Dark Skies

Yes, and... by porkchop_d_clown · 2006-08-10 15:08 · Score: 1

There's a very good reason for that.

--
Clear, Dark Skies

Re:Yes, and... by BigGar' · 2006-08-11 03:42 · Score: 1

I agree, there are a lot of advantages. I love the RPN way of doing things.

--

Shop smart, Shop S-Mart.

Re:Computer-Science Motto: Back to the Future by CTachyon · 2006-08-10 15:25 · Score: 1

I think it technically qualifies, much like fingerpainting...

--
Range Voting: preference intensity matters

Re:For the same reason language choice always matt by kabz · 2006-08-10 15:41 · Score: 1

Yeah, sound familiar. PS-Algol (Persistent S-Algol) designed at St Andrews by a team including one of the Java designers, did this. There was a standard stack for evaluations etc., but I think the stack frames, structures etc., themselves lived on the heap.

PS-Algol included persistent hashes through a simple but very powerful api, and support serialisation of anything you could point to. Ron Morrison would probably kill me if I didn't mention that procedures in PS are first class objects, permitting not just closures, but save/load of closures ... think a programming assignment where you open a db, then reflect the interface, then interact with a closure saved by the professor in order to perform the assignment. One example was implementing a communications protocol to communicate with stubs in a pre-saved closure.

Google for PS-Algol to find papers and research. Hi to anyone at St Andrews that still remembers me!

--
-- "It's not stalking if you're married!" My Wife.

If Stack-Based Computing Is So Great... by Nom+du+Keyboard · 2006-08-10 16:13 · Score: 2, Interesting

If stack-based computing is so great, powerful, and cheap, why aren't IBM PPC, AMD Athlon, Intel Core pick-a-number, and Sun Sparc dueling it out for the best stack-based chip. Why aren't the next-gen game consoles all using it, since Microsoft and Sony at least (Wii is just a faster GC) went to new architectures. Don't tell me no one has ever heard of the concept before. The Burroughs 5500 dates back to the late 1960's. I think there's more here than is being told.

--
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."

Re:So, for those of us without limitless spare tim by John+Nowak · 2006-08-10 17:43 · Score: 1

I'm not a salesman. If you're not interested enough by what I said to give it a closer look, then why would I waste my time?

Re:For the same reason language choice always matt by HiThere · 2006-08-10 17:56 · Score: 1

Unfortunately, I no longer use a Mac as my primary computer...though I did at the time that Neon was alive. (I'm glad to hear that it's still going, however.)

--

I think we've pushed this "anyone can grow up to be president" thing too far.

On behalf of all reverse-engineers - Noooo!!!!!! by demallien2 · 2006-08-10 19:07 · Score: 1

As anyone that has ever had the misfortune to have to reverse anything done for the ST20 can confirm, stack based processors just hurt the brain. I personally take three times as long to grok a program on an ST20, which uses a transputer core, as I do when working on an ST40, which is based on an SH-4 RISC architecture. That said, it would make life a bit more difficult for virus writers as well :-)

My Point Exactly by IDontLinkMondays · 2006-08-10 19:22 · Score: 1

The instruction set of the CPU is no longer even relevant. As far as I can tell, there is no logical reason to even suggest that stack based vs. register based is an issue when discussing CPU architecture any longer.

If you read the Core 2 Duo architecture details and all the well written articles across the web describing the design of them, you'll quickly come to the two conclusions that I came to early on :
- Branch prediction/Out of Order execution handlers completely render assembler irrelevant to anyone except the compiler developer
- The instruction set decoder no longer supports x86 but a wider architecture more similar (in many ways) to RISC computing than to x86. x86 appears to be intepretted into simpler RISC style instructions for faster decoding and execution when feeding the arithmetic/logical/etc... units.

The one place where this is less obvious is in the case of the SSE/SIMD/Vector processing units since they appear to be designed for two spefic groups of people
- Hand coding assember for faster operations
- Parallel Processing compilers that seem to ignore the existance of an FPU and use vectors instead when calculating massive equations. (such as Intel/former Kai compilers).

As for stack based computing. As a hobby, I've developed a great deal of code for both Open Firmware and for EFI using stack based computing. It's a great thing, but frankly, although it works well enough, when it comes to any other type of development, I don't really care what style of instruction set is used, that's the compilers problem.

My most important issue here is that it doesn't matter what advantages the CPU architecture has over another type, the pure simple fact is that there are thousands of compilers already in existance for register based computing and for the most part, compiler engineers/scientists across the world have the register based architecture more or less worked out. The drawbacks they have are no involved with performance regarding registers anymore, most optimizations made to compilers these days are focussed on memory allocation and usage.

If you want to write a paper which truly revolutionizes computing, focus instead on memory allocation technologies. Products such as SmartHeap and Doug Lea's Malloc and other technologies are only a minor fraction of the issue, though they themselves are amazing products.

The real problem in design is that there is no real method of dealing with object oriented developers that have little or no understanding of memory allocation architectures and therefore cause programs to use far too much heap and the cpu uses WAY too many cycles dealing simply with allocation and deallocation of memory. Fragmentation is an issue and no matter how advance the MMU in a system becomes, there's a certain limit to what can be done in hardware. Almost every single application I've ever seen the code of regularly expands its' own heap by calls to functions such as sbrk(), but almost none of the systems actually every give heap back to the system. In an environment such as Windows/Linux desktop/Mac it doesn't matter that much, but on systems with limited space, it's a killer.

So why waste time with ridiculos issues such as register vs. stack architectures when in reality it would make little real difference?

... what an innovation by Anonymous Coward · 2006-08-10 22:13 · Score: 0

come on, you can use every register based cpu in the meaning of a "stack computer" ..... just keep away at registermongering and use only push/pop (or an emulation of these ...) and memory access ..... how innovative

Java chip? by famebait · 2006-08-10 23:03 · Score: 1

Back when Java was new and mysterious, there was a lot of talk about java chips figuring in the near future, and they would most probably be stack based, since the runtime spec is. Never really happened, but maybe it's time for that sort of thinking, now that instruction sets are virtual anyway?

--
sudo ergo sum

Right. Well. by porkchop_d_clown · 2006-08-10 23:09 · Score: 1

I downloaded it and it looks like the exact same thing I was using in 1983. I'm not going to waste time trying to find out any different if you don't care enough to write a 30 second message.

--
Clear, Dark Skies

Re:Right. Well. by 11223 · 2006-08-11 01:43 · Score: 1

It's dynamically typed. It's got a presentation-based user interface ala CLIM from Common Lisp, which itself was derived from the user interface of the Symbolics Lisp machines. It's native-compiled, not just threaded.
Re:Right. Well. by John+Nowak · 2006-08-11 04:37 · Score: 1

It also takes quite a few interesting ideas from Joy. It adds dynamic scoping as well.

Parallalism by Peaker · 2006-08-11 00:15 · Score: 2, Interesting

How about achieving parallelism by using multiple stacks?

If a stack machine is that much simpler, couldn't you either have:

A vast amount of cores for many unrelated threads
Or: Multiple pipelines and explicit division of instructions into the pipelines?

The second refers to an instruction coding similar to VLIW such that you parallelise the code on multiple stacks but it still shares an instruction/data cache and allows for parallelism without heavy multi-threading at the high-level (and instead having parallelism as a compiler optimization at the low-level).

Re:Parallalism by Theovon · 2006-08-11 00:46 · Score: 0

As you know, automatic multi-threading is an unsolved problem. Your multi-stack VLIW idea is very interesting, but even that would be damn difficult for a compiler to take advantage of.
Re:Parallalism by Peaker · 2006-08-11 12:26 · Score: 1

As you know, automatic multi-threading is an unsolved problem.

Ofcourse (except for purely functional languages, that is), that's why my first suggestion only applies to multithreaded code, just as multicore cpu's do today.

Your multi-stack VLIW idea is very interesting, but even that would be damn difficult for a compiler to take advantage of.

As it was difficult in the original VLIW, wasn't it?
Though I think it should be just as simple and probably a lot simpler than the compiler doing a lot of shuffling in order to perform multiple unrelated parallel jobs on the same stack...

The compiler could deduce that multiple expressions it needs to calculate are indeed unrelated and split them into multiple VLIW's...
Perhaps even the compiler can expose a different interface in the high-level language to allow explicit use of this VLIW parallelism - but that is another difficult idea :-)

Heh. by porkchop_d_clown · 2006-08-11 00:56 · Score: 1

I've dropped into EFI off and on, but I never worked up the courage to install that EFI pong game that came out at one point.

--
Clear, Dark Skies

stack computer? by garwain · 2006-08-11 02:28 · Score: 1

I have been using stacks of computers for years. Isn't that about the same thing?

Re: transputer wikipedia link by DaChesserCat · 2006-08-11 02:32 · Score: 1

Computer Sysstem Architects, for whom I was working, was selling the Inmos Occam development kit and Logical Systems C, both of which allowed you to exploit the parallelism.

I was already familiar with C before I started there, and I learned Occam and Transputer Assembly Language on the job. I'd never even HEARD of a stack-based architecture before, so I was in for a real eye-opener.

What's truly sad is that, these days, some geeks would just port Linux to it and it would have a thriving market. Back then, it didn't run DOS or Windows 3.0, so it didn't sell.

--
... by the Dew of Mountains the thoughts acquire speed, the hands acquire shakes, the shakes become a warning

Re:For the same reason language choice always matt by SigmundFloyd · 2006-08-11 02:39 · Score: 1

However, MOPS' developer Mike Hore has stated that he wasn't too keen on porting the whole system to Intel Macs.

--
Knowledge is power; knowledge shared is power lost.

Re:For the same reason language choice always matt by Anonymous Coward · 2006-08-11 03:15 · Score: 0

Sounds interesting, thanks!

Variable length integers by tepples · 2006-08-11 04:35 · Score: 1

But you would need pointers to that infinite amount of memory (which HDD did we store what on). Unfortunately a pointer in an infinite address space needs and infinite amount of storage meaning you would have converted all matter in the universe into HDD before you had finished storing the first pointer.

Why not use variable-length integers, consisting of significant bits followed by a special symbol that designates the end of a pointer? For example, MIDI uses this concept for delays between notes.

Re:Variable length integers by Anonymous Coward · 2006-08-11 05:50 · Score: 0

Performance reasons. :)

Mod Parent UP by Portfolio · 2006-08-11 04:41 · Score: 1

Good responses from the original author.

Also, the Bernd Paysan mentioned is one of the prime authors of GNU Forth, one of the most popular Forth interpreters in the Unix world.

Agree by p3d0 · 2006-08-11 06:08 · Score: 1

Too bad Display Postscript didn't really seem to go anywhere. Did it evolve into anything?

--
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....

Open and Free OFSIC archtecture specification by Anonymous Coward · 2006-08-11 11:40 · Score: 0

a free and open architecture for a forth based cpu/microcontroller is here http://indi.joox.net/ Any comments?

Interesting. by porkchop_d_clown · 2006-08-11 12:14 · Score: 1

Thanks for the heads up - I googled for "thinking forth" and found the source forge site. In my defense, it was opened a good 2 years after I went looking for the book. At the time I needed it, I could only find copies on eBay.

Heh. And to think, if that source forge site had been open 2 years earlier, I might not have quit my job.

--
Clear, Dark Skies

benefits of multistack? by Joseph_Daniel_Zukige · 2006-08-11 12:19 · Score: 1

Just adding a stack is no good. You have to be coordinated about it.

I mentioned, I think, dedicated stack caches? Hysteric, with spill and fill at the 1/4 and 3/4 points? Much less complicated than 4-way set associative, and much smaller for the performance gain. And that cache is what the renaming function maps onto.

Cache for the call/return stack only needs to be maybe 16 or 32 words deep for almost any GP hardware, including all consumer stuff.

Cache for the parameter stack would need a bit more, but 1K words would be easily sufficient for moste purposes. Needless to say, renaming would only target the top 16 or 32 registers.

Small stack cache would allow implementing cache pairs for faster context switching. (Sets of four might be useful for dedicated real-time hardware.)

There are other optimization that are enabled (but not implied) by using multiple stacks. But it's my bedtime now, so I'll leave it alone.

Slashdot Mirror

Next Generation Stack Computing

347 comments