GNU Guile Scheme Gets a Register VM and CPS-Based IL
In late November, Andy Wingo pushed a new register VM to Guile's (the GNU implementation of the Scheme language) master branch. It brought a number of performance improvements, but led to a bit of a conceptual mismatch between the compiler's direct-style intermediate language and the virtual machine. Earlier this week Andy Wingo announced a new continuation-passing style intermediate language for Guile. From the article:
"To recap, we switched from a stack machine to a register machine because, among other reasons, register machines can consume and produce named intermediate results in fewer instructions than stack machines, and that makes things faster. To take full advantage of this new capability, it is appropriate to switch at the same time from the direct-style intermediate language (IL) that we had to an IL that names all intermediate values. ... In Guile I chose a continuation-passing style language. ... Guile's CPS language is composed of terms, expressions, and continuations. It was heavily inspired by Andrew Kennedy's 'Compiling with Continuations, Continued' paper. ... The optimizations I have currently implemented for CPS are fairly basic. Contification was tricky. One thing I did recently was to make all non-tail $call nodes require $kreceive continuations; if, as in the common case, extra values were unused, that was reflected in an unused rest argument. This required a number of optimizations to clean up and remove the extra rest arguments for other kinds of source expressions: dead-code elimination, the typical beta/eta reduction, and some code generation changes."
The article describes the CPS language provided by Guile and explains the reasons behind choosing CPS over SSA or A-Normal Form. The Guile manual contains draft documentation. The new VM and Intermediate Language will be released with Guile 2.2, which should be out later this year.
Remember when a nerd was someone who cared about tail call optimizations and SSA, and not which corporation made their cellphone?
Off my lawn.
Don't blame me, I voted for Baltar.
The frightening thing is, a year ago, I wouldn't have understood this post. Now I'm reading the paper on Contifications and nodding my head, going "Yeah, yeah... huh... uh... yeah, okay..." It's been that kind of year. This is the kind of stuff that's starting to show up on webdevs' radars. With the release of ECMA-6 and the precompiler suites, the essential core scheme-ness of even Javascript is starting to infect us all.
Hierarchic thinking has infested Computer Science for decades. The entire Object Oriented Programming paradigm was founded on hierarchy, with Inheritance defined as from one parent only, and many not too sure if multiple inheritance is a good idea or needed. That the notion of inheritance even has to be qualified with that word "multiple", because it implicitly means single otherwise, is a barrier to thought. Trees are useful data structures, but they aren't the ultimate, universal data structure that can succinctly describe all other organizations of data.
There's a lot of foolish programmers who think that because they've got inheritance available to them, they've got to use it. It's the If All You Have Is A Hammer principle. Delegation/composition is much more useful in practice (and can express arbitrary graphs just fine).
"Little does he know, but there is no 'I' in 'Idiot'!"
GNU guile's built-in reader includes support for SRFI-105, so you can use infix expressions directly. In particular, you can use {...} instead of (...) and put the operator in the EVEN position, e.g., {n https://www.gnu.org/software/guile/manual/html_node/SRFI_002d105.html
If you want to eliminate more of the parens, you can use guile with SRFI-110, which provides support for indentation-sensitive semantics. An implementation is available with an MIT license. See more here: http://readable.sourceforge.net/
- David A. Wheeler (see my Secure Programming HOWTO)
To those interested in the implementation of programming languages, it is immediately apparent that this is a fundamental change in the compiler behind the GNU Guile system which implements the Scheme programming language, inasmuch as it now has a virtual machine based on the register model instead of the implied stack model, along with an intermediate language in its compilation path that is based on continuation-passing style.
I think that the lesson here is for everyone: There are many segments of nerd culture, and it is very unlikely that any randomly-selected Slashdot reader understands and appreciates all of those segments. For example, the earlier headline today "Why Transivity Violations Can Be Rational" has no meaning to many readers, even after the title was corrected to spell transitivity correctly. After reading a little about that topic, I see that it is an area of of obvious interest to many nerds.
That being said, there are plenty of topics Slashdot poorly reports on which are not of interest to any segment of nerd culture, at least not beyond the overlap between nerd culture and the mainstream news where we already read the same information three days earlier except through the words of a literate, competent reporter with real editing before it hit the press.
Register VMs began to come back into vogue at least since the Dis VM in Plan 9. The Bell Labs developers wrote a paper about Dis which effectively stated that, "yo, yo, register VMs are faster out-of-the-box, easier to optimize, and easier to translate to native code".
By the time Dalvik came along (10 years after the Dis paper) it was already conventional wisdom that register-based VMs were the obvious choice for performance. But most projects still use stack-based VMs because they're easier to implement and easier to generate code for.
Lua 5.0 (of almost 10 years ago) was register-based and the developers wrote a paper about it: http://www.lua.org/doc/jucs05.pdf
Note that Lua (not just LuaJIT) blows the pants off of Python and Ruby in terms of performance, and it's largely because of their efficient VM. Lua has full lexical closures, coroutines, GC, tail calls, extensible metatypes, etc, yet it's ridiculously fast for a purely interpreted language.
Here's the Plan 9 Dis paper: http://doc.cat-v.org/inferno/4th_edition/dis_VM_design