Low Level Virtual Machine 1.3 Released
RSpencer writes "The Low Level Virtual Machine project has released version 1.3. There are full release notes available. LLVM is a source-language agnostic toolkit for building compilers, optimizers, and jit or interpreted virtual machines. LLVM provides extensive optimization support, three mid-level IR formats (bytecode, assembly, and C++), three backend targets (x86,Sparc,PPC), full documentation, and a very simple and unique design. This new toolkit approach to compiler related tools is quickly attracting new developers who are making significant contributions to the work. Visit the home page where you can learn all the details. LLVM is funded by the National Science Foundation, MARCO/DARPA, and supported by UIUC's Computer Science department and other developers."
LLVM is a very young project (only 3 years old) but has already made dramatic progress in it's time. Check out the status updates on the left hand side of the main site to see the rate of progress.
Building a full C/C++ compiler is no small feat!
-Chris
So...now we have various implementations of the Java VM, the .NET VM, Parrot, and LLVM, plus various emulators of real machines, and let's not forget the real machines themselves.
What I would like to know is how they all compare. How fast does a typical program run? How portable is the implementation; how easy can the bytecode be transformed to native code for various architectures? How easy is it to target this machine? How well does the machine cope with various programming languages (esp. Common LISP)? How stable (backward compatible) is the bytecode? What are the licensing terms? Does it communicate with the host system, and how well? Etc...
Please correct me if I got my facts wrong.
If I ever do build a satisfactory parser with Bison, I wonder how it would interface with LLVM. I tried converting a toy Bison parser to C++ and it seemed like there were some rough edges.
The casual Slashdot reader may roll his/her eyes when they see yet another Virtual Machine - but this project is much more than that. It's a complete compiler infrastructure project that will one day surpass GCC. Why? Because it's around ten times easier to understand and written in a modern language (C++) rather than C. An expert C++ programmer could start contributing code to LLVM in under a month; whereas an equivalent learning curve for GCC is at least a year. Writing new compiler passes or complete language front ends for LLVM is very straight-forward, for example. The LLVM AST has the advantage of having strong typechecking and not arcane tree macros as in GCC. LLVM is not burdened with the legal or philosophical wranglings of GCC where they do NOT WANT their compiler to be the backend of a commercial compiler and try their damnedest to obscure and change the programming API from release to release. The GCC "Toy" example language has not worked in many releases for this very reason.
GCC recently voted down using C++ in their core code. Perhaps LLVM at the very least will drag GCC into the modern age due to competition.
The VM part of LLVM is just icing on the cake.
(And yes, I am aware that LLVM uses GCC 3.4's C and C++ front-end code. That's a good thing for the short term. Perhaps longer term they will develop their own front-ends.)
How complete is the API? The power of the Java and .NET VMs (I don't know Parrot well enough to comment) is their standard libraries -- perhaps to a larger extent than the bytecodes themselves.
Array bounds checking is not new. Dynamically loading code isn't new. What was new was the creation of a standardized toolkit and API that handled threading and network I/O and GUI and database access and XML parsing and... You get the picture.
Another portable VM holds little value for me if in the end I just end up back to the "good ol' days" of C where you were given a hammer and told to build a house. POSIX isn't enough.
- I don't need to go outside, my CRT tan'll do me just fine.
The very best trolls always start with a grain of truth. (LLVM is much easier to understand than GCC. The GCC infrastructure is very baroque, dating from a time when assuming the presence of an ANSI C bootstrap compiler was too much. One of the major LLVM guys has presented his toolchain work at the annual GCC Summit, and maintains close communication with the rest of the GCC team -- and we wish him well. All very true; no GCC hacker would say any less.)
The trolls then move on into wild exaggerations and complete lies. Such as:
Pure malicious bullshit. RMS doesn't want proprietary backends to be able to read the GCC IR, and so we don't ever write it out in a machine-readable format. But we've never gone out of our way to obfuscate the internal API.
Again, a complete lie. We asked RMS whether we could make use of C++ in parts of the compiler. While a skilled and brilliant C and LISP hacker, RMS is a reactionary fuckhead when it comes to anything other than C or LISP. In his opinion, all GNU programs should be written in C, and only C, ever.
There was no vote.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
RMS doesn't want proprietary backends to be able to read the GCC IR, and so we don't ever write it out in a machine-readable format.
Then open source backends won't be able to read it either, but that's apparently okay with RMS, given his priorities. Since only REALLY good commercial software would have a chance against a no-cost incumbent, he is willing to keep great open source alternatives from becoming available in order to keep great commercial alternatives from becoming available.
This is the guy who proclaims that ALL commercial software developers are "unethical".
Open Source to me is about openness: the source is yours to use as you wish, the code can be scripted externally via a command-line interface, it can be incorporated as a library into your own code with minimal effort with a nice API intentionally designed for that purpose, it has a modular architecture that encourages others to compete at the replacement module level without having to rebuild the whole app, etc.--all the ways code can be opened: the least restrictions and the greatest usability.
This is not RMS's agenda. RMS has made his priorities clear. He has never claimed to support the "open source movement", only to be somewhat allied with it to the extent it supports his own anti-commercial software movement.
We asked RMS whether we could make use of C++ in parts of the compiler. While a skilled and brilliant C and LISP hacker, RMS is a reactionary fuckhead when it comes to anything other than C or LISP.
Having heard him speak on many occasions over the years, it's my impression that this characterization is correct and "anything other" applies to more than just programming languages, though not to everything.
I think RMS is right on in many (but not all) of his compaints about IP laws. He's a very bright and skilled guy with a lot of great ideas, but his old-fashioned leftist political philosophies are more "anti" than they are "pro" and handicap the usefulness of his products in some unfortunate ways, and that appears to include GCC.
To the extent that LLVM and other technologies compete with GCC, I'm all for it.
"Those who have never entered upon scientific pursuits know not a tithe of the poetry by which they are surrounded."