New Intermediate Language Proposed
WillOutPower writes "Sun is inviting Cray (of supercomputer fame) and IBM (needs no introduction...) to join and create a new intermediate run-time language for high-performance computing. Java's bytecode, Java Grande, and Microsoft's IL language for the Common Language Runtime, it seems a natural progression. I wonder if the format will be in XML? Does this mean ubiquitous grid computing? Maybe now I won't have to write my neural network in C for performance :-)"
"There is no problem in computer science that cannot be solved by adding another layer of indirection."
All good compilers use at least one intermediate language. It's practically impossible to do good optimizations otherwise, even on a single platform. For example, you want to inline functions if that would improve performance, but in order to determine whether it improves performance means that you need to look at things like register allocation, which depends on things like the machine code implementation of complex expressions; however, inlining a function needs to be done with the higher level information about flow control and the structure of the function call. So you basically can't do any of the interesting optimizations without a good intermediate language.
Furthermore, getting from the high-level langauge to the intermediate language is cross-platform, which means that any optimizations done at this level are then available to all of the code generators for different platforms; this code is reused across back-ends. It also means that you can support multiple front-ends with the same back-end, and make your C++ and Java automatically compatible by virtue of sharing an intermediate language, and they also both benefit from the same architecture-specific back-end.
There's no reason that having an intermediate language means that you'll stop compiling at that level and use an interpreter for the intermediate language to run the program. In fact, gcc always compiles its intermediate language into machine code, and it can compile Java bytecode into machine code as well. Modern JVMs compile the bytecode into native machine code when the tradeoff seems to be favorable, and they can do optimizations at this point that a C compiler can't do (such as inlining the function that a function pointer usually points to).
An intermediate language essentially pushes more of the skill into the optimizing compiler, because the same optimizing compiler can be used for more tasks. Also, if the compiler is used at runtime, it can optimze based on profiling the actual workload on the actual hardware. This is especially important if, for example, IBM decides to distribute a single set of binaries which should run optimally on all of their hardware; you run the optimizer with the best possible information.
Java is on the retreat??? Wow, I've been gainfully employed as a Java architect for the past five years; it musta' been a fluke. IBM, Oracle, Novell, et al must not know what their doing by investing millions in building their products around the Java platform. Come to think of it, there are sooo many alternatives to Java for enterprise, server-side computing. Thank you for your insight. I'll turn in my resignation and pick up a .Net book tomorrow.
Xenon, where's my money? -Borno
In terms of compiler optimisation, the higher the language the better. Strict typing and a language that allows the compiler to infer more about the call tree should enable better global optimisation. Lower level languages suffer from the problem that the programmer is explictly describing how to do something, and not what it is trying to do; thus the compiler can just unroll loops and perform peephole optimisations.
If a language was sufficently high enough that you could describe to the compiler that you were implementing a recursive function (e.g. shell sort), the compiler should then be able to perform fold-unfold optimisation and convert the code into a more efficient tail iterative function. Fans of Haskell and similar languages might recognise this. Some C compilers will convert recursion to iteration where possible, but this is only in simple cases.
The fact is that today, even as C has reached maturity and as high level as it is, there are still some optimisations that are impossible because of subtleties of the language. For example, multiple pointers may point to the same memory, but depending on how the pointers are assigned, the compiler has no idea that this is the case, and has to follow the code in a literal fashion.
My personal view is that languages like Java still have a lot to offer. I would like to see a lot more investment in the compiler to perform better optimisations, and would also like to see a compile on install system for Java like C#; if I run an applcation it would atleast be nice if the compiled parts were cached somewhere. This I believe could make good performance gains, and it's interesting that Sun's Server Hotspot VM actually performs more optimisation when compiling a class than the Client VM, however, because of the increase in time taken to load and compile a class, the Client VM omits some optimisation techniques to favour speedier loading. I guess this descision is to make GUI's more responsive and reduce app load times; compile at install would remove this constraint. We should be going to higher level languages, not lower, and concentrate on getting to compiler correct.
-- Mike