New Intermediate Language Proposed

Didn't we do this once before? by Uzik2 · 2003-12-26 10:34 · Score: 4, Interesting

I recall a system based on USCD Pascal. You would
write an interpreter on your target hardware that
would run the pascal p-code. It was supposed to
solve all sorts of problems. Except it was slow.
Nobody would write anything for it, I guess
because they didn't like Pascal, or USCD didn't
fire anybodies imagination with the product.

I don't see why we need to go through this again.
If you need performance write it in assembler or
use nicely optimized C. If you don't then an
interpreted scripting language will usually
suffice. What's the benefit to yet another
layer of abstraction?

--
-- Programming with boost is like building a house with lego. It's a cool but I wouldn't want to live in it

Re:Didn't we do this once before? by VertigoAce · 2003-12-26 11:38 · Score: 5, Informative

One interesting feature of .Net is language interoperability. Someone can write a class in VB.NET and I can inherit from that class in C++ just the same as if the original was written in C++. Sure there were ways of doing this before, but you generally had to treat other components in a different way from stuff written and compiled in your current project's language.

A more typical usage would be to write anything that needs better performance or that needs access to non-.net libraries in C++ (since it can be compiled to machine code before distributing) and then use that component in other languages that are easier for putting a GUI together. Again, it's always been possible to do stuff like this, but .NET makes it seamless (it's just like linking to any other library).
Re:Didn't we do this once before? by angel'o'sphere · 2003-12-26 11:41 · Score: 4, Insightful

I recall a system based on USCD Pascal. I also :-) Except it was slow. Well, on my Apple ][ it was good for the fastest code after Assembler. It only got catched when Z80 coprocessors with CPM and Turbo Pascal came en vouge.
I did really a lot of programming in UCSD pascal, and long UCSD p-code was the most wide spread operation sytem/virtual machine.
If you need performance write it in assembler or
use nicely optimized C.
Assembler loses all higher level abstractions, like inheritance, interface implementation, class relationships(relations, aggregations and compositions), thread synchronization. The same is true for C, besides that it is on source level not able to express higher level concepts. You might use assembler instead of C.
How do you optimize assemberl? The operation system, the non existing, but hypotetical VM, the loader, the processor, none of hem can optimzie "assembler". I mean: In Java Byte Code I have all the higher level abstractions of the system inspectable via reflection etc. In assembler I have nothing.
New bytecodes, able to express more higher level informations e.g. like prarallelization, or even this problem: consider you have an CPU server, consider you have code migrating to youor server, consider you want to trust that code, consider, the "owner" of the code does not want to trust you .... So you need a VM on your CPU server, able to execute encrypted bytecode, so hat you as owner of the CPU dont see what the code is calculating. BUT you, a CPU server, you dont want your system compromized, or the code of other clients compromized by any piece of code.
Or, consider this, you want byte code as an mobile agent, similar to the scenario above, but it should be allowed to replicate over a GRID, but only under certain restrictions.
You want to optimize every replica at the VM where it is finally executed, to take an optimum of resources on that point. How do you do that in "assembler"?
Modern byte codes will be likely even closer to the constructs of the high level languages than byte code is. Resource allocation, object creation, class loading, higher level concepts, like delegation, parallelism, synchronization(on multiple mutexes probably), serialization, distributed(pervasive) computing, probably OODB support build in, probably a light weight EJB like execution environment, probably a 4 level hierarchy of VM, meta container, container and executed code ... probably where the VM is itself only "executed" code inside of a meta cotainer. That means modern VMs probably will extract core VM features like garbage collection and thread scheduling outside of the VM into a library, and every piece of code may "class load" its own garbage collection schema. Consider differnt garbage collectors per thread and not per VM.
Well, I could continue for a day with improvements ....
What's the benefit to yet another
layer of abstraction?

The benefit is to optimze on that layer of abstraction and then to project/generate/assemble the optimzation down onto the machine layer(or the next lower layer).

angel'o'sphere

--
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
Re:Didn't we do this once before? by voodoo1man · 2003-12-26 13:21 · Score: 4, Insightful

Sure, as long as your class looks just like a C# one. Need multimethods, dynamic class redefinition, method combination, a non-crippled model of multiple inheritance, or maybe even prototypes? You're out of luck, because for this interoperability to work, your classes will either have to be C# classes or you have to make them look like ones, and .NET doesn't give you a Meta Object Protocol to do it.

--
In the great CONS chain of life, you can either be the CAR or be in the CDR.
Re:Didn't we do this once before? by cakoose · 2003-12-26 22:48 · Score: 4, Insightful

I think what he's saying is that the syntax isn't the only thing that defines a language. A language's type system probably plays a more important part in defining how the language works.

With .Net, it may seem like you have a lot of interoperating languages, but they're all basically the same language with different superficial characteristics. VB developers complain about how VB.Net is totally different from previous versions of Visual Basic. It's because they gutted its internals and implanted C#. I wouldn't be able to tell the difference because I see similar syntax, but someone who really knows the language will detect a different core.

That's not to say that different type systems cannot be emulated. Nice is a language with Java-like syntax but with a much better type system (among other things) and it still runs on an ordinary JVM. However, any interoperability will have to be at the level of the lowest common denominator. If you want to call Nice code from Java, your interface ends up losing or having to give up some power.

You really can't even share libraries between truely different languages. The STL just doesn't fit into the Java/C#-style type systems (though generics is a step towards accomodating the STL). Perl libraries are also distinct. Imagine dealing with a Haskell-style lazy list in your C# code. It just wont feel right.

XML ? by noselasd · 2003-12-26 10:36 · Score: 4, Funny

Did I see XML and performance in the same sentence ?! ... brain overload.. does not make sense...

Re:XML ? by benjamindees · 2003-12-26 10:47 · Score: 5, Funny

I saw that too.
Then I saw Posted by michael and everything was better.

--
"I assumed blithely that there were no elves out there in the darkness"

but the biggest question is... by b17bmbr · 2003-12-26 10:37 · Score: 4, Funny

The effort is part of a government-sponsored program under which the three companies are competing to design a petascale-class computer by 2010.

will sun survive until then?

--
My problem? I was perfectly gruntled, until some numbnuts came by and dissed me.

GCC by norwoodites · 2003-12-26 10:39 · Score: 4, Insightful

Sun should have invited us GCC developers also to help out with this because most of us want a way to do Inter modular optimizations but we have the FSF looking over our shoulder on how we implement it, right now (the mainline) you have to compile all the source files at the same time to get IMA to work correctly and you have to say to produce an .o file first.

Buzzword compliance by ajs · 2003-12-26 10:41 · Score: 4, Interesting

I really hope the author's smiley was to indicate that he understood that his string of buzzwords was meaningless.

What I hope is that Sun takes a good, long look at the only intermediate assembly that has been designed with language neutrality in mind, Parrot. While this article is over 2 years old, it's a decent starting point. Parrot has already been used to implement rudimentary versions of Perl 5, Perl 6, Python, Java, Scheme and a number of other languages. The proof of concept is done, and Sun could start with a wonderfully advanced next generation byte code language if they can avoid dismissing Parrot as, "a Perl thing" with their usual distain for things "not of Sun".... IBM on the other hand is usally more open to good ideas.

Re:Buzzword compliance by Elian · 2003-12-26 12:54 · Score: 4, Informative

Nah. we put that in to not scare people. Parrot is, for all intents and purposes, completely independent from Perl 6 and has been for ages. (well before that article was written). While we're going to put in anything we need to make perl 6 run on parrot, the same can be said of anything we need to run Python and Ruby. (Which has already happened, FWIW) The only difference is that Matz and Guido haven't asked for anything yet...

Reminds me of the old quote... by Waffle+Iron · 2003-12-26 10:50 · Score: 5, Insightful

"There is no problem in computer science that cannot be solved by adding another layer of indirection."

Re:What's the point? by basking2 · 2003-12-26 10:54 · Score: 5, Interesting

This is a good question to ask.

So, one of the ideas behind C# was to make an intermediate laguage (MS-Java-byte-code, if you will) which could be quickly compiled for the CPU in question. Stick a system call envrionment and garbage collector around it and you have [roughly] what C# is. One of the nice things about Java was that it was for no specific machine... it was very very simple at the instruction level, but making native code from that can be a pain.

Now, from the looks of the posted article some folks now want an intermediate laguage that can represent concepts like instruction vectorization and maybe SMP (hypter threading) and perhaps some other more complicated constructs that Java's machine code just doesn't talk about.

The end result is that you would have very fast machine code for the number-crunching loops in the code and portability. The compile time would be fairly quick and the optimization for the local CPU would be "smart" and fast if you marked up what where vectorizable instructions.

Why C# falls short, I can't say. I've only looked at the Java machine, never at how C# represets a program.

Hope this is helpful!

--
Sam

Re:What's the point? by plastik55 · 2003-12-26 11:09 · Score: 4, Insightful

Because a well-designed intermediate language will help optimization. Being somewhat higher-level than raw machine code, not yet having to worry about the specific details of registers and pipelining, makes it easier to perform higher-level optimizations because the IL can be more easily analyzed. And when you compile from IL to the target you will have just the same opportunities for platform-specific optimizations as if you had compiled straight from the source language.

The other benefits of using an IL are manifold. New languages can be implemented without having to write a compiler for each platform. New architectures can be supported without having to write compilers for each language.

--

I have a positive modifier on Troll. When I mod someone Troll their karma should go UP!

Re:What's the point? by iabervon · 2003-12-26 11:21 · Score: 5, Insightful

All good compilers use at least one intermediate language. It's practically impossible to do good optimizations otherwise, even on a single platform. For example, you want to inline functions if that would improve performance, but in order to determine whether it improves performance means that you need to look at things like register allocation, which depends on things like the machine code implementation of complex expressions; however, inlining a function needs to be done with the higher level information about flow control and the structure of the function call. So you basically can't do any of the interesting optimizations without a good intermediate language.

Furthermore, getting from the high-level langauge to the intermediate language is cross-platform, which means that any optimizations done at this level are then available to all of the code generators for different platforms; this code is reused across back-ends. It also means that you can support multiple front-ends with the same back-end, and make your C++ and Java automatically compatible by virtue of sharing an intermediate language, and they also both benefit from the same architecture-specific back-end.

There's no reason that having an intermediate language means that you'll stop compiling at that level and use an interpreter for the intermediate language to run the program. In fact, gcc always compiles its intermediate language into machine code, and it can compile Java bytecode into machine code as well. Modern JVMs compile the bytecode into native machine code when the tradeoff seems to be favorable, and they can do optimizations at this point that a C compiler can't do (such as inlining the function that a function pointer usually points to).

An intermediate language essentially pushes more of the skill into the optimizing compiler, because the same optimizing compiler can be used for more tasks. Also, if the compiler is used at runtime, it can optimze based on profiling the actual workload on the actual hardware. This is especially important if, for example, IBM decides to distribute a single set of binaries which should run optimally on all of their hardware; you run the optimizer with the best possible information.

Re:What's the point? by grotgrot · 2003-12-26 11:45 · Score: 4, Interesting

What's wrong with making a good compiler that writes directly to machine code?

Because that doesn't give you best performance. Machine code represents an exact processor implementation. Tradeoffs have to be made with backwards compatibility (eg Redhat is compiled for Pentium), expected cache sizes (optimising size vs performance), processor specifcs (Itanium has 4 instructions per bundle, Sparc has one instruction after branch) etc.

While it is true that you could compile for an exact machine, it is a horrible way of trying to ship stuff to other people, and it does require recompilation if anything changes. (The former is why Redhat pretty much picks base Pentium - if they didn't they would need 5 or so variants of each package just in the Intel/AMD space. Granted they do supply a few variants of some packages, but not everything, and Gentoo people can confirm that doing everything does help).

Using IL lets the system optimise for the exact system you are running at the point of invocation. It can even make choices not available at compile time. For example if memory is under pressure it can optimise for space rather than performance.

It also allows for way more aggressive optimisation based on what the program actually does. While whole program optimisation is becoming available now (generally implemented by considering all source as one unit at link time), that still doesn't address libraries. At runtime bits of the standard libraries (eg UI, networking) can be more optimally integrated the running program.

Machine code also holds back improvements. For example they could have made an x86 processor with double the number of registers years ago. If programs were using IL, a small change in the OS kernel and suddenly everything is running faster.

Needless to say, using IL aggresively is not new. To see it taken to the logical conclusion, look into the AS/400 (or whatever letter of the alphabet IBM calls it this week). I highly recommend Inside the AS/400 by Frank Soltis.

If you've not implemented parallel code X-Arch by fw3 · 2003-12-26 12:01 · Score: 4, Informative

Then you (like the pos(t)er of this article and most of the comments) probably don't follow what the value is here.

Maintaining high performance code across cpu achitectures is bad enough (and I know of some supercomputing centers which are continuing with technically inferior AMD64/Xeon clusters rather than switch to PPC970 precisely because they know they can't afford to re-optimize for that arch).

Factor in that today most numerically intensive code is still written in FORTRAN because competing languages simply can't be as easily optimized.

Now let's think about SMP, while POSIX threads are portable, the best performace probably requires different threading code depending on arch/unix varriant. (And of course NPTL for linux is still in CVS.)

Now let's think about massively parallel, where inter-cpu communication will be handled a bit differently on every platform.

So the payoffs to developing an efficient cross-platform language layer are pretty substantial. (Which does not imply that I expect IBM to jump on to Sun's bandwagon on this :-))

--
Linux is Linux, if One need clarify their dist: <Dist>/GNU Linux
bsds are of course just BSD

too little, too late by penguin7of9 · 2003-12-26 12:28 · Score: 5, Interesting

The effort is part of a government-sponsored program under which the three companies are competing to design a petascale-class computer by 2010.

We already have such a runtime: it's called "CLR". The CLR is roughly like the JVM but with features required for high performance computing added (foremost, value classes).

Sun wants the so-called Portable Intermediate Language and Run-Time Environment to become an open industry standard.

I hope people won't fall for that again. Sun promised that Java would be an "open industry standard", but they withdrew from two open standards institutions and then turned Java over to a privately run consortium, with specifications only available under restrictive licenses.

Sun's goal is to apply its expertise in Java to defining an architecture-independent, low-level software standard - like Java bytecodes - that a language could present to any computer's run-time environment.

Sun's "expertise" in this area is not a recommendation: the JVM has a number of serious design problems (e.g., conformant array types, arithmetic requirements, lack of multidimensional arrays) that attest to Sun's lack of expertise and competence in this area.

What this amounts to is Sun conceding that Java simply isn't suitable as a high-performance numerical platform and that it will never get fixed (another broken promise from Sun). But because the CLR actually has many of the features needed for a high-performance numerical platform, Sun is worried about their marketshare.

The question for potential users is: why wait until 2010 when the CLR is already here? And why trust Sun after they have disappointed the community so thoroughly, both in terms of broken promises on an open Java standard and in terms of technology?

Maybe we will be using a portable high-performance runtime other than the CLR by 2010, but I sure hope Sun will have nothing to do with it. (In fact, I think there is a good chance Sun won't even be around then anymore.)

Re:What's the point? by Lazy+Jones · 2003-12-26 12:36 · Score: 4, Interesting

Because a well-designed intermediate language will help optimization

All good compilers already use well-designed intermediate languages. A general intermediate language that aims to be equally suitable for many high level languages will most likely be inferior to the best intermediate language for a particular high level language.

The other benefits of using an IL are manifold. New languages can be implemented without having to write a compiler for each platform.

Great. Just what we need - another of those braindead technological "advances" like human-readable data interchange formats that makes life easier for a few developers (simpler, cheaper compiler development) and harder for millions of users (worse performance). Frankly, the only advantage for the rest of us I can think of would be the higher probability of the resulting tools being mostly bug-free.

--
"I love my job, but I hate talking to people like you" (Freddie Mercury)

Re:Next try? by kingkade · 2003-12-26 12:45 · Score: 4, Informative

Ok, so now that Java is on the retreat they try to enter a new area?

It's probably because there's no Java user community or usefull implementations out there. And it has virtually no practical application on the desktop for that matter. Maybe because it doesn't do 3D or sound. Or is not so usefull as far as scalable RDBMS abstraction or a real application server for the enterprise. Maybe they need to move into the mobile market. What's really needed is a good Java IDE to get developers on board. Changes should be driven by the software community and making the source open would help as well. Sun should also be making improvments in Java's next(?) version.

You're right, I guess "we" should just cut our losses.

--
why run from Vincenzo?

Re:Next try? by bckrispi · 2003-12-26 13:09 · Score: 5, Insightful

Java is on the retreat??? Wow, I've been gainfully employed as a Java architect for the past five years; it musta' been a fluke. IBM, Oracle, Novell, et al must not know what their doing by investing millions in building their products around the Java platform. Come to think of it, there are sooo many alternatives to Java for enterprise, server-side computing. Thank you for your insight. I'll turn in my resignation and pick up a .Net book tomorrow.

--
Xenon, where's my money? -Borno

Re:What's the point? by Mike+McTernan · 2003-12-26 13:18 · Score: 5, Insightful

In terms of compiler optimisation, the higher the language the better. Strict typing and a language that allows the compiler to infer more about the call tree should enable better global optimisation. Lower level languages suffer from the problem that the programmer is explictly describing how to do something, and not what it is trying to do; thus the compiler can just unroll loops and perform peephole optimisations.

If a language was sufficently high enough that you could describe to the compiler that you were implementing a recursive function (e.g. shell sort), the compiler should then be able to perform fold-unfold optimisation and convert the code into a more efficient tail iterative function. Fans of Haskell and similar languages might recognise this. Some C compilers will convert recursion to iteration where possible, but this is only in simple cases.

The fact is that today, even as C has reached maturity and as high level as it is, there are still some optimisations that are impossible because of subtleties of the language. For example, multiple pointers may point to the same memory, but depending on how the pointers are assigned, the compiler has no idea that this is the case, and has to follow the code in a literal fashion.

My personal view is that languages like Java still have a lot to offer. I would like to see a lot more investment in the compiler to perform better optimisations, and would also like to see a compile on install system for Java like C#; if I run an applcation it would atleast be nice if the compiled parts were cached somewhere. This I believe could make good performance gains, and it's interesting that Sun's Server Hotspot VM actually performs more optimisation when compiling a class than the Client VM, however, because of the increase in time taken to load and compile a class, the Client VM omits some optimisation techniques to favour speedier loading. I guess this descision is to make GUI's more responsive and reduce app load times; compile at install would remove this constraint. We should be going to higher level languages, not lower, and concentrate on getting to compiler correct.

--
-- Mike

Compilers 101 by p3d0 · 2003-12-26 14:14 · Score: 4, Informative

I don't understand your question, but if you're asking why we need intermediate languages, then I can answer that.

Imagine N high-level languages and M target platforms. A naive approach would wind up creating NxM separate compilers.

Intermediate languages (ILs) allow you to write N "front-ends" that compile the N high-level languages to the IL, and M "back-ends" that compile from the IL to the M target platforms. So rather than needing NxM compilers, you only need N+M.

Even more significant is the optimizer. Front-ends and back-ends are relatively straightforward, but optimizers are very hard to write well. In the naive approach, you need NxM optimizers. With an IL, you only need one. The front-end translates to IL; the optimizer transforms IL to better IL, and the back-end translates to native code.

In summary, to answer one of your questions:

What's wrong with making a good compiler that writes directly to machine code?

Every optimizing compiler uses an IL anyway. These companies, I presume, are simply agreeing to use the same IL across their products (though I'm only guessing because the article is slashdotted).

--
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....

Slashdot Mirror

New Intermediate Language Proposed

23 of 440 comments (clear)