New Intermediate Language Proposed
WillOutPower writes "Sun is inviting Cray (of supercomputer fame) and IBM (needs no introduction...) to join and create a new intermediate run-time language for high-performance computing. Java's bytecode, Java Grande, and Microsoft's IL language for the Common Language Runtime, it seems a natural progression. I wonder if the format will be in XML? Does this mean ubiquitous grid computing? Maybe now I won't have to write my neural network in C for performance :-)"
What's wrong with making a good compiler that writes directly to machine code? I would think Cray and IBM would be even more inclined to do so, given their control over the hardware their software will run on.
Try not. Do or do not, there is no try.
-- Dr. Spock, stardate 2822-3.
Sun should have invited us GCC developers also to help out with this because most of us want a way to do Inter modular optimizations but we have the FSF looking over our shoulder on how we implement it, right now (the mainline) you have to compile all the source files at the same time to get IMA to work correctly and you have to say to produce an .o file first.
No wonder we have to keep making faster CPUs just to maintain the same performance. Is Java on a PIII or G4 any faster than hand-optimized assembly code on a 486 or 68030?
Soon we'll need a 10 GHz CPU just to be able to boot tomorrow's OS in less than 5 minutes.
Good god. XML is VERY VERY good for what it was designed for: semantic markup of texts. It is not very good as a straightjacket on a programming language.
The article is very light on details.
Huh?
So, how many languages are being proposed here? A new "low-level" one, plus a higher-level "technical computing language" designed to make the most of the lower-level one? Just what's so special about this new low-level language that requires a specific new language to get the "maximum benefit" out of it? I don't have to write in Java to be able to compile to the JVM bytecode. For that matter, I could write in Java and compile to some other assembly language.
New back-ends ("low-level languages," if I understand the article) are added to GCC all the time. We never needed to add a whole 'nother front-end just for them.
I suspect that the real situation is less weird, and the journalist got confused... or heck, who knows, maybe they're proposing half a dozen new languages. It's Sun, after all.
Odd. I wouldn't have thought you'd need to do that these days anyway.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
"There is no problem in computer science that cannot be solved by adding another layer of indirection."
Language neutral? Perhaps I'm just skimming your linked-to article too quickly, but this is what leapt out of the page at me:
"Parrot is strongly related to Perl 6... Perl 6 plans to separate the design of the compiler and the interpreter. This is why we've come up with a subproject, which we've called Parrot that has a certain, limited amount of independence from Perl 6." [emphasis added]
That certainly doesn't sound like it's been designed with language neutrality in mind. For what it's worth, MS's IL was designed with at least four languages in mind - VB.NET, C#, managed C++ and J#, and a couple of dozen others have been or are being ported to it, including Fortran, Cobol, Haskel, and (iirc) even perl.
As you say, the article is over two years old, so maybe they've changed their goals since then - but that article at least gives a very strong impression that Parrot is tied intimately in with Perl.
It's official. Most of you are morons.
I thought an open, peer-reviewed, high performance IL/runtime was exactlywhat Parrot was trying to accomplish.
Haven't used Mozilla recently, have you?
Are you in management?
I would imagine that the data for this new IL would be very uniform, like assembly. Unless they plan on adding gobs of metadata, there is absolutly no need for XML. XML is very useful tool, but don't forget that if you give someone a hammer then everything will look like a nail.
Would you represent an array of bytes using XML just because you can? This IL will most likely be a sequence of well-defined binary tokens. What would it benefit from XML? Maybe programs/functions/classes can have some XML metadata, but the actual sequence of commands will most likely be a chunk of binary data.
If it was useful/practical to have an IL language specified in XML, then Microsoft would have done so with MSIL already. BTW, don't bother bashing MS about the usefulness and practicality of their products. MS has some brilliant engineers so at least give them some credit for trying to make a decent IL.
I recall a system based on USCD Pascal. I also :-) Except it was slow. Well, on my Apple ][ it was good for the fastest code after Assembler. It only got catched when Z80 coprocessors with CPM and Turbo Pascal came en vouge. .... So you need a VM on your CPU server, able to execute encrypted bytecode, so hat you as owner of the CPU dont see what the code is calculating. BUT you, a CPU server, you dont want your system compromized, or the code of other clients compromized by any piece of code. ... probably where the VM is itself only "executed" code inside of a meta cotainer. That means modern VMs probably will extract core VM features like garbage collection and thread scheduling outside of the VM into a library, and every piece of code may "class load" its own garbage collection schema. Consider differnt garbage collectors per thread and not per VM. ....
I did really a lot of programming in UCSD pascal, and long UCSD p-code was the most wide spread operation sytem/virtual machine.
If you need performance write it in assembler or
use nicely optimized C.
Assembler loses all higher level abstractions, like inheritance, interface implementation, class relationships(relations, aggregations and compositions), thread synchronization. The same is true for C, besides that it is on source level not able to express higher level concepts. You might use assembler instead of C.
How do you optimize assemberl? The operation system, the non existing, but hypotetical VM, the loader, the processor, none of hem can optimzie "assembler". I mean: In Java Byte Code I have all the higher level abstractions of the system inspectable via reflection etc. In assembler I have nothing.
New bytecodes, able to express more higher level informations e.g. like prarallelization, or even this problem: consider you have an CPU server, consider you have code migrating to youor server, consider you want to trust that code, consider, the "owner" of the code does not want to trust you
Or, consider this, you want byte code as an mobile agent, similar to the scenario above, but it should be allowed to replicate over a GRID, but only under certain restrictions.
You want to optimize every replica at the VM where it is finally executed, to take an optimum of resources on that point. How do you do that in "assembler"?
Modern byte codes will be likely even closer to the constructs of the high level languages than byte code is. Resource allocation, object creation, class loading, higher level concepts, like delegation, parallelism, synchronization(on multiple mutexes probably), serialization, distributed(pervasive) computing, probably OODB support build in, probably a light weight EJB like execution environment, probably a 4 level hierarchy of VM, meta container, container and executed code
Well, I could continue for a day with improvements
What's the benefit to yet another
layer of abstraction?
The benefit is to optimze on that layer of abstraction and then to project/generate/assemble the optimzation down onto the machine layer(or the next lower layer).
angel'o'sphere
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
For its time, UCSD Pascal was an excellent language and operating system. Its main problems were price and politics, not performance or technical issues. Many people, including myself, wrote software for it. The speed penalty of the p-code interpreter was offset by the compactness of p-code, which was important on the memory-constrained PCs of the time. UCSD Pascal, like other alternative operating systems of the period, could not compete with MS-DOS and PC-DOS, which sold for well under $100, on price.
Mea navis aericumbens anguillis abundat
There are several issues with regard to current programming techniques and grid computing for HPC. Some include:
Java isn't a bad way to offer the capability to run your code on many platforms, but it is easy to write slow code that really doesn't match the HPC speed requirement, although some do use it for HPC. Faster bytecode or JVMs that do ecen better at optimising bytecode would be a help, but I am not sure if there is enough algorithmic information left in the bytecode to allow the best optimisations on all architectures. Perhaps this is where the new initiative is aimed?
An alternative route is to publish capabilities for processing via web or grid service type mechanisms and then use brokers and discovery services. This would work well for widely used production codes, e.g. charm, fluent, etc
Parrot looks like it will be a nice intermediate language for languages like Python, Perl, and Java. But Parrot lack the right primitives for an intermediate language for high-performance numerical computing.
Right now the only widely used intermediate language that comes close to being suitable for high-performance numerical computing is Microsoft's CLR (JVM actually still has better implementations, but it lacks important primitives like value classes).
Or just enable automatic overflow into arbitrary precision bignums like lisp has had for several decades now...
- Lack of scientific data types, such as complex numbers.
- Lack of multidimensional arrays.
- Inept implementation of floating point arithmetic.
- Poor choices for defaults, such as array bounds checking and pretty printing ascii I/O.
- Onerous penalties for JNI calls and serialization.
- Intermindable process for correcting deficiencies with the language.
SUN has not displayed an understanding of HPC. Adding OpenMP or other "HPC" friendly capabilities to the VM is not going to correct design decisions with remove performance from "High Performance computing.Java is on the retreat??? Wow, I've been gainfully employed as a Java architect for the past five years; it musta' been a fluke. IBM, Oracle, Novell, et al must not know what their doing by investing millions in building their products around the Java platform. Come to think of it, there are sooo many alternatives to Java for enterprise, server-side computing. Thank you for your insight. I'll turn in my resignation and pick up a .Net book tomorrow.
Xenon, where's my money? -Borno
Sure, as long as your class looks just like a C# one. Need multimethods, dynamic class redefinition, method combination, a non-crippled model of multiple inheritance, or maybe even prototypes? You're out of luck, because for this interoperability to work, your classes will either have to be C# classes or you have to make them look like ones, and .NET doesn't give you a Meta Object Protocol to do it.
In the great CONS chain of life, you can either be the CAR or be in the CDR.
So now it's considered a defeat or a "retreat" to create a new and improved version of one of your products?
.net." to be the most extreme sign of life possible. Honestly I wish they'd done it sooner.
Hey, I heard that Microsoft just released a new version of their OS and called it "Longhorn". cn I say "Ok, so now that WinXP is on the retreat they try to enter a new area?"
Personally, I would consider "Hm, Microsoft seems to be catching up to us. Let's make something better than current Java OR
Irritable, left-wing and possibly humorous bumper stickers and t-shirts
In all honesty, the XML would be generated at a level where hands would likely never touch it, more likely through a series of transformations. Having written XML generators for C++, C# and Java, I've found that the XML is, by itself, very verbose, because it is fundamentally a meta-level description. You wouldn't write:
<func optargn="burp">arg1 arg2 arg3 lst</func>
you'd write
<function name="foo">
<param name="arg1" type="xs:string"/>
<param name="arg2" type="xs:integer"/>
<param name="arg3" type="cplxOpj"/>
<param name="arg4" type="xs:string" optional="yes"/>
<!-- implementation code -->
</function>
In all likelihood, the fragment will have been generated via a UML interface or something similar, and this would then be produced through a simple transformation.
Before objecting to the cost involved, consider that both an XML parser and an XSLT transformation are fairly straightforward finite state machines, and could very easily be dropped into firmware (something that is already beginning to happen). Because of the ubiquity of XML, firmware processing of XML is making more and more sense, and once you have that, it becomes a natural for building ILs and related compiler technology.
Over the last few decades, there have been many exotic parallel architectures. Dataflow machines, connection machines, vector machines, hypercubes, associative memory machines (remember LINDA?), perfect shuffle machines, random-interconnect machines, networked memory machines, and partially-shared-memory machines have all come and gone. Some have come and gone more than once. None has been successful enough to sell commercially in quantity. Very few of these machines have ever been purchased by any non-government entity.
There are two ends of the parallelism spectrum - the shared-memory symmetrical multiprocessor, where all memory is shared, and the networked cluster, where no memory is shared. Both are successful and widely used. Everything in between has been a flop.
Despite decades of failure, people keep coming up with new bad ways to hook CPUs together, and getting government agencies to fund them. It's more a pork program than a way to get real work done.
By the time one of these big wierdo machines is built, debugged, and programmed, it's outdated. A few years later, people are getting the same job done on desktops. Look at chess. In 1997, it took Deep Blue to beat Kasparov. Kasparov is now losing games to a desktop four-processor IA-32 machine.
Figuring out more effective ways to use clusters is far more cost effective than putting a National Supercomputer Center in some Congressman's district in Outer Nowhere. There's a whole chain of these tax-funded "National Supercomputer Centers". The "Alabama Supercomputer Center" has ended up as an ISP for the public school system, hosting E-mail accounts and such. It's all pork.
Putting corporate politics aside, what would be nice from a technical perspective is an intermediate language that is register-based. Microsoft decided to copy java so thoroughly they also copied java's mistakes by making the .NET runtime a stack machine. Market reality tells us Intel/AMD is not going away anytime soon, it would have been wise to make MSIL fit more nicely into the x86 architecture for performance purposes.
The mono/.DOTGNU projects are similarly unfathomable. It will be nice to have these tools available to run more bloated GUI's, but if one of these projects really wanted to differentiate itself, that project should instead focus on a C# to native-compiler using gcc's backend and let the other project focus on a compiler-to-MSIL. I guarantee you that project would become the 'winner'.
I think what he's saying is that the syntax isn't the only thing that defines a language. A language's type system probably plays a more important part in defining how the language works.
With .Net, it may seem like you have a lot of interoperating languages, but they're all basically the same language with different superficial characteristics. VB developers complain about how VB.Net is totally different from previous versions of Visual Basic. It's because they gutted its internals and implanted C#. I wouldn't be able to tell the difference because I see similar syntax, but someone who really knows the language will detect a different core.
That's not to say that different type systems cannot be emulated. Nice is a language with Java-like syntax but with a much better type system (among other things) and it still runs on an ordinary JVM. However, any interoperability will have to be at the level of the lowest common denominator. If you want to call Nice code from Java, your interface ends up losing or having to give up some power.
You really can't even share libraries between truely different languages. The STL just doesn't fit into the Java/C#-style type systems (though generics is a step towards accomodating the STL). Perl libraries are also distinct. Imagine dealing with a Haskell-style lazy list in your C# code. It just wont feel right.
Unless you really need to use every cycle, you're better off writing in a high level language and then recoding the critical portions (as identified through thorough profiling) in assembly language. (I speak as one who needed to use every cycle when I was a games programmer in the 80s. I've often thought of doing an all-assembly, no OS required app today, just to see how ludicrously fast it would run.)
You gain extensive experience with the procesor and platform to which you are writing, and you work bloody hard. It also depends on whether you are optimising for space or speed. For example: writing a game for the Amiga, I was told by the customer that it had to run on machines with half a meg of RAM (the entry-level machine). I once spent a whole day seeking a way to save 12 bytes; the first part of the solution involved recoding a routine using a different algorithm. The rewrite saved me 8 of my 12 bytes, and executed in the same number of clock cycles (that was a crucial constraint). I then got the other four bytes by using the interrupt vector for an interrupt I'd disabled. As I was writing to the silicon (not even using any ROM routines), I could get away with this. I wonder what kind of warnings a modern C++ compiler would throw up for this kind of behaviour ;-)
Assembly language is fun, but life can be too short. I had to spend so much time fitting the above-mentioned game into half a meg that, by the time it came to market, 1Mb was the standard required by all games anyway.
If you design your code well you have plenty. Even when you inline code to save the overhead of call/return, you will be aware of the functional purpose of those 50 instructions considered as a single entity. The same discipline required to write well-constructed code is needed for assembler. It's similar to using an old version of BASIC, with only GOTO and GOSUB for transferring control; although it allows the sloppy thinker to produce spaghetti code, a good coder will adhere to the same abstractions as they would use in a higher level language.
I'll stop rambling about the past and go and write myself a Forth system now :-)
(P.S. p-code was extremely cool. When I first got acquainted with Java, it was the first thing I thought of. Plus ca change, plus ca ne change pas...)
Using HTML in email is like putting sound effects on your phone calls. Just say <strong>no</strong>.
First, you've made the mistake of confusing the language with an implementation of the language. These are different things entirely. I'm not even sure what it would mean for the language itself to be "free". Maybe if it were submitted to a truly open standards group (like ANSI/ISO C and C++) that would make it more "free" but I don't see how that would help. Of course having a good free implementation of the language is important, but that doesn't mean that Sun needs to provide that implementation. gcc is not provided by the original implementors.
Of course there are free software implementations of java.
As for releasing java under the GPL -- I don't see it happening. Releasing it under an appropriate open source license would help Suns implementation become more popular, but they wouldn't be able to make money licensing their source code.
Do you even know what you are talking about anymore?
Intermediate languages are essentialy a processor independant instruction set. You compile down to this instruction set and then let the virtual machine translate to the native instruction set, hence cross platform. These intermediate languages are binary and have no concept of decimal or hexidecimal.
-- Fighting mediocrity one bad post at a time.