Generic VMs Key To Future of Coding
snydeq writes "Fatal Exception's Neil McAllister calls for generic VMs divorced from the syntactic details of specific languages in order to provide developers with some much-needed flexibility in the years ahead: 'Imagine being able to program in the language of your choice and then choose from any of several different underlying engines to execute your code, depending upon the needs of your application.' This 'next major stage in the evolution of programming' is already under way, he writes, citing Jim Hugunin's work with Python on the CLR, Microsoft's forthcoming Dynamic Language Runtime, Jython, Sun's Da Vinci Machine, and the long-delayed Perl/Python Parrot. And with modern JITs capable of outputting machine code almost as efficient as hand-coded C, the idea of running code through a truly generic VM may be yet another key factor that will shape the future of scripting."
correct me if im wrong but isn't silverlight 2 out and it has the DLR with python and ruby etc
I remember some years ago the elation people felt when Parrot was announced. At last, we could leverage the strengths of either Python or Perl--or whatever other interpreted languages--but work with a common interpreter. But then the hype started to die down, and the last edition of O'Reilly's book on the subject appeared over four years ago. Within the Python community, interest in Parrot seems completely dead. Are the Perl folks going it alone, and when might we see the project reach a successful deployment?
One standard, several implementations? Sounds nice in theory, just like the numerous standards that Sun has outputted where each vendor delivers its own implementation (JPA, JDBC, J2EE among others). However, in practise you pick *one* vendor and *one* implementation and run with it. Only a fool would dare switching implementation mid-development, making the choice really just academic, because there are always minor differences that "shouldn't" matter, but does.
Reminds me of architects and developers who create generic database access engines so their product can be "platform independent" and then wonder why its performance is so bad no matter which of the six major databases is used.
sPh
Software development recursively disappears up it's own arse.
We already have different, generic, virtual machines. They are called operating systems. They run on bits of silicon and steel.
You can't fix the problems you have writing software by running away from them
Deleted
article didn't include it, but this open source project seems to have similar goals
http://llvm.org/
Sure this sounds quite a bit like something Microsoft, of all people, tried to create? That's right, I'm talking about .Net! Microsoft loved touting how you could develop .Net applications in C#, C++ or even good ol' VB and it should all work the same and even interoperate. .Net and I'm sure anyone with any experience knows that despite the supposed advantages, it has quite a few disadvantages as well. But at least it made VB somewhat useful again.
But it's
None the less, I wouldn't hold my breath on this one, sounds like a pipe dream to me and I'm sure some would argue - what's the point in running your code through a VM if you can just run it natively?
On a side note: As efficient as hand-coded C? In my experience, 90% of the time someone tries to write "efficient" C, they end up causing more problems than it's worth (early-optimisation and all that). Perhaps it should be reworded to say something like Hand-crafted C from a C Master".
+1 IDisagreeSoHeMustBeATrollOrAnAstroturferOrAShill
Within the Python community, interest in Parrot seems completely dead.
Generic VMs are so 2005, the future of Python runtime is PyPy. From a single implementation of Python (written in Python), they can compile Python code to C, JVM, automatically create a customizable JITed VM, etc...
Check them out: they are doing some seriously cool stuff and they can use a bit of help.
There's a hidden treasure in Python 3.x: __prepare__()
Am I the only one that sees this as completely ass backwards? I mean, part of the lure of scripting languages is that we skip that whole compile phase of things, and so achieve a certain degree of platform independence. So long as the system being targeted has a implementation of the scripting languages interpreter, you just run the script inside of it, and you can distribute the same script (more or less) for any system with an interpreter. Now they're talking about essentially compiling a scripting language to one of several different byte codes to target one of several different VMs, which then of course need implementations on whatever systems you're targeting. How is this an improvement over the previous way of doing things?
What exactly are we getting out of this? The language developers don't have to worry about the details of the underlying machine, but as a trade off they now need to write implementations for whatever VM is out there, which is turn will require them to worry about the details of the underlying machine, so we've just pushed that pain point down one level of abstraction, but not eliminated it. The only up side I can see to the entire thing is language interoperability which is nice and all, but how does that fit in with the multiple-VM approach being touted here? Each language is most likely going to require some minor changes in order to support interoperability at the VM level, and of course there will be quirks and gotchas on each VM as well. Unless all the VM developers get together and agree on the exact changes that will be required to each language we could end up with a situation in which each language will come in multiple slightly different syntaxes depending on exactly which VM it targets.
Curiosity was framed, Ignorance killed the cat.
All I know is that every large Java system seems to have parts written in native code called through the JNI.
The JVM has been around for a long time and still can't do things like device drivers. Performance code, like parts of Java Advance Imaging, are native. A lot of people turn the native parts off though because they use ridiculous amounts of memory.
I think it's just too hard to make VM's that do everything well.
Equine Mammals Are Considerably Smaller
This all sounds great for a single programmer or small team, but how does this play in today's corporate programming environment? Today you can have teams split up into 3 or 4 time-zones, contractors and perms, outsource coders in India, China, and who knows where else...all working on the same project with their own opinions of what is "best" for the project. Will allowing each to code in their own programming "dialect" really work?
Intel stock rose sharply as investors realized that ubiquitous VMs will require faster processors because more programs will be written in scripting languages. Shortly after, Intel stock plummeted as investors realized that intermediate VMs decouple the programs from the processor architecture.
Ian Piumarta and the VPRI [http://vpri.org] are doing some amazing work related to this story.
COLAs: Combined Object Lambda Architectures - A Complete System in 20,000 Lines of Code.
http://piumarta.com/software/cola
The system is slowly evolving towards version 1.0 which
* is completely self-describing (from the metal, or even FPGA gates, up) exposing all aspects of its implementation for inspection and incremental modification;
* treats state and behaviour as orthogonal but mutually-completing descriptions of computation;
* treats static and dynamic compilation as two extremes of a continuum;
* treats static and dynamic typing as two extremes of a continuum; and
* late-binds absolutely everything: programming (parsing through codegen to runtime and ABI), applications (libraries, communications facilities), interaction (graphics frameworks, rendering algorithms), and so on.
http://piumarta.com/papers/colas-whitepaper.pdf
http://piumarta.com/papers/EE380-2007-slides.pdf
http://piumarta.com/pepsi/objmodel.pdf
http://www.vpri.org/html/work/NSFproposal.pdf
Allen Wirfs-Brock and Dan Ingalls are currently working on bringing notions like Colas to the browser so that we can use any programming language WE choose to for our browser based applications. Check out their interview here.
http://channel9.msdn.com/posts/Charles/Dan-Ingalls-and-Allen-Wirfs-Brock-On-Smalltalk-Lively-Kernel-Javascript-and-Programming-the-Inter/
So I heard you like coding on VM's? So we put VM on you VM so you could code while you code.
Bot Assisted Blogging
Microsoft promised this with .NET. (Just buy our tools and you build to .NET and run on all Windows platforms, XP SP1, XP SP2 AND Vista! It's sooo much better than that... Java thing.)
Microsoft promised us this with Windows CE. (Just buy our tools and with a simple compiler switch, voila, you're targetting CE... it couldn't be easier.)
Microsoft couldn't even do it with DirectX where OpenGL could (Oh hey, that XBox directX.. it works a little differently than Windows DirectX)
For that matter, the Windows Printer driver APIs aren't consistent (Yeah, we know it's called GetMarginSpaceFromEdge but driver A measures the edge from half an inch in and driver b measures the edge from the print head detects the edge of the page which is sometimes an inch greater than the page itself...)
Y'know what the greatest VM is right now? i386! And has been for nigh-on 10 years!
I LIKE Microsoft product, don't get me wrong... but I'm not going to buy Visual Studio 2011 which has no other changes than a GUI enhancement and the ability to target my development towards the hot new sweetness.DNET API's so 3 years later, Microsoft can abandon .DNET for DCOM# because, hey, thats what our research said people wanted and it'll be supported on Windows 7.1.1 along with Blackbird 2.0
My first thought on reading this was an old software engineering maxim, usually (and probably correctly) attributed to Don Knuth:
Universal VMs are old as the hills (anyone [else] here old enough to have programmed on the UCSD p-System?). We shift towards VMs to gain independence and portability, and then we shift back to direct, spot or JIT compilation to improve performance. It's an old, old dance, and one that will likely go on for years to come. ..bruce..
Bruce F. Webster (brucefwebster.com)
I find it kind of funny how there's the battle between wresting the most performance out of the hardware versus the ease of use for the programmers and users. Back in the day, every character was significant and code with too much documentation simply ate up too much space. (and this is talking about after we gave up on punch cards and were typing the code into terminal screens.) Every step we take to make computers easier to understand, easier to use makes the backend so much more complicated. A base install of XP is something like what, tens of thousands of times larger than DOS? But it's also thousands of times more powerful. But at the same time, we can sacrifice too much. I could run Win2k and Office 97 just fine on a good machine from 1999, it would still suffice for the typical office worker even today. Of course, that machine cost around $1000 back then and a basic office machine with so much more power goes for $600 today, including the Vista tax, but wait, Office is gonna ding you another $600 now. Funny how the cost of software used to be the cheapest part of the machine and now it's become the most painful. But also, when you get right back to it, the secretary isn't typing her letters an faster on the newer machine. There's probably nothing she needs in any of the newer versions of Office that she didn't have in 97.
It just strikes me as kind of funny how we make these huge advances in performance, in hardware capability, and it seems like the software is really lagging behind in the effort to fully exploit these gains. But then I look at how hard it is to write the code and it's amazing we've come even this far.
Kwisatz Haderach
Sell the spice to CHOAM
This Mahdi took Shaddam's Throne
I've been thinking about this topic for a long time. The use of a virtual machine is usually hampered by the lack of a proper language agnostic, operating system ambivalent, linking-loading mechanism.
Bare with me.. Being able to consistently identify precise versions, provide a global library namespace with automatic cross-language compatibility (calling convention, and datatypes with or without support for cross language OOP) make the benefits of a VM much easier to attain.
There is a JSR to address this on the JVM but I am not convinced that interop between languages on a single VM will be transparent. I mix Java libraries with JRuby and I often end up writing thin facade classes to make interop better.
Oh, and real men write their own compilers.
Real men code in P".
Deleted
.... wanting to fully understand it I followed the links where I typically found a new link after the first paragraph, recursively. So after 15 minutes of reading I determined that I hadn't gotten anywhere in understanding much of anything except for one thing:
How many programs must we run, layer upon layer, in order to run an application?
Doesn't adding more and more layers of complexity contribute to the failure side of the failure vs. success equation?
I do really understand the ideals behind .net, such as the CLI and CLR but I also note the downside as being one of addressing the general objective of doing such, to be that of failing to address the general objective much sooner in the software development cycle. Addressing the general objective at the later stage of runtime only overly complicates the fix.
To use an analogy, A social science teacher once described to the class how quality control was once done in the USSR. A fine china plate manufacture would produce the plates, put them on a truck and ship them to the store. To buy a plate you would stand in a line for a number and once you have number you'd stand in another line to pay for the plate where you'd then get a receipt. Then you'd stand in a third line to pick up your plate. Once you got to the front the line the store employee would look at your receipt and go over to the plates pick one up and with a wand, they woudl strike the dish as a quality control step. If it broke they would do the same with the next dish until they completed your order.
Likewise the ideal of write once run anywhere via a run time engine is the same sort of just in time for being to late in cost effective over complexity failure.
Where the general objective needs to be addresses is at the very beginning of the development process, perhaps even before code is written.
Programming language, anything above machine language, is an abstraction and this is recursive. But in application running the machine must see it in terms od machine language and as such, what ever the level of abstraction, it gets boiled down to machine language (granted quality of machine language results is defendant on TRANSLATION method used). This is common knowledge with anyone who knows anything about programming.
The Common Language Infrastructure (CLI) is the ideal of taking all the more popular programming concepts and data-types and combining them in a manner that is non-conflicting that is then used in the translation process to convert to an Common Intermediate Language that then runs on the Common Language Runtime.
The key point here is of "TRANSLATION" and by addressing translation early on in coding it become possible to translate whatever to whatever else to then compile and run anyway you want, be it directly on the hardware or native on any OS that is capable, or even on a VM.
The point is Computer programming languages are abstractions and it is in dealing with and translating such abstractions from one form to another, is where the magic of the future is to be found.
It is in understanding the Natural Laws and Physics of Abstraction creation and use, understanding translation mechanics, where software development solutions will be genuinely found. Deal with Abstraction Translation prior to compile (though compile is itself a translation to machine binary)... But even in doing this focus on abstraction translation, there will evolve simpler yet powerful programming languages. Its the whole point of programming! to take some complexity and make it easier to use and reuse via defining it and a simplified interface to its use. Done of course, the only place it can be done, at the abstraction creation and use level.
Not in some down the line VM additional complexity that is designed mainly to generate licensing fees.
This concept is not new. It was implimented on the IBM System 38 in the late 70s and was called MI ( Machine Interface, later renamed as TIMI, Technology Independant Machine Interface ). It allowed IBM over the years to make radical changes to the hardware ( S/38 => AS/400 => iSeries, CISC to RISC to POWER etc. ) without end users having to modify any of their code, written in any of a number of languages, CL, many variants of RPG, REXX, PL1, C, C++ ... Curiously Microsoft for years ran its internal operations on a pair of S/38s. I wonder ...
generic VMs divorced from the syntactic details of specific languages
The syntax of programming languages is something understood by the front-end of a compiler. It then translates the code into code that does the same thing in the back-end language (such as JVM/PyVM/x86/LLVM bytecode). Neither back-end knows about the syntax of the front-end language.
The real challenge is to adopt conventions on the back-end VM that allows different languages to talk together. It'd be straightforward to implement an x86 emulator on top of the JVM and run the ${language} VM on that x86. Wow, you now have ${language} running on the JVM. So? You can't talk to the Java library that way.
If you want languages to talk together, they need to agree on data representation formats and calling conventions. Try getting object.field if you don't know where field is relative to the base address of object. Try calling object.method() if you don't know the format (or location) of object.__vtbl.
Also, the semantics of some operations have to be considered if a language has to deal with a foreign object model. Let's say we target the Java VM. How do you implement multiple inheritance? What does .super do on a class with multiple parents? How do you implement "Object *p = malloc(...); *p = my_object;"? How do you implement C++'s delete? How do you implement python's generators?
To support a set of languages, the VM must support the union of features. To make the languages talk together smoothly, the VM must support each feature in a reasonably straightforward way. The two demands pull the VM in opposite directions.
I don't want to just poo-poo this idea, but my experience with dealing with the Java VM (I've written a java-important-subset compiler in my compiler course) is that it's tightly coupled to the Java way of doing things. My experience with different languages (C, C++, Java, python, perl, ruby, haskell, scheme) says that things are different enough that you can transfer most of what you know from one language to another [at least for the oo/procedural], but that the devil is in the details, and the VM has to handle all values of $details.
"Imagine being able to program in the language of your choice..."
Which bigger enterprise would allow you to program in the language of your choice. We have a code base written by around 1000 developers during the last thirty years. Do you really think we give developers a choice about their language?
Depending on the problem you have to solve there is one language to pick. Maintaining this code is extremely expensive. This is were the real complexity lies and this is the problem we have to address.
I really do not care whether our developers have it cozy so that they can pick the language of the day...
Despite that it was still in use for some business apps (mostly accountancy style stuff) right into the early 90's.
When I look at e.g. MS "Singularity" I see something suspiciously similar to my old (multi-user) Sage II (sadly now long departed).
Andy
Wasn't platform independence the selling point of UCSD's p-system? Yes, it worked, but it never really caught on. One camp of software development says that hardware is always getting faster, cheaper and more efficient, so adding a layer of abstraction between the source code and the hardware is not a problem. The other camp says we can use those same performance improvements to build software that does more things, on larger data sets, with better graphics, and in general make what once were impractically large and complex software tasks run on the average users' systems. Over the last three decades, the market has favored the latter.
...as long as it's Python.
No thanks.
... since its all about abstractions and translation, by doing it up front your have more control and opportunity to advance.
To deal with translation on the back end is avoidance or hindrance of genuine programming advancement in exchange for licensing fees for another level of abstraction/translation.
Maybe because not even Sun's engineers actually wanted to write actual code in Java?
no don't laugh, it works very well! there are a number of very good reasons for this.
1) javascript is actually an incredibly powerful language, in particular due to the concept of "prototype"ing.
2) javascript, thanks to web browsers, has an unbelievably large amount of attention spent on it, to optimise the stuffing out of it. as a result, the latest incarnation to hit the streets - the V8 engine - actually compiles to i386 or ARM assembler.
3) the number of "-to-javascript" compilers is really quite staggering. see the comments from pyv8 article for an incomplete list.
GWT has a java-to-javascript compiler; Pyjamas has a python-to-javascript compiler. There's a ruby-to-javascript compiler - the list just goes on.
then there's the pypy compiler collection, which has javascript as a back-end. (and, for completeness, it's worth mentioning that it also has a CLR backend, LLVM.org backend, and a java backend).
you have to bear in mind that scripting languages, in order to be _reasonably_ efficient, have to do intermediate byte code _anyway_.
python uses a FORTH-like intermediate byte code, for example. the similarity to CLR will be pretty high.
when you come to things like V8, that does on-the-fly _compilation_ which is basically the same thing as intermediate byte code, only a bit more extreme and aggressive.
so the technology is beginning to move in the direction of "grey area" - thinning the distinctions.
i like the idea of using javascript as the VM intermediate language.
what's really neat about using javascript is that people have been optimising the hell out of it for a loooong time.
so, pyv8 demonstrated an empirical result of running python TEN times faster than the standard compiler does, by translating the python into javascript and then V8 compiling it to i386 assembler on-the-fly.
that's _very_ cool.
Look, matey, I know a dead parrot when I see one, and I'm looking at one right now.
...
No no he's not dead, he's, he's restin'! Remarkable bird, the Norwegian Blue, idn'it, ay? Beautiful plumage!
My favorite quote doesn't fit into 120 characters. Now no one will like me.
The future is the 70s?
http://lkml.org/lkml/2005/8/20/95
The point is that it's not far off from what this article is talking about.
+1 IDisagreeSoHeMustBeATrollOrAnAstroturferOrAShill
If memory serves, all of their compilers compiled to a genericized ``pcode'' for which multiple engines existed (one per processor architecture I believe it was) --- all that was missing was multiple implementations per architecture.
William
Sphinx of black quartz, judge my vow.
Actually you have to use C++/CLI which isn't quite the same thing as C++. For example you can't use multiple inheritance. Whether you think that is good thing or a bad thing, its still a restriction imposed by the CLI that disallows truly and fully using any language.
Software Inventor
"Hardware independence is invented. That is what Java is doing for you."
So any multi-threaded or multi-process Java code I write will run identically on all systems regardless of the processor used or how many cores are present?
Imagine being able to commercially sell a software program without worrying about your competitors reverse engineering your work and use it against you?
Java/.NET/perl/python..et al are all worthless until the reverse engineering penalty even with obfuscation is anywhere near that of compiled C code.
The real innovation in programming languages is to be able to say what you want rather than how to do it like various execution engines for very application specific systems already do. Spouting about Language and virtual machines is all noise.
Only a fool would dare switching implementation mid-development
Unless the requirements change mid-development. For example, an application originally intended to run on a notebook computer (which has an x86 CPU) might get retargeted to run on a handheld device (which more than likely has an ARM CPU), or vice versa. With C++, you switch to a different implementation that supports a different instruction set. Or perhaps you want to develop a product and deploy it on multiple platforms. For example, XNA for Xbox 360 can only run CLR bytecode, and MIDP for mobile phones can only run JVM bytecode, so you'd need to write a cross-platform video game's model[1] in a language that can target both the CLR and the JVM.
[1] Here, I distinguish the "model", the core of a program that defines the rules of a domain such as a business or a game, from the "view", the way a program presents the model to the user. Physics and AI are major components of a game's model; things like graphics, sound, and some of the input make up the view. Ideally, porting a program should require rewriting only the view.
We already have different, generic, virtual machines. They are called operating systems.
So if I have some customers who own hardware that runs one operating system, and other customers who runs another operating system, how do I deploy a solution to both?
I read that as Genetic VMs and that sounded really cool. It even made interesting sense almost all the way through the OP.
I was sadly disappointed when I realized my error. Generic VMs? Like everybody else said, boring.
Being a programmer, I feel I can say this.
I really hate it when programmers get architectural ideas. This is how we ended up with Java, and look how that turned out. Hasn't lived up to its promises, is completely pointless now and outdone by a lot of other languages.
Scala is an excellent example of a functional, multi-threaded language implemented on a virtual machine - the JVM.
Scala's a really nice language, actually.
Yes it does. They had great ideas at that time but not the cpu-power to make them real. Which is rather sad as lesser ideas needing less cpu-power made it and still haunt us even true we now do have the needed cpu-power.
Martin