Microsoft Roslyn: Reinventing the Compiler As We Know It
snydeq writes "Fatal Exception's Neil McAllister sees Microsoft's Project Roslyn potentially reinventing how we view compilers and compiled languages. 'Roslyn is a complete reengineering of Microsoft's .NET compiler toolchain in a new way, such that each phase of the code compilation process is exposed as a service that can be consumed by other applications,' McAllister writes. 'The most obvious advantage of this kind of "deconstructed" compiler is that it allows the entire compile-execute process to be invoked from within .NET applications. With the Roslyn technology, C# may still be a compiled language, but it effectively gains all the flexibility and expressiveness that dynamic languages such as Python and Ruby have to offer.'"
doesn't this allow for malicious programs to get even more malicious?
What do they exactly mean by "flexibility and expressiveness of other dynamic languages" ?
I remember a demo at a Microsoft Developer congress where C# would be able to execute and rebuilt itself dynamically.
At the time it got me really excited (as I've bumped into many problem which would have a much more beautiful solution should I be able to compile during runtime.) but this seems yet another technology?
I think we can keep recursing like this until someone returns 1
yeah I know I must be new here http://developers.slashdot.org/story/11/09/16/0253202/microsoft-previews-compiler-as-a-service-software
If I wanted to, I could rig GCC and the like to do that too: That's the wonderful thing about command-line tools and piping, you can munge things together any way you want. And of course you can always tell gcc to stop partway through the compilation if you need assembler code or a parse tree or something. This sort of thing is common in open-source compilers, because they need these features for debugging purposes and have no reason to leave them out of the released version.
Of course, I probably don't want to include a feature like this dynamic code execution, because if I screw up, it would be a fantastic way to get a machine to execute code that it's not supposed to.
I am officially gone from
It seems that Neil McAllister has never heard of LLVM and Clang, while Microsoft obviously has.
That's something that I haven't seen a language really get right since FORTH. I'd love to be able to use C# in a similar way, entering small function definitions from the command line, compiling them as they're entered, interactively testing functions as they're written. It's a great way to speed development.
Oh look, M$ has found a new way to infect windows with viruses.
This sounds great if you're doing stuff like autotuning, but for the vast (vast, vast, vast) majority of programmers out there I don't really see how opening up the internals of the compiler is useful. Who cares if that loop gets fused or that function gets unrolled?
To make laws that man cannot, and will not obey, serves to bring all law into contempt.
--E.C. Stanton
This mean that they are taking out the Linker?
Roslyn is a complete reengineering of Microsoft's .NET compiler toolchain in a new way, such that each phase of the code compilation process is exposed as a service that can be consumed by other applications,
Sounds like LLVM.
Compile and execute code from within an application? That's exactly what Krita (http://www.krita.org) does with OpenGTL (http://opengtl.org) -- we have code written in special languages for filters and so on which gets compiled by Krita and then executed as native code. It's pretty safe as well.
This isn't exactly new. LISP had it from the early days. It's an idea that's been tried before, now available with more modern buzzwords, like "the compiler as a service".
Can't wait to get my hands on a FOSS clone of it.
"When information is power, privacy is freedom" - Jah-Wren Ryel
...and the Malware writer community collectively cried: WOOHOO!!! :-)
Now malware can be shipped in various partially-compiled steps and in different packaging (one,two,three modules, arriving from different vectors, etc), making detection harder, and can then be compiled targetting the cpu it lands on. Oh, what a fricken great IDEA! platform-independence for malware just got easier! It''s really getting hard to distinguish between the bad guys and producers of ideas like this.
Pavlov wouldn't be so famous if he'd used a can opener instead of a bell.
like the Scala compiler? an API, plugin support and more? the Scala shell uses it as an example of how to use it
Its not so much about seeing what the compiler did, but changing what will do. Roslyn will enable tool writers to parse and analyze programs consistently. This seems most helpful for tools like Resharper and NDepend. I think you could also use it to make AOP possible without an IoC container, i.e., you mutate the compiled partially compiled output to cleanly integrate cross cutting concerns.
Small modules that can be assembled in different ways to achieve many objectives -- great idea!
"If you're not passionate about your operating system, you're married to the wrong one."
LISP has been compiled, and has a eval statement.
SPITBOL is a compiled version of Snobol and supports the Snobol CODE statement which allows one to construct code and invoke it dynamically.
The Java Compiler can be invoked from within Java.
Am I missing something?
platform-independence for malware just got easier!
"I've got more toys than Teruhisa Kitahara."
If I get a dime for each time someone "reinvents" Common Lisp, I would be rich.
Please continue innovating, Microsoft. Hint: I think whoever invents the bicycle again, will get to headlines too.
http://de.wikipedia.org/wiki/Klein_Zaches,_genannt_Zinnober
"Rosalyn, who's your daddy?"
Progressive compilation that allows access to parse tree, AST, symbol tables, and other such artifacts is a great help in IDE and other "introspective" applications.
Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
Microsoft has patents to this. only joking
I was thinking cpu-specific, not OS-independent. Sorry for ambiguity. CPU-specific compilation may allow for use of idiosyncratic features/bugs in the production of invasive code, something a little more difficult if the target hardware is unknown.
Pavlov wouldn't be so famous if he'd used a can opener instead of a bell.
So, they're reinventing LISP?
C#, Ruby, and Python are all (in their main implementations) compiled languages. Where they differ is that C# is mostly-statically-typed, and Ruby and Python are dynamically-typed. The .NET compiler toolchain being exposed as a runtime service doesn't really make C# much more like Ruby or Python, since it doesn't change their main area of difference between the languages. It does mean that you can implement the equivalent of eval for .NET languages that don't already have it (like C#), which makes it a little bit more like Ruby or Python, but I don't think "C# doesn't have eval" is really the main reason people would think Ruby or Python is better for certain tasks than C#.
Oh?
I run it on Windows, Linux, FreeBSD and MacOS.
I run it on x86 and ARM.
Seems pretty damn independent to me.
Regarding what the GP stated though, with the right libraries and a little clever coding, a similar independent 'partially compiled' method could be used with C as well. Of course the partial compile of the windows version would have to check for a C compiler, and download/install one if it isn't available. Java and Flash could conceivably be used to do the same. So, it's really not adding a whole lot of new threats to the ecosystem.
Self proclaimed typo king, and inventor of the bear destroying coffee table (patent not pending).
"each phase of the code compilation process is exposed as a service that can be consumed by other applications."
How bout if the 'other app' is a web browser window? TFA suggests this will be possible with MS's product.
Pavlov wouldn't be so famous if he'd used a can opener instead of a bell.
I think when a content producer submits their own article, Slashdot ought to flag that fact - just like they do "whoever is a first-time submitter" - have language to that effect:
"This article was submitted by the same company that originally produced it."
I also think that a self-submitted article ought to be given the most negative rating possible so it is buried. If it's any good, someone will find it and submit it.
I also think Slashdot ought to disclose financial relationships, if any, with content producers for article placements. InfoWorld, Computer World, certain book publishers like Packt, etc are over-represented especially considering the quality of their articles is low.
Where's that survey link again?
Sounds pretty familiar. Oh, yeah, Mono does this already.
// file: mice.h
#include "frickin_lasers.h"
So? my code could be put in an apache module. Use WSGI and it is available in Python. PHP has the ability to do it straight away.
It's still not adding any vulnerabilities to the ecosystem that haven't existed before. Yes they used it as a demo, but that's probably because it's a quickly visible demo that everyone can easily see what it is doing. Only an idiot would use it like that on a production system, just like only an idiot would use C, PHP or Python to do the same thing, and those have had that feature for almost as long as they've been around.
Self proclaimed typo king, and inventor of the bear destroying coffee table (patent not pending).
They found a way to shove XML into the compiler! Kudos to MS!
(see sig)
Stupidity is an equal opportunity striker.
Fellow slashdotter Bill Dog
We used that approach for PBL some years ago. It is wasteful to having to rewrite parsers and lexers for languages to build IDEs, and other tooling.
For example, code indentation can be done by walking the AST (you need to be careful to preserve hidden tokens, such as comments).
You can also allow code completion by changing the compiler to accept a "COMPLETION" token in some places in the grammar. Then, from the editor, when someone presses "Ctrl+SPACE" (or whatever) you mark the location in the lexer and send the code to the compiler. When you build the ast, you insert a completion node in the AST, and you have now contextual information about what can go in there and produce a list of potential things that can go in there.
Also, syntax highlighting can use the lexer for basic coloring and some type information to then add more information (such as what are field, or functions, etc.)
What's new is exposing these phases in a standardized manner in the language. That's a bold move, since backward compatibility will be tricky to maintain. Maybe they're thinking in finally stabilizing C#.
There are already APIs to emit IL or to invoke a C# compiler built into .NET and the security systems built into .NET give you a way to prohibit them. There's no additional risk exposed by Roslyn. Rather, it's a way of getting at the juicy knowledge about the code that the compiler builds up before it exits and that libraries have been written to poorly piece together. That's a good idea that I'd like to see accompany more official language compilers, static or not.
Ruby and Javascript were interpreted languages. The kicker isn't the eval function, but rather the def/prototype functions. In Ruby, you can instantiate a String object named str, add a method to String, and then immediately call that method on str. Upshot? - Imagine for a moment replacing (or removing) an object's toString method on the fly.
It's only "meh for everyone else" until the few folks who are interested start building tools that other people can use.
I'd love to have a toolchain that gave me the option of directly integrating what are now stand-alone tools into my build process. Static analysis tools, syntax checkers, security analysis, and even future tools that nobody has even every thought of creating yet.
Ever used JSP before? You know that JSP pages are compiled (either on the fly or precompiled) and (if you're smart) you stored off the compiled .java files so you can debug when you page goes belly-up.
(You have to store the pages, because the line numbers match the .java classes, not the JSP pages themselves)
Now, we're removing the compiling mess, moving it to .NET as a service, and standardizing the calling of compiling those pages.
This premise, a managed AST you can manipulate programmatically (a SOM, Source Object Model), plus a managed compiler pipeline to compile, is nothing new. Boo language was doing this on .NET , and I'm sure there are many examples before it: Boo was started in 2003.
"C# may still be a compiled language, but it effectively gains all the flexibility and expressiveness that dynamic languages such as Python and Ruby have to offer".
..
And ties you even more effectively into the mothership
Seems to me Microsoft is now attempting to do with compilers what they attempted to do with the mobile phone.
Join the Slashcott! Feb 10 thru Feb 17!
How is this any different at all from Javassist?
Arbitrary CPU instructions aren't a problem. Arbitrary holding of the CPU and arbitrary API calls are a problem. The OS shouldn't award too big a slice to just any arbitrary sub-process. The parent process should check for unauthorized API calls. Something like this can be sandboxed. The question is, what kind of box are they putting it in?
Given the current ability of scripts to lock up IE and perform drive-by attacks, I'm not too optimistic about how they've secured it either. I'm just saying that it's not impossible.
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
http://www.occupytheboardroom.org/
Ah, you are so right. Shame you'll just get modded down for telling the Truth (shame there are so many folk out there who think Window's desktop dominance means that Windows == computing).
THATS WHAT CLANG/LLVM DOES!
Jesus Christ. There are some real dysfunctions pointed out, which is regrettable and may even be enough to effectively kill the project. BUT.
But ALL THAT WAILING about the choice of version control system. Waaaah! Waaaah! You won't use the version control system I prefer! Yours doesn't work! You suck! Piss off! I can't be bothered accommodating it and learning to use it to best advantage! Never mind that countless projects have used cvs perfectly successfully. Just about ALL open source projects for a long time.
That shit just turned my stomach. Specifically that shit. Just saying.
Janino, javaassist and now jdk has feature like this from long time,
Caerusone (at code.Google.com/p/caerusone) can be used to extend the idea and use similar facility on even Google app.engine.
So, if these functions are available to a redistributable program then anyone would be able to distribute a Microsoft compiler pretty much for free. If these functions are not available then it limits your customers to those people willing to buy a $800 compiler.
I've been around a long time, and I've never heard that. It has the kind of plausible ring that usually sends me to Snopes, where two thirds of the time I come away chastised for loaning the idea five seconds of credence.
What I know about GCC is that it had a rough adolescence and that over-arching design hardly entered into it for long stretches of time.
There's some truth to this aphorism. GPL is designed around what Stallman doesn't want people to do. It builds from a negative. Stallman doesn't want others to take away his freedom by building something he can't have.
I admire what Stallman's dogmatism enabled him to achieve. We're probably better off on both sides of the license fence because of it. At the same time, his repurposing of the word "freedom" is one of the most toxic subversions in the history of language. No, he couldn't just come up with his own word, he had to take someone else's word away. I wonder what Marshall McLuhan could have come up with given the starting point "gift culture Lebensraum".
I think closer to the truth of the matter is that gcc gained far too many extremely important use cases to start dabbling in architectural modernism. You'll note over the same time period, that Linux remained fairly far to the monolithic end of the spectrum. When a project reaches that scale, specific success factors put the stomp on architectural ideology.
Futhermore, on the C++ side, the rapid evolution of the C++ language wasn't doing anyone any favours in IDE integration.
The time is ripe for a new approach. The king is dead. Long live the king.
Here you go! Microsoft Roslyn: Reinventing The Wheel As We Already Know It
Only an idiot would use it like that on a production system
Would you call a JIT compiler idiotic then ? Because this is exactly how I foresee this stuff being used in enterprise apps, particularly ones that rely heavily on dynamic entities. We could have the app generate code on-the-fly that is then reused as needed, rather than reinterpreted every time with a hundred DB calls and long-winded generic form-generating code.
-Billco, Fnarg.com
I've worked with FORTRAN 66 programs that could do that - the base program was a configuration processors and would read configuration files - then generate variables (usually just arrays and their sizes) and write them into a fortran main program - then the program would start a chain to the compiler/linker, which would then chain to the compiled program.
It was a way to get dynamically resized arrays specific to a given data.
It got easier when the dynamic load functions were added to UNIX (the dlopen/dlsym/dlclose library functions).
Would anyone want to make a POS slow language like C# even slower?
I see a lot of other tools that do this, but since C# mostly started off as a ripped off Java, it's also worth pointing out that since Java 1.6, that language also provided public interfaces to compile code at runtime.
These are nice features. Sometimes, they are even useful (as opposed to just another hammer developers can abuse). But the announcement makes it seem, wrongly, that MSFT is doing something really unique here.
Please read the previous comments, a JIT compiler is NOT what we are discussing. Those are used to compile static, pre-existing code.
Self proclaimed typo king, and inventor of the bear destroying coffee table (patent not pending).
Addendum, didn't read your whole comment, still not what we were discussing. We were discussing the use of dynamic code entered by the user making a request via the web, and compiled/executed by the application. Not a specific set of pre-defined templated code that is modified without taking code directly or indirectly from the user.
Self proclaimed typo king, and inventor of the bear destroying coffee table (patent not pending).
C# has always supported compiling additional code at runtime.
I've had it in projects since the 1.0 release.
They may be redoing the structure and making it easier to do, but doing it isn't new.
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager