Extensible Programming for the 21st Century
Anonymous Cowardly Lion writes "An interesting article written by a professor at the University of Toronto argues that next-generation programming systems will combine compilers, linkers, debuggers, and that other tools will be plugin frameworks [mirror], rather than monolithic applications. Programmers will be able to extend the syntax of programming languages, and programs will be stored as XML documents so that programmers can represent and process data and meta-data uniformly. It's a very insightful and thought-provoking read. Is this going to be the next generation of extensible programming?"
The document is mirrored here to help compensate for the bandwidth deluge.
When life gives you lemons, you CLONE those lemons, and make SUPER-LEMONS. -- Dr. Cinnamon Scudworth, Ph.D
... will obviously be "forbidden." Yes, I did RTFA.
This is incredibly stupid. How come XML helps in dealing with data and metadata? Metadata *is* data.
What we really want is an user-extensible type system, like the one proposed by Date and Darwen in _The Third Manifesto_ for relational database systems. Remember, types are domains plus operators.
Leandro Guimarães Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin
programs will be stored ... so that programmers can represent and process data and meta-data uniformly.
Yup. Back in the day, we called this "Lisp". It was about as readable as XML, but a hella lot more fun.
Extensible, so that each programmer can use a whole heap of functional and syntatic elements that no other programmer has ever heard of...
XML, so stuff that doesn't need to be human-readable is human-readable, and the whole mess is a good six times larger than it needs to be...
Plugins, so that everything can be dependant upon proprietary, bulky, inefficient runtime engines...
I am all for progress, and not married to old-school solutions by any means. However, some things can sound good in theory without actually representing progress.
XML? Bah. Next generation languages will be written in "WIMNNWIS" (What I mean not necessarily what I say) and will run on processors liberally sprinkled with pixie dust.
taken! (by Davidleeroth) Thanks Bingo Foo!
And suddenly he's propheysing the future?
Editors like Emacs, Visual SlickEdit and even the loved/loathed MS Visual Studio have plug-in frameworks.
As for XML being the "glue" for holding things together... No. It'll be a data neutral "modulator" you emit your data from your program by name in a particular format. Transmitting and receipt by the other programs will be handled by a remodulator. In between it might be XML, it might be binary, it might be whatever you feel like using that day.
(and no I haven't read the artile (FORBIDDEN)
How humans can tell what will be in a few years if they can't tell what will be tomorrow?
I'd completely agree if the claim wasn't "that next-generation programming systems will combine compilers"... but "should combine...".
Right, the idea is nice. But where will the market go, how will big corporations guide the development, what will become the new fancy or if there will be a new development that will render XML completely obsolete and feeling ugly comparing to that "new thing" - we don't know.
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
Instead of treating each new idea as a special case, they allow programmers to say what they want to, when they want to, as they want to.
Is this not the Ultimate goal of programming? The Holy Grail of programming perhaps?
I have a theory that the truth is never told during the nine-to-five hours. -- Hunter S. Thompson
Dude, I have enough trouble debugging my code without having my homemade, guaranteed to be buggy, optimizer introducing even more bugs...
/greger
I don't understand this fascination with XML. It's just a generic container for storing data - nothing more. OpenOffice uses it as the underlying format for storing documents, but that doesn't mean I have to deal with it when writing a document. It's transparent to the end user.
In the same way, , why should I have to deal with it when coding? It's sort of like requiring coders to be able to pop up a hex editor and cruise through the code.
Remember MVC (model-view-controller)? Being able to disassociate the different parts was considered a good thing. Swing decided it was too cumbersome, ASP.NET joins them at the hip, and now we've come all the around, with Microsoft proclaiming with XAML that everything should be dumped into one big XML box.
Bleah.
Sometimes XML is not the answer. That being said there are also so really great uses, but XML was not made for everything.
This is not the next generation of programming systems but rather the present one for pretty much everyone except for those using Microsoft tools.
Again, nothing new.
There is no way in hell that would ever happen. Ever.
No.
Now, I will read the entire article, but somehow, I am not holding my breath...
Sincerely,
Pan Tarhei Hosé, PhD.
"Homo sum et cogito ergo odi profanum vulgus et libido."
Now you can do this in C++, but look at what you need to implement to do it
It would be great, if instead, I could hook into the compiler and tell it exactly how it should handle vectors.
Umm... what makes you think that programming a compiler is going to be more straight forward than doing generic programming? That seems like a huge assumption to me.
The closest thing I've seen to what this article talks about was CLOS's MOP, which was great, but once again, a lot of people had a hard time groking it.
sigs are a waste of space
So this guy's familiar with UNIX, he's familiar with Lisp, yet he thinks the future is XML and hideous frameworks with ever-changing APIs? Not often you se e someone with a hammer AND a screwdriver using the hammer to pound screws.
Everybody's a libertarian 'till their neighbour's becomes a crack house.
/greger
that next-generation programming systems will combine compilers, linkers, debuggers, and that other tools will be plugin frameworks, rather than monolithic applications.
yup, it already happened. more than 10 years ago. it's called Rule of modularity and Rule of Composition. In case you don't know. It's the Basics of the Unix Philosophy
#
#\ @ ? Colonize Mars
#
We've seen what can happen to languages when countries get conquered. English is one of the best examples. Try to read some old English to see for yourself. With XPL (Extensible Programming Language), you cannot say anymore that I know C++, or I know C#. Someone will ask you to maintain some code, and you'll take a look at it and have no idea what is going on, until you learn the extensions. This will happen over and over again with every project you are supposed to maintain. This is BRAIN FRYING and huge possibilities for mistakes. It is just like waking up everyday and being asked to speak in another human language. Today English, tomorrow French, the day after tomorrow Bengali, can you do it?
Yep, and around the same time, we'll all be typing on Dvorak keyboards in Esperanto talking about the new flat tax. :)
If I recall correctly,
Fourth-Generation languages was going to be the future of programming back in the early 80's?
(Machine code, Fortran/Basic-type languages and Pascal/C-type languages being the supposed first, second and third generations, IIRC)
Then in the early 90's.. OOP was going to save the world. Not that it hasn't had impact, but it certainly hasn't fundamentally changed things.
And now it's XML that's going to save the programmers, while the old-timers whine that we should all really be using Lisp.
Not that I'm a computer-language conservative myself, but it's worth pointing out that historically, there has been quite a big discrepancy between which languages the Comp-Sci researchers feel everyone should be using, and the ones which actually are used.
that next-generation programming systems will combine compilers, linkers, debuggers,
...THINK Pascal (for the Mac) was doing this almost 20 years ago: the editor served as the front end to the compiler --- so the syntax highlighting in the THINK Pascal editor was driven by the lexer (really was the lexer): you knew about syntax errors immediately. The debugger was fully integrated into the environment. It was really sweet, and probably one of the best programming environments ever written.
and that other tools will be plugin frameworks
Like Unix pipes and Eclipse?
Tomorrow arrived yesterday and appears today.
but much of this 'vision' is implemented in Microsoft's .Net Framework and Visual Studio!
So basically, we get to combine the speed of Java/.NET with the user friendliness of XML and the security of COM? May god have mercy on our souls...
Adrian
...I'd say roughly 10% of you have actually read the article. He mentions specifically many of the criticisms you've mentioned. I don't think this is earth shattering, but some of the ideas are pretty good.
I for one like the idea of source code stored as XML, but not displayed or edited as XML. Imagine, viewing source code in the format you specify (eg positioning of braces). And it would be really nice to be able to treat source code as data without breaking your back writing a parser. And for those of you worried about bloat - honestly, we're talking about text files here!
It's bad enough tracking down the umpteen libraries that an open source program depends upon now. Now we have to track "Bob's special compiler" as well?
Besides, we already have compiler "plugins" for extending the syntax. They have names like bison and flex. Anyone can layer new functionality on a language through meta-translation, if there is a reason to do so. But you better have a reason!
why simple application software needs 2G of RAM and multi-GHz CPUs to get the responsiveness I got on a 100MHz 486 with Win3.11.
Engineering is the art of compromise.
We have had these kinds of integrated, extensible systems: Smalltalk-80, Lisp, and others. And we have had the same tired, old arguments against UNIX since its original design (you can read up on them in the UNIX Hater's Handbook). Smalltalk-80 and Lisp didn't fail because there was some grand conspiracy against them, they failed because people voted with their feet.
Most real-world programmers apparently just want to put up a bunch of dialog boxes and windows, interact with the user a little, and interact with a database. They don't want to extend the programming tools or language or modify the optimizer, they want it to just do what they need it to do. And if it doesn't do what they need it to do, they just pick a different language and environment and don't go on a crusade to develop zillions of plug-ins and modifications. Programmers stick with text files not because they believe that they are the best representation, but because they actually work pretty much everywhere.
Some of the changes Wilson advocates are happening. That's not surprising, given that the features he advocates have been around for decades and many people are familiar with them. But they are happening in an incremental way and people pick and choose carefully which aspects of Lisp and Smalltalk-80 they like and which ones they don't. For example, you can get versions of GNU C that output interface definitions in XML format. IBM VisualAge maintains Java sources inside databases (not text files) and permits incremental recompilation. Many Java development environments have plug-in architectures. Many editors now permit structure-based editing operations ("refactoring") and display "styled" source code, using the raw ASCII text just as a formal (non-XML) representation of the program structure. Aspect-oriented programming adds a great deal of extensibility to languages like C++ and Java. On the other hand, general-purpose macros are out--language designers made deliberate decisions not to include them in Java, C#, and similar languages.
Altogether, it looks to me like Wilson is merely restating what is already happening and combining that with a good dose of UNIX hatred. If he would like the industry to move in a different direction, there is a simple way of doing that: he should implement what he thinks needs to be done. I think an XML-based programming language (and several have been proposed) has about as much chance at flying as a lead balloon, but, hey, surprise us.
How about developing Maintainable programming?
That's exactly one of the author's points! You shouldn't (and in his vision won't) have to deal with the XML directly, -unless- you are one of the people actually writing new plugins rather than just using them.
His suggestion is primarily that we start using editors that transparently present the 'code file' in our choice of format rather than forcing us to edit it byte-by-byte. It's like the syntax-highlighting you probably use now, only effecting more than just colors.
Using XML for the underlying syntax is mostly irrelevant to his proposal, but he suggests it merely because it is currently popular, well suppoerted, and well suited to it's primary job of presenting data in an easily MACHINE READABLE format.
His proposal is, in fact, exactly the opposite of requiring coders to pop open a hex editor, and he likens our current ASCII-only coding methods to doing exactly that at one point.
That's not how I read the article's proposal at all!
The code you've been asked to maintain is stored in some standard machine readable format. When you come in you then use the code-editor program to view it using -your- extensions, and the underlying primatives of the code objects are presented in the manner you're used to.
Whatever extensions and transformations the original author used to create the code would be relatively meaningless, which (for many of the reasons you descibe) is a good thing.
- Low level developers. People programming in C; the ones writing Linux and KDE.
- Quasi-low level developers. People programming in Java; the one writing much of the business software right now.
- High level developers. People programming in scripting languages, like Ruby, Python, PHP, JSP, Javascript.
The second group is the most visible, because business loves them. The first group is the second most visible, because -- while it isn't as "hot" a technology in Monster -- most of the software we use is written at this level. I suspect that the third group is the one that will goose the business community in the future, and will probably eclipse the second group. I'd guess that this is a submarine technology; you don't see many job postings for Ruby programmers, but a heck of a lot of software is being written in it. Even more is being written in PHP, JSP, and Python.I imagine something like Python or Ruby, or some other high-level language that's easy to write software with, coupled with a decent compiler will be the real winner in the near future. Get some type inferrence for one of these languages, and the ability to compile it (as with Parrot), and group two will mostly go away. Java claims to be a more productive language than C because of higher level features; modern scripting languages are even better at increasing productivity, and their only real limitation is their speed, or lack of it. Just as Java eventually overcame the speed issue, so, too, do I expect some future version of a scripting language.
But, maybe Java will hang in there. If you look at Java 1.5, you see a lot of increased syntactic sugar that has usually been only available in languages like Ruby -- I've heard that this was motivated by similar constructions in C#. Perhaps Java or C# will evolve enough syntactic sugar that hacking out code will be as easy as doing so in Ruby. IMO, it'll take a more radical language change than that provided by 1.5; my biggest complaint about Java these days is that it gets in your way; a large chunk of the code you write for any application is infrastructure, and you write it over, and over, and over (anybody else sick of ActionListeners yet?). I'd like to see the typing system changed to type inferrence... but it is possible.
I doubt, however, that software development is going to evolve into choosing black boxes from a set of tools and plugging them into each other, mostly because to do cover all possible jobs, the framework would have to have access to a huge amount of fine-grained tools, and by that point, you might as well just write the code yourself. Look at the size of the Java APIs. How many packages are there? How many classes? How many methods? This is making our lives, as programmers, easier... how?
Where Wilson goes wrong is in assuming that this kind of environment will be built based on plug-ins. The interrelationships needed between the components to get the required level of functionality are too great. What many people have already noted is that the current Unix environment is in fact based on plug-in development. Editors, debuggers and compilers are modularized as programs, with clean lines of communication between them in the forms of files and streams (which Unix again abstracts to one concept). The limitation of this system lies in the fact that the modules all use their own separate address spaces, and hence each one has to have a private representation of the program. This can't be mitigated by having the separate tools communicate to a central database (this is the most that Wilson's proposal of using XML as the underlying format can accomplish), because then the method of communication would be the limiting factor. Of course, you can use the neutral code-data representation to make the communications between the modules and the database be in terms of sending closures (from reading the paper, I don't think Wilson even considers this), but then you've just designed a single distributed address space, and in the process removed all the encapsulation and modularity advantages of the communication links (not to mention introducing a whole slew of concurrency issues)!
One such integrated system has been built in the past, called Interlisp. Barstow, Shrobe, and Sandewall's book (mentioned above) has a few papers that describe the system, but briefly a few lessons can be distilled from it. First of all, the system itself was an integrated development environment for a dialect of Lisp, where everything was done in one in-core address space: source code (including comments) was represented by data structures in memory, upon which the structure editor (residing in the same address space) operated directly. Code could either be interpreted from the data structure or compiled by the (yes, in-core) compiler. There were several extended packages (besides a Lisp macro-like facility), notably the structure editor and "Conversational LISP," a pseudo-natural language command-prompt parsing system. Although source code (and data) could be serialized to files (there was a sophisticated change-tracking facility that took care of this), the usual way of working was by saving the core image to disk and loading it next session, so the whole environment was persistent. There were hooks for everything from the parser to the compiler to error handling down to the most basic frame-handling code of the stack-based VM, and in order to implement the facilities mentioned above (and some other ones I left out, like the ever-present DWIM automatic error-correction facility) the code took full advantage of them. This caused some trouble when it came to portability of the components and the Interlisp itself (the heavy interdependence caused many problems in bootstrapping the system). Some of these incidents are documented in Barstow et al.'s book, but the Interlisp bootstrapping difficulty has been mentioned in all of the Interlisp porting papers I've read. Unfortunately, I don't think a system with those capabilities can be built with the rescrictions of modularization, since all of the things it did are applicable to programming in any language, and to do them required precisely the
In the great CONS chain of life, you can either be the CAR or be in the CDR.
The ultimate expression of this was realized with the Symbolics LISP machine. Everything was in LISP. Everything was hackable. The MIT Space Cadet keyboard, with six shift keys (Shift, Ctrl, Meta, Super, Hyper, and Top). All 2^16 keycodes could be bound to any EMACS function.
I've used both. They sucked. Partly because they didn't work very well, but mostly because all that flexibilty and programmability had negative value. Language and UI design are hard. Evading the problem by making everything changeable does not fix the problem.
His point about XML being another way to put LISP S-expressions into textual form is well taken, though. They're both trees. The problem with LISP is that while the data structures are valuable, the programming notation really is a pain.
LISP works well as a web development environment. Viamall, which later became Yahoo Store, was written in LISP. That was one of the first web applications that really did something elaborate on the server. You could create web pages on the server from a web browser. And the overhead was lower than with XML, where you're forever re-parsing text strings into trees.
Well of course that's what templates are. Yes, their syntax is horrendous but that's what comes of trying to wedge the concept into the existing crannies of C syntax (or when, as Stroustrup remarked to me once, "the ecological niche was already polluted").
If you hanker for a language in which metasyntactic extension is natural, you need Lisp macros (or here and here for a more complex example), Scheme "hygenic" macros or the CLOS MOP.
But if you really want to consider "hooking into the compiler" as you say then you should look at the reflective programming work, the ground work for which was laid down almost 25 years ago by Brian Cantwell Smith and was even implemented, by me and others, back then. Although a lot of work continued in this area that vein pretty much got mined: unless you can think up a completely new control structure there's not a huge amount more you can do with such a system than you could with a normal metasyntactic extension mechanism.
HTH
-d
compilers, linkers, debuggers, and other tools will be plugin frameworks, rather than monolithic applications
Uh? My compiler acts as a "plugin" via. make, which is called from emacs. If I want another compiler, I tell make, and voila' it's "plugged in". Welcome to the world of 'NIX Mr. Wilson.
What is worse, every tool's command-line mini-language is different from every other's
But this is their strength! Different tools solve different problems - and they use different languages to describe what they do, because they are *fundamentally* different (awk is not sed is not grep is not ls). How would you possibly write up a single language to describe what both sed and awk does, without poorly re-creating perl?
Attempts to stick to simple on-or-off options lead to monsters like gcc, which now has so many flags that programmers are using genetic algorithms to explore them
Most CS majors will know that modern CPU architectures are complex beasts, and that it is pretty hard to come up with which combination of optimization methods will yield the best performance on some particular revision of some particular CPU on some particular hardware configuration. Nothing mysterious about that. I completely fail to see what that has to do with command line options.
And instead of squeezing their intentions through the narrow filter of command-line mini-languages, programmers can specify their desires using loops, conditionals, method calls, and all the other features of familiar languages
Instead of squeezing my intentions thru the narrow filter of command-line mini-languages, I can specify my desires using what a standard shell (like bash) has to offer. Ladies and gentlemen of the jury, this is not making sense!
The result is that today's Windows developers can write programs in Visual Basic, C++, or Python that use Visual Studio to compile and run a program, Excel to analyze its performance, and Word to check the spelling of the final report.
Oh come on, please... So if I develop on windows, I can use VB, C++ and Python. How is this relevant? There are more useful languages available on the dreaded "command line systems" ('NIX), but let's just agree that there are plenty of languages available on most OS'es out there - regardless of the windowiness or commandlineness of the system.
Using VS to compile and run the application? Well, if your command line absolutely sucks, they I can imagine why you would want to launch your app from your editor - a matter of taste too maybe. But relevant? How?
Somehow I need COM in order to put numbers into Excel? Ever heard of CSV? You know, new-line terminated lines of T-E-X-T which can be processed by these little all-different tools, like, for example, Excel.
The part about Word and spell-checking of a final report... What? What's your point? If I use COM for developing software, I can spell-check in Word? If I use a command line, I cannot spell check a report that I write about it in Word?
A similar API allows the popular memory-checking tool Purify to be used in place of Visual Studio's own debugger, and so on.
Absolutely! Plusings make perfect sense certain places. Dude - GUD is written in Emacs LISP, it's a plugin for GDB. You could write an elisp file for Purify as well - in fact, Intel actually ships an elisp file for their debugger, even on Windows... Plugins make sense some places, other places they don't. Which, lo and behold, is why they are used certain places and not others.
One of the great ironies of the early 21st Century is that the programmers who build component systems for others are strangely reluctant to componentize their own tools. Compilers and linkers are still monolithic command-line applications: files go in, files come out
Why does he not see what he's writing?!? A compiler reads a number of input files and generate an output file - this is a perfect match for a command-line too
This is not the golden, unifying solution or anything, but there's some ideas in there that could be useful.
I saw his thougts in the first section of the paper and took the rest as some quick examples on how it might look.
I can think of plenty of directions this could go. The first thing I got out of it is applying the same level of abstraction we try to implement in programs to the act of programming itself. This is happening in all kinds of areas of computers anyway (like abstracting file systems, GUI's, etc.) why not put programming into the mix?
It's not about using scripting instead of programming languages, it sounded more like building the same features into our programming tools as we build into the apps we write with 'em.
why all the negative reactions? If its about loosing your editor to write code, you didn't read the article. If it's about too much abtraction to program, then it seems kinda hypocritical considering all the frameworks we use for other people's tools. Or is it just irritation about having to relearn a bit and keep on coding as before? The complaints about XML are odd too. He choose a machine-parseable, human-readable widely used format as a possible way to store programs at a low level.
AB HOC POSSUM VIDERE DOMUM TUUM
In the great CONS chain of life, you can either be the CAR or be in the CDR.