Are Extensible Programming Languages Coming?
gManZboy writes "Programming writer and instructor Greg Wilson is proposing that the next generation of programming languages will use XML to store not only such things as formatting (so you can see indentation your way, and I can see it my way, via XSLT) but even programmatic entities -- like: <invoke-expr method="myMethod"><evaluate>record</evaluate></invoke-expr>. Wacky, but perhaps wacky enough to be possible?"
Look, I can understand XML to convey data.... but honestly, you don't need to use XML for everything under the sun. Proven old good methods work just fine, thank you very much.
Ahhh...the great dumpster continuum. Many a free computer will be found there. -- sowth (748135)
...programmatic entities -- like: record. Wacky, but perhaps wacky enough to be possible?
Hopefully, no. Christ almighty, why is there this surge in interest for pointless layers of abstraction on top of the code? It seems some people are desperate to do anything to avoid actual implementation (work?), prefering to dance around the periphery of a project, adding needless fluff and speedbumps. Honestly, will the addition of XML markup in source code REALLY help to advance a project, make the code more readable or avoid bugs?
Code, Hardware, stuff like that.
We all know how programmers like languages that require typing a lot of
verbose and lengthy expressions. Y'ever notice how *popular* COBOL is?
Did you notice how many more languages have copied Pascal's style of
delimiters BEGIN/END versus the C style {/} or the lisp style (/), and
how popular those languages are?
It's different for data, because you don't type them in by hand most of
the time; you write a program that generates them.
Cut that out, or I will ship you to Norilsk in a box.
Most good IDE's already support autoformatting a document to fit the indenting and bracketing to the user. I don't see how putting formatting as a core part of the language will really help the language at all.
Not to mention the fact that programming languages (not assembly) by definition, are extensible. Most programming languages provide loops, if statements, and ways to define classes, methods, and variables. Some programming languages provide standard libraries so you don't have to do everything from scratch. I don't see anything new that this will offer.
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
Seriously. Isn't software bloated enough? Why obfuscate things further. Dumbest... idea... ever.
"With sufficient thrust, pigs fly just fine." -- RFC 1925
1 - Compilers with plug-in architectures - GCC anyone? I know, he probably means something quicker and easier than writing new front- and back- ends for the Gnu Compiler Collection, but the concept is already out there.
2 - Just about any modern language does this to some degree depending on your definition. Under even the most rigorous definition of this, the good old language LISP does it with flair. Users can extend LISP syntax with ease, and user-added extended LISP syntax is virtually indistinguishable in style and functionality from the built-in elements of the language.
3 - Since existing languages have a well-known syntax which is easily machine parseable (in fact, that's what the parser and compiler do every time you use them on your source code), existing computer languages are already in a format which allows easy conversion into other formats and representation, and the gathering of metadata. Converting semicolons, whitespace, and parentheses (or whatever your language of choice uses) to xml tags doesn't really change anything, except to make things uglier and harder to type.
11*43+456^2
XML is certainly more portable than binary code
That's a huge fricking lie that I wish would die.
Your TCP/IP packets don't all start and end with < and >, and they seem to be fairly portable.
Endian-ness and packing are not rocket science.
Education is the silver bullet.
S-expr's and xml are interchangeable, for the most part. Congratulations, you can now be a total degenerate and program in xmlisp.
Wow, I never had this much trouble posting on slashdot before. Try making a joke in XML. Your screwed... First it rejects repeated tags and then silently deletes the tags it doesn't like.
<parenthesis>
<parenthesis>
a
<comma>
b
</parenthesis>
c
</parenthesis>
-- http://thegirlorthecar.com funny dating game for guys
Programs are written by humans and they should stay easily legible and comprehensive to humans. Going to such extremes as to use XML as the author of the article suggests would defeat that purpose, it's a common trap that people who get too deeply involved with something fall into - they want to make everything use the object of their obsession. I don't recall any ASN.1 zealots trying to push for something that extreme, but with XML there are more and more people who are pushing for XML to be where it should not.
Don't certain crappy HTML-producing languages, for example CFML, already operate this way?
For example (and yes, I've forgotten my crappy CFML syntax, so this is pseudo-crappy CFML (as opposed to crappy pseudo-CFML)),
<CF-FOR INDEX="I" START="0" END="5" STEP="1">
<CF-PRINT>%I%</CF-PRINT>
<CF-IF CONDITION="I=3">
<CF-PRINT>Aw yeah, baby, three!</CF-PRINT>
</CF-IF>
</CF-FOR>
And, to expand upon my question from my subject:
Aren't they already here? And aren't they already crappy?
If I'm wrong, then this might be slightly more interesting in the long run than, say, Cyclone, where you have to learn a tiny amount more of additional syntax to mark that "this pointer was meant to point to data, not code", "this pointer should not write beyond this boundary", "this function has no business mucking up its stack", etc.
Alternatively, look at Visual Studio.NET. vs. The latter is a bit more readable but more annoying to write. Better we have tools to generate this stuff for us.
And then someone will come out of the woodworks to say "Knuth had Literate Programming back in the 80s, why the fuck aren't we using that?" but that's another rant altogether.
[o]_O
-XML is not the panacea.
-XML was made for comunication between different programs, not for humans to write or think in.
-This was done before in LISP.
10 times each morning. If in a week you are still thinking about this, call me back.
If you think that extensible programming languages aren't already here, then read On Lisp (some familiarity with Lisp is necessary).
Why do people love to use XML for all sorts of inappropriate things?
XML does not make data immediately understandable. All it does is remove one parsing problem, leaving the much more important problem of understanding the meaning of the tags, data, and their combination.
XML might make sense as a compiler intermediate format, or even as a source archive format, but it has essentially nothing to offer in tems of extensible syntaxes (except for reminding us that the surface syntax of a programming language and the abstract syntax it represents can be as independent as we choose) or semantics in programming languages. (By the end of the article, this is essentially the point he comes to, with the only argument for XML being that it is popular.)
There are now 140 or so comments, and it is painfully clear from them that almost none of the posters have actually bothered to read the article. If they had, they wouldn't be confused on the following:
The author also makes a persuasive case about programmer's hyper-conservatism compared to other computer users:
Something to think about.Languages need to evolve out of the pure text medium. This has been happening as incremental hacks to classic languages through code folding editors and AST-aware, intelligent IDEs like Eclipse, literate programming and Python's doctest module. High-level development tools like Delphi were early adopters of the philosophy that code doesn't need to be visualized as text when it's better to visualize it graphically.
The next step is to store not text but structure. For example, why shouldn't I be able to comment on -- annotate -- a specific number in a mathematic formula in my code? With current text-based languages this would be a headache:
Instead, I could just select the value in my editor, click on the annotate key, and enter (in nice WYSIWYG HTML or whatever) my comment there. As a result, the editor will show a tiny icon next to the number, or perhaps in the margin, indicating that there's an annotation.
And why are formulas like that represented with such a poor syntax? Why can't I easily use proper Greek letters and standard math notations such as dots for multiplication, a horizontal line for divisions/fractions, etc.? Why can't I insert images into the source file which illustrate the concept it implements?
What I'm talking about isn't just "rich source code", which Donald Knuth's literate programming concept covers to some extent. Languages will experience a revolutionary leap when they start treating language elements as flexible blocks of content as opposed to tokens in an AST. Consider internationalization; instead of looking up a string from a language-specific message table, your source code can include the string in every possible language, hidden away in a single visual representation -- it might look something like:
where "English ..." is a link that opens up a nice GUI letting you change the strings in different languages. The logic to select the string to choose at runtime exists in the string "component" itself.
A common problem in dynamically-typed language is that it's hard to implement optional static typing at the language level. It adds a lot of noisy syntax, and unless you add a lot of syntax, it's hard to solve many ambiguities and special cases. With a rich source format, you can hide away the details, similar to my annotation example.
Unix geeks typically balk at non-textual files, but I blame it on a fundamental lack of imagination. You can have both! Rich source code can be represented as text -- it's just not convenient to edit it like text. Instead, you add intelligence and convenience to your tools. You don't edit your PNG files with Vi -- you use a tool like GIMP or Photoshop.
Insert picture of programmer popping the key tops off a keyboard and clipping them back on in different places.
Language conversion. Say you find some open source Perl code that does exactly what you want, but you are a Java shop. So, just run the XML version of the code through an XSLT and voila!
Lovely theory, but I'd like to see you pull that off in practice. What if I start using some very idiomatic language paradigms in perl, which all make good sense there, but result in, at best a tangled barely intelligible mess of Java, at worst something unconvertible. What this does, in effect, is reduce every language down to a poor quality "lowest common denominator". How do you easily convert a functional language into a procedural one? How do you convert you OO Java code into C? Sure, it can be done, but itf its done in an automated way I'm not sure I would want to be the one responsible for editing and maintaining the results.
Jedidiah.
Craft Beer Programming T-shirts
I'm not a fan of using the wrong tool for the job. At work we normally use PHP for web applications, but when I see an advantage, I will stray from the "norm" and use Perl.
XML can be a very good candidate for coding logic. We are starting to do this with several libraries we have developed for manipulating data. It is much easier to get a text document at a major company published than it is to get a DLL published. The DLL is the main engine, controlled by XML documents. We can then create a "custom version" of the library by supplying different XML documents that contain layout and logic. We can write the engine once, then customize it via XML.
...I just want to say:
:)
Congratulations!
You are now on step 1 on a long and tedious journey to building a poorly-designed lisp dialect!
Other posters have already made this case well enough that there's not much point in me elaborating!
I don't see why you gave up the benefits of C++ for such a small improvement. One day you might want to display video on the sides of your cubes. With C++, you just pass a VideoCube to renderer.spin(Cube&cube) and it will call approporiate virtual functions to get bitmaps of each of the sides. With C code, you are likely accessing internals of struct Cube directly and can not change it's implementation without re-writting a lot of code.
Besides, if you really need efficiency, you can write low-level routines in C and still compile them using a C++ compiler. Make Renderer a friend of Cube if you really want to hardcode its internals. Of course, some C++ features like non-virtual method calls have no extra overhead, and some - like inline functions and refrences instead of pointers - can potentially generate faster code.
OOP can be overdone, but a small degree is useful in any program longer than 2 pages. By contrast, I don't see how coding directly in XML would ever be helpful. If that's an internal representation used by my editor or compiler - well, whatever works for them.
My understanding is that there was a big push for XML because of a perceived need for open document formats.
The advantage of XML is that you can use an off the shelf parser for every language instead of writing a new parser for each language. Let someone else handle the parsing and you handle only what you have to.
As opposed to computer languages now, where most modern languages (LISP-family excepted) have context-dependent grammars that are incredibly hard to parse correctly and each language has to have a parser written specially for it.
Binary formats are "closed" only in so far as we do not have access to the source of the program that created them.
Yeah, I'm sure that if you got the code to Microsoft Word, you could figure out the format just like that.
Even if you can, then you add another large ball of code to your project, for reading Word files. In the end, you've got a dozen different libraries attached, each one for reading a different format of file.
JPEG is a binary file format, yet we have open standards and the committee who designed it released open source reference implementations of the decoder and encoder.
Look at how many file formats the average graphical viewer has to support. Each one has its own library, its own bugs, its own security holes...
JPEG is an open format and nobody goes around trying to stuff pixels in XML files.
No; they stuff pixels in PNG files and TIFF files and PNM files and GIF files and a dozen other formats that need to be parsed by completely different parsers.
This leads to a logical paradox: if programmers continue to write code in "plain" ascii format, how is it going to acquire the XML markup? Why, someone would have to write a parser, of course!
This XML encapsulation is a misguided effort to create a standard interface to code parsing. Guess what! A highly effective parser already exists for every single programming language. It's the compiler!
To encapsulate source code in an XML form that redundantly specifies how it is to be parsed is asking for trouble:
- What if the XML markup ever gets out of sync with the plain text source code? Either:
- Software for editing/validating/formatting the code WILL catch the problem. If so, that software must include a parser for the ASCII source code, thus rendering the XML useless, OR...
- XML-ified software WILL NOT catch the problem. The code might get highlighted incorrectly in the editor or incorrectly validated by the code checker (yikes!)
- What if compiler writers get lazy and start relying on the "pre-parsed" source code. Now the programmer might be in the lovely situation of editing plain text code that's marked up INCORRECTLY with *hidden* XML code that's affecting the compilation of the program.
My point is, don't include redundant information in the fundamental form of your source code, because it will get out of sync somehow. Remember all those wParam variables in 16-bit Windows API? The "w" was supposed to "document" that it's a 16 bit variable. Now it's declared as a 32-bit variable and everyone calls it wParam... if they still code in C. At worst, it's misleading, at best useless and annoying.If people want a standard way to decipher language syntax, then compiler writers should write hooks in their compilers to export the parse tree in a standard format. Heck, it could even be an XML format, but this should be treated like an object file (it's derived from the source code rather than the source code itself).
My bicyles
I'm not certain I see any real benefits listed, for 2, 3 and 4 this currently exists without using XML. This is because of something very simple... languages have formally defined syntax. Thats so the compiler can do its job without becoming a mind reader to work out what the programmer really meant. I'm not even sure how its possible using XML as a storage mechanism (whilst still editing something that looks like normal source code) will force well defined variable names. That comes down to the discipline of the programmer in question, I could still use i, j, k and l if I wanted to.
If 4 was ever implemented, you would be welcoming in maybe the slowest programming language on the planet. Besides, drag and drop already exists (VB anyone?) not to mention most IDE's will have templates which provide what you want at a quicker rate and I wouldn't even want to think of how big the dnd pallete would need to be for all the possible language constructs out there.
If I were to pick one example of where this already exists I'd pick Eclipse, though Visual Studio has similar features. It creates in memory AST's of the source being edited so that most refactoring operations are a breeze (2). This also means it validates the code as its being typed (3). Templates allow dropping in most common language constructs and it can automagically fill in what it thinks are the most appropriate variables for method calls (4).
So far, besides the first point the rest already exists..
after invoking the correct dtd (and finding one of the three people on the planet who can write a complex dtd without fcsking it up b.a.r. or making it pig slow to parse)
< /value></initialize /arguments>i able><compari son>lessthan</comparison>e valuate>> <integer>i ncrement</integer></operator</loop_check_top>x ecute>
then we can write things like
<loop_check_top><initialize arguments><assign><argument><variable> i </variable></argument><value><integer>0</integer>
<evaluate><argument><variable>i</var
<integer>3</integer></
<modify><variable>i</variable><operator
<e
Which is terrifically easier to read than
for(i=0;i<3;i++){
and because execute can be rendered in different ways so that the { goes on the next line. That's some powerful stuff.
And because with our really smart DTD's we can emulate this exotic loop structure in languages that only support checking loops at the end by creating some automatic variables!
Give me a break.
If you have XML you can suck it into a DOM parser and then do node walking. Then you can write the data from the nodes into structures in whatever language you have. And for this reason it makes a great way to feed data from one program to another.
It is a very inefficient way to have the data for a program while the program is running.
I agree that XML can be whatever you want it to be, and I agree that it is very over-hyped and the OOPSLA mongers, who make their money trying to confuse people into buying into their solutions, are behind XML in a large way.
XML is still good for many things.
But it is very bad for high-performance programming like robotics or video games, or graphics or music. It is a good thing to use to store data, or at startup in a real-time process.
For web pages having the tags around all the data makes XML formated pages very easy to spider. And for that reason alone it is very useful to use in web pages. But that XML will look just like HTML.
So don't disregard XML all the way. But please do continue your health skepticism about it.
The object-tool mongers caused a lot of problems and a lot of grief for many engineering products by selling tools that were designed by amatuers and supposed to work in real-world real-time situations where they just couldn't hack it.
Were there ever any refunds made for any of these so-called tools? These professors got rich selling their seminars and a lot of very good companies got duped.
No; they stuff pixels in PNG files and TIFF files and PNM files and GIF files and a dozen other formats that need to be parsed by completely different parsers.
You say that like its a bad thing.
png is good for icons and webpage graphics (unless your target is IE). Its compresses well, is lossless, and has good transparency
tiff is a good choice for very large, very high color images, such as producing for poster prints
pnm are an excellent format for doing batch transformations (such as sticking together, rotations, etc. Just see the pnm* tools with any Linux Distro.
gif is paletted, and compress better than png's for low color images. If you only have 20 or so colors and only a need for a mask (or perhaps would like some animation) gif is your bet. See 'screensavers' on mobile phones.
In your world, webpages would take much longer to load, poster prints seem faded and image transformation tools would take forever. Oh, and I wouldn't be able to fit a 'screensaver' on my mobile phone. All because you can't be bothered to use one of the many, existing, image transformation tools.
Seriously. Isn't software bloated enough? Why obfuscate things further. Dumbest... idea... ever.
Couldn't agree more. Leave it to some MS XML pin head to think you need a new language to be extensible.
The fool should study polymorphism, and a object orientated language like Java or C++. But I suspect that is all is to much for the child's brain.
Going to be flame bait for this:
ALL XML IS IS FREE STYLE HTML/SGML and your smoke'in crack if you do not realize it.
And when I type code I hate typing
all the freaking time to make my job productive. Or at least I don't think it makes me virile.Maybe too much schooling has made me a stodgy young academic, but didn't LISP provide us with extensibility and everything else XML cuold possible offer, in a much cleaner and more elegant syntax?
Whenever I see people trying to pull something like this I tend to remind them of one thing (everbody seems to forget this whenever XML is brought up, I dunno why...):
It's great that you can read the syntax of a language (which is basically what this idea boils down to -- people just have to implement an XML parser instead of a $LANG-parser) without effert, but if you don't understand the semantics of what you're reading it's rather pointless, unless all you want to do trivial tree-based transforms. This applies to XML in general and it applies here. As you pointed out, the semantics of languages are different, and so your tools have to understand all the different languages anyway (or, as you say, reduce them all to some common denominator).
HAND.
How on earth is an argument like that "insightful"?
"Getting milk from the store is more convenient than having your own cow" is the biggest lie ever!
Lots of farmers gets their milk from cows without a problem.
But for the topic:
When I write a binary format that is basicly just a filedump of som c-struct, how compatible is that with c#?
Sure - I CAN read it - but it takes effort.
If I wrote the same file in xml, it would be pretty effortless to read it in php, c#, java, VB, you name it.
And with the binary file i get the added bonus that extending the file WHILE maintaining backwards compability is a bitch.
Can this be overcome by a smart developer - sure, but it would take effort.
Extending the xml-file would be pretty simple.
There are other bad things about xml - why not focus on these, instead of pulling things out your behind?