Comparing the Size, Speed, and Dependability of Programming Languages
In this blog post, the author plots the results of 19 different benchmark tests across 72 programming languages to create a quantitative comparison between them. The resulting visualizations give insight into how the languages perform across a variety of tasks, and also how some some languages perform in relation to others.
"If you drew the benchmark results on an XY chart you could name the four corners. The fast but verbose languages would cluster at the top left. Let's call them system languages. The elegantly concise but sluggish languages would cluster at the bottom right. Let's call them script languages. On the top right you would find the obsolete languages. That is, languages which have since been outclassed by newer languages, unless they offer some quirky attraction that is not captured by the data here. And finally, in the bottom left corner you would find probably nothing, since this is the space of the ideal language, the one which is at the same time fast and short and a joy to use."
I am surprised how they manage to get scala to perform so much worse than pure java.
.class files and uses static typing and the makes claim that the bytecodes are almost identical.
Scala compiles to pure java
I wonder if the benchmarks are executed in the same environment.
http://shootout.alioth.debian.org/ has a Gentoo label behind the java benchmarks, but not the Scala one.
Where Cobol and RPG, the languages that run business?
I think verbosity in moderation is necessary. I have read many an article with developers arguing that they don't need to document their code when their code is self-documenting. Do you make all of your variables and class/function/methods a single character for the sake of verbosity? I hope not. And I would think that reading and maintaining that code would be far less than a joy.
Long meaningful identifiers are useful. Needing 5 lines of setup for each API call is annoying, particularly if those 5 lines are usually the same. Requiring lots of redundant long keywords to "look more like English" is annoying. Large standard libraries that let you remove most of the tedious parts from your code are useful.
Every time I see one of these things, OCaml always rocks it. I wonder why it never caught on to a greater degree?
This kind of fits in with my thinking.
When I was starting out in programming, I just wanted results. I wasn't concerned about performance because the computer was a million times faster than me. I was most concerned about how many "non-vital" keywords were necessary to describe what I wanted the machine to do (e.g. "void main(...)" isn't *vital* because it's just boilerplate. However "if", "for", "while" etc. would be vital - and even for/while are just cousins), and how many of the vital keywords (i.e. those that specifically interfered with the way my program would *actually* operate... a "static" here or there would hardly matter in the course of most programs) were "obvious". Java failed miserably at this... I mean, come on: System.out.println() and the standard wrapping take up too much room.
So, BASIC was an *ideal* first language (sorry, but it was, and the reason nobody uses it much now is because EVERYONE has used it and moved on to something else - doesn't mean it "breaks" people). In this regard, even things like C aren't too bad - 30-50 keywords / operators depending on the flavour, all quite simple - you could memorise them perfectly in an afternoon. However things like Forth and Perl can be hideous.
And even C++ is tending towards the stupid. Believe it or not, even things like bash scripting come out quite well under that test. And, to me, that correlates with the amount of effort I have to put in to write in a particular language. If I just want to automate something, bash scripting is fast and easy. Most of the stuff I write is a "one-job program" that will never be reused. If I want to write a program to work something out or show somebody how something is done programmatically, BASIC is a *perfect* prototyping language (no standard boilerplate, no guessing obscure keywords, etc.). If I want to write a program that does things fast, or accurately, or precisely, or for something else to build upon, C is perfect.
I see no real need to learn other languages in depth past what I'm required to know for my work. I have *zero* interest in spending weeks and weeks and weeks learning YAPL (Yet Another Programming Language) just to spent 90% of that time memorising obscure keywords, boilerplate and the language's shortcuts to things like vectors, string parsing, etc. If I was going to do that, I'd just learn a C library or similar.
I think that these graphs correlate quite well with that thinking. Let's be honest, 99% of programming is reusing other code or shortcuts - short of programming in a Turing machine, C is one of the simplest languages to learn because it *doesn't* have a million shortcuts... you want to iterate over an array or create a hash / linked list, etc. you have to do it yourself from basic elements. In modern programming, that means a one line include of a well-written library. As far as I was concerned when learning it, even the "pointer++ increases by the size of the pointer" was far too smarty-pants for me, but incredibly useful.
But with C++, I instantly lost interest because it's just too damn verbose to do a simple job. Java OOP is slightly better but still nasty once things get complicated and the underlying "functional" language is basically a C-a-like.
I'm a fuddy-duddy. Old fashioned. If I write a program, the damn computer will damn well do instruction 1 followed by instruction 2 with the minimum of flying off into libraries and class systems. If I want 4 bytes of memory to change type, then I will damn well have them change type. And I'll even get to specify *what* 4 bytes of RAM if I want and I'll clean up after them if it's necessary. That's how I think, so things like C match perfectly when I want to code. The fact that C is damn powerful, fast, low-level and so common also add to it's appeal.
I worry about what will happen when people *only* code in OOP languages. The abstraction is so large that people forget that they are still telling a computer to handle bits and bytes and suddenly they get lazy. M
It didn't seem to me like being concise or verbose was a help or hindrance aside from his comment. Per those graphs I could say I want something as fast as Java (how often have you heard that on /.), but a little less verbose... "oh, csharp might be worth a look".
I found it interesting.
Regina(Rexx) is a pretty fast and clear language. My only issue with it is how other functionality has been added such as SQL and network connections as their implementation(or maybe their documentation since there seems to be very little) doesn't seem quite as clear as using, say, PHP or Ruby. If they'd get more than a couple of developers working on the project, it could be easily transformed into something far more useful.
That said, for 2 years I ran a website entirely off of Regina(or maybe it was ooRexx..basically the same thing) and despite the limitations of the machine it was running on, it performed faster than any other scripted site I ever visited. These days with the heavy AJAX and java implementations, it'd run circles around them.
First off, he presents the big chart twice. The second version is meant to compare functional languages with imperative languages, but it's also small enough to fit on my screen, so if you're browsing the article, you might want to look at that one first.
His "obsolete" sector is really more like a special-purpose sector. For instance, Erlang shows up in the obsolete sector, but that's because Erlang wasn't designed to be especially terse or fast. Erlang was designed to be fault-tolerant and automatically parallelizable. Io also ends up looking lousy, but Io also wast designed to be terse and fast; it was designed to be small and simple.
The biggest surprise for me was the high performance of some of the implementations of functional programming languages, even in cases where the particular languages aren't generally known for being implementable in a very efficient way. Two of the best-performing languages are stalin (an implementation of scheme/lisp) and mlton (an implementation of ml). However, as the author notes, it's common to find that if you aren't sufficiently wizardly with fp techniques, you may write fp code that performs much, much worse than the optimal; that was my own experience with ocaml, for instance.
The choice of a linear scale for performance can be a little misleading. For instance, csharp comes out looking like it's not such a great performer, and yet its performance is never worse than the best-performing language by more than a factor of 2 on any task. Typically, if two languages differ by only a factor of 2 in speed, then speed isn't an important factor for choosing between them. The real thing to look out for is that some of the languages seem to have performance that's a gazillion times worse than normal on certain specific tasks.
Many of the languages are hard to find, because they're listed by the names of their implementations. In particular, g95 is an implementation of fortran.
Find free books.
If the code size attribute is measured in number of lines, I suspect that forth, which is practically an assembly language, will rank very low (near the top of the graph, if not at the very top), though it ought to be very fast (near the left). It depends so much on stack operations that I suspect its left to right ranking would depend a great deal on the processor it's running on.
I love forth. I learned it many years ago. But I've never been in a position to use it for anything, which is a shame.
These sorts of things never fail to to amaze me.
The verbs, nouns, semantics and such used in a given programming language have nothing, I repeat... NOTHING to do with performance!
What does have to do with performance is the talent of the compiler / interpreter author, nothing more, nothing less.
C implements ++ and so forth and so on. Pascal does not, you have to express it as var := var + x or in some implementations as inc(var) or inc(var,100). The smart compiler / interpreter author would implement those in the fastest possible way regardless of the particular language.
The one metric that has real meaning is programmer enjoyment. Do you prefer terseness over verbosity or something in between. Does this languages flow amke you truly appreciate working with it.
The only other real metric that has any true meaning is again the talent of the compiler / interpreter author. Was the the language parser built so that it can unfold complex statements that are often required to express certain ideas and perform certain operations. Does the language implement your favorite expression, eg: ++ , or something like that, which again harkens back to programmer enjoyment.
So what it really leaves us with is, "Do you enjoy using that language?" and only you, the programmer can asnwer that question.
Hey KID! Yeah you, get the fuck off my lawn!
Ha ha. Good joke. You've forgotten the most important feature "needed in a great programming lang": higher-order and first-class functions with proper closures. Oh wait, C doesn't have that.
Any truly great statically typed language will also have at least algebraic data types, parametric polymorphism (even C++ only has ad-hoc polymorphism), type constructors and functions, maybe even a Turing complete type system (heh). C doesn't have any of those.
Even aside from types, great languages should include tail-call optimization, pattern matching and hygienic macros (CPP macros are a bad joke).
Now don't get me wrong. C is a great portable assembly language. It's close to the metal, widely known and easy to read. But as far as programming languages go, C feature poor.
Show me a language impossible to write ugly code in, and I'll show you a language which is unnecessarily restrictive.
If you can't recognize the beauty in near infinite flexibility and the associated amount of power provided, you're not qualified to participate in such a discussion. Come back when you've gotten to know some actual talented programmers. One way to identify those programmers is that they don't blame their tools for their own incompetence.
I wrote a little "literate" FORTH tutorial if any readers of the above comment are interested in it: jonesforth.
libguestfs - tools for accessing and modifying virtual machine disk images
"Oh but Java is a plodding, stumbling, lumbering, slug of slowness."
Java does well in these kinds of synthetic tests because it doesn't have to invoke the garbage collector. All the *real life* java programs I use are significantly slower than roughly equivalent C++ programs. E.g. compare NetBeans to Visual Studio, or Azereus to uTorrent. I tried Eclipse once but it was unusably slow.
Find me a speedy desktop Java program and I'll change my mind about it.
Snarky et al are ancient words used up to the 60s; their resurgence can only make me hope that we are potentially seeing the return to precise use of language. This would be a fantastic event; a reversal of the trend to the dilution of semantics and language in general.
Of course my use of 'et al' is symptomatic of this, as is the common use of acronyms for everything... Ah weel, it was a nice idea while it lasted
Semi-automatic amateur armchair Australian philosopher; conjecture ready at any moment...
You should not draw too many conclusions about the results of those two without taking into account the fact that both are whole-program optimizing compilers. Those two systems just do not support separate compilation of individual source files; if you change one line of one file in your program, every file must be recompiled.
This type of compiler has a performance advantage over the more common separate compilation systems, simply because it can inline anything anywhere, and thus optimize far more aggressively. But it's next to useless for developing large software systems, and thus mostly really useful only for writing smallish programs, in very high-level style, that perform some really expensive computations really fast.
Are you adequate?
Funny, isn't this what Twitter thought too before dumping RUBY entirely? Wasn't this what Twitter thought as they threw more and more hardware at the problem and still could not solve the problem? Didn't Twitter end up spending more on IT to administer 2-3 times the numbers of servers that it would take to do the same thing in Python, PHP or Java?
Yeah, throw hardware at it. That's a viable solution for a company. As long as you aren't thinking about who has to maintain all those servers and the fact that RUBY STILL DOESN"T SCALE.
This is my sig. There are many like it but this one is mine.
If this were the case then Perl 6 would have stuck with the Pugs implementation.
That's silly. Pugs was not designed as a production implementation of Perl 6 - it was a proof of concept for the new syntax. The fact that Pugs was so surprisingly easy to implement, and so surprisingly performant before any effort at all was put into its performance, was a stunning demonstration of the power of functional programming.
GHC would have stuck with Darcs and not gone to GIT.
GHC did stick with Darcs. There was a time when a move to Git was considered, but it had nothing to do with Darcs being written in a functional language. At the time, the Darcs team was not big enough or well enough organized to keep up with the heavy support demands of being the RCS for a large and growing compiler project. That has changed. Darcs has become one of the best run (and most fun) open source projects, with a large team of active and talented developers. Darcs is now an excellent choice of RCS even for very large projects.
That is typical of the process that has been happening with functional programming during the past few years - a quick transition from the theoretical to the best-of-breed in practice.
So if you want to continue this conversation start reading the various performance related discussions. There are 2 decades of papers on trying to resolve specific examples of this problem.
If you're interested in history, you are the one who ought to have a good look at that literature. Learn to tell the difference between knotty problems that were identified twenty years ago, and the astonishing continuous progress that has been made since then.
But if you want to continue this conversation, you should look at what is happening in the present. One thing that has happened is that you no longer need to read academic papers to learn and use functional languages in practice. There are books, online tutorials and resources, many developer-friendly tools, and a huge and super-friendly community.
The shootout is a nice demonstration that speed optimization is another aspect of the increasing strength of modern functional compilers and functional programming techniques. No one will claim that a functional language can beat C at speed right now, but it says a lot that such high level languages can now compete well in the same league as C. As hardware architectures continue to move farther and farther away from the classical imperative model, watch for this trend to continue.
I love functional languages but putting your head in the sand regarding where the problems are and considering the evaluations FUD is not going to advance the cause.
I called the great-grandparent post FUD because it is FUD. These are common misconceptions, caused by lingering impressions of where functional programming was decades ago. Open your eyes, look at the facts, see what is happening now. Don't let the FUD lull you into apathy.
FYI, the new Padre Perl IDE is itself written in Perl.
http://padre.perlide.org/wiki/Screenshots
If you like Forth, you should check out Factor, which is basically a modernized version of Forth (dynamically typed, no *very* low level filesystem junk that Forth has). I've recently started playing with it.
Nice post. It is true that APL sends you to places you never go with any other language. And it is also true that this isn't necessarily a good thing.
My first language was APL on an IBM System/360 in about 1973. I recall one of the lab assistants had a workspace of text functions he'd created that some of us were looking at. One in particular, was designed to take a text string and reduce occurances of multiple spaces down to single spaces. The program was a one-liner, of about 120 characters. Several of us looking at it could see that it could be simplified from this 120 character monstrosity, and of course that it *should* be, so we set ourselves to the task. We divided up into a few groups, and an buddy and I worked on it for awhile and got it down to 14 characters. We concluded that was as good as it could get and were firmly convinced we would win the informal contest we were having. But then one of the other students showed us his result which was almost identical to ours but had it at 13 characters because he had noticed a logical not that could be combined with an operator in order to eliminate it. We were crushed because we worked so hard on it and were sure we had it aced...
But that was pretty typical with APL, you could spend a huge amount of time juggling array elements to be just so, in order to evaluate all the answers in parallel, when in any other language you would have just written a for loop with a few statements and been done with it. Not as challenging as APL, but as I said, that could very well be a Good Thing(TM)...
Plus, the varying quality of APL programmers meant that you may be looking at a 120 character monstrosity of obscure and unneccessary logic that could have been done more concisely in 13 characters...