Comparing the Size, Speed, and Dependability of Programming Languages
In this blog post, the author plots the results of 19 different benchmark tests across 72 programming languages to create a quantitative comparison between them. The resulting visualizations give insight into how the languages perform across a variety of tasks, and also how some some languages perform in relation to others.
"If you drew the benchmark results on an XY chart you could name the four corners. The fast but verbose languages would cluster at the top left. Let's call them system languages. The elegantly concise but sluggish languages would cluster at the bottom right. Let's call them script languages. On the top right you would find the obsolete languages. That is, languages which have since been outclassed by newer languages, unless they offer some quirky attraction that is not captured by the data here. And finally, in the bottom left corner you would find probably nothing, since this is the space of the ideal language, the one which is at the same time fast and short and a joy to use."
APL was faster than C and there has never been a more terse language.
Some drink at the fountain of knowledge. Others just gargle.
And finally, in the bottom left corner you would find probably nothing, since this is the space of the ideal language, the one which is at the same time fast and short and a joy to use.
I must ask why the author assumes that verbosity is bad and why lack thereof makes it a "joy to use."
I think verbosity in moderation is necessary. I have read many an article with developers arguing that they don't need to document their code when their code is self-documenting. Do you make all of your variables and class/function/methods a single character for the sake of verbosity? I hope not. And I would think that reading and maintaining that code would be far less than a joy.
I don't even need to argue this, according to his graphs we should all be using Regina, Mlton or Stalin (a scheme implementation). But instead languages like Java and Perl and C++ prevail. And I would guess that support and a mediocre range of verbosity are what causes that.
Great work in these graphs! But in my opinion, verbosity when used in moderation--like a lot things--is far better than either extreme.
My work here is dung.
This site is awesome. It's very simple. They have over code in over 1200 different languages that spits out the lyrics to the "99 bottles of beer on the wall" song. Check out the perl example (yes, it really does work): http://99-bottles-of-beer.net/language-perl-737.html
"I have never let my schooling interfere with my education." --Mark Twain
I am surprised how they manage to get scala to perform so much worse than pure java.
.class files and uses static typing and the makes claim that the bytecodes are almost identical.
Scala compiles to pure java
I wonder if the benchmarks are executed in the same environment.
http://shootout.alioth.debian.org/ has a Gentoo label behind the java benchmarks, but not the Scala one.
Oh but Java is a plodding, stumbling, lumbering, slug of slowness. All thoroughly indoctrinated Slashdotters know that already. No need to RTFA...
OMG!!! Ponies!!!
And finally, in the bottom left corner you would find probably nothing, since this is the space of the ideal language, the one which is at the same time fast and short and a joy to use."
Plain ASCII. The shorter and faster it is, the more joy it is to use.
What can compare to the joy and speed of, for example, the command "Go fuck yourself!"
Even shorter and faster syntax: the command "F.U.!"
And for conciseness of comments - "SHIT!" and "oops!" and "WTF???"
Looping constructs: "Sit on it and rotate!"
If-else constructs: "Dat so? F.U. 2!"
foreach: "You, your mamma, and the horse you rode into town on!"
Exit statements : Just fuck off!"
c-style assertions: "Eat shit and DIE!"
#defines: "#define YOU One dumb motherfucka"
conditional #includes "#ifdef YO_MAMMA"
real-time peremption: "I OWN you, beotch!"
Programming languages don't have attributes like size and speed: implementations of these languages do. Take Common Lisp for example: SBCL is blazing fast, while CLISP is rather pudgy (albeit smaller). Any conforming Common Lisp program will run on both. Or consider Python --- IronPython and CPython have different performance characteristics. (I'm too lazy to link these now.)
Point being, describing a programming language as "fast" makes about as much senese as describing a natural, human language as "smart".
http://en.wikipedia.org/wiki/Forth_(programming_language)
--
Forth is a simple yet extensible language; its modularity and extensibility permit the writing of high-level programs such as CAD systems. However, extensibility also helps poor programmers to write incomprehensible code, which has given Forth a reputation as a "write-only language". Forth has been used successfully in large, complex projects, while applications developed by competent, disciplined professionals have proven to be easily maintained on evolving hardware platforms over decades of use
--
Forth is still used today in many embedded systems (small computerized devices) because of its portability, efficient memory use, short development time, and fast execution speed. It has been implemented efficiently on modern RISC processors, and processors that use Forth as machine language have been produced
--
Verbosity = ( 1 / Expressiveness )
Where Cobol and RPG, the languages that run business?
I think verbosity in moderation is necessary. I have read many an article with developers arguing that they don't need to document their code when their code is self-documenting. Do you make all of your variables and class/function/methods a single character for the sake of verbosity? I hope not. And I would think that reading and maintaining that code would be far less than a joy.
Long meaningful identifiers are useful. Needing 5 lines of setup for each API call is annoying, particularly if those 5 lines are usually the same. Requiring lots of redundant long keywords to "look more like English" is annoying. Large standard libraries that let you remove most of the tedious parts from your code are useful.
Every time I see one of these things, OCaml always rocks it. I wonder why it never caught on to a greater degree?
CAN HAS LOLCODE?
KTHXBYE
It would be better if it was extended to support classes.
ML, Haskell, Scheme, Fortran and Common Lisp are on the list. You probably didn't notice them because they are listed by their implementation name (e.g. mlton/O'Caml, GHC, Stalin/Gambit/Ikarus/Chicken, G95, SBCL/CMUCL, etc.). The Shootout front page lists which implementations implement which languages.
This kind of fits in with my thinking.
When I was starting out in programming, I just wanted results. I wasn't concerned about performance because the computer was a million times faster than me. I was most concerned about how many "non-vital" keywords were necessary to describe what I wanted the machine to do (e.g. "void main(...)" isn't *vital* because it's just boilerplate. However "if", "for", "while" etc. would be vital - and even for/while are just cousins), and how many of the vital keywords (i.e. those that specifically interfered with the way my program would *actually* operate... a "static" here or there would hardly matter in the course of most programs) were "obvious". Java failed miserably at this... I mean, come on: System.out.println() and the standard wrapping take up too much room.
So, BASIC was an *ideal* first language (sorry, but it was, and the reason nobody uses it much now is because EVERYONE has used it and moved on to something else - doesn't mean it "breaks" people). In this regard, even things like C aren't too bad - 30-50 keywords / operators depending on the flavour, all quite simple - you could memorise them perfectly in an afternoon. However things like Forth and Perl can be hideous.
And even C++ is tending towards the stupid. Believe it or not, even things like bash scripting come out quite well under that test. And, to me, that correlates with the amount of effort I have to put in to write in a particular language. If I just want to automate something, bash scripting is fast and easy. Most of the stuff I write is a "one-job program" that will never be reused. If I want to write a program to work something out or show somebody how something is done programmatically, BASIC is a *perfect* prototyping language (no standard boilerplate, no guessing obscure keywords, etc.). If I want to write a program that does things fast, or accurately, or precisely, or for something else to build upon, C is perfect.
I see no real need to learn other languages in depth past what I'm required to know for my work. I have *zero* interest in spending weeks and weeks and weeks learning YAPL (Yet Another Programming Language) just to spent 90% of that time memorising obscure keywords, boilerplate and the language's shortcuts to things like vectors, string parsing, etc. If I was going to do that, I'd just learn a C library or similar.
I think that these graphs correlate quite well with that thinking. Let's be honest, 99% of programming is reusing other code or shortcuts - short of programming in a Turing machine, C is one of the simplest languages to learn because it *doesn't* have a million shortcuts... you want to iterate over an array or create a hash / linked list, etc. you have to do it yourself from basic elements. In modern programming, that means a one line include of a well-written library. As far as I was concerned when learning it, even the "pointer++ increases by the size of the pointer" was far too smarty-pants for me, but incredibly useful.
But with C++, I instantly lost interest because it's just too damn verbose to do a simple job. Java OOP is slightly better but still nasty once things get complicated and the underlying "functional" language is basically a C-a-like.
I'm a fuddy-duddy. Old fashioned. If I write a program, the damn computer will damn well do instruction 1 followed by instruction 2 with the minimum of flying off into libraries and class systems. If I want 4 bytes of memory to change type, then I will damn well have them change type. And I'll even get to specify *what* 4 bytes of RAM if I want and I'll clean up after them if it's necessary. That's how I think, so things like C match perfectly when I want to code. The fact that C is damn powerful, fast, low-level and so common also add to it's appeal.
I worry about what will happen when people *only* code in OOP languages. The abstraction is so large that people forget that they are still telling a computer to handle bits and bytes and suddenly they get lazy. M
On the plus side, both versions of Python can claim many of the smallest programs in the collection. Ruby (8, 1) might also compete for titles, but unfortunately its performance is so bad its star falls off the performance chart.
Then why the fuck is the Ruby community hyping it so much, and drawing nieve young developers in to a trap?
Not flamebait.
Why can't they make a language, or extend a language like Ruby, such that one can program it as a scripting language, but then add verbosity optionally (i.e. declaring the data types and their sizes, private / static etc. & whatever the hell makes a program light weight and fast) optionally? It's my hope that if I stick with Ruby one day it I won't be forced to learn Python because performance won't be "Ruby's big issue" in every discussion, but really, that is *just* a hope. I hope this isn't a mistake.
"You know you don't act like a scientist, you're more like a game show host." Dana Barret
Contexts can be deceiving.
Be careful not to use these charts to decide what language to learn or what language is better for a given solution.
Let's remember the web server ecosystems: cgi, c#, perl, java, python, php, ruby.
A given algorithm implemented in you language of choice can give you the upper hand
and instant notoriety; but running the whole operation (labor/maintenance/testing) goes far beyond
controlled environment testing.
Lately I've been thinking that
the more powerful solution (language wise) is the one that you can build and tear down from scratch in less time/effort.
That gives you more confidence to try new/innovative solutions.
my 2 cents.
- these are not the droids you are looking for -
Computer algebra systems are high level programming language. Writing good code does not need
documentation. The code itself shows what is done. Here is an example which takes two pictures
and procuces a GIF movie interpolating them:
A=Import["image1.jpg"]; B=Import["image2.jpg"];
width=Length[A[[1,1]]]; height=Length[A[[1]]];
ImageInterpolate[t_]:=Image[(t A[[1]]+B[[1]] (1-t)),Byte,ColorSpace->RGB,ImageSize->{width,height}];
Export["mix.gif",Table[ImageInterpolate[k/50],{k,0,50}],"GIF"]
It takes over a minute to process. A simple C program doing the same is a multiple times larger but also
needs multiple less time to process. But it needs to be documented because even simple things like
reading in a picture
fgets(buffer,1025,in);
if(strncmp(buffer,"P6",2)){
fprintf(stderr,"Unsupported file format (need PPM raw)\n");
exit(1);
}
do fgets(buffer,1025,in); while(*buffer == '#'); // get picture dimension
x_size = atoi(strtok(buffer," "));
y_size = atoi(strtok(NULL," "));
fgets(buffer,1025,in); // get color map size
c_size = atoi(buffer);
if((image = (char *) malloc(3*x_size*y_size*sizeof(char)))==NULL){
fprintf(stderr,"Memory allocation error while loading picture\n");
exit(1);
}
i = 0;
ptr = image;
while(!feof(in) && i<3*x_size*y_size){ *ptr++ = fgetc(in); i++;}
fclose(in);
But C it is worth the effort. For more advanced image manipulation tasks for example,
Mathematica often can no more be used, due to memory or just because it takes too long
(Math link does not help here very much since objects like a movie (a vector of images) can just
not be fed into computer algebra systems without getting into memory problems, which deals with a movie as a whole).
For computer vision stuff for example, one needs to deal with large chunks of the entire movie).
While the simplicity of programming with high level programming languages is compelling, speed often matters.
There is an other nice benefit of a simple language like C: the code will work in 20 years. Computer algebra
systems evolve very fast and much what is done today does not work tomorrow any more in a new version. Higher
level languages evolve also faster. And large junks of internal CAS code are a "black box" invisible for the
user. Both worlds makes sense: the low level primitive, transparent and fast low level language and the slower, but
extremely elegant high level language.
First off, he presents the big chart twice. The second version is meant to compare functional languages with imperative languages, but it's also small enough to fit on my screen, so if you're browsing the article, you might want to look at that one first.
His "obsolete" sector is really more like a special-purpose sector. For instance, Erlang shows up in the obsolete sector, but that's because Erlang wasn't designed to be especially terse or fast. Erlang was designed to be fault-tolerant and automatically parallelizable. Io also ends up looking lousy, but Io also wast designed to be terse and fast; it was designed to be small and simple.
The biggest surprise for me was the high performance of some of the implementations of functional programming languages, even in cases where the particular languages aren't generally known for being implementable in a very efficient way. Two of the best-performing languages are stalin (an implementation of scheme/lisp) and mlton (an implementation of ml). However, as the author notes, it's common to find that if you aren't sufficiently wizardly with fp techniques, you may write fp code that performs much, much worse than the optimal; that was my own experience with ocaml, for instance.
The choice of a linear scale for performance can be a little misleading. For instance, csharp comes out looking like it's not such a great performer, and yet its performance is never worse than the best-performing language by more than a factor of 2 on any task. Typically, if two languages differ by only a factor of 2 in speed, then speed isn't an important factor for choosing between them. The real thing to look out for is that some of the languages seem to have performance that's a gazillion times worse than normal on certain specific tasks.
Many of the languages are hard to find, because they're listed by the names of their implementations. In particular, g95 is an implementation of fortran.
Find free books.
Isn't that the 'dlang' reference?
Where's Ada?
One item in the list is gnat which is one particular implementation of Ada. So, there is at least one Ada implementation on the list. I did not recognize any others.
--- Liberty in our Lifetime
These sorts of things never fail to to amaze me.
The verbs, nouns, semantics and such used in a given programming language have nothing, I repeat... NOTHING to do with performance!
What does have to do with performance is the talent of the compiler / interpreter author, nothing more, nothing less.
C implements ++ and so forth and so on. Pascal does not, you have to express it as var := var + x or in some implementations as inc(var) or inc(var,100). The smart compiler / interpreter author would implement those in the fastest possible way regardless of the particular language.
The one metric that has real meaning is programmer enjoyment. Do you prefer terseness over verbosity or something in between. Does this languages flow amke you truly appreciate working with it.
The only other real metric that has any true meaning is again the talent of the compiler / interpreter author. Was the the language parser built so that it can unfold complex statements that are often required to express certain ideas and perform certain operations. Does the language implement your favorite expression, eg: ++ , or something like that, which again harkens back to programmer enjoyment.
So what it really leaves us with is, "Do you enjoy using that language?" and only you, the programmer can asnwer that question.
Hey KID! Yeah you, get the fuck off my lawn!
Ha ha. Good joke. You've forgotten the most important feature "needed in a great programming lang": higher-order and first-class functions with proper closures. Oh wait, C doesn't have that.
Any truly great statically typed language will also have at least algebraic data types, parametric polymorphism (even C++ only has ad-hoc polymorphism), type constructors and functions, maybe even a Turing complete type system (heh). C doesn't have any of those.
Even aside from types, great languages should include tail-call optimization, pattern matching and hygienic macros (CPP macros are a bad joke).
Now don't get me wrong. C is a great portable assembly language. It's close to the metal, widely known and easy to read. But as far as programming languages go, C feature poor.
I think your statement is strictly speaking true but not useful in practice.
Here's what I mean: strictly speaking, with unlimited intelligence on the compiler's part, the compiler can understand what a program does and rewrite it completely as it wishes to conform to the same behavior. This means any turing-complete language can have the same performance, with a sufficiently intelligent compiler
In practice and in current times, however, a language's features determine how well the state-of-the-art in compilers can optimize a program. To give a very simple example... You don't see compilers inserting statements to free memory in Java programs, even though that would sometimes make them faster than running them with a garbage collector as happens in practice.
The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
It's *not* several layers of objects.
System is the package containing the 'out' object, which is of type "PrintStream". println() is simply the method that outputs a line of text to the standard output, followed by a newline character for the current system.
Now, compare this to C++: std:: is the namespace, which is the same as "System". However in C++, this code: "using std::cout;" will allow you to omit the name space when you use the object. "cout" is an object of type ostream, like "out" in Java. And then, "operator
So in both languages, we're dealing with 3 different things: a namespace, an object, and a function call. Some people think C is better because you can just write printf(), but I disagree because I cringe whenever I need to type silly things with libraries such as BASS_DoThis(), BASS_DoThat(), etc. Namespaces are *way* better.
When the first 4K TRS-80 showed up at my high school, I had the option to learn BASIC, Z80 machine code, or APL up the street at the local university.
One of my early exercises with BASIC was writing a set of nested for loops which called a subroutine (gosub). I put the next statement that controlled the for loop iteration inside the subroutine, and the return statement for the subroutine inside the nested for loops. It still worked! At that point I understood that there were mechanistic languages and languages with a solid conceptual basis.
APL's reputation for inscrutability was only halfway deserved. Often the problems arose when you were trying to shoe-horn a data structure that didn't want to be an array into an array, because that was your only hammer. Later APL supported nested arrays, which increased the data structuring options, but I think by then the PR battle was lost.
In the original APL, it was kind of painful to pass more than two arguments to an APL function. This lead to programmers passing in flags to the function encoded in the array's rank, which were extracted to the tune of rho rho rho, while imaging the knapsack folding problem in Colossal Cave Adventure. As brutal as any language I've used. But you have to give APL a bit of a pass in some respects. Like vi, it was designed in 1963 to work well within the constraints of a paper teletype.
The next level of inscrutability arose because the APL primitives could often be combined in novel ways to yield surprisingly powerful algorithms. IIRC, the IBM 370 APL included a JIT compiler for certain common APL idioms. A one line program I wrote in APL to find primes (sieve of Eratosthenes) ran ten times faster than the compiled PASCAL program by the CS student sitting next to me.
Understanding APL was a lot easier if you were familiar with functional programming languages, but these hadn't been invented yet. Hey, I didn't know this: the Wikipedia page credits APL as a direct influence on FP, which I first heard of in 1982. Father knows best.
So you encounter this unfamiliar pattern of 15 familiar symbols for the first time, and you brain is polluted with horrible iterative solutions from BASIC or PASCAL, and the beauty of the expression is denied to your limited frame of consciousness.
Like solving a Suduko? Hardly. It takes me twenty minutes to solve a typical five star Sudoku. It used to take me about the same amount of time to puzzle out an unfamiliar APL one liner, which might be anywhere from 10 to 40 characters. There is one small difference: after decoding the APL algorithm, I usually slapped myself across the head and moaned to myself, "I am unworthy to drool on the shoe laces of the grand designer, but I will learn!" Never got that feeling from Sudoku.
Wrestling with the higher art of APL was like giving your ignorance a root canal. Sometimes the root canal made me barf up my milk: when the highest art of APL was applied to shoe horn a data structure unsuitable to array representation into an array representation anyway, like the Beethoven scene in Clockwork Orange.
The third case is where the one liner isn't all that difficult, but it's doing it in more dimensions than the brain wishes to visualize. This is a case where a picture is worth a thousand words. Your 20 character APL function would have been better presented as a caption on a one page UML diagram. Never figured out how to embed a UML diagram in an APL lamp statement on my VT100 terminal.
Another problem APL suffered was too much kinship with Forth. To thrive in APL, you needed to create hundreds of tiny APL functions, which the implementations of the day mashed together into a single unmaintainable workspace.
And the system interface tended to suck.
But other than that, what's not to like?
I could have composed instead a tedious, but germane post on Shannon's first law: concision is a function of preconception. It's a rare breed of programmer who thrives in a language which provides su
Performance is created by the compiler, not the language. A C program compiled with a shitty compiler is going to run slower than a Ruby one in a good VM, even though C is running native on the CPU. For that matter, what if I take the C code and compile it with the CLR as a VM target?
I wish people would stop trying to compare languages by performance, it does not make any sense. The only language it makes any sense for at all is assembler.
If you would like APL to be on the list, then submit benchmarks for APL to the Shootout (the blog got its data fro there).
~I'm sure he would definitely like to. But he broke the necessary keyboard~
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
I'm sure there are others, but I have work to do.
All of these numbers are based on what completely flawed microbenchmarks from a site that used to be called "The Language Shootout". The numbers have been thoroughly debunked several times in the past. See this thread, for example: http://www.javalobby.org/java/forums/t86371.html?start=30 Or just google for "language shootout". It's not that the people who run this are just incompetent (making dumb mistakes like including JVM startup times). It's that they actively allow and even encourage cheating. For example, at least one of the "C" benchmarks actually uses hand-coded assembly (libgmp), and rather than stick to the obvious "program must be written in the language it's listed under" rule, the site maintainer suggests that the "Java" benchmark could be changed to also use assembly. This is all documented in the thread listed above. After several of these debunkings over the years, they had to change the name from "the language shootout" to something else, as any quick google will show that these benchmarks are completely bogus. Nothing to see here, move along.
You should not draw too many conclusions about the results of those two without taking into account the fact that both are whole-program optimizing compilers. Those two systems just do not support separate compilation of individual source files; if you change one line of one file in your program, every file must be recompiled.
This type of compiler has a performance advantage over the more common separate compilation systems, simply because it can inline anything anywhere, and thus optimize far more aggressively. But it's next to useless for developing large software systems, and thus mostly really useful only for writing smallish programs, in very high-level style, that perform some really expensive computations really fast.
Are you adequate?
Funny, isn't this what Twitter thought too before dumping RUBY entirely? Wasn't this what Twitter thought as they threw more and more hardware at the problem and still could not solve the problem? Didn't Twitter end up spending more on IT to administer 2-3 times the numbers of servers that it would take to do the same thing in Python, PHP or Java?
Yeah, throw hardware at it. That's a viable solution for a company. As long as you aren't thinking about who has to maintain all those servers and the fact that RUBY STILL DOESN"T SCALE.
This is my sig. There are many like it but this one is mine.
If this were the case then Perl 6 would have stuck with the Pugs implementation.
That's silly. Pugs was not designed as a production implementation of Perl 6 - it was a proof of concept for the new syntax. The fact that Pugs was so surprisingly easy to implement, and so surprisingly performant before any effort at all was put into its performance, was a stunning demonstration of the power of functional programming.
GHC would have stuck with Darcs and not gone to GIT.
GHC did stick with Darcs. There was a time when a move to Git was considered, but it had nothing to do with Darcs being written in a functional language. At the time, the Darcs team was not big enough or well enough organized to keep up with the heavy support demands of being the RCS for a large and growing compiler project. That has changed. Darcs has become one of the best run (and most fun) open source projects, with a large team of active and talented developers. Darcs is now an excellent choice of RCS even for very large projects.
That is typical of the process that has been happening with functional programming during the past few years - a quick transition from the theoretical to the best-of-breed in practice.
So if you want to continue this conversation start reading the various performance related discussions. There are 2 decades of papers on trying to resolve specific examples of this problem.
If you're interested in history, you are the one who ought to have a good look at that literature. Learn to tell the difference between knotty problems that were identified twenty years ago, and the astonishing continuous progress that has been made since then.
But if you want to continue this conversation, you should look at what is happening in the present. One thing that has happened is that you no longer need to read academic papers to learn and use functional languages in practice. There are books, online tutorials and resources, many developer-friendly tools, and a huge and super-friendly community.
The shootout is a nice demonstration that speed optimization is another aspect of the increasing strength of modern functional compilers and functional programming techniques. No one will claim that a functional language can beat C at speed right now, but it says a lot that such high level languages can now compete well in the same league as C. As hardware architectures continue to move farther and farther away from the classical imperative model, watch for this trend to continue.
I love functional languages but putting your head in the sand regarding where the problems are and considering the evaluations FUD is not going to advance the cause.
I called the great-grandparent post FUD because it is FUD. These are common misconceptions, caused by lingering impressions of where functional programming was decades ago. Open your eyes, look at the facts, see what is happening now. Don't let the FUD lull you into apathy.