Comparing the Size, Speed, and Dependability of Programming Languages
In this blog post, the author plots the results of 19 different benchmark tests across 72 programming languages to create a quantitative comparison between them. The resulting visualizations give insight into how the languages perform across a variety of tasks, and also how some some languages perform in relation to others.
"If you drew the benchmark results on an XY chart you could name the four corners. The fast but verbose languages would cluster at the top left. Let's call them system languages. The elegantly concise but sluggish languages would cluster at the bottom right. Let's call them script languages. On the top right you would find the obsolete languages. That is, languages which have since been outclassed by newer languages, unless they offer some quirky attraction that is not captured by the data here. And finally, in the bottom left corner you would find probably nothing, since this is the space of the ideal language, the one which is at the same time fast and short and a joy to use."
And finally, in the bottom left corner you would find probably nothing, since this is the space of the ideal language, the one which is at the same time fast and short and a joy to use.
I must ask why the author assumes that verbosity is bad and why lack thereof makes it a "joy to use."
I think verbosity in moderation is necessary. I have read many an article with developers arguing that they don't need to document their code when their code is self-documenting. Do you make all of your variables and class/function/methods a single character for the sake of verbosity? I hope not. And I would think that reading and maintaining that code would be far less than a joy.
I don't even need to argue this, according to his graphs we should all be using Regina, Mlton or Stalin (a scheme implementation). But instead languages like Java and Perl and C++ prevail. And I would guess that support and a mediocre range of verbosity are what causes that.
Great work in these graphs! But in my opinion, verbosity when used in moderation--like a lot things--is far better than either extreme.
My work here is dung.
This site is awesome. It's very simple. They have over code in over 1200 different languages that spits out the lyrics to the "99 bottles of beer on the wall" song. Check out the perl example (yes, it really does work): http://99-bottles-of-beer.net/language-perl-737.html
"I have never let my schooling interfere with my education." --Mark Twain
I am surprised how they manage to get scala to perform so much worse than pure java.
.class files and uses static typing and the makes claim that the bytecodes are almost identical.
Scala compiles to pure java
I wonder if the benchmarks are executed in the same environment.
http://shootout.alioth.debian.org/ has a Gentoo label behind the java benchmarks, but not the Scala one.
And finally, in the bottom left corner you would find probably nothing, since this is the space of the ideal language, the one which is at the same time fast and short and a joy to use."
Plain ASCII. The shorter and faster it is, the more joy it is to use.
What can compare to the joy and speed of, for example, the command "Go fuck yourself!"
Even shorter and faster syntax: the command "F.U.!"
And for conciseness of comments - "SHIT!" and "oops!" and "WTF???"
Looping constructs: "Sit on it and rotate!"
If-else constructs: "Dat so? F.U. 2!"
foreach: "You, your mamma, and the horse you rode into town on!"
Exit statements : Just fuck off!"
c-style assertions: "Eat shit and DIE!"
#defines: "#define YOU One dumb motherfucka"
conditional #includes "#ifdef YO_MAMMA"
real-time peremption: "I OWN you, beotch!"
Programming languages don't have attributes like size and speed: implementations of these languages do. Take Common Lisp for example: SBCL is blazing fast, while CLISP is rather pudgy (albeit smaller). Any conforming Common Lisp program will run on both. Or consider Python --- IronPython and CPython have different performance characteristics. (I'm too lazy to link these now.)
Point being, describing a programming language as "fast" makes about as much senese as describing a natural, human language as "smart".
http://en.wikipedia.org/wiki/Forth_(programming_language)
--
Forth is a simple yet extensible language; its modularity and extensibility permit the writing of high-level programs such as CAD systems. However, extensibility also helps poor programmers to write incomprehensible code, which has given Forth a reputation as a "write-only language". Forth has been used successfully in large, complex projects, while applications developed by competent, disciplined professionals have proven to be easily maintained on evolving hardware platforms over decades of use
--
Forth is still used today in many embedded systems (small computerized devices) because of its portability, efficient memory use, short development time, and fast execution speed. It has been implemented efficiently on modern RISC processors, and processors that use Forth as machine language have been produced
--
Verbosity = ( 1 / Expressiveness )
Where Cobol and RPG, the languages that run business?
I think verbosity in moderation is necessary. I have read many an article with developers arguing that they don't need to document their code when their code is self-documenting. Do you make all of your variables and class/function/methods a single character for the sake of verbosity? I hope not. And I would think that reading and maintaining that code would be far less than a joy.
Long meaningful identifiers are useful. Needing 5 lines of setup for each API call is annoying, particularly if those 5 lines are usually the same. Requiring lots of redundant long keywords to "look more like English" is annoying. Large standard libraries that let you remove most of the tedious parts from your code are useful.
Every time I see one of these things, OCaml always rocks it. I wonder why it never caught on to a greater degree?
CAN HAS LOLCODE?
KTHXBYE
This kind of fits in with my thinking.
When I was starting out in programming, I just wanted results. I wasn't concerned about performance because the computer was a million times faster than me. I was most concerned about how many "non-vital" keywords were necessary to describe what I wanted the machine to do (e.g. "void main(...)" isn't *vital* because it's just boilerplate. However "if", "for", "while" etc. would be vital - and even for/while are just cousins), and how many of the vital keywords (i.e. those that specifically interfered with the way my program would *actually* operate... a "static" here or there would hardly matter in the course of most programs) were "obvious". Java failed miserably at this... I mean, come on: System.out.println() and the standard wrapping take up too much room.
So, BASIC was an *ideal* first language (sorry, but it was, and the reason nobody uses it much now is because EVERYONE has used it and moved on to something else - doesn't mean it "breaks" people). In this regard, even things like C aren't too bad - 30-50 keywords / operators depending on the flavour, all quite simple - you could memorise them perfectly in an afternoon. However things like Forth and Perl can be hideous.
And even C++ is tending towards the stupid. Believe it or not, even things like bash scripting come out quite well under that test. And, to me, that correlates with the amount of effort I have to put in to write in a particular language. If I just want to automate something, bash scripting is fast and easy. Most of the stuff I write is a "one-job program" that will never be reused. If I want to write a program to work something out or show somebody how something is done programmatically, BASIC is a *perfect* prototyping language (no standard boilerplate, no guessing obscure keywords, etc.). If I want to write a program that does things fast, or accurately, or precisely, or for something else to build upon, C is perfect.
I see no real need to learn other languages in depth past what I'm required to know for my work. I have *zero* interest in spending weeks and weeks and weeks learning YAPL (Yet Another Programming Language) just to spent 90% of that time memorising obscure keywords, boilerplate and the language's shortcuts to things like vectors, string parsing, etc. If I was going to do that, I'd just learn a C library or similar.
I think that these graphs correlate quite well with that thinking. Let's be honest, 99% of programming is reusing other code or shortcuts - short of programming in a Turing machine, C is one of the simplest languages to learn because it *doesn't* have a million shortcuts... you want to iterate over an array or create a hash / linked list, etc. you have to do it yourself from basic elements. In modern programming, that means a one line include of a well-written library. As far as I was concerned when learning it, even the "pointer++ increases by the size of the pointer" was far too smarty-pants for me, but incredibly useful.
But with C++, I instantly lost interest because it's just too damn verbose to do a simple job. Java OOP is slightly better but still nasty once things get complicated and the underlying "functional" language is basically a C-a-like.
I'm a fuddy-duddy. Old fashioned. If I write a program, the damn computer will damn well do instruction 1 followed by instruction 2 with the minimum of flying off into libraries and class systems. If I want 4 bytes of memory to change type, then I will damn well have them change type. And I'll even get to specify *what* 4 bytes of RAM if I want and I'll clean up after them if it's necessary. That's how I think, so things like C match perfectly when I want to code. The fact that C is damn powerful, fast, low-level and so common also add to it's appeal.
I worry about what will happen when people *only* code in OOP languages. The abstraction is so large that people forget that they are still telling a computer to handle bits and bytes and suddenly they get lazy. M
On the plus side, both versions of Python can claim many of the smallest programs in the collection. Ruby (8, 1) might also compete for titles, but unfortunately its performance is so bad its star falls off the performance chart.
Then why the fuck is the Ruby community hyping it so much, and drawing nieve young developers in to a trap?
Not flamebait.
Why can't they make a language, or extend a language like Ruby, such that one can program it as a scripting language, but then add verbosity optionally (i.e. declaring the data types and their sizes, private / static etc. & whatever the hell makes a program light weight and fast) optionally? It's my hope that if I stick with Ruby one day it I won't be forced to learn Python because performance won't be "Ruby's big issue" in every discussion, but really, that is *just* a hope. I hope this isn't a mistake.
"You know you don't act like a scientist, you're more like a game show host." Dana Barret
If you would like APL to be on the list, then submit benchmarks for APL to the Shootout (the blog got its data fro there). The Shootout is mainly driven by user submissions. They do have some fairly strict rules about how to submit. However, if you can name an implementation (along with enough guidelines to make installing it easy) and provide a reasonably full set of benchmarks, then the language will generally be added.
One tip about submitting though: try to make life as easy as possible for the guy who runs it. He doesn't have time to figure out typos or to fix "easy" bugs in some programming language they he may not even know.
First off, he presents the big chart twice. The second version is meant to compare functional languages with imperative languages, but it's also small enough to fit on my screen, so if you're browsing the article, you might want to look at that one first.
His "obsolete" sector is really more like a special-purpose sector. For instance, Erlang shows up in the obsolete sector, but that's because Erlang wasn't designed to be especially terse or fast. Erlang was designed to be fault-tolerant and automatically parallelizable. Io also ends up looking lousy, but Io also wast designed to be terse and fast; it was designed to be small and simple.
The biggest surprise for me was the high performance of some of the implementations of functional programming languages, even in cases where the particular languages aren't generally known for being implementable in a very efficient way. Two of the best-performing languages are stalin (an implementation of scheme/lisp) and mlton (an implementation of ml). However, as the author notes, it's common to find that if you aren't sufficiently wizardly with fp techniques, you may write fp code that performs much, much worse than the optimal; that was my own experience with ocaml, for instance.
The choice of a linear scale for performance can be a little misleading. For instance, csharp comes out looking like it's not such a great performer, and yet its performance is never worse than the best-performing language by more than a factor of 2 on any task. Typically, if two languages differ by only a factor of 2 in speed, then speed isn't an important factor for choosing between them. The real thing to look out for is that some of the languages seem to have performance that's a gazillion times worse than normal on certain specific tasks.
Many of the languages are hard to find, because they're listed by the names of their implementations. In particular, g95 is an implementation of fortran.
Find free books.
Where's Ada?
One item in the list is gnat which is one particular implementation of Ada. So, there is at least one Ada implementation on the list. I did not recognize any others.
--- Liberty in our Lifetime
These sorts of things never fail to to amaze me.
The verbs, nouns, semantics and such used in a given programming language have nothing, I repeat... NOTHING to do with performance!
What does have to do with performance is the talent of the compiler / interpreter author, nothing more, nothing less.
C implements ++ and so forth and so on. Pascal does not, you have to express it as var := var + x or in some implementations as inc(var) or inc(var,100). The smart compiler / interpreter author would implement those in the fastest possible way regardless of the particular language.
The one metric that has real meaning is programmer enjoyment. Do you prefer terseness over verbosity or something in between. Does this languages flow amke you truly appreciate working with it.
The only other real metric that has any true meaning is again the talent of the compiler / interpreter author. Was the the language parser built so that it can unfold complex statements that are often required to express certain ideas and perform certain operations. Does the language implement your favorite expression, eg: ++ , or something like that, which again harkens back to programmer enjoyment.
So what it really leaves us with is, "Do you enjoy using that language?" and only you, the programmer can asnwer that question.
Hey KID! Yeah you, get the fuck off my lawn!
Ha ha. Good joke. You've forgotten the most important feature "needed in a great programming lang": higher-order and first-class functions with proper closures. Oh wait, C doesn't have that.
Any truly great statically typed language will also have at least algebraic data types, parametric polymorphism (even C++ only has ad-hoc polymorphism), type constructors and functions, maybe even a Turing complete type system (heh). C doesn't have any of those.
Even aside from types, great languages should include tail-call optimization, pattern matching and hygienic macros (CPP macros are a bad joke).
Now don't get me wrong. C is a great portable assembly language. It's close to the metal, widely known and easy to read. But as far as programming languages go, C feature poor.
When the first 4K TRS-80 showed up at my high school, I had the option to learn BASIC, Z80 machine code, or APL up the street at the local university.
One of my early exercises with BASIC was writing a set of nested for loops which called a subroutine (gosub). I put the next statement that controlled the for loop iteration inside the subroutine, and the return statement for the subroutine inside the nested for loops. It still worked! At that point I understood that there were mechanistic languages and languages with a solid conceptual basis.
APL's reputation for inscrutability was only halfway deserved. Often the problems arose when you were trying to shoe-horn a data structure that didn't want to be an array into an array, because that was your only hammer. Later APL supported nested arrays, which increased the data structuring options, but I think by then the PR battle was lost.
In the original APL, it was kind of painful to pass more than two arguments to an APL function. This lead to programmers passing in flags to the function encoded in the array's rank, which were extracted to the tune of rho rho rho, while imaging the knapsack folding problem in Colossal Cave Adventure. As brutal as any language I've used. But you have to give APL a bit of a pass in some respects. Like vi, it was designed in 1963 to work well within the constraints of a paper teletype.
The next level of inscrutability arose because the APL primitives could often be combined in novel ways to yield surprisingly powerful algorithms. IIRC, the IBM 370 APL included a JIT compiler for certain common APL idioms. A one line program I wrote in APL to find primes (sieve of Eratosthenes) ran ten times faster than the compiled PASCAL program by the CS student sitting next to me.
Understanding APL was a lot easier if you were familiar with functional programming languages, but these hadn't been invented yet. Hey, I didn't know this: the Wikipedia page credits APL as a direct influence on FP, which I first heard of in 1982. Father knows best.
So you encounter this unfamiliar pattern of 15 familiar symbols for the first time, and you brain is polluted with horrible iterative solutions from BASIC or PASCAL, and the beauty of the expression is denied to your limited frame of consciousness.
Like solving a Suduko? Hardly. It takes me twenty minutes to solve a typical five star Sudoku. It used to take me about the same amount of time to puzzle out an unfamiliar APL one liner, which might be anywhere from 10 to 40 characters. There is one small difference: after decoding the APL algorithm, I usually slapped myself across the head and moaned to myself, "I am unworthy to drool on the shoe laces of the grand designer, but I will learn!" Never got that feeling from Sudoku.
Wrestling with the higher art of APL was like giving your ignorance a root canal. Sometimes the root canal made me barf up my milk: when the highest art of APL was applied to shoe horn a data structure unsuitable to array representation into an array representation anyway, like the Beethoven scene in Clockwork Orange.
The third case is where the one liner isn't all that difficult, but it's doing it in more dimensions than the brain wishes to visualize. This is a case where a picture is worth a thousand words. Your 20 character APL function would have been better presented as a caption on a one page UML diagram. Never figured out how to embed a UML diagram in an APL lamp statement on my VT100 terminal.
Another problem APL suffered was too much kinship with Forth. To thrive in APL, you needed to create hundreds of tiny APL functions, which the implementations of the day mashed together into a single unmaintainable workspace.
And the system interface tended to suck.
But other than that, what's not to like?
I could have composed instead a tedious, but germane post on Shannon's first law: concision is a function of preconception. It's a rare breed of programmer who thrives in a language which provides su
Performance is created by the compiler, not the language. A C program compiled with a shitty compiler is going to run slower than a Ruby one in a good VM, even though C is running native on the CPU. For that matter, what if I take the C code and compile it with the CLR as a VM target?
I wish people would stop trying to compare languages by performance, it does not make any sense. The only language it makes any sense for at all is assembler.
All of these numbers are based on what completely flawed microbenchmarks from a site that used to be called "The Language Shootout". The numbers have been thoroughly debunked several times in the past. See this thread, for example: http://www.javalobby.org/java/forums/t86371.html?start=30 Or just google for "language shootout". It's not that the people who run this are just incompetent (making dumb mistakes like including JVM startup times). It's that they actively allow and even encourage cheating. For example, at least one of the "C" benchmarks actually uses hand-coded assembly (libgmp), and rather than stick to the obvious "program must be written in the language it's listed under" rule, the site maintainer suggests that the "Java" benchmark could be changed to also use assembly. This is all documented in the thread listed above. After several of these debunkings over the years, they had to change the name from "the language shootout" to something else, as any quick google will show that these benchmarks are completely bogus. Nothing to see here, move along.
Funny, isn't this what Twitter thought too before dumping RUBY entirely? Wasn't this what Twitter thought as they threw more and more hardware at the problem and still could not solve the problem? Didn't Twitter end up spending more on IT to administer 2-3 times the numbers of servers that it would take to do the same thing in Python, PHP or Java?
Yeah, throw hardware at it. That's a viable solution for a company. As long as you aren't thinking about who has to maintain all those servers and the fact that RUBY STILL DOESN"T SCALE.
This is my sig. There are many like it but this one is mine.