Rosetta Code Study Weighs In On the Programming Language Debate
An anonymous reader writes: Rosetta Code is a popular resource for programming language enthusiasts to learn from each other, thanks to its vast collection of idiomatic solutions to clearly defined tasks in many different programming languages. The Rosetta Code wiki is now linking to a new study that compares programming language features based on the programs available in Rosetta Code. The study targets the languages C, C#, F#, Go, Haskell, Java, Python, and Ruby on features such as succinctness and performance. It reveals, among other things, that: "functional and scripting languages are more concise than procedural and object-oriented languages; C is hard to beat when it comes to raw speed on large inputs, but performance differences over inputs of moderate size are less pronounced; compiled strongly-typed languages, where more defects can be caught at compile time, are less prone to runtime failures than interpreted or weakly-typed languages."
If you are characterizing performance of "large inputs" without quantifying machine behaviors (cache, TLB, ram) you're doing it wrong.
So it's telling us just what we already knew? Interesting.
especially if it makes the code unreadable. Give me the verbose, easy to read code any time. If you really, really want succinctness, use Perl or even better, APL and don't worry about the next poor slob who has to maintain your code.
Isn't that kind of the point?
Is this supposed to be something we didn't know? Or just confirming something we did?
Lost at C:>. Found at C.
The difference, as the summary noted, is that when using a scripted-language, you are trading all your compile-time (build breaks) for runtime errors that your users will see.
If you write 'C' code, would you declare all your input and output return types as 'void*'?
If you write 'Java' code, would you declare all your input and output return types as 'Object'?
Why someone would willingly give up the function of a compiler is beyond me. Sure, use scripts for little tasks / prototyping etc. Any long-term project should be using a proper language, that provides type-checking (at compile time), and provides proper encapsulation so that 'private' means 'private' (looking at your Groovy). I don't want to be forced to read every line of your crappy code, just to try to figure out what object-type your method supports because you are too damn lazy to define it in the method's interface.
When you change the behavior of the method and assume different input/output object-types, I want that to be a BUILD-BREAK instead of me once again having to reverse engineer your code.
It is disappointing that this is not the only comment for this post. You didn't even need to waste your time on a body or quoting just a title of Duh!
Average Go programmers.
Maybe it tells something about maturity of language (wide knowledge of best practices - or even existence of them), availability of skilled programmers rather than runtime performance.
And in real world, it might be a lot more important how fast/readable/maintenable code will be written by people you can hire rather than how fast/readable/maintenable it could possibly be in most idealized situation.
Simply because a language is billed as a "scripting" language (by which people tend to mean distributed as source code and partially compiled for each execution rather than compiled once and distributed as object code rather than actually used primarily to script other programs) doesn't mean there's no programming paradigm associated with them. They can support procedural, functional, actor-based, object-oriented, logical, dataflow, reactive, late binding, iteration, recursion, concurrency, and whatever other paradigms and methods people want. Some of them support mixing and matching even in the same program.
Languages that are typically fully compiled can even be run in an interpreter. C-- comes to mind. Often languages known for interpretation (actually most of which are partially compiled rather than interpreted line-by-line) have support for compiling at least portions of a program up front, too. Examples include the .pyc files of Python, luajit, Facebook's HHVM, Steelbank Common Lisp, and Reini Urban's work on perlcc.
People making claims about one type of language vs. another should really keep straight what types they are talking about.
It always comes down to personal preference.
All i care about is performance. If i want performance, i will learn how to use C++, regardless of what new writing methods i need to learn.
People who cant/dont want to learn a "better language" will always try to brickwall every other language with an excuse that suits their "locked in" writing ability.
Funny how C++ is missing from this "lets try and justify programming languages using a graph made by a 2 year old with crayons", tests
Not obvious at all that C is hard to beat on raw speed on large inputs. Fortran and COBOL and Forth do that.
And yet, Python is one of the most succinct languages in the study...
Why aren't there more languages that allow strong typing were desired but weak typing where not? One can kind of emulate such with generic "variant" or "object" types in some languages, but you have to keep declaring everything "object". If I want dynamic parameters, I shouldn't have to put any type-related keyword in the parameter definition.
For example, one should be able to type: function foo(x, y)... versus function foo(x:int, y:string)... for weak and strong typing, respectively. And types could be converted as needed for strong parameters rather than require an explicit conversion/casting. For example, if you send a string to x in the 2nd function def above, it would attempt to parse and convert it to an integer.
Strong typing tends to be best for the "root" routines and deeper infrastructure guts of a system, but weak typing for the top-layer business logic, where being closer to pseudo-code helps one read and fiddle with ever-changing business logic. Dual typing would allow a single language to better fill both roles.
And one should be able to specify that a given class or module or name-space requires all variable and parameter definitions to be explicitly typed to enforce strong-typing for selected sections.
The idea that a language must be type-heavy OR type-light seems false; a mere habit of the industry.
Table-ized A.I.
If I wrote a C program using one line and lots of ;s, it would be the most concise program possible.
Rosetta Code solutions were chosen precisely because they're idiomatic, and hence not tuned to these benchmarks.
Poor python, where newlines have syntactic effect!
There are loads of Python solutions posted on http://codegolf.stackexchange.... which *are* tuned to similar benchmarks.
It's remarkable how much can be achieved with a single list comprehension.
My /. account still works. Havent' used it in ...err... a few years.
Just here to bump this article for my friend Mike and his fantastic work on RC.
A study recently found that "Duh" is far more succinct than "You're telling us something we already new", and "No shit sherlock" applies greater annoyance at the repetition of redundant information.
Is not true for all scripted languages. AutoHotkey for instance gained a #warn flag, that among other things:
.exe instead of just run-time interpreted.
1) Tells you when you have a local variable with the same name as a global. The local trumps the global, unless it's a forced/hard Global, but it lets you know so as to be aware.
2) Tells you when you reference a variable that hasn't been assigned a value, prior to the reference.
3) Warn when an environment variable is automatically used in place of an empty script variable.
AHK is written in C++. Supports normal braces usage {}. Has an interesting take on Objects, as well as things like Regexp Match objects. Can pass functions as variables to functions, ByRef variables, varargs, etc. AHK Can be compiled to an
Lexikos has done some pretty cool stuff with it since ~2007; AHK2 is shaping up. More info here for anyone interested.
Note: autohotkey.com has been subverted by a f'ing wanker that wont give up the domain-name and is pushing an agenda. He also only provides a version of AHK that hasn't been updated in 7 years (from 2007).
If I understand the statistics correctly the average program has 71 lines of code. Those are mickey mouse tests for which scripting languages shine. All the verbosity of imperative languages becomes handy when you have tasks that are a few 100KLC long.
This is a lesson Perl learned the hard way: once your program is long enough you beg in your knees for strong static type checking system.
The problem with programming language evaluations is that they tend to be based on small snippets of code, like this one, or data from novice student programmers, or worse, popularity. Yet what really tends to matter is how much trouble a language causes in large systems and in later years. That's where high costs are incurred because changes in module A affect something way over in module Z. Undetected cross-module bugs, high costs of changing something because too much has to be recompiled, that sort of thing. How much help the language gives you then matters.
A really good programming language study should digest data from change logs on some major open source projects.
Neigh-sayers?
What do you have against horses?
In my opinion the basic trade-off is that "scriptish" languages can be written to be closer to pseudo-code and thus easier to read and grok. Strong/heavy typing tends to be verbose and redundant, slowing down reading.
Better grokkability often means less "conceptual" errors, but at the expense of more "technical" errors, such as type mismatches. There's no free lunch, only trade-offs.
In some projects the conceptual side overpowers the technical-error side, and vice verse. It also depends on the personality of the coder or team.
Table-ized A.I.
So it's telling us just what we already knew? Interesting.
For three or more decades. (Before that some of the classes of things they're comparing didn't exist, with enough deployment, to characterize.)
On the other hand, it's nice to have it confirmed with some rigor and measures.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Take that you neigh-sayers! ;-)
We are no longer the pgrogrammers that say neigh, we are now the programmers that say Ekki-ekki-ekki-ekki-PTANG. Zoom-Boing, z'nourrwringmm
If my comment didn't sound as good in your head as it did in mine, then I guess we all know who's to blame
The biggest suprise for me is how well Go does:
"Go is the runner-up but still significantly slower with medium effect size: the average Go program is 18.7 times slower than the average C program. Programs in other languages are much slower than Go programs, with medium to large effect size (4.6–13.7 times slower than Go on average)."
My only objection is that they classify Go as "procedural" along with C, Ada, PL/1 and FORTRAN. It may not have inheritance (a good thing in my book!) but it has many OO features including support for abstraction and encapsulation.
You can strip out all C newlines (after removing escaped newlines) and replace them with spaces, except for ones right before a #, and the exactly identical code will score much higher on succinctness.
Did you do this to a bunch of examples on Rosetta Code before the database dump was taken that this study is based on? No? Then it doesn't matter, because as I said Rosetta Code represents idiomatic solutions.
Code Golf already takes this into account (counting bytes).
Poor python, where newlines have syntactic effect!
So do semicolons. For example:
>>> print "this";print "that"
this
that
>>>
works.