Is Perl Better Than a Randomly Generated Programming Language?
First time accepted submitter QuantumMist writes "Researchers from Southern Illinois University have published a paper comparing Perl to Quorum(PDF) (their own statistically informed programming language) and Randomo (a programming language whose syntax is partially randomly generated). From the paper: 'Perl users were unable to write programs more accurately than those using a language designed by chance.' Reactions have been enthusiastic, and the authors have responded."
Better? How about we start with distinguishable?
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
I always thought Perl was a randomly generated programming language.
There's no -1 for "I don't get it."
I'd have to say PERL is better than a lot of purposefully crafted languages. Its syntax is very forgiving, and there are lots of ways to do most things. Those two components are likely the reason this study came to that conclusion. This in no way means that PERL is not a good language. It does mean that many people can write PERL badly, but many people speak English badly and that doesn't reflect poorly on the language. PERL is, IMO, and should always be: Easy to do, but impossible to do "perfectly". But then I'm not sure that anything can truely be done "perfectly". Things may be done poorly, well, very well, or nearly perfectly, but to claim perfection is to deny the possibility of improvement.
How does C++ fair?
Farely average.
Donate free food here
How does C++ fair? LOL
#%@$&#@^UGSOWDYRO&F@#L(EGFGP*$TW
This Script written in Perl computes the answer.
They claim that Perl is not significantly better than Randomo, but that's just due to the test they chose. Looking at their figure, Perl programmers outperformed Randomo programmers in 6/6 tasks (that is, their means were greater). Using a simple sign test on the differences between the means, the two tailed p value is about 0.03, and the one-tailed p value (I think we're justified here having having a directional hypothesis...) is about 0.015. Both of these numbers are less than 0.05; we are justified in saying that Perl programmers performed significantly better than Randomo programmers, in spite of what the paper says.
Are they telling me that Quorum is better then a randomly generated language at teaching and that Perl makes bad programmers? This sounds more like someone setting up a study and trying to rig it so that their horse (quorum) gets taught in the class room. Personally I stick Perl in the same bucket as VB and most scripts. They may have their uses but new programmers need to be beaten with languages like C and C++ first. Otherwise they learn bad habits. Perl only starts getting good when you use strict so that it has been given permission to beat the programmer for any little mistake.
Yes it was Perl 4, which is one of the flaws in this study.
Is Betteridge's Law of Headlines Correct?
Languages that consider whitespace need to die.
While Perl has never had a particular reputation for clarity, the fact that our data shows that there is only a 55.2 % (1 - p) chance that Perl affords more accurate performance amongst novices than Randomo, a language that even we, as the designers, nd excruciatingly difcult to understand, was very surprising.
This is a complete misunderstanding of what a p value means in statistical inference. The p value is not, and should not be interpreted as, the chance that "Perl affords more [or less] accurate performance." The p value is the chance, given that there is no difference, of obtaining a difference as large or larger. This is covered in first-year statistics.
But for now... If I were a Samurai, I would not start newbies with a live sharp sword. And Modern Perl is so, so very sharp...
I keep reading the full paper (+points for publishing the whole thing!) and have yet to hit upon the definition of the word "accurate" they are using to measure the results. Apparently that is contained inside their previous paper with no direct link. On page 3 though, Perl is described as "A well-known commercial programming language". Really? C# is a commercial language, Perl is an Open Source language with wide commercial adoption that has evolved or the years into several distinct beasts.
1 Dachshund + 1 Dachshunds = A Paradox.
Well said.
If you want your code properly indented, just indent it. It's like the Python apologists are incapable of formatting their code properly unless the language forces its particular version of "properly" on you.
Before the trolls fire back: In the case of code written by others, run it through a pretty-printer. Problem solved. Oh, as a bonus, you can use that same tool to format code the way you prefer, and switch it back to whatever style your company requires at the press of a button. Why is this a bad thing?
Required reading for internet skeptics
When Perl is well written, including indents and not jamming multiple lines all together on one line, it looks very similar to Python, but with a semicolon at the end at each line.
I was going to post a string of close parens as representing the termination of a Lisp program, but the comment moderation nanny would not let me do that. So much for trying to tell a geek joke around here.
If those punctuation marks (or keywords) make the code more readable, then they're not gratuitous are they? I, for one, find brace-less languages fantastically hard to read, Python especially.
"we did not train participants on what each line of syntax actually did in the computer programs. Instead, participants attempted to derive the meaning of the computer code on their own."
They were not trained. They were just shown code samples with no explanation. The code samples had 1-letter variable names and no comments. The Perl sample uses $_[0} for getting the first sub argument instead of shift, and "for ($i = $a; $i = $b; $i++)" to do a for loop instead of "foreach $i ($a .. $b)", so it is deliberately obfuscated Perl.
Fortran (at least, IV and earlier) totally ignored white space, even in the middle of an identifier. Of course, this led to problems like
DO 10 I = 1.10
meaning "assign the floating point number 1.10 to variable DO10I", when the programmer meant to type
DO 10 I = 1,10
meaning "loop from here to label 10 varying I from 1 to 10".
An error something like this caused the Mariner II probe to Venus to go off course at launch and the Range Safety Officer hit the destruct.
-- Alastair
Most languages consider whitespace. In most programming languages where both of the following are valid, they will have different semantics:
1: foo bar
2: foobar
Quite a lot of languages even distinguish between different types of whitespace, e.g., C where the following two constructs are different, despite differing only in which particular kind of whitespace:
1: //
foo();
bar();
2: // bar();
foo();
Python may be unusual in which differences in whitespace it considers significant, but not in that it considers whitespace significant. People need to stop confusing the issue.
Perl is a language, just like Dutch, Swedish, English, German and most of the others. In just about any language there is, to paraphrase a well-known Perl motto, more than one way to say something. That is in many ways a good thing, especially when it comes to using the language creatively as a novelist or poet or similar type of wordsmith does.
It is true that this quality does tend to make Perl programs somewhat hard to grasp for the uninitiated in the programmers style of writing. That is another quality the Perl language shares with those other languages mentioned above - did you understand all of Finnegans Wake the first time you read it?
In other words, Perl is a writers' language. It is not an editors' language. Once you get into the right mood, Perl flows like your native language does. Done right, this can lead to great things. It can also lead to the sort of notes you made when attending those lectures you did not care about in the first place, and did not understand in the second. Use Perl for things you care about, and it will provide you the means to express yourself in just the right way (for you).
--frank[at]unternet.org
Fortran is interesting, theologically - it considers God to be real unless declared integer.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
I dunno. Since it's a comment on Perl, starting with a # would seem to be entirely accurate according to the syntax.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
If those punctuation marks (or keywords) make the code more readable, then they're not gratuitous are they? I, for one, find brace-less languages fantastically hard to read, Python especially.
I LUUUUURV Python so much that if it was legal I would marry it, but I completely agree. Curly braces to denote block starts and stops make the code easier to read and manage. I should not have to wonder whether a function or block continues past the bottom of the current screen's worth of code when it ends with a few lines of whitespace because I have to know the indentation level of the next line of code to know if it's in a different block context than the last line of code on the current page. I also should never have to wonder if I re-indented code correctly when cut/pasting or adding/removing a level of block nesting.
I don't care if Python wants to keep the indentation requirements. Forcing the code of awful programmers to be more readable in this way is a good thing. Forcing all code to be less readable in another way is a bad trade-off. Just add in the damn braces! Then I can use tools to auto-indent for additional readability.
The enemies of Democracy are
I've always felt that version control systems should store syntax trees, but have never had the time to do the work to do that.
Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
It in fact has three disadvantages: it bypasses any prototype coercions, it passes @_ unmodified by default, and it's unidiomatic.
All of these fencepost errors I've fixed argue otherwise.
how to invest, a novice's guide
Personally, I find that curly braces make code easier to read on top of perfect indentation. In truth, though, it's not so much the braces, as it is the nearly-empty lines of code that are spend to put those braces (note: this specifically applies to ANSI-style brace layout only, not K&R style). It creates a kind of a visual box, clearly delimited, with body of the block in it - more so than plain indentation does by itself.
That said, I wouldn't call Python "fantastically hard to read", quite the opposite - it tends to be one of the easiest languages to read. Not because of indentation, but because its basic syntax is rather clean.
No-one is saying that Python is good because it forces you to indent. Quite the opposite: all sane people indent their code anyway, whatever the language, so why not use that to indicate program structure?
Unfortunately, C++ remains the only language with a full-featured yet concise RAII, which is its main advantage when compared to C. And templates, while messy, are also extremely efficient in terms of generated code - more so than similar mechanism (generics etc) in pretty much any other language I know of.
Just enforce a formatter on commit. If the formatted code is any different from the original file, abort the commit. Git makes this kind of thing easy. It also means the repository is always in a sane state. A simple script can reformat all changed files trivially before a commit operation.
The study cited has several biases in favor of the scripted languages that are acknowledge by the author in the references of your supplied link.
Primarily:
- The non-scripted languages (C, C++, Java) were tested under formal conditions in 1997 / 1998 (Java 1.1 I assume), the script programmers wrote their programs at home and self reported their times (and in most cases spent several days thinking about the problem before starting work, time which was not included).
- The script programmers were told that the programmer effort and elegance of their solution was a criterion, the non-script programmers were only told that the program would be judged based on its correctness (accuracy).
- The script programmers had immediate access to a hint (to resolve a misread requirement) which was only available to non-script programmers after they failed an acceptance test.
- The non-script group would have a cost deducted from them each time their program failed an acceptance test, whereas the script group had access to the final acceptance test data.
Overall, the comparison between the languages does not seem fair, or at least not the comparisons of the scripted and non-scripted languages.
Because not everyone uses the same indentation as everyone else. If indentation rules need to be worked out before starting a project, you're wasting more time than a language where indentation has no meaning.
Every time I start to have faith in humanity, I ruin it by driving to work between 7 and 8 am.
Really, Python's problem is that both spaces and tabs are legal - if the language required one or the other, it would be fine, modulo subjective readibility arguments about braces.
Socialism: a lie told by totalitarians and believed by fools.
Modern IDEs make it so easy to convert from one style to another I wish they would just make it a display option. Then each developer could read and write in the style they prefer.
bite my glorious golden ass.
Why use four characters when one will do?
That is only one of the two syntactic roles assigned to parentheses. The other is to disambiguate priority. For instance, you have to write (a + b) * c if you don't want it to mean add(a, mult(b, c)). But you see, combining multiple lines is very exactly this: priority disambiguation. Consider "if (cond) a; b". Priority is such that the statement is parsed like "(if (cond) a); (b)", because the if statement doesn't eat up semicolons. If you want it to mean "(if (cond) ((a); (b)))", then you could just put parentheses like this: "if (cond) (a; b)". Combining multiple lines into a single logical command precisely requires to override the fact that statements have higher priority than statement separators. This is perfectly consistent with what parentheses *already* do. Syntactically speaking, the if statement is to the semicolon what multiplication is to addition. Seriously.
This being said, I consider that using parentheses both for priority disambiguation and for function calls is one of mathematical notation's biggest fuckups. For instance, a(b + c) can be either multiplication or a function call depending on context. That's just mind-bogglingly terrible. It's not nearly as bad in programming languages because straight up juxtaposition is often a syntax error, but nonetheless I would say that function calls should NOT use parentheses. They should use square or curly brackets, and parentheses should only serve to disambiguate priority, which includes grouping statements. You might see my point better then.
A shift would have been more intuitive?
No, but perhaps a "my ($a,$b,$c) = @_;" would have been. Since I'm a long-time Perl programmer, I can't really speak for the newbie. But the use of the numerous $_[n]-lines is probably unclear. In any case, it is considered bad code, since it is both hard to read and error prone.
Using a foreach, instead of the C-style for loop, is certainly easier and MUCH closer to the implementation used in Quorum and Randomo. So that, at least, was very poorly thought-out. And Randomo? Is it really random? Or is it really Quorum with a bunch of substitutions made? Just look at the code samples.
When I had a look at the paper, the first thing I noticed was the use of the ampersand sigil in a function call. This has been considered bad code in Perl since time immemmorial and really goes to show to things:
* The researchers didn't know the first thing about contemporary Perl and didn't bother to find out, ie. do research.
* The researchers did nothing to make the Perl code readable, which is paramount for newbies to any language.
And worst of all, and this is really appalling, they are cherry-picking their methods. Just look at the table and the numbers, then read their analysis. And don't even get me started on the sample-size...
Lemon curry???
The Perl sample uses $_[0} for getting the first sub argument instead of shift, and "for ($i = $a; $i = $b; $i++)" to do a for loop instead of "foreach $i ($a .. $b)", so it is deliberately obfuscated Perl.
As someone with a grand total sum of one hour of Perl experience about 10 years ago, that does not look like deliberately obfuscated Perl, but Perl written by a C programmer.
I'm willing to bet I would understand this "obfuscated" code a lot better than your idea of non-obfuscated Perl code. That doesn't mean I think the "obfuscated" code is better - just that it's a better match for the way my mind has been twisted by the particular programming language I use most often.
That's also why this study is somewhat interesting - it starts from people without any prior experience in programming, just your average human experience. And the language based on "intuitive" concepts (i.e. concepts from general human experience instead of programming-language specific concepts) did better. Is that surprising? Not very, except for providing a first insight that some concepts may in fact be more intuitive than others and thus more easy to learn. Does that mean in the long run these concepts will also be the most productive? I cannot say for sure, but I had a lot of fun trying to bend my mind around Lisp, and noticed sudden leaps in producivity when things "clicked", so my grand total of one anecdote tends to make be believe that some non-intuitive concepts may be a lot more productive than the intuitive concepts.
My main objection to semantic indent can be summarized in this psuedo code example:
//end foo //end fubu
class fubu
function foo(bar)
start function
more code
all's well
console.log("debug message")
more real code
Having that debug console statement out of band with the rest of the functional indents makes it easy to notice when scanning code. Now you might say one should never debug that way, which is fine, but I do sometimes (and know others who do), and so I don't like semantic ident languages b/c they prevent me from doing something in a way that is helpful/useful in my workstream. Sure I can solve this with search/replace etc, but I like the visual cue sometimes.
It's a minor point but at least for me significant.
Oh the fucking irony of it. I was trying to post the following using pre and code tags without success and just ended proving your point:
Sure. Because
def function(): if condition: while ok: do_something() end while end if end def Is much more readable than: def function(): if condition: while ok: do_something()