Modeling How Programmers Read Code
An anonymous reader writes "Following up on an experiment from December, Michael Hansen has recorded video of programmers of varying skill levels as the read and evaluate short programs written in Python. An eye tracker checks 300 times per second to show what they look at as they mentally digest the script. You can see some interesting differences between experts and beginners: 'First, Eric's eye movements are precise and directed from the beginning. He quickly finds the first print statement and jumps back to comprehend the between function. The novice, on the other hand, spends time skimming the whole program first before tackling the first print. This is in line with expectations, of course, but it's cool to see it come out in the data. Another thing that stands out is the pronounced effect of learning in both videos. As Eric pointed out, it appears that he "compiled" the between function in his head, since his second encounter with it doesn't require a lengthy stop back at the definition. The novice received an inline version of the same program, where the functions were not present. Nevertheless, we can see a sharp transition in reading style around 1:30 when the pattern has been recognized.'"
As "the" read and evaluate?
if i smoked enough pot i bet i could make my eyes move like that too
This article is complete garbage. They tested 2 people with different code that produces the same results and then make up a narrative of how novice and expert coders think in different ways. Use the same code to test a much larger pool of programers and then the results might actually be interesting.
Is that video real time (adjusted for the 300Hz sample rate)? I ask because I'm not a Python programmer (I do know C, C++, asm) but about 10 seconds into the video I knew what the program would print and yet the video went on for 3 minutes. Something does not add up.
I don't even read it, I close my eyes, which are a hindrance, and use my inner eye to feel the code. I become the code. Most times I look like a bowl of spaghetti
They should link to the follow up post that talks about the experiment with 162 programmers http://synesthesiam.com/posts/what-makes-code-hard-to-understand.html. It also links to the paper that has even more information.
The code between these two individuals is completely different, even if it produces the same results. How do you discern any meaningful results out of two people reading two different sets of code?
the style of the code might also make a difference (as well as the specific languages use of form)
Code of more than average complexity also shifts the reading patterns, as would non trival agorthms that require more study to figure out what it actually does
I have my own code style that assists in fast scanning and placements of specific language features for when Im working on a project of 100000 lines of actual code (30 years a programmer and I also dont care for having 100s/1000s of tiny files spreading out content )
the display itself (and tools IDE which might not efficiently display code) - there is an effect of how much text is visible without scrolling also will have an effect
Some call centers / help desks suck with BS metrics and scripts.
We need less BS metrics as people just game them and people who do a good job have poor metric scores and some with lines and lines of bloated code gets a good score.
It seems line Hansen is still a novice too - sampling 30fps video at 300 Hz...
Excuse me, but please get off my Pennisetum Clandestinum, eh!
I'm not sure this topic requires another article:
http://developers.slashdot.org/story/12/12/19/1711225/how-experienced-and-novice-programmers-see-code
Just run the code. There, you have the output. Problem solved. Now, seriously. It took me less than a minute to figure out what it would do. So maybe I'm an expert programmer? Most of my time was spent making sure their code wasn't doing something sneaky like changing a variable later on, or printing a variable with a slightly different name.
Brain-damaged lemmings are exactly the sort of hipsters that are into python. Back in my day there were great languages like Delphi that would draw the fashion-conscious coders and keep them far away from the rest of us.
Unfortunately, this thing seems to be en vogue in the computer fashion industry. I just attended a conference where this phrase could some up a bunch of the presentations:
"We are modeling, tapping into the power of social networks, and doing visual analytics!"
I happen to be reading The Psychology of Computer Programming, Silver Anniversary Edition" right now. An interesting quote:
The only thing that's changed here in twenty-five years is the fact that the funds dedicated by executive to eliminating programmers from their payrolls have become far more staggering than I imagined back then. And, now, I finally recognize in this executive desire a pattern so strong, so emotional, that it has blinded these executives to two facts:
1. None of these schemes has succeeded in eliminating programmers . (We have now at least ten times as many as we did then.)
2. Every one of these schemes has been concocted by programmers themselves, the very people the executives want so passionately to eliminate.
So, although people say that programmers lack interpersonal skills, they evidently have a skill at persuasion that surpasses that of the late, great P:T: Barum, famous for his theory: "There's a sucker born every minute."
I guess if I need some money for something from executive, I'll tell them that I need it to model, tap into the power of social networks and do visual analytics. That ought to get me my funds.
Schroedinger's Brexit: The UK is both in and out of the EU at the same time!
Obviously, you haven't progressed past the 'beginner' stage...
Sleep your way to a whiter smile...date a dentist!
Oh no! *Explodes*
Cognitive scientists have been studying saccades for a long, long time. Interesting to see it used to measure screen-reading changes in an expert vs. a novice.
Did anyone notice that the "expert" got the answer wrong in his video? See here.
The second line should be "1 0 8 1" and the third line should be "8 9 0". The "novice" got these answers correct.
The executives in my company seemed genuinely surprised that a programme to let call centre staff _make customers happy_ rather than sticking to scripts and hitting their metrics, actually resulted in better satisfaction and retention metrics.
Customers are only calling us because something went wrong, so the chance that a mechanical "just follow the script" solution is appropriate is very small, and thus it ought to have been no surprise at all that the new programme was an improvement, but still they had to be cajoled into it by expensive consultants.
One data point in a multidimensional space does not a dataset make.
Ezekiel 23:20
when confronted with a situation falling within their specialized field, experts can process information in large chunks. Whereas laymen and novices tend to process things one small piece at a time; and on top of that, they flail around a lot.
Actually, that's a load of nonsense:
1) With programs that are *actually* large, you won't find "experts" that consume them in "large" chunks, unless they use very small fonts.
2) With programs that are new to the readers, you might have to read in toto them anyway. There's no guarantee that the *actual* dependencies in the code will allow you to read it in a limited or strictly hierarchical fashion. You gotta read what you gotta read. It's not like people will only shove neat and pleasurable code on you in real life. If the code is messy, your reading of it will most likely be messy, too. Especially if you hit duplications and have "wait a minute, didn't I see this somewhere else? Lemme check" moments.
Ezekiel 23:20
If you want to know what a code prints, just evaluate it, it will tell you instantly without having to worry whether you made a mistake in evaluating the code in your head.
Now if you want to understand the code, that's something else, but it's not what's asked here. The code is sufficiently straightforward that the only explanation you could give is the code itself.
It's amazing what kind of inaccurate stuff get's posted on here.
Clearly from both videos - the tested programs to read were not even the same. The novice one dosen't have functions and does everything inline, whereas the 'expert' one uses functions.
Also - both 'programmers' made mistakes when predicting their outputs. The novice used brackets which aren't part of the output stream and the expert's last two lines read 1 0 9 1 , 9 instead of 1 0 8 1 , 8 9 . It's nice to see the amount of people on slashdot who can't even comprehend a 20 line program yet still bother to comment on it. Grep!
SELF-DEFENSE: Apply a generous portion of rasberry jam on the camera.
OFFENSE: Have a radio-controlled drone 24/7 tracking the researcher's eyes, hands and every other part of his body and stream it on the web.
Workplaces are pushing "Nobody has a right to privacy" and "We are going to measure you whether you like it or not".
Reports like these make the slower coders feel like they are worthless. Eventually this will backfire and "the observed" will retaliate by reclaiming their digital freedoms and privacy. They will also retaliate by making those researchers feel worthless also. We wouldn't mention other methods to do so because we don't want those researchers to see it coming.
When reading the lines
In this last 30 seconds or so of the novice video above, you can see her back-and-forth comparison of the x and y lists. If you look carefully, however, the red dot (her gaze point) is often undershooting the numbers on both lists. Why is this? While it could be a miscalibration of the eye-tracker, the participant may also have been using her parafoveal (the region outside the fovea) to read the numbers. This and the fact that foveation and visual attention are not necessarily always the same (i.e., looking at something doesn't always mean you're thinking about it) encourages us to be cautious when interpreting eye-tracking data.
I suddenly realized that while reading those lines I indeed was focusing below those lines. I have no idea if this was an unconscious reaction on reading this, or if I do that always (unfortunately my very awareness of this will affect any result, so I can't test this for myself).
This gives "reading between the lines" a whole new meaning ;-)
The Tao of math: The numbers you can count are not the real numbers.
One data point in a multidimensional space does not a dataset make.
Of course it does. It doesn't make a very useful dataset, and certainly not a statistically significant one, but it is a dataset nonetheless.
The Tao of math: The numbers you can count are not the real numbers.
x = [2, 8, 7, 9, -5, 0, 2]
print [xn for xn in x if 2 < xn < 10]
y = [1, -3, 10, 0, 8, 9, 1]
print [yn for yn in y if -2 < yn < 9]
print [xn for xn in x if xn in y]
Personally, I would scan the whole code , if it was short, to asses the style of whomever wrote it, plus alterations.. It's art, not science
It's funny how there's huge amounts of people criticizing flash on slashdot every day, and then we have an article with a flash video (I can only assume it's a video, since I don't run flash) attached to it.
Do these guys even know how their target audience is?
I can't believe this shit is published on a popular news website. Seriously. First, what the fuck do we learn by studying how different programmers look at different codes? My guess: absolutely nothing. Next, the "expert" programmer made two huge mistakes and didn't even realize it. Is this shit for real? When the "expert" had to find numbers between (but not including) -2 and 9, he somehow missed 8, using 9 instead. Then, when he had to find numbers that were common to both lists, he missed 66% of them because he used the wrong goddamn lists. I won't even mention the fact that you don't even need to look at the code to figure out what the functions do - they are written in plain english. I could figure out this code in 10 seconds with 100% accuracy. It took their "expert" 1 minute to do a really bad fucking job. So now we are comparing two different programmers looking at two different codes - one does it correctly, the other does shit. And of course, it makes Slashdot because you dickheads are so blinded by "hey, it's programming!" to actually curate good content. Fuck this shit, I'm going back to the NYT Science Blog, bitches.
Why? It's a reasonably good language for most high level stuff
Ultimately this study only is there to prove that programmers don't really "see" code like this http://duelofseashells.files.wordpress.com/2011/12/matrix11.jpeg
Every non-techie person I know seems to think techies see the world this way. Honestly the reality is all techies we see the world like this. http://2.bp.blogspot.com/-I1amyM5uSt4/TlG9sr4SX8I/AAAAAAAAADk/vMmsvYgqIrc/s1600/matrix-code-neo.jpg
Do all expert programmers read all code the same way? If I had a mass of uncommented C code to digest, I'd read it much differently than well-written Python or Java with complete JavaDoc comments. Wouldn't the quality of code affect how the code is read by an expert? I don't read math books the same way I read history books, either.
There is a follow-up blog post here with more data: http://synesthesiam.com/posts/what-makes-code-hard-to-understand.html and there are many more videos available here: https://www.youtube.com/user/synesthesiam/videos?sort=dd&view=0&tag_id=&shelf_index=0
I've been programming for a long time, and I still tend to scan over the code, looking at defs. Or if I'm in a proper IDE, looking at the class outline first. Then I go to the entry point, if there is one, or the init if there isn't, and read through that line by line.
Actually, the FIRST thing I try to do when I get a non-trivial program/system dumped on me is attempt to find and review the technical documentation.
THEN I start looking for/at code, based on the documentation.
In cases where the sole documentation consists of oral folklore, I spend a brief interval meditating on the best way to hunt down and kill those responsible.
How did they determine a programmer's skill level? Length of time as a programmer? Ability to find and fix bugs? Self-assessment? Some other metric? Maybe they should have asked on Slashdot how to determine a programmer's skill level. That would have yielded the one, definitive, indisputable method of determining a programmer's skill level.
An eye tracker checks 300 times per second to show what they look at as they mentally digest the script ...
The screen.