Modeling How Programmers Read Code
An anonymous reader writes "Following up on an experiment from December, Michael Hansen has recorded video of programmers of varying skill levels as the read and evaluate short programs written in Python. An eye tracker checks 300 times per second to show what they look at as they mentally digest the script. You can see some interesting differences between experts and beginners: 'First, Eric's eye movements are precise and directed from the beginning. He quickly finds the first print statement and jumps back to comprehend the between function. The novice, on the other hand, spends time skimming the whole program first before tackling the first print. This is in line with expectations, of course, but it's cool to see it come out in the data. Another thing that stands out is the pronounced effect of learning in both videos. As Eric pointed out, it appears that he "compiled" the between function in his head, since his second encounter with it doesn't require a lengthy stop back at the definition. The novice received an inline version of the same program, where the functions were not present. Nevertheless, we can see a sharp transition in reading style around 1:30 when the pattern has been recognized.'"
All we need is confirmation from Netcraft.
Trolling is a art,
This article is complete garbage. They tested 2 people with different code that produces the same results and then make up a narrative of how novice and expert coders think in different ways. Use the same code to test a much larger pool of programers and then the results might actually be interesting.
Is that video real time (adjusted for the 300Hz sample rate)? I ask because I'm not a Python programmer (I do know C, C++, asm) but about 10 seconds into the video I knew what the program would print and yet the video went on for 3 minutes. Something does not add up.
That just confirms that Soulskill is an expert editor who processes the meaning of an entire paragraph at a time, instead of scanning each word like a beginner would.
I don't even read it, I close my eyes, which are a hindrance, and use my inner eye to feel the code. I become the code. Most times I look like a bowl of spaghetti
They should link to the follow up post that talks about the experiment with 162 programmers http://synesthesiam.com/posts/what-makes-code-hard-to-understand.html. It also links to the paper that has even more information.
The code between these two individuals is completely different, even if it produces the same results. How do you discern any meaningful results out of two people reading two different sets of code?
the style of the code might also make a difference (as well as the specific languages use of form)
Code of more than average complexity also shifts the reading patterns, as would non trival agorthms that require more study to figure out what it actually does
I have my own code style that assists in fast scanning and placements of specific language features for when Im working on a project of 100000 lines of actual code (30 years a programmer and I also dont care for having 100s/1000s of tiny files spreading out content )
the display itself (and tools IDE which might not efficiently display code) - there is an effect of how much text is visible without scrolling also will have an effect
Some call centers / help desks suck with BS metrics and scripts.
We need less BS metrics as people just game them and people who do a good job have poor metric scores and some with lines and lines of bloated code gets a good score.
It seems line Hansen is still a novice too - sampling 30fps video at 300 Hz...
Excuse me, but please get off my Pennisetum Clandestinum, eh!
I'm not sure this topic requires another article:
http://developers.slashdot.org/story/12/12/19/1711225/how-experienced-and-novice-programmers-see-code
Unfortunately, this thing seems to be en vogue in the computer fashion industry. I just attended a conference where this phrase could some up a bunch of the presentations:
"We are modeling, tapping into the power of social networks, and doing visual analytics!"
I happen to be reading The Psychology of Computer Programming, Silver Anniversary Edition" right now. An interesting quote:
The only thing that's changed here in twenty-five years is the fact that the funds dedicated by executive to eliminating programmers from their payrolls have become far more staggering than I imagined back then. And, now, I finally recognize in this executive desire a pattern so strong, so emotional, that it has blinded these executives to two facts:
1. None of these schemes has succeeded in eliminating programmers . (We have now at least ten times as many as we did then.)
2. Every one of these schemes has been concocted by programmers themselves, the very people the executives want so passionately to eliminate.
So, although people say that programmers lack interpersonal skills, they evidently have a skill at persuasion that surpasses that of the late, great P:T: Barum, famous for his theory: "There's a sucker born every minute."
I guess if I need some money for something from executive, I'll tell them that I need it to model, tap into the power of social networks and do visual analytics. That ought to get me my funds.
Schroedinger's Brexit: The UK is both in and out of the EU at the same time!
And are you sure the code isn't malicious? Not a good idea.
Ignoring that possibility, I very rarely need to know what a program does- I need to know how it does it in order to fix it or add features. You don't get that from running it.
I still have more fans than freaks. WTF is wrong with you people?
Obviously, you haven't progressed past the 'beginner' stage...
Sleep your way to a whiter smile...date a dentist!
I need to know how it does it in order to fix it or add features. You don't get that from running it.
Surely once you verify a function does what it is supposed to do you don't keep going back and stepping through it again, though? (referring to Eric the expert's eye movements here). What's the point in that?
Not unless I think I missed something, or its a tightly coupled implementation with another function. My point was just that "just running" code isn't a good idea and doesn't generally get you what you need.
I still have more fans than freaks. WTF is wrong with you people?
when confronted with a situation falling within their specialized field, experts can process information in large chunks. Whereas laymen and novices tend to process things one small piece at a time; and on top of that, they flail around a lot.
Actually, that's a load of nonsense:
1) With programs that are *actually* large, you won't find "experts" that consume them in "large" chunks, unless they use very small fonts.
2) With programs that are new to the readers, you might have to read in toto them anyway. There's no guarantee that the *actual* dependencies in the code will allow you to read it in a limited or strictly hierarchical fashion. You gotta read what you gotta read. It's not like people will only shove neat and pleasurable code on you in real life. If the code is messy, your reading of it will most likely be messy, too. Especially if you hit duplications and have "wait a minute, didn't I see this somewhere else? Lemme check" moments.
Ezekiel 23:20
If you want to know what a code prints, just evaluate it, it will tell you instantly without having to worry whether you made a mistake in evaluating the code in your head.
Now if you want to understand the code, that's something else, but it's not what's asked here. The code is sufficiently straightforward that the only explanation you could give is the code itself.
You can tell it's not malicious from a glance. To be malicious it would need to access the filesystem, networking or interprocess subsystems.
> (has Dilbert thought you nothing?).
I suppose he doesn't think about many people except himself, thats why he doesnt teach anymore :)
When reading the lines
In this last 30 seconds or so of the novice video above, you can see her back-and-forth comparison of the x and y lists. If you look carefully, however, the red dot (her gaze point) is often undershooting the numbers on both lists. Why is this? While it could be a miscalibration of the eye-tracker, the participant may also have been using her parafoveal (the region outside the fovea) to read the numbers. This and the fact that foveation and visual attention are not necessarily always the same (i.e., looking at something doesn't always mean you're thinking about it) encourages us to be cautious when interpreting eye-tracking data.
I suddenly realized that while reading those lines I indeed was focusing below those lines. I have no idea if this was an unconscious reaction on reading this, or if I do that always (unfortunately my very awareness of this will affect any result, so I can't test this for myself).
This gives "reading between the lines" a whole new meaning ;-)
The Tao of math: The numbers you can count are not the real numbers.
How would a 9 end up in the last line if the second line has no 9 in it?
As far as I can see, the third line should only have an 8, nothing else.
The Tao of math: The numbers you can count are not the real numbers.
OH, BTW, I just tried to print an array with Python, and it *did* output brackets and commas. So the novice was right and the expert wrong in that regard.
The Tao of math: The numbers you can count are not the real numbers.
One data point in a multidimensional space does not a dataset make.
Of course it does. It doesn't make a very useful dataset, and certainly not a statistically significant one, but it is a dataset nonetheless.
The Tao of math: The numbers you can count are not the real numbers.
It could output a text which, when read, causes harm to you. Like the funniest joke of the world.
The Tao of math: The numbers you can count are not the real numbers.
Why just not read the comment that tells you what it does?
Then it would be a good death.
Personally, I would scan the whole code , if it was short, to asses the style of whomever wrote it, plus alterations.. It's art, not science
It's funny how there's huge amounts of people criticizing flash on slashdot every day, and then we have an article with a flash video (I can only assume it's a video, since I don't run flash) attached to it.
Do these guys even know how their target audience is?
Why? It's a reasonably good language for most high level stuff
x = [2, 8, 7, 9, -5, 0, 2] print [xn for xn in x if 2 < xn < 10]
y = [1, -3, 10, 0, 8, 9, 1] print [yn for yn in y if -2 < yn < 9]
print [xn for xn in x if xn in y]
Funny thing is, I find that easier to read and understand than the original. It's like "make a list of the elements in this range and print it", twice over, and finally "make a list of the stuff in x that's also in y and print it".
By the time I'm working on a program I'm already told what it does. I don't think I've ever gone into a situation that blind. What I don't know is how it does it.
I still have more fans than freaks. WTF is wrong with you people?
There is a follow-up blog post here with more data: http://synesthesiam.com/posts/what-makes-code-hard-to-understand.html and there are many more videos available here: https://www.youtube.com/user/synesthesiam/videos?sort=dd&view=0&tag_id=&shelf_index=0
I've been programming for a long time, and I still tend to scan over the code, looking at defs. Or if I'm in a proper IDE, looking at the class outline first. Then I go to the entry point, if there is one, or the init if there isn't, and read through that line by line.
Actually, the FIRST thing I try to do when I get a non-trivial program/system dumped on me is attempt to find and review the technical documentation.
THEN I start looking for/at code, based on the documentation.
In cases where the sole documentation consists of oral folklore, I spend a brief interval meditating on the best way to hunt down and kill those responsible.
How did they determine a programmer's skill level? Length of time as a programmer? Ability to find and fix bugs? Self-assessment? Some other metric? Maybe they should have asked on Slashdot how to determine a programmer's skill level. That would have yielded the one, definitive, indisputable method of determining a programmer's skill level.