How Experienced And Novice Programmers See Code
Esther Schindler writes "We always talk about how programmers improve their skill by reading others' code. But the newbies aren't going to be as good at even doing that, when they start. There's some cool research underway, using eye tracking to compare how an experienced programmer looks at code compared to a novice. Seems to be early days, but worth a nod and a smile."
Reader Necroman points out that if the above link is unreachable, try this one. The videos are also available on YouTube: Expert, Novice.
I see Blonde, Brunette,...
At this moment, novice programmers think the network is down. Experienced programmers know the site from TFA has been slashdotted.
I imagine one of the first things a programmer learns about reading code, if they're going to be any good at it, is to skip over the inline comments. Reading them will only prejudice your interpretation of the code in favor of the original authors expectations, preventing you from seeing what the code is actually doing.
Comments are useful when you come across a block of code you can't otherwise understand, but the rest of the time they tend to either duplicate information which is already in the code, or confuse matters by being vague, misleading, or just plain wrong.
High-level documentation of modules and functions is invaluable, of course, but those comments should be in a block of their own, or even a separate file, and not be mixed in with the rest of the code.
"The state is that great fiction by which everyone tries to live at the expense of everyone else." - Bastiat
My personal story:
When I was but but a wee lad skill-wise, I read other people's code so I could figure out how it worked. I knew WHAT it was doing, just not HOW.
Now when I read other's code it's usually either to figure out what it's doing or, more likely, what it's NOT doing that it's supposed to be doing.
There's another difference:
Now I can skip over the code that isn't "interesting" and zero in on where I think the bug is or where I think the "undocumented feature" is. My younger self had the luxury of time to study every line and learn as he did so.
By the way, when I'm trying to learn something totally new, I do revert to "the way of the young learner." Why? Because it works when I'm starting nearly from scratch.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
That's actually pretty scary.
Novice programmers are simply overwhelmed by vast amounts of code, and have no idea how to do large-scale software development.
When you teach them about tools that allow you to find your way through the code, they're all impressed.
Universities simply aren't teaching these boys right.
I don't even see code anymore. All I see is Blond, Brunette, Redhead ...
"I use a Mac because I'm just better than you are."
A Novice sees it done wrong
A Master sees it done differently
Join the Slashcott! Feb 10 thru Feb 17!
The guy who was tested in the video has a blog post up (and his server actually works): http://blog.theincredibleholk.org/blog/2012/12/18/how-do-we-read-code/
Direct youtube links to the videos:
Video 1
Video 2 (Novice)
Its not what it is, its something else.
Young programmers see code as a way to show how good they are
Old programmers see code as something that puts money in the bank.
What off topic ?
The Cloud - because you don't care if your apps and data are up in the air.
The algorithms required to make a mainframe work vs. a distributed system are vastly different. Your failure to recognize this is probably why your thoughts are outdated.
I don't know if this correct or PC, but when I first started I read code like a book. Every single word, left to right, down to the next line.
Now I skip.
I look where arguments go in, where they come out, where functions are called.
If I go into a code block, I look where the result is first and back track as needed
I look at the connections and only go deeper when I need to.
If I have to read legacy procedural code, I do what I do when I read the web. I ignore sub levels of information until I finish a level. On a web page I read the content, ignore the side bars and I don't click on links until I am done with that page. In procedural code I read top down and ignore nested blocks until I see what is going on the first level.
Just a FYI, the test is really poorly done because the code that the novice and the expert are looking at are different. I can read the Expert's code and figure out what the output should read in no time. The inlined code, on the other hand, I have to do the full iteration for each loop. It's really a test fail.
Any tutorials or articles, etc. that start out referring to someone as "newbie", "newb", etc. should be automatically labeled as worthless. Every team of programmers--it doesn't matter if it's a team of 2, 10, 50...--always has someone that can be called a "weak link" simply because you're always going to have someone who just isn't as fast or efficient as everyone else. And if programming is more of an art form (which I personally believe it is due to having millions of ways to skin cats), then it's apt to claim that nobody can be as good as everybody. Everything depends on what exactly is being done, what technologies are being used, frameworks, functions, who's involved, what business logic is required, what data is being worked with, etc., etc., etc. I've seen people who sucked at back-end stuff rock it out with front-end and others, vice-versa. I've seen people excel with certain technologies turn around and completely blow with others but one reason some of these supposed GODS suddenly sucked has nothing to do with their understanding or capabilities but instead almost always had to do with how their resources were being used... There is no such thing as a newbie. It's an epiphany brought about by someone's nerby boner culture. Constantly using the phrase "newbie" and all that other juvenile bullshit is old news and something someone does when they're bored and looking to put people down just to make themselves feel better about their world. Stop referring to people like this because some of the most seasoned people could be painted with that brush depending on how you view things...
An article that could have been astounding is on a site that got Slashdotted. Perhaps they need some experienced web programmers.
Sigs. We don't need no steenking sigs.
...redhead.
It's different because significantly more of the processing and display logic is distributed to the clients. Additionally clients have local storage and processing capacity which if programmed correctly may continue to operate in some fashion in the face of network failure.
Also, it's not freaking COBOL, not that I find Ruby or Python much preferable (Aaaaand there goes my good karma)
This was written about maintaining old code, but I'd argue it's application is broader
Which has more power: the hammer, or the anvil?
View: http://www.youtube.com/watch?v=Jc8M9-LoEuo
Cache:
http://webcache.googleusercontent.com/search?q=cache%3Asynesthesiam.com%2F%3Fp%3D218&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
I didn't expect to get Slashdotted, and now my website is down! :)
Here's my friend's blog post that started this all off: http://blog.theincredibleholk.org/blog/2012/12/18/how-do-we-read-code/
Maybe his website will stay up
Mike just posted a mirror of his post on another server. Hopefully this will hold up better under load.
http://www.cs.indiana.edu/~mihansen/modeling-programmers.html
It'd be interesting to see if it's similar to novices & experts with non-programming languages (but using a non-native language of course). You will have to give some lean for the more linear flow of non-programming languages (especially if the code has multiple threads).
It's also different if I'm trying to figure out a bug or if I'm trying to figure out the output (like this example). (This is assuming I don't have a compiler and can't run the code.) If I'm trying to figure out the output, I jump to main first and follow the breadcrumbs. If I'm trying to figure out a bug without a stack trace, then I'd either start at where I think the problem is occurring or gloss over the code quickly first and then go back over each line methodically. Usually if it's a novice's code, I can just gloss over the code quickly and grab the problem. I would say that this is similar to a chessmaster who has solved countless chess puzzles. They have thousands of patterns already memorized, so they can get a solution much quicker. Actually, I bet the eye movement of chess players between novices and experts is closer to programmers. If looking for checkmate, start at the king and follow the breadcrumbs. Gloss over the image to see if any neurons fire a memorized pattern.
The G
The algorithms required to make a mainframe work vs. a distributed system are vastly different. Your failure to recognize this is probably why your thoughts are outdated.
Mainframes are distributed systems in a big iron box. There's a reason you can hot swap CPUs and RAM in a mainframe. It's because everything is redundant, distributed, and aware of other components.
The cloud is mainframe that sacrifices speed for physical separation (to prevent failures due to loss of power, or disasters). The algorithms are only "vastly different" if you're too dumb to see past IPv4 and understand what the damned thing is actually doing and why.
I'd like it if I could adopt this tool as a pre-screen for interviews. It would give me some measure of experience based upon a known set of data, and might even help me understand an individuals programming style.
Someone who just took a programming course with no prior experience is most definitely a newbie...
And someone without a lot of experience in a given area is by definition a newbie in that area. (Or if you don't like the term, call it "inexperienced".)
I've been doing mostly linux kernel hacking and low-level POSIX stuff for 10+ years. If I needed to do some database stuff I'd be a total newbie and likely to make all the usual mistakes (but I've got enough experience to know this and to at least try and find out what the common mistakes are first).
When I first started programming, I tried to write elegant functions.
Later, I tried to package that into elegant libraries or modules, and my code became clearer and more structured.
Along came object oriented languages, which made modularity natural and clarified the code further.
Nowadays a complex system looks like a template-driven fractal to me, a thing of rigorously standardized crystalline beauty.
I do not fail; I succeed at finding out what does not work.
The expert colleague actually got the *wrong* answer for xy_common. To me, that is the most interesting part of this experiment! What does that tell us? Maybe the "experts" could learn a thing or two from the novices, e.g., slow down a bit and verify results :)
When I worked as a COBOL programmer in the 1980's a co-worker sitting opposite me consistently ignored the habit we had of vertically aligning TOs in series of MOVE ... TO ... statements. For the younger ones: we worked with dumb 3270 terminals that didn't show much code at once, compiling code was slow because we had to submit jobs with a much lower priority than production jobs, so we would routinely print sources on fan-fold paper and work from there. Vertically aligning the TOs and the receiving variables in assignments was important to be able to quickly find the statement you were looking for. I tried to make it clear to this guy his code was difficult to read (which many complained about), but he kept insisting his way was better. At one point I started observing him while he was studying printed sources. His eye movements were a surprise. While I was used to scan code (the TOs) vertically to quickly locate the assignment I was looking for he would consistently read line after line from left to right, and he did that at an amazing speed. Aligning the TOs vertically created a lot of white space within statements, and it turned out he had trouble staying on the same line while skipping over it. It had never occured to him that other people look at code differently, and he had never understood the criticism he got until we discovered this. He was a very experienced programmer, by the way, and while being very pigheaded he's also the most brilliant problem solver I've ever worked with.
Because of this I have been aware for decades that eye movement patterns can correlate to layout preferences. I would not be surprised at all to learn that preferences in where to put curly brackets correlate to similar differences in the way different people process visual information. One of the reason I personally like Python is because the lack of visual clutter created by curly brackets and semicolons vastly improves readability for me. Obviously not everyone agrees. I think my preference can be explained by the fact that I'm very poor at shutting out distracting stimuli, even at that level. That would likely show up if my eye movements were tracked while studying code.
It seems to me that experience (knowing what to look at) is just one of the factors that influence eye movements.
At least point to something relevant: Sorting algorithms as dances.
You want the taste of dried leaves boiled in water?
This student's work is interesting, but there is much more mature work on the subject, e.g.:
http://dx.doi.org/10.1145/2168556.2168642
http://dx.doi.org/10.1007/s10664-012-9201-4
http://dx.doi.org/10.1016/j.scico.2012.01.004
http://dx.doi.org/10.1109/ICPC.2012.6240505
Experienced sees top down design from code.
Novice sees bottom up design from code.
Casteism