It's Not About Lines of Code
Charles Connell writes: "What makes a programmer highly productive? Is it lines of code per
day? Lines of good code? In this article, I examine the concept of software
productivity. I look at some of the standard definitions for productivity
and show why they are wrong. I then propose a new definition that captures
what programming really is about." Read on for Connell's stab at a better way of evaluating the worth of programmer time.
CT Originally the contents of an article were here but there was
a communication problem resulting in us thinking we were given permission to
print the article here. Now that things have been cleared up, we've linked
the
original article which you can read instead.
Sorry about the inconvenience.
They just don't apply to this art/science. Would Michelangelo's boss have put him to task for square inches/day or pounds of statue/week output?
Dude, buy a copy of DeMarco/Lister's "Peopleware", original edition is circa 1985. Your "revelation" is old news and you offer no substantive recommendations for actually helping management measure or actuate programmer productivity. The Peopleware book is factful and entertaining and reaches no better conclusion than you have. After 17 years, don't you think your postulations should improve on previous work. Have you done any research on prior publications?
In a commercial setting, the awnser is obvious: how much money the software makes is how to measure the programmer's acheivment.
In a different setting, it's not as clear......
As far as "what makes a programmer productive", I know what makes a programmer unproductive... reading Slashdot all day. Back to work, all of you!
I don't care if it's 90,000 hectares. That lake was not my doing.
Phew, what a long winded way to say: KLOC in any form is a useless metric.
I was rather hoping for positive suggestions regarding better alternative, and especially some shiny references to back them up for when I take them to my boss.
The best metric I've found is simply "Time until feature complete". Just that. Elapsed time between a feature being requested and it going live in the field with no bug reports coming back. Anything else is largely bunk. Trouble with that is that it's very hard for twitchy bosses to deal with the interim stages. "Time to code complete" is easier for them to monitor and deal with, but as anyone who has actually supported a product will know, that's only the beginning of a piece of software's life. Push the time to code complete back by a week, and you can save yourself month of grief later. ;-)
If you were blocking sigs, you wouldn't have to read this.
Fortunately, you can tell a programmer's personality type by the code they write - it is all explained in this paper by Kevin Marks & Maf Vosburgh
There are various types of programmers around. We've certainly worked with a wide selection. Over the years, we've come to realize that programmers can be divided into various "personality types". You don't stay the same personality-type your whole life though -- as you develop and learn, your approach to programming changes and that change is visible in your code. We're going to look at various functions and how programmers with different personalities would write them.
MacHack attendees have normally been around the block a few times. That means they have learnt various things, like when you're going around the block, it helps to watch where you're going, and be driving a tank. We know that a function has important responsibilities. It needs to check every error code, keep track of every byte it allocates, and that function needs to know how to cope with anything that happens, cleaning up perfectly after itself and returning an error code which explains what went wrong. But in order to write code like this you have to have made mistakes and learned from them. We know we have...
; )
"If being a geek means being passionate about something, then I pity those who aren't geeks." - Pike65
I work on contracts for commercial software and it is amazing how much code people can write and not comment it. I had to change the functionality of some program once and it took me 5 days to write 3 lines of source. Why? Because I had to wade through code with variable names like "int32 data[7];". As a bonus there were hardcoded numbers to the variable. I had to do hex dumps at one point to see where the data was being used and how.
As I shouldn't even have to say... commmenting your code improves productivity A LOT. Some people say you shouldn't comment code in a commercial product because then you can easily be replaced. My response to that is, why don't you do good work then you won't have to worry about being fired?
If I had an employee who's not commenting his code, that means the next coder that tries to change something is going to spend a bunch of completely unproductive days just trying to figure out what's going on. I think I'd fire the employee because of his incompetence and the amount of time/money he going to make me waste.
Outdoor digital photography, mostly in New Engl
Don't get me wrong, commenting your code is a must.
However, I would rather have a programmer who writes easily understood code but doesn't document it well than one who writes well documented but overly complex code.
I've worked on large projects where there was nearly a 1:1 ratio of comments to code, but the comment didn't help you see the big picture because the parts of the application were too far abstracted from reality. And the code was written in strange ways that made it hard for other people to understand.
In summary, the code can and show be written so that most of it documents itself. If the application is well designed and the code is written well, the need for a lot of in-code commenting goes way down. This is assuming we're not talking about assembler, which in my opinion should have a nearly 1:1 ratio of code/comments.
I want to add another angle... I managed validation. I viewed our job as reducing 1-800 support calls to zero. In the end, support costs need to be rolled into productivity numbers for develpers also. A couple of support calls from a single user can easily make the gross margin for the sale to that user negative. (And for you "free beer" programmers -- same thing applies, wouldn't you rather write code than spend time supporting users?)
And a closing note: A wise manager once said: "Obviously you want smart, productive people on your project. Note that dumb, unproductive people are relatively harmless, because they are not productive enough to cause much damage. What you need to watch out for are dumb, productive people."
Having any defined metric is (IMHO) a Bad Thing in the long run, for the simple reason that people will sooner or later start gaming the metric. If you reward lines of code you get lots of lines of code. If you reward feature points you get lots of features. For a while I tried more abstract things like "user satisfaction," but that started drifting into the "The Customer Is Always Right" syndrom, with all the feature creep and bloat that goes with it. Using "my satisfaction as your manager" is even worse; brown-nosers are a danger to anyone undertaking a team effort with any element of risk.
So I started wondering: do I realy need to measure productivity at all? Why do I care? The bottom line was, I don't care. I'm not interested in "producivity" any more than I am in "attendence." At this point, I tell people if you want to know what your score is, play a game, open an on line stock market account, or post messages on a web page that keeps track of karma. In this team, the focus is on getting the job done, not on keeping score.
-- MarkusQ
Ummm, this appears to be a regurgitation of a segment from Triumph of the Nerds . With the Microsoft guys saying that productivity should be based on getting a problem solved vs. the IBM guys saying that productivity should be based on LOC or KLOC (thousands of lines of code) or MLOC (millions) etc.
Being a "Data Miner" myself, I can certainly agree with the problem-solving-as-productivity approach, rather than the "how many inner joins can I throw at this to make it look like I am busy" approach.
Actually, the LOC as productivity is so foreign to MY thought process that I can not comprehend why anybody in management or in direct labor would bother to think about it.
Eve Fairbanks says I drive a hybrid!LOL
Agreed. I think it was Dijkstra who argued that if Lines Of Code are counted, then the number should be viewed as a liability rather than an asset. That is, LOC are not something we produce, but something we spend.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
And when Ingrid Insightful finishes her day job, she heads off to her night job at the freak show as the bearded lady
Other useful metrics:
Spelling errors per line of documentation
Size of chopstick collection
Volume of spam on harddrive
How many years out of fashion clothes are
Months since last date
Weight of programming manuals in personal collection
Accumulation of fast food and junk food detritus on keyboard
How long to gnaw leg off to escape meeting
How many minutes can talk in jargon and acronyms alone
Number of hours will voluntarily work if just left alone to do the damn thing
Age of most out-of-date, yet essential, book and when it became out of date
Serverity of unintelligible handwriting because everything is usually typed
Increase in heartrate when new technical journal arrives
Depth of paper, notes, cans, wrappers, computer bits, et al piled on desk
Ability to quote from any Monty Python show, movie, recording, book, without error.
Proportion in size of editor macros relative to actual code
A feeling of having made the same mistake before: Deja Foobar
"I have made this letter longer than usual, because I lack the time to make it short."
-- Blaise Pascal
If anyone deserved to have a programming language named after him, it was the originator of this quote. I just wish it had been a more concise and expressive language.
This guy clearly hasn't read The Mythical Man-Month. He should.
About 2 years ago, I was working on my first major project and the project manager called me one day out of nowhere to ask where my progress was (normally we covered this in a weekly meeting). I started giving him percentage estimates based on feature completeness, structure completeness, etc.
So then he asks "how many lines of code do you have?" I tell him that I don't use that as a gauge, I use what I just told him for my progress. Also told him that I don't count lines. He persisted, so I came up with a rough count. He says "so if you say you're at 60%, and have X lines of code written, then you'll have Y lines when you're done, right?"
I had to reiterate (for the third time in that phone call) that LOC means nothing - it may very well be that I only had 100 lines left to put together, but it would tie up the remaining functionality needed (by gluing all my pieces together).
But he just kept coming back and harping on that LOC number, no matter how I tried to persuade him that it was meaningless. He was convinced that this was how he would know how much work went into the project. I guess the 3 weeks of writing very little code and charting out the logic of the app didn't mean much to him. He was taken aback when I told him "I don't just start writing code on day 1, I plan things out"
The article originally appeared here last week. Sheesh. Don't pretend it's an original Slashdot article if it's not.
I like commentParserDatabaseCursor. I also like i, j, p, and retval. The length of an identifier should be roughly proportional to the log of the size of its scope. A file scope "i" is an abomination. A loop scope "commentParserDatabaseCursor" is an idiocy.
Boss: How many lines of code did you do today?
Coder: 1
Boss: [next day], how many lines did you do today?
Coder: 1
Boss: [day 3] how many lines did you do today?
Coder: 1
Boss: how come you only do one line per day
Coder: Actually I'm working on the same line.
Boss: How many lines is the damn program?!?
Coder: 1
Boss: You're programming in Perl again arent you...
Number of lines of code written, highest score wins. In short, why write in 100 lines what you can write in 1,000?
Number of lines of code written, lowest score wins. You end up with your own obfuscated coding contest. You might also find people rewriting other people's work, redesigning APIs and other infrastructure components, for no reason other than to lower their own score. This can lead to total chaos, fights in the parking lot, etc.
Number of good lines of code written a month. What's the definition of "good," and how subjective is it? If it includes comments, then how is the usefulness of a comment determined? I've seen developers write more comments than code, and in the end the comments said nothing useful that would've helped a new developer maintain the code.
Number of bugs fixed in a month. The programmer who spent a month researching the sev 2 bug that was affecting system availability or data integrity for the last three months isn't recognized for his/her achievements, while the intern who fixed 100 bugs pertaining to typos on the website and in the documentation is rewarded.
Number of bugs created in a month, lowest score wins. Nice idea, punishing people for creating bugs, but people might get so paranoid about causing bugs that the turnaround time for code is obscenely high.
Code complexity metric, lowest score wins: All this proves is that the programmer's capable of writing non-complex code but says nothing about the documentation for the code, the overall design of the component or subsystem the programmer was working on, etc.
Number of tasks completed in a month. This screws the guy who's got a hefty task that cannot be divided any further or the programmer waiting on the sysadmin to install the necessary development tools so that he/she can actually get started.
Customer satisfaction. The customers are pissed because the website is unstable, but the rest of the back office system is running perfectly fine. In the end, everyone -- the back-office developers, the database guys -- is punished when only the website people should've been put on call center duty for a week.
Number of customer issues resolved. There's a great incentive here to solve issues with kludgy hacks which, in six months to a year, might leave the company with a very flaky, unmaintainable system.
360-degree input, or "Mutually Assurred Destruction". This was a system IBM used -- still uses? -- where your peers, some picked by you, some by your manager, would fill out a survey and offer opinions about you. The manager would then piece it all together and come up with a result. Like Dilbert called it, it's "mutually-assurred destruction," although I saw it work the exact opposite way many times.
There's so much more that goes into developing and delivering software than just writing lines of code. And the number of lines of code written isn't all that significant if the design sucks, if the documentation is unusable by the people who need it, if the call center people supporting the thing aren't trained properly, or if the systems supporting the website or the database are unstable. How do you put a score next to a name when many of the things contributing to that score are subjective or out of the control of the person being scored? We're not building CD players here!
This guy is "president of CHC-3 Consulting, teaches software engineering to corporate and university audiences, and writes frequently on computer topics". Still he failed to mention function points (an old measure of product size) or use cases (a more modern measure of product size).
He also fails to recognize that programming is a group activity, where one person can be seemingly unproductive, but in reality being vital for the productivity of the group. Typical such persons are mentors, which spends some of their time helping others. Mentors may not produce a single line of code, but still be the most valuable person in the group.
Alistair Cockburn does in his modern classic "Agile Software Development" state that software development is a "Cooperative Game of Invention and Communication". Therefore the productivity is best measured at the team level, since they are, in the end, cooperating.
Also, I think it is quite clear that use cases, or user stories, or whatever you wish to call them, are the best way to describe the wishes of the customer. Fulfilling these wishes are ultimately the only thing that matters.
So, I would say that the number of finished use cases per unit of time, for the whole team, is by far the most meaningful measure of productivity.
Mats
You should strive to make your design docs just good enough for the people who'll be reading them -- the maintenance programmers, who will also have the code. In other words, the design docs are the cliffnotes to the code. The code is always the authoritative design documentation.
BTW, I STRONGLY recommend reading Agile Software Development for anyone who's seriously interested in these issues.
The firm I'm working for hired a contractor to write the system software for their scientific instrument.
... other conditional code // to exit
...
They later hired me.
It is now my job to maintain, expand and rewrite his original code as the device gets further from the prototyping stage.
Here's some metrics for you:
4000 lines of C code
Avg. Variable name length: 3 to 4 characters
Avg. Function Name Length: 3 to 4 characters
Total number of functions (not including state-machine functions): 9
Amount of documentation: nearly 0. Comments are laughable.
Total number of functions, including state-machine functions labelled from stsws0 through stsws30 - 40.
Total number of constant values used without #defines or assigned names: Too many to count.
Amount of documentation of constant values that aren't obviously for buffer/scratch space, but actually do something important: Zero.
Amount of dead code: Current investigation indicates somewhere between 30% and 60%.
Amount of dead code interspered with live code, so it's really difficult to work out what's a dead function, and what's live: All of it.
Level of interweaving of dead code and live code: pretty damn high.
Use of pound-defines for code switching and giving alternate code paths: Zero.
Use of pound-defines to switch blocks on and off that really you want to keep on ALL THE TIME (particularly as the app crashes if you turn them off): 10
Interesting idioms:
-- Use of pound-if(0) and pound-endif to bracket (useless) comment sections, eg:
pound-if(0)
This is a comment.
pound-endif
-- Use of a while loop to do error handling. Eg;
while (TRUE) {
if (condition) { error = 5; break; }
break;
}
if (error)
Number of pound-includes that are actually totally unneeded:
5.
Number of Windows 3.1 programming idioms used: 2 found so far. In a piece of code that *requires* NT4 and as such is Win32 only.
Other interesting idioms: Massively nested if's EVERYWHERE. Very little modularisation. Grabbing an HDC at that start of the app and not letting it go until shutdown, without specifying a Class DC.
The guy REWROTE FROM SCRATCH button controls and edit controls, using WM_MOUSExxx message and WM_CHAR handling, as part of the main frame window. Each edit/button has a separate cut/paste if statement block to handle it. There are about 80 controls on the main screen. This code is cut and pasted for each single control.
And for the final piéce de resistance;
Total number of local variables used: ZERO. 0. Nada. Zilch. EVERYTHING IS A GLOBAL VARIABLE.
Coming soon - pyrogyra