It's Not About Lines of Code
Charles Connell writes: "What makes a programmer highly productive? Is it lines of code per
day? Lines of good code? In this article, I examine the concept of software
productivity. I look at some of the standard definitions for productivity
and show why they are wrong. I then propose a new definition that captures
what programming really is about." Read on for Connell's stab at a better way of evaluating the worth of programmer time.
CT Originally the contents of an article were here but there was
a communication problem resulting in us thinking we were given permission to
print the article here. Now that things have been cleared up, we've linked
the
original article which you can read instead.
Sorry about the inconvenience.
They just don't apply to this art/science. Would Michelangelo's boss have put him to task for square inches/day or pounds of statue/week output?
Dude, buy a copy of DeMarco/Lister's "Peopleware", original edition is circa 1985. Your "revelation" is old news and you offer no substantive recommendations for actually helping management measure or actuate programmer productivity. The Peopleware book is factful and entertaining and reaches no better conclusion than you have. After 17 years, don't you think your postulations should improve on previous work. Have you done any research on prior publications?
In a commercial setting, the awnser is obvious: how much money the software makes is how to measure the programmer's acheivment.
In a different setting, it's not as clear......
One of the things that makes good programmer full stop is not worrying about trying to measure the imeasurable.
Cheers,
Ian
It compared the lines of code and number of bugs to the salaries. Of course it said it was cheaper to hire a super programmer. But, it found that the only difference between the average programmer and the incompetent programmer was the number of bugs generated, not the lines of code.
People need to be reminded of the high cost of debugging. It takes a long time to track down a bug.
Fight Spammers!
Your manager doesn't care how many lines of code you do or don't write. He doesn't care what those lines do, or how they work. That's because the client or customer doesn't care about those things. All they do care about is features: Did you add what we needed to add today? Did you finish ahead of schedule or behind it? Will we deliver on time or a week later?
Optimize on your own time. All the non-developers care about is what gets into the final product, and if you meet the list of desired features, then you're productive. End of story.
I work on contracts for commercial software and it is amazing how much code people can write and not comment it. I had to change the functionality of some program once and it took me 5 days to write 3 lines of source. Why? Because I had to wade through code with variable names like "int32 data[7];". As a bonus there were hardcoded numbers to the variable. I had to do hex dumps at one point to see where the data was being used and how.
As I shouldn't even have to say... commmenting your code improves productivity A LOT. Some people say you shouldn't comment code in a commercial product because then you can easily be replaced. My response to that is, why don't you do good work then you won't have to worry about being fired?
If I had an employee who's not commenting his code, that means the next coder that tries to change something is going to spend a bunch of completely unproductive days just trying to figure out what's going on. I think I'd fire the employee because of his incompetence and the amount of time/money he going to make me waste.
Outdoor digital photography, mostly in New Engl
Don't get me wrong, commenting your code is a must.
However, I would rather have a programmer who writes easily understood code but doesn't document it well than one who writes well documented but overly complex code.
I've worked on large projects where there was nearly a 1:1 ratio of comments to code, but the comment didn't help you see the big picture because the parts of the application were too far abstracted from reality. And the code was written in strange ways that made it hard for other people to understand.
In summary, the code can and show be written so that most of it documents itself. If the application is well designed and the code is written well, the need for a lot of in-code commenting goes way down. This is assuming we're not talking about assembler, which in my opinion should have a nearly 1:1 ratio of code/comments.
I've always operated like Ingrid Insightful - I just can't convince managers to agree with me. If I could I'd make mid-day trips down to the Art Institute, or just go for a stroll while thinking about anything but the problem (strangely, the answers always come to me as soon as I get my mind far enough away from the problem that I can see the big picture clearly ... or maybe it's not so strange).
Unfortunatly, I'm doing consulting work and there's something about the client prefering to pay for time on site. Suggestions for beating these concepts into management?
Having any defined metric is (IMHO) a Bad Thing in the long run, for the simple reason that people will sooner or later start gaming the metric. If you reward lines of code you get lots of lines of code. If you reward feature points you get lots of features. For a while I tried more abstract things like "user satisfaction," but that started drifting into the "The Customer Is Always Right" syndrom, with all the feature creep and bloat that goes with it. Using "my satisfaction as your manager" is even worse; brown-nosers are a danger to anyone undertaking a team effort with any element of risk.
So I started wondering: do I realy need to measure productivity at all? Why do I care? The bottom line was, I don't care. I'm not interested in "producivity" any more than I am in "attendence." At this point, I tell people if you want to know what your score is, play a game, open an on line stock market account, or post messages on a web page that keeps track of karma. In this team, the focus is on getting the job done, not on keeping score.
-- MarkusQ
Agreed. I think it was Dijkstra who argued that if Lines Of Code are counted, then the number should be viewed as a liability rather than an asset. That is, LOC are not something we produce, but something we spend.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
I work for IBM as a software engineer and one time in a meeting a mid-level manager actually uttered the phrase:
"Its been shown that code reviews increase quality by 94%".
I knew then that these people don't have a clue what programmers do. They want to quantify something that is qualitative. You just can't fight someone who lives in a contradiction.
Apoptosis
isn't who was more productive, Fred or Danny, in the situation above, but who was more productive if Fred wrote his 5000 lines in 5 days and got it done, and Danny wrote his 2500 lines, took him 10 days to get it done.
That's the dilemma facing project managers-- is it better to "get it working now" or "ease extensibility/maintenance later." There is no hard and fast solution. It's different for every project, and often misjudged, in part because you can't see into the future to determine its lifespan. Of course everyone wants two Dannys who get done in 5 days, but that's not the real world scenario.
Measuing the productivity of a software "designer/engineer/coder" or whatever you want to call them, is a very difficult thing to do. On our current project we are using a third party tool that is riddled with bugs, unfortuantly due to management desicions we are unable to ditch the product and search for a different tool. However, my team has remained highly productive during this past month while working with the vendor to solve the issues.
Have we produced many lines of code? Not really, probably around 9000 lines between the 6 of us over the past month, that is about 1500 lines per programmer... in over a month. According to any logic using "lines per day" or anything along those lines, we are in horrible shape. However, we have been solving many of the issues with the vendor, scouring over lines of code to ensure that the tools are working correctly, changing and tweaking our testing classes (Java) to ensure that everything is truly working the way that it is supposed to be.
Now, with about a month of time wasted according to typical programmer productivity analysis, we have a decent library of functionality built up (or easily migrated into a library), we are very familiar with the product that we are using and the APIs, and will probably come in on, or before schedule.
Was that time wasted? Were we unproductive? I would say no to both of those questions. Yes working with a vendor with broken software was frusterating and time consuming, however we now have an intamite knowledge of a third-party "black box" and we have, in the process of working with them, built up a suite of test cases that will help us immensily in the near-term future.
But, we only turned out 1500 lines per programmer in a month you say. However do to all of the debugging work, and API "scouring" we have done, we will probably be able to turn out closer to 500-1500 lines of good well documented and working code every day or so.
Well my point is simply this: Lines of code per day is simply not a good analysis. The best way to determin productivity is on a per project basis. How is the project coming? Are the objectives being met, are you solving the problems that are coming up in a timely fasion? There is no final answer, it must be evaluated per-project, per-team, per-company.
-ryanCharles Connell is playing the part of Danny Designer, writing an article which restates ideas that have been stated before. You, on the other hand, are playing the part of Ingrid, finding the old information and simply re-using it. I'm sure this is what Chris was hoping you would do, in a nifty reality-hack kinda way...
Um, yeah.
This is yet another case of trying to quantify something that is qualitiative. It's is pointless to try to measure somebody's quality as a programmer (or as anything else for that matter) by using some numerical assessment. The examples above demonstrate that clearly, but here's a couple more examples:
Which is more valuable, a programmer who churns out 1000 lines of code/day but very reclusive or the one that does 500 but is also good at communicating project directions with others?
Which is more valuable, an inexperienced programmer who learns quickly or an experienced programmer who doesn't?
if you want to know how good a programmer is to ask them the right questions. I'm not sure exactly what those questions are, it depends on what you want out of them. But I've been on many interviews and it's amazed me the vast differences in interview quality. People who are trying to measure the quality of a programmer by "lines of code" are setting themselves for lots of problems.
I think I was asked once to estimate lines of code I've written and I had NO idea. Frankly if somebody did know the answer to that question I'd be concerned. It sounds like somebody who's too busy keep track of the metrics that imply their skill rather than actually doing good work. These are likely the same people who are staring at the clock for the last 15 minutes of the day, constantly estimating the minimum amount they need to do to get by.
This sig has been temporarily disconnected or is no longer in service
"I have made this letter longer than usual, because I lack the time to make it short."
-- Blaise Pascal
If anyone deserved to have a programming language named after him, it was the originator of this quote. I just wish it had been a more concise and expressive language.
Using problems solved per day wouldn't be good. For one, some problems and projects can be spread over days, weeks, and months. This means that the number of problems solved will be less for someone working on a harder project than someone who works on simpler projects (and problems).
Also, I would also have to disagree with the article. The following comment really isn't true: "[h]is code is shorter and simpler, and simplicity is almost always better in engineering". The larger code code be contained in well thought out objects, or functions, that make the reading of the main Objects, or functions, simpler to read, and the main bulk of the code if well engineered could also be reused easily.
In his above example, Danny's code may not be anymore reusuable than Fred's. Danny's smaller number of lines per code could be result of not taking into account all necessary possibilities, or not thinking about possible future problems.
The size of the code just doesn't matter. What does matter is how well the code is thought out and commented (both insertions and deletions of code). Well thought out code usually can produce some reusuable code and/or design patterns. Personally, as a rule of thumb, I don't like to write code more than three times for a given task or problem--unless I absolutely have to write it more times.
At the next eco-hypocrisy-meeting, count the private jets used to get to the meeting. Should be interesting to see that
The writer is truly missing the point regarding the purpose of measuring performance regarding lines of code.
Source Lines of Code (LOC or SLOC) are used, by management, to get an understanding of the overall productivity of software engineers in general. It is not an end-all,be-all rule regarding software engineering.
If you take a sampling of 100 good programmers, given clear requirements, and measure their performance, you will be able to determine the overall productivity for a single engineer on a per day/week/month/year basis. This allows managers to make some determinations regarding project planning, enhancements, changes, and yes, to some degree, the performance of engineers.
For example, if I know that my engineering group of X people are capable of contributing 1000 LOC per person per month (per man-month) to a project, then I can better estimate the cost and schedule of a new project. The project's scope is determined by detailing the customer's desires, and developing a break-down of capability. (Things such as R&D, training, and new technologies are identified and have an appropriate risk factor associated with then).
A LOC estimate is associated with each capability, which consequently will produce a timeline and cost.
What the author should have really reflected upon was not how to refine the software productivity metric, but rather how to refine the application of that metric.
About 2 years ago, I was working on my first major project and the project manager called me one day out of nowhere to ask where my progress was (normally we covered this in a weekly meeting). I started giving him percentage estimates based on feature completeness, structure completeness, etc.
So then he asks "how many lines of code do you have?" I tell him that I don't use that as a gauge, I use what I just told him for my progress. Also told him that I don't count lines. He persisted, so I came up with a rough count. He says "so if you say you're at 60%, and have X lines of code written, then you'll have Y lines when you're done, right?"
I had to reiterate (for the third time in that phone call) that LOC means nothing - it may very well be that I only had 100 lines left to put together, but it would tie up the remaining functionality needed (by gluing all my pieces together).
But he just kept coming back and harping on that LOC number, no matter how I tried to persuade him that it was meaningless. He was convinced that this was how he would know how much work went into the project. I guess the 3 weeks of writing very little code and charting out the logic of the app didn't mean much to him. He was taken aback when I told him "I don't just start writing code on day 1, I plan things out"
I feel like some intrinsic part of programmer productivity has been overlooked here. A lot of development is done in teams, working with groups of people. Sometimes a person can be of immense support to a team by providing insight, direction, explaining an existing system, etc... without writing a single line of code. I've known some programmers who never wanted to be bothered and others who became so swamped with people asking them questions that they sometimes had trouble getting their own work done. If Bill asks Rick a question, and Bill's answer takes an hour to explain, but saves Rick a day in wasted implementation, how does that affect the perceived productivity of Bill and Rick? Furthermore, how does this make you look at the productivity of someone who never wants to be bothered or someone who rarely asks questions even when they should?
My personal favorite productivity measture: lines of code I've DELETED!
Yeah, I know this isn't any new revalation either, but I'm a believer in Refactoring[?]: improving code without adding functionality. Refactoring improved efficientcy, understandability, and removed coded duplication.
Read Martin Folwer's awsome book, and/or practice Extreme Programming[?], it'll change the way you program.
----------
I can't spell. What else is new?
If Slashdot is where the spelling-challenged go when they die, I'm in heaven.
The article originally appeared here last week. Sheesh. Don't pretend it's an original Slashdot article if it's not.
This is off topic:
I don't understand why "if( 1==x )" is ugly. If the poster prefers "if( x==1 )", then I submit that some (experienced) programmers use the former because they know how easy it is to mistype the comparison operator. In that case, "1=x" causes a compiler error while "x=1" simply assigns 1 to x and continues.
Your Servant, B. Baggins
I don't think anyone has ever claimed lines of code per day is a useful or meaningful measure, except of course for pointy haired bosses.
Looking for an Information Security student project suggestion?
Try http://dotcrimeManifesto.com/
who could do a 90% solution to a problem in days that would have taken any one else weeks. There were two problems with him: he knew he was that good and it took the rest of us weeks to finish his code.
Just another post below my threshold.
Getting away from a problem sometimes is a good way to solve it.
Works for debugging, anyway.
Once upon a time we were stuck for ages with a horrible bug, and were making no progress. I went home for a bath (don't remember at this distance why I thought that was necessary), and in the bath worked out what must be going wrong.
On returning to the office I got some coffee and wandered over to my desk, ignoring the group of people still huddled round the minicomputer. After a few minutes I looked up and said "You have fixed, it haven't you?". "No", they said.
So I told them which module to look in, and then which line of code was wrong, without the listing in front of me, and for no apparent reason they got quite cross when they discovered I was right.
Dude, that's completely rediculous. Some of the women I work with would make mad cash if they were evaluated by their beards, even though they're just stupid bitches.
One of the guidelines I have to find code to refactor : look for comments. Comments should be in the checkin log of the file, in the changelog of the project , in the header files and in the naming of variables and functions. And in the diff between revisions.
,and you get a very recognisable frustration. Only, the solution for it is not 'better comments'. It's better code
If they are in the code you have a good chance to point out weaknesses in the code. Occasionally these are weaknesses that are very hard to avoid. Often not.
Which leads to guideline two: always try to upgrade the clarity of the code when you do a small mod(usually a fix) . Small mods tend to degrade the code quality, the maintainability, even though the code works better than before.
Comments in the code also quickly become irrelevant, and even misleading, if they aren't already bad from the start. So take a weak fix and append a confusing comment
This guy is "president of CHC-3 Consulting, teaches software engineering to corporate and university audiences, and writes frequently on computer topics". Still he failed to mention function points (an old measure of product size) or use cases (a more modern measure of product size).
He also fails to recognize that programming is a group activity, where one person can be seemingly unproductive, but in reality being vital for the productivity of the group. Typical such persons are mentors, which spends some of their time helping others. Mentors may not produce a single line of code, but still be the most valuable person in the group.
Alistair Cockburn does in his modern classic "Agile Software Development" state that software development is a "Cooperative Game of Invention and Communication". Therefore the productivity is best measured at the team level, since they are, in the end, cooperating.
Also, I think it is quite clear that use cases, or user stories, or whatever you wish to call them, are the best way to describe the wishes of the customer. Fulfilling these wishes are ultimately the only thing that matters.
So, I would say that the number of finished use cases per unit of time, for the whole team, is by far the most meaningful measure of productivity.
Mats
being able to answer the question
"what are you doing to solve the problem?"
If you know, big picture, what your plan is, but you've spent 2 days figuring it out, that's probably better than just jumping in and writing a bunch of crap just for the sake of writing it. You're less likely to have to re-write that way.
In this case, you're probably going to write less code overall, but that's cool. This is a combo of Danny & Ingrid.
The problem is, you can't keep too many Ingrids around, because what if the required feature isn't already available? Then you have a staff of people who just laze around hoping an easier solution will manifest itself. It's all about balance.
Knowing how to proceed in solving the problem is a useful metric in an unfinished software product.
The process of software development is still a creative task, requiring the exercise of human imagination and judgement.
Few can properly measure the output of a programmer except in Yhellowstones (brown fluid in --> yellow fluid out) since this will remain a judgement call until this becomes a determinant rather than an emergent process.
A poet is still working even when gazing out the window.
Finally, usually communications/research time cuts down on code generation- and that's the real question. How can we have metrics unless there's a second opinion over whether code needs to be created?
-soup (GNUrd, Speaker to Machines) "Laugh at yourself- Why should everyone else have all the fun?" -Romanchek's 6th Ru
You should strive to make your design docs just good enough for the people who'll be reading them -- the maintenance programmers, who will also have the code. In other words, the design docs are the cliffnotes to the code. The code is always the authoritative design documentation.
BTW, I STRONGLY recommend reading Agile Software Development for anyone who's seriously interested in these issues.
Using peer reviews and feedback, therefore, allows a manager to qualify a particular developer's "productivity" by asking the only metric worth anything - the opinion of other knowledgeable developers. Trying to equate any of this to any metric so far uncovered is truly pointless.
Not that there may not be a real metric (or, more likely, a complex set of metrics) out there that someone will discover to adequately measure this stuff; in the meantime, though, I'll continue to let the team examine, develop, grow, and rate itself. In the end, I know I'll have a strong group of developers who respect each other, work well together, can understand each other's code and approach, and who are...productive.
Sorry, but that's utter crap. Simple code is every bit as straight forward as comments in english.
You claim that code cannot be tested unless it's documented. Then how do you test the comment that is supposed to be describing the code? It's just as likely to be wrong as the code is itself. Example:
// This code prints foo to the console
printf("bar");
Why should the comment be the true authority? It's no more an authority than the code. I've seen countless examples of the comments being far out of date compared to the code.
The point is that simple code *is* obvious and doesn't need any further comments.
I wasn't talking about the design document, I was talking about the inline documentation, aka comments. I think all code should be designed before it's built and that design should take the form of a document, but that is not the same discussion we were having.
I suppose there are rare disciplines where this would be an acceptable level. As much as measuring the lines of code for productivity is misleading and inaccurate, in some situations it could point out extremely low or extremely high productivity.
If the programmers are engaged in testing, debugging, or optimization, ten lines of code per day *might* be acceptable. If they're in the design stage, don't expect any lines of code. Ideally, not a single line would be written until the full design spec has been laid out and approved by management or the customer.
However, if they are in the middle of development, ten lines of code per day is abyssmal in virtually every case I can imagine. As an example, say you have a small application that consists of 100,000 lines of code (measured by however your programmers are defining a line of code). 100,000 lines of code is not a very large package (consider that W2K is, I believe, on the order of tens of millions of lines). It would take your two programmers at 10 lines/day 1,000 weeks (19 years) to complete the software. That assumes they work 5 days a week without ever taking a single weekday off for holidays, sick leave, etc.
I'm a programmer -- in fact the only one for the division of the company in which I work -- and the software that I wrote from scratch beginning last fall consists of over 60,000 lines of code. It worked out to an average of 500 lines of code per day. A lot of the time up front was spent solely on design without a single line of code being written, so in reality that number is actually a bit higher when you count only those days where development work occurred.
And I think I was slow. Either the two programmers who told you that 10 lines per day is the average were joking with you, or they're pitifully lazy. Or they're just not very good. But, before you get mad at them, take in to consideration that maybe there are outside factors that cause considerable drain on their productivity. Maybe they have extremely poor documentation to work with or they processes they're using for development are inefficient.
The algorithm part can go two ways. Ideally, they should have included that in the design phase. They should have already mapped out the operation of the system -- unless the specs changed on them in a significant way as to thwart previous designs. It is also unavoidable in large, complex systems that issues will come up during development that cause the programmers to have to step back and think through things first. This can cause temporary lags in development as the design catches up to changes in the spec or unforeseen problems (maybe a piece of the core was relying on something promised from another vendor, but that vendor has disappeared). But if your programmers are constantly designing each little piece of the software as they reach it, without having mapped out the big picture, they're not very good.
As for the API, that could either be extremely poor documentation (which in most cases can be remedied by dropping by a local technical bookstore) or programmer laziness. It is a poor approach to only look up those specific API calls as you need them without having any understanding of the entire package you're using. Each time a new external package is needed and its API will be used, the programmers should be learning that package as a whole before they begin to use it. They may get the software to work without it, but theyw ill lack the understanding of it to be able to use it efficiently, and many times correctly. For tiny packages you can get by easier, but when you start talking about using stuff like, say, Xlib directly you're making a big mistake to not learn as much of Xlib as you can before actually using it.
Long story short: 10 lines per day is pathetic in most cases where development is fully underway -- unless significant roadblocks have been created that work against the programmers. But, expect very few lines of code during design. In fact, encourage it. You want your programmers designing the software, not rushing head first in to the development without creating a map. And if you go through extensive testing and there are very few lines of code being written or modified, that's probably a good thing because it usually means that there's not much to fix.
Definition:Productivity, '2 a : the act or process of producing b : the creation of utility; especially : the making of goods available for use.' from Mariam Webster
Premise: Productivity cannot be measured except by creation of utility. Utility can be defined as a marginal increment of value. Value can be defined as a unit of production. Productivity then is a measure of increased value. Definitions of value have been attempted from Aristotle onward with varying degrees of acceptance. For business purposes value is found in the bottom line and is predicated upon Generally Accepted Accounting Practices.
If we put aside the idea of a programmer being made to be 'highly' productive as a pipe dream then increments of utility can be put forth as the only available measure of productivity. For example I find I'm more likely to be 'highly' productive the more people like me, do what I want them to do and give me what I want.
If we accept increased utility as a definition of productivity then the final product as it is defined becomes the final abitrator of value. This implies a Goal oriented approach to value based upon measurable increments to utility. This suggests any one programmer is capable of productivity only in so far as s/he is capable of adding to utility.
If this simple definition of productivity is looked at from the view point of Open Source an interesting phenomenon arises in terms of the artistry of programmers. The Renaissance and post Renaissance periods produced leaps in Science and the Arts something akin to what we're presently experience. It's been suggested the creation of perspective drawings birthed the industrial revolution by providing schematics that made possible the production of the machinery of industrialization. A critical aspect of the Renaissance and the eras that followed upon it allowed for the free borrowings from the works of others. Those given to 'copyrighting' their material had little recourse, famed lutanists would hide behind curtains so no one could steal their chops. Bach, Shakespeare and others freely borrowed lines and more from their contemporaries and those who came before them. Bringing this rant to a close it remains to postulate whether the Open Source movement, in multipart harmony, provides a more efficient model for productivity? Well doh!
heuristic algorithm seeks stochastic relationship
There's a whole literature on managing software projects. Look up terms like "Software Engineering" and "Software Management". For tracking progress, the usual approach is to divide the project into a series of steps, where each step can be unambiguously determined to be true or not (no "90% done" steps). Estimate the time that's required for each step and use a scheduling program to determine how long it will take; you'll also need separate management reserve time for the inevitable problems (but keep this separate from the steps, so that you'll know when you're using it up). Some people define dollar values for each step, resulting in earned value approaches.
By the way, I've used SLOC to estimate the effort needed to develop one of the GNU/Linux distributions (Red Hat); you can see the results in More than a Gigabuck: Estimating GNU/Linux's Size .
- David A. Wheeler (see my Secure Programming HOWTO)