The Futility of Developer Productivity Metrics
snydeq writes "Fatal Exception's Neil McAllister discusses why code analysis and similar metrics provide little insight into what really makes an effective software development team, in the wake of a new scorecard system employed at IBM. 'Code metrics are fine if all you care about is raw code production. But what happens to all that code once it's written? Do you just ship it and move on? Hardly — in fact, many developers spend far more of their time maintaining code than adding to it. Do your metrics take into account time spent refactoring or documenting existing code? Is it even possible to devise metrics for these activities?' McAllister writes. 'Are developers who take time to train and mentor other teams about the latest code changes considered less productive than ones who stay heads-down at their desks and never reach out to their peers? How about teams that take time at the beginning of a project to coordinate with other teams for code reuse, versus those who charge ahead blindly? Can any automated tool measure these kinds of best practices?'"
Refactor, refactor, refactor
KISS technology, nothing beats it.
If you don't have a use case for reuse, you shouldn't try to code for it. To many 'interfaces' are single use, see 'servlet' vs. 'http servlet'.
The force that blew the Big Bang continues to accelerate.
Holistic coding is the only way to go.
Whatever you measure will be gamed. Measure bugs fixed, and you will find people wasting time listing each tiny variation of a bug. Measure lines of code, you will get spaghetti code.
It almost seems better to measure a bunch of things and use a secret formula to determine productivity.
Man, you really need that seminar!
But what happens to all that code once it's written? Do you just ship it and move on? Do your metrics take into account time spent refactoring or documenting existing code? Is it even possible to devise metrics for these activities? Are developers who take time to train and mentor other teams about the latest code changes considered less productive than ones who stay heads-down at their desks and never reach out to their peers? How about teams that take time at the beginning of a project to coordinate with other teams for code reuse, versus those who charge ahead blindly? Can any automated tool measure these kinds of best practices?
It bitrots. Yes. No. Maybe. Probably. Definitely. Possibly.
Unless it's your job to make up the metrics.
A lot of problems rating developer productivity. First, if a system is that good, then managers won't be able to game it to play favorites. Second, writing code for future use is always harder than writing code specific to the problem. Third, almost any metric is going to penalize a simpler solution. (Keep in mind that once you see a simple solution it seems obvious and everyone thinks they'd think of it. Fourth, evaluating developers well would require making the best coders managers, and that rarely happens for several reasons.
Democracy Now! - your daily, uncensored, corporate-free
Wouldn't it be the project leader who monitors these on an individual basis? If a coder isn't pulling their weight its up to the project leader to address it up to the point of termination. Above that you have a suit who monitors the project leader's team performance and decides how well the project leader is doing. Of all the places layered management doesn't work, coding is not one of them. It's a challenge to hold a developer accountable because there are so many different approaches to the same problem in coding and a lot have definitive pros and cons.
Can any automated tool measure these kinds of best practices?
No. The - for the sake of politeness, let's call them "people" - who invested time and effort into devising these schemes have actually built a complete chain of negative productivity by doing so. Remarkable.
OMG!!! Ponies!!!
Personally I am always happy with the guy who can get things done with one line of code instead of a hundred, but what I really care about is that objective is met and we don't have a host of bugs that require 10 times the cost of the development just to maintain. Its not hard stuff but it does require common sense and a hard nosed attitude both of which can be scarce commodities these days.
If they know their productivity is being measured, it becomes a contest to see who can cook the books the best anyway.
The idea that complex things can not be measured is constantly thrown up by professionals who don't want to be measured unfairly or just don't want to be measured at all. However, doctors, teachers, and programmers can all have their output evaluated. I know that there is more to evaluating a doctor than survival rate and how often he remembers to wash hands between patients, but I know that hospitals that try to measure and improve doctors performance do a better job of helping patients. Reviews by peers and management are a good place to start. Yes, that can devolve into a popularity contest or a blame game but good management can guard against that. When I ran a software company we'd have meeting where we reviewed and discussed sections of code as a learning tool as a small part of our QA process. The end result was better code and a more educated, engaged programming staff. When you combine subjective measures like these with easily quantifiable measures you can get a good idea of how competent a programmer is.
metrics provide little insight
If only we had some kind of.... metric metric.
What about all the time you spend not coding: meetings, documentation, training users on how to use the system, working out the design before you start coding, answering emails, sending status reports, filling out time sheets, coordinating work with other developers, coordinating things with others in IT so your program will have a server to run from, being the go-between for the IT server team and the customer when the server goes down, creating database layouts and writing SQL?
Coder's Stone: The programming language quick ref for iPad
This all comes down to lazy, gutless management. Why take the time to get to know Dave and monitor the quality of Dave's work when we can just look at a spreadsheet at the end of the month? Managers prefer to tinker with automatic analysis software rather than manage.
Which is more fun, getting a better handle on what Dave is doing, or researching fancy new software tools that might get you all sorts of praise from metric-craving executives?
Dave's job, which was once about creating a quality product, now shifts to merely satisfying the metrics.
Coding is somewhere between art and engineering. You can hardly measure art.
Metrics are valuable if you do the same thing repeatedly. If you build a new building that is like the previous one, you can collect metrics and compare your performance against history. If you write the same search algorithm again and again, you can collect metrics and compare to see how your performance changes over time. Of course, with software, you never repeat. Somewhere around the third time, you move it into some form of library, reuse it, and start on a fresh problem. Perhaps metrics are helpful in some situations, such if your team keeps repeating the same mistakes, you might find similarity in those mistakes (code smells.) There are plenty of people working on these problems and tools. But, from a management point of view, if you keep doing the same thing, you are doing it wrong, and code metrics are not going to help much.
I don't get the concept of everything needing to be quantified. Does the team accomplish what the goals of having the team are? Does it get developed in a fair timeframe? Is everyone on the team pulling their own weight, or are there complaints of someone slacking off? In the end if the product works then the team is doing well, if the product isn't there should be at least one hybrid manager/coder that actually works with the team members sees who is committing what and can tell off the bat if there is or isn't a weak link dragging the rest down. Actually putting a pen and paper number on a complex project is silly. Do authors get judged by the number of pages they write in a day, no they get paid by the success or failure of the book. You can't judge by the number of lines of code, bugs per line ratio or anything like that, because it is all subjective and has little to no bearing on the end product.
Not everything that matters can be measured; not everything that can be measured matters.
-- Einstein (or maybe it was Franklin)
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
While people can certainly game any system that measures how productive they are by analyzing the quality of their code; it certainly doesn't understand whether or not that piece of code actually needs to exist, or that feature, etc. For example, you may construct an office building which is 20 miles from anywhere does it really serve a purpose, should it have been built? The building meets all the criteria for a good building and is functional, has structural integrity, etc, but it's in the middle of nowhere. Software is elastic in that the original code may be great, but overtime the quality may degrade, or the additional code may have solid quality, but the cohesiveness of the code feel is lost. Indeed the soul of the coding becomes convoluted and becomes difficult to maintain given the various signatures that people leave in their wake. Show me software that can measure that.
Measuring developer output/metrics effectively is a tough problem. Developers could solve it, but if they do, then they have to both change the way that they work and possibly work harder. Developers are smart enough to know that the metrics will be misused, even if the logic used to produce them is valid. Therefore, any solution will be ridiculed by the development community as insufficient, but the degree to which it is ridiculed will lessen as the solution improves. A solution though, is inevitable if development continues.
Writing for reuse can be excessive, but there are a number of reasons to move in that direction even if you don't intend to reuse that specific code block:
1) Unit Tests. If you abstract your functionality in a way that allows reuse, it also abstracts in for extremely easy unit testing. And unit testing will save you an incredible amount of effort in code maintenance.
2) Consistency. If you follow the same design pattern for all of your abstractions, all of your developers should be familiar with it. This makes it significantly easier for different developers to step into projects as the hopefully don't have to learn another person's style for abstraction.
3) Replacement and isolation. Need to implement a functional change? If your code is abstracted, like it would be for reuse, the functional change is limited to a single block, which is easily identifiable and if you're doing it right, unit-testable.
4) Just in case. Most of the time abstracted code doesn't get reused, and event when it does get reused it's usually a copy and paste job instead of a reference. Even so, if anyone ever does need the same functionality, it allows them to quickly rip off the exact piece they are looking for as opposed to trying to strip out your programs logic to get the tiny bit they want.
-Rick
"Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs
Can somebody give an example of that IBM score card, please? Is it just another Balanced Score Card system, just adapted to software development? The article misses to give that information.
... you are already doomed. You've gone so far down the wrong path there's no hope of recovery.
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
This sounds like some PHB how only knows how to be manger wanting numbers and having no idea about how developing works.
I suspect that if productivity measurement were so easy it could be automated, that programming would be "done" as a profession.
Take a look at jobs where productivity measurement is easily automated, and you get assembly line worker, fruit picker, banker. These are all jobs for people that aren't very smart. :)
Code metrics are fine if all you care about is raw code production. But what happens to all that code once it's written?
Seriously, does the author has to ask? Or anyone that does software development for that matter? Sadly, apparently they have to ask.
You never get to a point where you can say "code once it's written". Code changes after deployment, there are new releases, bug fixes, shit like that which crops up in real life.
So assuming you are gathering metrics the smart way and in a manner that your particular project and organization, then, you simply keep gathering metrics as you re-factor your code, add new features and fix new bugs.
At the very least once should be running McCabe, SLOC metrics (and LCOM metrics when doing OOP), # of open bugs per component and per release version and man-hours involved for a particular project/component/release. Identify and sort components by the # of open issues (fixed bugs, unresolved bugs, partially implemented features) and see where there is a relation between these issues and any or all of these metrics.
Project/department/solution stack specific metrics, when done intelligently serve as structural red flags. They help prioritize.
You can track which components have the highest McCabe metric, for example. Though you should always refactor, metrics like this can flag which part of the software needs more urgent refactoring, that's another example. You can keep track of the # of SLOC changes (additions, modifications and deletions) in a component from release to release (a measure of entropy), as well as the # of bugs detected (also from release to release).
Then you compute a ratio of entropy/bugs detected per release. And if you find that the ratio of your last release (named A) is considerably smaller to the previous ratio (named B), then you can take that as an indication that there might be B-A undetected bugs in this release. And if after a certain number of releases, the ratio is still small, then you can take that as an indication that your quality control has improved.
Things like that are what you do with code during development and during deployment and maintenance. It is never over with software. Until you completely decommission a software system, code is never over.
I got first job doing COBOL back in 1995. Second code change I did I got called into managers office.
He asked how I felt about change. I stated it was easy being only 2 line change to 1 module.
That change saves us 2 million ukp a year he says.
Ever since tried to write as little as possible . Amazing what you can do when you understand the problem , understand the code and find the one change that does it.
Not sure my method of wandering through code, chatting to everyone and then submitting a 4 line commit would fit with IBM
I think the idea of "productivity" is a hold-over of the Industrial Revolution that does not pertain to many of today's jobs; jobs where the unit of work is hard to define, and ultimately irrelevant. Are you telling me you pick your doctor by how many patients he can see in a day? Probably quite the opposite!
In terms of software development, I find that the *effectiveness* of a developer is more important, where effectiveness considers the following (not an exhaustive list):
- Appropriateness of solution
- Thoroughness of implementation (logging, exception handling, graceful failure, input validation, etc.)
- Well-written, parsimonious code that is easy to read and descriptive of what it does
- Works right the first time, no kickbacks from QA or end user
Give me someone who is effective but slow over someone who craps out junk quickly any day of the week and twice on Sunday! In the end, I don't care about productivity metrics, I care that the end users get a useful piece of software that does what they need with a minimum of headaches.
Insisting on "correct" English is like saying that there is only one, definitive recipe for chili.
Instead of trying to micro manage your devs ("Oh, Frank, you created 10 lines less than Bob, you slacker!"), focus on the project instead. Check whether your teams meet their deadlines. Pit the teams against each other. And rely on peer pressure within teams to keep the productivity up and slackers from slacking. Are you judging your beancounters by the amount of bookkeeping lines they fill out per day? Do you gauge the productivity of your project managers by the amount of meetings they go to? No, you measure them by the cost they produce and the benefit they create. Why the fuck should it be different for programmers?
Stop that petty micro management crap and start looking at what they accomplish!
We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
I consider my most productive days to be those that I delete the most lines of code.
Like paying a composer by the number of notes that he writes
I've generally assumed that standardized evaluation is pushed by upper management to provide a baseline for the direct mangers to follow. Sure, a manager should know which of his direct reports are capable and which aren't, but how can you compare one manager's "adequate" to another's? A few numbers won't get you a complete picture, but combine those numbers with the manager's opinions and you can better calibrate performance than you could with either evaluation on its own. A PHB will be a PHB no matter what you do, but if you can get some objectivity into his evaluations, you can get a more accurate picture of what kind of talent (or lack thereof) he has under him*.
*Again, the metrics would need to be taken with a grain of salt, but using feedback from good managers you'll have a better idea what the metrics mean, and how to make the metrics more helpful.
My webcomic
Productivity is just the inverse of the amount of time spent on slashdot, with additional points taken off for posting to slashdot and even more for submitting articles.
Voila! Productive!
No, that link you posted to a web comic we've all seen a hundred times is not "obligatory."
Yeah, that's an upper management failure in my book. They should be connected well enough to know what is going on at least two levels down, not one (enabling them to check the calibration of their direct reports). If that's not possible, the branching factor is probably too high.
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
Answering the question "Who should I fire/promote" is not the only use for metrics.
You also have the following:
"How long will this take."
"How many people do we need."
"Do our employees need more training/reviews"
This is the guy who first said it.
Stop listening to the MBA and metrics nutjobs. Don't try to manage your people like the machines they operate.
I completely agree, that's why I qualified my statement with the if.
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
Bugs per dev/yr.
If a dev doesn't "produce enough usable code" then they need to be fired.
I worked on a CCMI-5 project. Bugs were much worse than lower productivity there. Moved out of government software and worked on a team with almost zero process but a huge personal responsibility process. Memory leaks were considered critical bugs on that team. **Any** issue was handled quickly by the individual. It was a point of pride.
Every software project since those two were probably what most people see. Crap code, no responsibility for bugs. Perception of the manager mattered more than anything else at raise time. If you looked stressed and said "cross platform thread safety" enough, people thought you were brilliant. That's how I became "employee of the year" at that job.
We had a VP come in from IBM, and the first thing he instituted were daily timesheets for developers. I wrote a script to fill them with random but reasonably looking numbers, then started shopping my resume. Now I work for Google and say: Hooray for PHBs from IBM, I didn't know developers could be treated well.
All I ever heard while at Sun was the six-sigma blackbelt one cube over talking on the phone all day about how he was a six-sigma blackbelt, a bunch of guys lamenting the passing of the "good old days" when they'd have magicians and jugglers in to entertain the employees at lunch, and a few engineers badmouthing open source software quality.
One of these companies is still very successful. The other one was acquired by Oracle.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
If a company needs metrics to evaluate their employs, then the company is barely functioning, on a communication, collaboration and management levels; interpersonal knowledge and cohesive employees always produce better results than a company that is myopically focused on the numbers.
Metrics should only be used to make projections about reimplementation of a project, i.e. highest level if estimation, but having said that, if the person responsible for that estimation, cannot make it off the top of their heads(relies on metrics alone), then the company has bigger problems...
Where a metric makes developers competitive, to the point where they are working against each other, rather than together, the metric is actually damaging the project; how many of us have worked on a project like that?
Any good metric requires each objective to be weighted based on size and difficulty, factored by the experience of the resource with the domain and code-base, a lot of hard work, but if the initial estimations are inaccurate or deliberately favour certain resources, it collapse like the house of cards that it is...
My advice take metrics with a grain of salt, especially on the macro level; its rare for the keystone developer to be the best performing in the eyes of the metric...
I never work at a larger software company where they feed programmer metrics into a program and expect to come up with "answers"
I'm not saying I am against Scrum, Agile, TDD, XP, Kanban, pair programming, whatever - just that if as a manager/leader you don't talk to your team over time, observe their efforts and have no idea as to who contributes what that you need to use a fracking program like that - please step down or at least split yourselves up into smaller teams.
My recent experience and a previous experience with large corporate software efforts showed me how counter-productive such metrics are:
I used to work for a computer company with over 100,000 employees which used six-sigma to measure programming effectiveness. So, what the developers did, was to create a lot of programs from templates rather than creating common libraries and to put as little code on one line as possible. Getting the number of lines of code up was the goal so that the errors per lines of code would look better. Completely counter productive at the development level.
Recently, at the management level, in preparation for a presentation with top management, metrics and results are simply manipulated at the tool reporting (jira) level to get the results you need.
To get good quality software, you have to have good technical people managing the process and promote people based mostly on that ability but that's not the way corporate America works.
there is only one metric: demand and offer.
:)
The HR/performance appraisals "techniques", "metrics" and "S.M.A.R.T." commitments/goals are there only to shield you from the truth: that salary/benefits/promotions are a function of market demand/offer, influence and leverage. It has nothing to do with merits. NOTHING.
The sooner you realize that, the less you will suffer and the more you will gain
This is the sort of role IBM has stamped out for itself in the 22nd century: that of the BS'ing bureaucrat.
The irony of using an almost completely unstable Rational Software Architect product, heavily-burdened bloatware, to develop robust software was not wasted on me.
They do have a kick-ass online help system, though, I gotta give 'em that. Bureaucracy can work in open-loop systems.
is Lines Of Code, obviously.
The article's author and many of the posters would benefit by reading Capers Jones' books on the subject of software metrics. Yes, the software industry does need them, and yes, metrics will prove a lot of developer's ideas on how to build software wrong. Right now the software industry is flailing around; spitting out new languages like monkey's typing at keyboards hoping to reproduce Shakespeare.
I think the only real "metric" is, at any given point:
"How much of what I said I could get done, did I get done?"
This applies at the highest level of the project, from the team to the entity its being delivered to,
all the way down to the lowest level, from the developer to his lead.
I think this is the most important aspect of so called agile development.
1. Estimate how long it will take to complete x features.
2. Review progress at a non-annoying/non-production killing interval.
3. Revise the next estimate based on how accurate you were on the previous estimate.
This has to be done on a person by person basis. Every person gets things done at his/her own rate. You can't have a manager estimate the task hoping (or trying to enforce) the creative person will meet that expectation. You can't have the person pitching a client making the total project estimate, it has to "trickle up" from the people doing the work.
I've seen all these mistakes made. The company salesman/owner/whatever tells the client we can do it in 30 days. The project managers try to push this deadline onto the developers. The developers then go WTF there's no way we can do it that fast... dev a over here is a good programmer but he's kinda slow, and dev b is really fast but he sometimes makes mistakes, and dev c is really great at solving problems in general but isn't a great programmer or fast.. etc.
You have to go by the word of your developers, not bean count on their "output". Yes, this puts the dreaded task of Estimation upon your developers, but its better than the stress of being told how much they are expected to churn out. Its up to the managers to then efficiently assign tasks to the right people and track estimation accuracy etc.
-- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
The real danger of using such systems is that developers will game them, which happens with any process or system anyway. For example, if writing more lines is seen as higher productivity by the metrics and manager's interpretation, that will happen; which in turn may lead to lower software quality.
Welcome to 1982! You can tell your boss I said so. ;-)
Bill Atkinson, of early Apple fame, also "struggled" (too strong a term, really) with the lines-of-code metric. "He thought his goal was to write as small and fast a program as possible, and that the lines of code metric only encouraged writing sloppy, bloated, broken code.", as the story goes.
www.folklore.org/StoryView.py?story=Negative_2000_Lines_Of_Code.txt
"Good news, everyone!"