The Four Fallacies of IT Metrics
snydeq writes "Advice Line's Bob Lewis discusses an all-too-familiar IT mistake: the use of incidents resolved per analyst per week as a metric for assessing help-desk performance. 'If you managed the help desk in question or worked on it as an analyst, would you resist the temptation to ask every friend you had in the business to call in on a regular basis with easy-to-fix problems? Maybe you would. I'm guessing that if you resisted the temptation, not only would you be the exception, but you'd be the exception most likely to be included in the next round of layoffs,' Lewis writes. 'The fact of the matter is it's a lot easier to get metrics wrong than right, and the damage done from getting them wrong usually exceeds the potential benefit from getting them right.' In other words, when it comes to IT metrics, you get what you measure — that's the risk you take."
It's bad business planning, but it's also the way any big name linux distroy works. Something not working on your Red Hat Linux? No problem, call us! And that's how they make money. They make money on the promise of fixing problems, and that includes saying that their OS is broken.
Losers realize this simple fact, instantly think of several ways to game the metric, then don't do it figuring that "obviously" the decisionmakers realize the metric is horribly broken. Then they get laid off. Winners spend hours, days, or weeks coming up with one way to game the metric, pat themselves on the back for being so clever, and do it. Then they get promoted, eventually to a position where they come up with metrics of their own.
I am going to get a little mean here, but if a company is doing this they are looking to outsource you because they dont understand.
So fuck em and dump em.
Anyone worth their salt will look at downtime, stability, and resolutions before they look at resolution time.
This problem was aptly portrayed in the classic dilbert comic strip in 1995.
I'm going to code myself a minivan.
"No matter where you go, there you are." -- Buckaroo Banzai
Such metrisc also disincentivize people taking proactive steps to reduce the number of incoming tickets (i.e. making the system/environment more robust or your users more educated), and disincentivizes managers for so doing by reducing the number of people needed to service incoming tickets (thus reducing the size of the empire and the pay grade of the manager).
I've seen both "disincentives" in action. It ain't pretty.
Everybody gets what the majority deserves.
Winners understand that tech support is a stepping stone and treat it as such. Which means that they move up as soon as possible.
Tech support managers are under pressure to keep their costs down. So unless you're okay with working for less money than the others there (but still solving as many problems / answering as many calls) you will be replaced with a new, cheaper person as soon as they can find one.
The metrics are just there to justify replacing you.
that it always sounds like a good idea when you're thinking of implementing it and few people go beyond the "this sounds like a good idea" phase to the "how can I game the metric I just thought up?" phase.
Metrics are great for some things. For making sure that your employees are working they are terrible. I used to work in a metric free environment and there was a great team atmosphere. Then metrics came along and it all went to hell. Now everyone is so focussed on making their numbers look good that the whole organisation is suffering from a weird sense of internal competitiveness. People no longer collaborate on difficult problems because there is no measure within the metrics system to reflect that this occurred. People who used to be innovative are no longer so, because they are not rewarded for spending time innovating. It has achieved nothing good that I can see.
..but I'm not so keen on /.'s article description here. "...the use of incidents resolved per analyst per week as a metric for assessing help-desk performance..." Having worked in this area for decades, I can tell you that I can't think of a single IT support org that uses this as a metric. It's a straw horse, of which there are many when it comes to metrics.
The three most common metrics are:
Cost per incident
Customer Satisfaction
Resolution on First Contact (sometimes FC is defined as 'resolved at/within tier 1, even if it means')
There are usually two more, but those tend to vary on your business and priorities, if you have SLAs/OLAs, and what service channels you offer.
Average speed of answer/Time to Respond to Client is usually next.
Average Time to Resolution sometimes.
People sometimes care about Abandon Rate, but only within the context of the customer satisfaction metric. A nice place may poll for employee satisfaction. A nicer place does it more than 1-2/year. I've never even seen 'resolved/analyst/week' come up in discussions, forums or books going back to the early 90s.
And seriously - NOBODY running anything but a penny ante 100 call/week call center would ever try to regularly cook the stats by having friends and family calling in to boost the customer contacts. It's too much work for too little bang, and it's too easily caught. Any place with a real ACD system, eventually, will notice that a not-insignificant number of calls/emails are coming from the same 10 addresses/numbers. It's just not worth it. The description implies the exact opposite. If you don't have a real ACD system and a real incident-management/ticket-tracking software, you're not really measuring anything anyway and you're probably working at a place that's not complicated enough to care about metrics in the first place.
Metrics work if you are comparing two workers on an assembly line doing the exact same work - you can compare their widgets built-per-hour rate (offset by any QA problems).
But when you're dealing with a helpdesk team, the work is no longer homogeneous. The more senior helpdesk person usually gets the hard problems... and he spends more time mentoring his peers (at least he'll do that in a well run team). But tell him that his time-to-resolve metric will determine his bonus and suddenly he'll focus on solving tickets as quickly as possible and instead of volunteering to track down that intermittent printing problem reported by the finance team, he'll leave that for his cohorts and instead will jump on the fast easy tickets.
The quote above is from Jerry Weinberg, and it is true.
There's an entire brilliant, short book about this problem: Measuring and Managing Performance in Organizations by Robert Austin (1996). It's actually a fairly rigorous, somewhat philosophical work, but it is pretty unrelenting to documenting that, indeed, trying to manage by metrics almost always introduces distortions, which in turn are almost always counter-productive. The problem isn't just with IT, it's with any type of effort that seeks to reward or punish based on metrics.
The only metrics that I've found actually useful in IT are those that are predictive -- for example, aiding to estimate the actual delivery date of a project under development. The metrics that seek to somehow measure "accomplishments to date" solely for the purpose of reward or punishment are always gamed and are almost always useless. ..bruce..
Bruce F. Webster (brucefwebster.com)
even non tech call centers suffer from this and then that's way so many times you get people who don't care are fast to get you off of the phone.
That's an example of the Dunning-Kruger effect.
http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
Anyone can design a metric that they themselves cannot figure out how to game.
Good. glad to see that some VP did the smart thing for once and cut the middle managers instead of the people who actually get the work done.
Its not just the losers. Talented and rational technicians and engineers bend to the rules of the system too. Basically you get what you incentivize, what your reward. If you reward people for complying to some metric then they will generally comply. It does not matter what everyone agrees is right, it does matter if management says quality is important. If the metric decides whether you get to keep your job or get that raise then the metric is what the company gets regardless of what the company asks for or whether the company's goals are actually advanced.
unions in other jobs poor metrics can do have done worse and unions are big help to fix bad metrics.
I saw a old post hear or it was linked I think it was about about a glass factory where one one shift was doing more units then the other but the quality was slipping and management was pushing the other shits to do more. So the union pushed the one shift to slow down. In the end they did slow down and quality when up. There is more to it but I don't remember the full post.
Also I think there was this fork lift job where people where by pass safety locks to hit metrics.
Now maybe a IT union can you know step in to talk back about poor metrics to they can do a better job and not just do whats needed to hit numbers.
QA at our company would be rated better or worse by the number of bugs we filed that got fixed. So it made more sense to focus on small issues (e.g. typo in some text) than bigger issues (e.g. data corruption in a stress environment).
I could file 5-10 easy bugs that get fixed in a day, while someone else suffers trying to explain how to reproduce the data corruption issue.
This wasn't the absolute deciding factor keeping your job, but it had quite some weight to it.
I also wrote the report that the managers used to read this data, but before that I wonder what factors they used to judge performance.
It's not just a problem in IT. It's a problem anywhere that managers are out of their depth. When they are very badly out of their depth poor attempts to observe what is going on can have very bad effects on an operation.
Some years ago I worked for few months at a steelworks, and it was managed as badly as the most rabid Libertarians imagine that the worst of government is run. Management never saw the operation or the city it was in. They just saw numbers, and the most important number graphed on noticeboards everywhere around the plant was "tonnes of steel per man hour".
Now the hours were only counted for permanant employees, so contractors were shuffled in and out by the thousands to skew that number. Since they were not employees and were theoretically off the books there was no training for those that came in from outside of the industry, which of course given the number of people involved and the nature of the workplace resulted in some very serious accidents with multiple deaths. Quality suffered from untrained staff and a desire to increase the tonnage above all else. Revenue went down becuase a lot of material had to be sold as a lower grade of steel, in addition to increased amounts of scrap which still counted in that magical number even though it had to be remelted. Nobody on site had the authority to make any major changes and any reports beyond the interesting numbers were ignored.
It went from being a profitable operation to almost completely shut down within two years - of over 16,000 employees only 300 remained to operate a small rod rolling mill that could get steel shipped in from elsewhere. The losses exposed the company to a takeover bid and they are now owned by Swiss Bankers that have some odd remote control management quirks of their own (which has created a billionaire that picked up one of their discared operations for just about nothing).
Performance metrics are just a simple model and you have to make sure that model actually fits the situation. Trying to change reality to fit an inappropriate model can result in the opposite to what is intended.
Good. glad to see that some VP did the smart thing for once and cut the middle managers instead of the people who actually get the work done.
It is deliciously ironic that you would take a swipe at "middle managers" in this conversation about metrics.
The only way to eliminate middle management, is for upper management to utilize metrics in order to evaluate lower management. There is no time for hands-on management and evaluation with a keen eye in one of these vaunted "flat organizations" with no middle management. And so lower management quickly realizes that their jobs and bonuses depend on the metric, rather than on quality or long-ranged action.
After that, the company is humped... but by then, the "aggressive VP" who wiped out middle management has collected his bonus and moved on.
FATMOUSE + YOU = FATMOUSE
Stats by themselves will only ever be an indicator what is happening. You really need managers on the ground that are trust worthy to give you feed back on how things are actually going.
Taking humans out of the loop when rating other humans is always a mistake
It said "windows 98 or better" so I installed Linux
...the young king's heroic character is shown by his decision to wander around the English camp at night, in disguise, so as to comfort his soldiers and determine what they really think of him...
I worked as an engineer at a company the professed to be "agile" (the quotes are because really, not so much). They started judging performance by "cards closed per week".
You'd be amazed at the number of cards that will be created and closed under those conditions. Our productivity *soared* (according the graph that showed productivity as a measurement of cards closed per week ... ).
What, are you getting paid by the post? :-)
on Raymond Chen's site (The old new thing) a sporting goods store wanted to increase the upsales of "shoe protecting" sprays. They offered the staff a kickback since they had a ridiculously high margin on the spray (like 80%). Anyways the sales people were allowed to use discounts at their discrecion and the store had coupons frequntly. So ... smart employees gave away the spray and used coupons or "discounts" to make up the difference so they'd get the kick back. In general any behavior you offer a reward for that isn't exactly what you want as a company will result in you getting what you are incentivizing regardless of whether or not you get what you want out of the deal. The only solution: tie the reward directly to what you want, eg. if you want more profit for the company than give profit sharing. You still have to work hard to remove disincentives, ie crappy employees that make the goal unachievable and so make even your good employees just take it easy since there is no reason to put in the extra effort, but at least you make your employees incentives tied to your organization wide goals.
Be warned: my example is way off topic, but a pet statistic I keep track of.
There is no such things as bad statistics, only bad layman statisticians who don't understand what the numbers actually measure.
Take lines of code, for example. Some people hate it because you can bloat the numbers by adding comments, neglecting to consider how useful those comments are for future maintenance, and thereby a useful application of a developer's time. If you use a consistent formatting style for two projects, you can get a fair grasp of their complexity from the line count, though that will gloss over details about how the code actually works.
The most interesting pattern I've notice in line counts over the years is that the use of templates and other code abstraction facilities really hasn't decreased the size of code much at all, though it's improved readability, maintainability, and programmer API usability substantially. So line counts only give you an approximation of complexity with a language like Java, but do nothing to measure the quality of the code.
One other thing I've found is that complex code looks fat and heavy from it's sheer size, but often compiles to very reasonable executable size and runs rings around supposedly "tight" code that makes heavy use of dynamic techniques like introspection. As only one image of an executable is loaded by a reasonably competent OS, a fat binary does not mean a fat application at runtime.
Big code is only scary if it's not following recognizable patterns and is instead a mishmash of different developer's pet syntax, algorithms, style conventions, naming conventions, and even preferred APIs. If you manufacture it predictably, fat source code becomes a joy to maintain, enhance, and use.
But back to the core topic: help desk performance.
The only help desk stat I care about is a low number on customer complaint reports about the quality of information and assistance provided by the tech team. If it's my company and my budget, I'd rather hire more technicians to handle the load and produce happy customers in the end than I would saving money by overworking and burning them out by even thinking about useless numbers like "calls handled per week."
In the end, if you care about your business, the only thing that truly matters are happy customers who want more services or products in the future, and who will gladly tell others about their good experiences in dealing with you.
There is no substitute for a good word-of-mouth reputation and repeat business. No one ever got fired for buying IBM not because they're perfect, but because their people will go the extra mile to make things work.
I do not fail; I succeed at finding out what does not work.
It didn't say that our metrics are wrong but warns that not understanding them is very dangerous. You need to understand the value of what you're measuring and what your goals are.
Lets take the metric of issues resolved for an IT department.
Agent 1 - Issue 1 - Faulty keys on keyboard, Replaced keyboard - Resolved
Issue 2 - Faulty monitor, reconnected loose cable - Resolved
Issue 3 - User locked out of account, restored account - Resolved
Agent 2 - Inventory database inaccessible, troubleshoot servers, network connections, software, data corrupted, restored data from backup - Resolved
By using the metric of per issue you have granted equal value to any and all issues so even though agent 2 is working on something much more important and would take much longer to fix, by the metrics defined, agent 1 has done 3 times the work than agent 2. Now if you base your rewards on this faulty metric, agent 1 receives a bonus and agent 2 gets laid off despite the face that agent 1 couldn't do what agent 2 did.
You also need to understand what goal you are trying to accomplish. If your only goal is to increase issues resolved than you can make that number go up, but people will find the fastest and easiest way to do so. On the surface that sounds great but in reality everything else suffers because you have declared that not important. Costs go way up because parts are replaced instead of being fixed. Productivity goes down because problems get bandages and declared resolved when everyone knows the problem will reoccur again and again. Great for IT because they get a ton more easy resolved issues but sucks for everyone that is trying to use those resources.
It should be remembered that efficency and effectiveness generally are unrelated.
Efficiency is something that can be measured: responces to calls, forms processed, etc, the sort of thing you can count. It's pretty easy to do this sort of thing, and often the PHBs will take some metric and use it as a measure of activity. Because of this, one often sees things like proformance indicators, and the process and often salary, becomes connected to the indicator. The industry stops being what it is and starts producing 'red beans' for the bean counters. The indicator changes, and one produces blue beans.
Effect is something that is about getting the right job done, both for the customer and for the system. It's not even about what the customer wants, since this supposes that it is the role of the customer to diagnose the problem and the solution, and simply ask for the solution to happen. One needs to think of what happened with the system that responded to cyclone Katrina in New Orleans, which the responce was based on customer wants, rather than pre-assessment by those who should have done this. A call for help is an indicator to a problem, not a proposed solution.
Of course, even though an indicator might be proportional to effect in the wild, when it is proportional to money, the indicator becomes more important to the effect. A doctor, who might have an indicator on consultations, will split several illnesses to several consultations. On a help desk, one is more intent on creating calls, then on providing effect. A call that seeks three problems would be terminated at the first, and new calls needed for the second and third. Also, the process might be extended to several calls to create extra indicator traffic.
In the main, help desk traffic is not a really good indicator of effect, since there are things that effect this. Response time, time to fix, etc, all serve to alter traffic, in some cases, it might be better served by the section guru rather than the help desk. The effectiveness of the guru's solutions may well impede the help desk's overall issues, since it might make matters worse.
One should also note that recording the help calls is also an impediment. It serves no effect, and in many cases, might take as much to make happen as the call does in nature. One might answer say, 90% of the calls first up, yet spend more than 50% of the times making the necessary beans for the counter. A good deal of issues can be condensed into a few batch files (yes, i did this: system configuration is a good candidate for script files), so that while the call is terminated relatively fast, the actual recording might be tedious.
My experience of help desk is that particularly Microsoft rograms (eg Word, Access, Windows), use common names, which makes them very hard to grep for in the system. This reduces the effectiveness of any sort of 'search the job tables' for help. To this end, i used Wart, Abcess, Windoze, much to the annoyances of the PHBs.
OS/2 - because choice is a terrible thing to waste.
At least give credit where it belongs for the idea that that is a bad metric:
-2000 lines of code
http://www.folklore.org/StoryView.py?story=Negative_2000_Lines_Of_Code.txt
-- Terry
"Seriously? That's the only way to evaluate? You can't think of a single other way?"
I take umbrage with this sort of response. Your whole purpose is trying to make the parent look stupid, without any sort of constructive input of your own.
By all means, if you know how to perform the job of 10 middle managers with just one top level manager without using metrics, please tell us, otherwise I'm calling your bluff. The parent poster's whole point is that you need hands on management to do proper evaluations. They need to observe you, talk to you, critically evaluate your work, etc. A single top level manager, managing 100 employees can't do that, and if you try to ask the other employees, most decent ones tend to not want to sell out their co-workers.
Metrics was invented for this exact purpose. But a number based on some work stats can't replace critical evaluation.
I worked in a helpdesk many years ago where we were all measured on the number of calls per week we closed. There was no consideration towards the complexity of the call given.
Our boss at the time, started giving a $100 incentive to the most number of closed calls. One of the guys in there consistently got the prize. One day, while looking up a call I fat fingered a digit and found myself looking at one of his tickets... it was a ticket, opened and closed about receiving a phone call from X. $ticketnum +1 was the actual ticket for X.
In a nutshell with some sorting/filtering I saw that the guy was not only gaming the system, but hiding the fact that he was grossly incompetent. I wrote everything up and showed it to our boss. Needless to say, he was less than happy not only with this guy, but with me. He was being pushed on from his boss to generate metrics and basically was complicate.
Long story short, I went to his bosses boss i.e. the CIO and voiced my frustration. I pointed out that fallacy of this metric that me imaging a laptop (which back then took hours) vs. Answering the phone both being basically equal to the same measure of productivity made the metric useless. Not to mention the fact that it provided zero incentive to provide better support, just incentive to close tickets.
Obviously, this caused some huge changes. Not the least of which was a much more comprehensive analysis of what people were actually doing. This made quite a few people unhappy because it exposed them for being the incompetent hacks they were. Not the least of which were my boss at the time and that employee.
Yes Francis, the world has gone crazy.