Ask Slashdot: Good Metrics For a Small IT Team?
First time accepted submitter shibbyj writes "I'm a member of a small 3 person IT team for a medium sized business (approximately 300-350 employees) that has multiple locations internationally. I have been tasked with logging our performance using the statistics from our ticket management system. I've also been tasked with comparing these stats and determining if we are performing above or below what is considered optimal. I'm wondering what people opinions are on what good metrics should be in regards to mttr mtbf etc. I have had trouble finding information on this."
One of you is getting fired
Didn't we just have a story about how metrics suck?
Time to answer call, time to resolve ticket, abandoned tickets (unresolved).
If you google a few of those it will bring some more, but that's a simple start.
The price is always right if someone else is paying.
"what good metrics should be in regards to mttr mtbf etc"
Easy, there are no good metrics. Metrics don't lead to improved business outcomes, they rarely cover enough variables to tell the whole story, so all they lead to is people gaming the metrics, most likely leading to worse business outcomes.
Metrics are favoured by lazy management.
Simple... if you have a 3 person IT team at a 300 employee company and your site / it infrastructure isn't in nuclear meltdown your probably doing good. Looks like they are going out of house for IT. Welcome to the cloud-future, where your job is dissolved for magic.
You're always below optimal, by definition of optimal.
talk about flow, about bottle necks. Visualize workflow. Look at Henrik Kniberg's paper on kanban as applied to IT Ops. My guess is that your ticketing systems will provide low value data on volumes on resolution time - gear up - visualize the pipeline. check http://www.infoq.com/minibooks/priming-kanban-jesper-boeg - turn the conversation around to "business value" - don't get wrapped in the ropes of volumetrics /peace.
There is no metrics system that can't be gamed.
If you set it for "total tickets fixes" (higher=good): you just encourage people to report trivial problems you can fix easily.
If you set it for "total tickets" (higher=bad): you refuse to do things, add features etc, or you make it hard to contact IT to log a fault
If you set it for "time taken per ticket" (higher=bad): you end up pushing kludge solutions
If you set it for "user rated response" (higher=good): you end up blackmailing the end users to rate you 10/10 otherwise their emails/logs/dirt etc get published and sent to boss/wife/etc
Ask your manager how their performance is evaluated? Then start suggesting ways they could bust their KPIs, and they should get the drift.
Metrics. Excellent, I hate when bosses use the Imperial system.
All jokes aside: If you care about your job in this economic climate, I suggest you do what your 2 other teammates are doing - picking through the stats that make YOU look the best. The company isn't going to look out for you. IT is an expense to be cut, remember. Boosts the temporary bottom line, promotes "growth" in this fiscal quarter, gets the investors going so the CEO can shuffle another fold into his golden parachute. If non-important metrics are selected that sacrifice your job, it's a brief victory lap straight into the unemployment line.
We can't answer your question, though. In the end, I recommend you watch a clip from "Office Space" - wherein the Bobs interview the employees:
Bob: "So tell me, what is it, exactly, that you do here?"
If you can't answer that question, you probably should be job hunting already. Or should have kept a copy of the job posting from when you applied.
There's a spot in User Info for World of Warcraft account names? Really?
Is shit broken?
Does shit get fixed fairly quickly?
Are your people busy, but not swamped?
- Percentage of staff with ITILv3 foundations certification: zero.
I know this because of your question. Watch thes videos as a start: http://pmit.pl/en/it-management/free-itil-v3-course-collection-of-itil-v3-moviesdarmowe-szkolenie-itil-v3-zbior-filmikow-o-itil-v3/ -- and sell a formal training course to your management.
The people joking about how one of you is getting fired, or you're all getting outsourced. . . probably true. Learning ITIL is all about learning what's important to your business stakeholders, how to monitor/measure these things, and how to make sure you're always making the right decisions based on the business priorities.
If you can't convince them to pony up for you three to take the certification course, then pay for it out of your own pocket, you'll need it to find a new job.
Seriously. Once management fall under the belief that they can have reports automagically generated for them by measuring your working habits, you will never hear the end of it. Why does X have less keystrokes per hour than Y? You only committed 15 lines of code yesterday. You must not be working. Why did this junior employee fix 100 bugs, when this senior only fixed 10? And take a look at yesterday's topic too.
It's not unusual for management to be clueless about what exactly it is that their IT staff does on a daily basis, nor is it unnatural that they should take an interest. Often, it's a good sign when they actually ask the guys doing the work what the metrics should be... it indicates some degree of trust, and they haven't simply read an "IT Management for Dummies" book over the weekend laying out some arbitrary system that isn't going to fit your organization.
As a more cynical commenter points out, it also provides the opportunity to create a measurement system that you can game to make you look good. But I think it isn't a terrible sign that the bosses care what their employees are up to. It may represent an opportunity to explain what you think is important that perhaps they hadn't considered previously.
No relation to Happy Monkey
Your ticketing system needs or needs to have added an automatic followup to the customers. The system sends out an email after every ticket asking "Did the problem get resolved in a reasonable amount of time? Did the IT staff respond in a way that enabled you to get back to work?" Nothing more complex than that, though you can parse things out by ticket priority (though deciding what's a higher priority than other things is, just by itself, a major undertaking).
Your goal should be to increase the percentage of positive responses.
Why this touchy-feely stuff instead of a hard metric? Easy. There are no metrics that work in your situation. It's quite easy to argue that there are no metrics that work, period.
By adding this email feedback to your ticketing system, you have met the requirement to come up with a metric derived from the ticketing system.
Selling this to management can be simple, depending on how you handle it. Something along the lines of "Given that the IT staff is so idiotically understaffed, we must be given the agility to solve problems instead of meeting random metrics. Only our customers can know if we met their needs, considering all pertinent factors. Someday, when we actually have enough people and money to divide work more rigidly, we can add metrics like timeliness of ticket closure, etc." Then you hope they never notice that you never divvy up the work rigidly. All of this requires having an IT manager who is dedicated to the inescapable truth - that their function is to keep the MBAs off your ass and let you do your job.
I've worked where my performance was measured in this way. It can be heaven.
One more thing - if your upper management doesn't already have faith in you, they'll never go for it. They need to already appreciate your contribution to go along with this. The very fact that they're asking for metrics tends to suggest they don't sufficiently appreciate you now. If that's the case, than all I can say is that I've worked under those circumstances, too, and my heart goes out to you.
> determining if we are performing above or below what is considered optimal
Scenario 1: you are below optimal -> you are inefficient so they replace you
Scenario 2: you are above optimal -> you are overkill so they replace you
Bottom line, I would rent The Wire and learn how to "juke the stats" because that's the only way you won't get to jump on that grenade.
Been there, done that - my advice: be just under optimal so you have room to grow and show improvement, but don't be too low so they don't feel the need to consider a business case for outsourcing.
lucm, indeed.
I'm not with most IT management on this one, but I always thought the best metric was customer satisfaction. For instance every time I open a ticket with Cisco I get a survey at the end of like 5 questions. Was my problem resolved, was the person polite, etc.
The other metrics suggested are things to graph and look at trends. Are repair times getting worse or better? Is the average time per ticket going up or down? They are great int he aggregate. They break down quickly when divided. Only one guy on your team knows network devices, so he gets all the network devices which include the 8 hour fiber cuts, so his times always are worse than the guys fixing printer problems, as an example. You have to be very careful as you start to divide them up.
At the end of the day though you're trying to make the customer happy. Track it, and see how your staff is doing. If people are happy with their IT support, your department will be seen in a more positive light.
1. make your numbers.
nobody actually cares what 'the numbers' are, or if they actually mean anything. but you have to make them.
you might ask yourself - isn't this a huge waste of time? isn't it completely counter productive? doesn't it actually decrease efficiency? aren't the metrics measuring completely the wrong thing? as the slashdot story the other day said, aren't bad metrics actually worse than no metrics, because they cause people to do inane, wasteful things to make their numbers?
well, your problem is that you are asking yourself. in a corporate environment, do not ask. just do.
just make the numbers.
hopefully, if you get good enough at 'making your numbers', you will have time left over to actually do some work.
2. but what about the theory of capitalism, the free market, efficiency, etc?
its all bullshit. just like the theory of communism was bullshit. what statistics and 'numbers' were reported to the government were just flat out garbage. people somehow managed to make the system work through personal relationships and working-around the assholse in charge. but most of the theories it was built on have no resemblance to reality. think about it - if efficiency really made for the best corporation, why would you be spending 4 hours a week filling out meaningless statistical performance reports that nobody will ever read, let alone understand?
the only difference between the soviet union and 'the west' is that 'the west' still hasnt collapsed yet.
The only truly meaningful metric is customer satisfaction. After each ticket is closed send a survey email to the user. If your team plows through enough tickets you get a statistically significant success % per tech that you can compare to the other techs.
Without knowing a lot more about the nature of tickets it is hard to give a better response. It might be very important to gauge the difficulty (trivial-hard) and novelty (common-rare) of tickets but it could also be a waste of time. Does one tech plow through dozens of tickets a day or a few? Are the techs specialized? Anything email related goes to joe, hardware issues to tom, etc.. Are the tickets auto-generated from emails, called in or do they fill out a web form?
Given the size of the team the company thinks 1) you aren't getting the job done, 2) one or more people on the team are dead weight or 3) you are overworked and need more people. I wouldn't bet on #3.
Falling back to metrics is a lazy manager's way of proving to her superiors that her drones are operating at peak efficiency. The most lazy of all will rely on utterly meaningless metrics such as the number of help tickets closed per day, per individual per day, etc. A metric such as this is completely useless as all tickets don't require an equal amount of effort to complete. Diagnosing a problem due to an intermittent hardware issue doesn't take the same amount of effort as helping a user change their password. Unfortunately these types of issues generally comprise the vast majority of tickets generated and therefore often end up being the ones that are 'measured. ' This often leads to a drop in morale and thereby negatively impacts performance; ironically the opposite of what the whole exercise is attempting to accomplish.
Trouble ticket data is primarily useful for detecting trends, thereby helping an IT team appropriately focus their human capital on issues that will enable their users to be more efficient. Going back to the password issue above, the speed and alacrity with which the IT staff help users change their passwords isn't a useful metric at all. A more meaningful metric would be the frequency of password change requests before and after the installation of a self-service password reset solution that was put in place in response to the analysis of help ticket data that showed that this was one of the most frequent issues and one that could be easily solved with little effort and financial expenditure. Measuring a sharp drop in password reset requests would show that the solution worked and was therefore beneficial to the organization by enabling users to help themselves, resulting in their having more time to concentrate on their primary tasks, and also by allowing IT staff to allocate their resources on issues that are less amenable to resolution via automation.
Unfortunately, in my experience, ticket systems get used to determine useless metrics such as the first example mentioned above, and therefore end up being the bane of IT staff, rather than a useful analytical tool.
Seriously I just have to say that this is the single funniest comment I've ever read on Slashdot. Laughing, pointing at the screen, drug my wife over here to have her read it funny. Brutal. Absolutely brutal.
From one cynical bastard to another, I salute you.
Weaselmancer
rediculous.
Time to answer call, time to resolve ticket, abandoned tickets (unresolved).
In business school it is a common theme in various classes that you get what you reward, not what you ask for, not what is necessarily best for the organization. Here is a highly relevant Dilbert cartoon illustrating this point, http://dilbert.com/strips/comic/1995-11-13/.
The underlying problem is that metrics applied to humans leads to people working towards the metrics, not necessarily doing good work. It is a classic environment for unintended consequences. Its not even that the people are necessarily being opportunistic, there is also a certain amount of practicality. If you are being measured by some metric and keeping your job or getting a raise is dependent upon that metric you may quite rationally decide to act to that metric rather than what is necessarily in the best interest of customers.
Are you measured by resolved tickets? Then tickets will get resolved quickly. Not necessarily thoroughly, completely, or robustly resolved. Which leads to related followup tickets because of a minimal effort put into resolving the original ticket. I saw this in a programming environment where the tickets consisted of new features or bug fixes.
Are you measured by abandoned tickets? Then tickets will get resolved, even if they don't reasonably deserve to be considered resolved. You will get things unnecessarily classified as "unable to duplicate", "insufficient information", etc.
In these two examples, where is the difficulty of the task factored in? Not all task, tickets, are equivalent. Furthermore sometimes there are external dependencies, a part is being shipped, where is this factored in?
The metrics you offer are reminiscent of stats from call centers. There such metrics are a little more reasonable, not perfect but perhaps OK, given that the calls are somewhat equivalent in the amount of effort required, a small number of minutes not hours, and that they are randomly assigned. Over the period of say a month the large number of calls handled by any operator will resemble a normal curve with respect to effort required. For an IT organization the evaluation period may need to be some number of years to get to a normal curve with respect to effort required.
I got written up once because my ticket stats were radically different than the other people on my team. 15% lower "total time on tickets" but 20% more tickets closed. I was apparently fudging numbers and closing unresolved tickets.
Fortunately, a trip to HR with a ream of printouts from closed tickets proved otherwise.
Still left the company a few months later.
Might as well close the comments now. :-)
Go look up Robert Austin's book on measurements and management. Read it and recognize that you've been given a task that is at best counterproductive and at worst impossible. Dust off your resume, because it may be more than one of you that are getting fired. ..bruce..
Bruce F. Webster (brucefwebster.com)
Well, when they say small, I'm the sole IT worker and thus the IT Manager at a 4 server, 40 workstation business so my only metrics are: I didn't make them spend a bunch of money, nothing lit on fire, I didn't quit. That's seriously about it and this quarter, I got all but the middle one but I wasn't the one who ordered that HP workstation, nor would I have, so it's sort of a gray area lol.
Touch up your Resume', go tell your boss to get bent, pound sand, etc. and look for a new place to work. Anyone who needs metrics on a three person team deserves anything they get, up to and including a swift kick in the ass. If the manager can't figure it out on his or her own, they should be the one being sent out the door with boxes.
"My immediate reaction is "WTF? What kind of moron doesn't make things 64-bit safe to begin with?" Linus
Before you do anything else, read Robert Austin's book, "Measuring and Managing Performance in Organizations".
The points I got from the book:
1) Measuring the wrong thing or in the wrong way makes things much, much worse.
2) Good measurements are possible but take a lot of hard work.
3) Measuring things that are easy to measure is almost certainly wrong.
I also endorse BenEnglishAtHome's comment timestamped 8:55pm.
I manage a department roughly the same size for a company about the same size.
I use two main metrics to see how we're doing:
1. The IT budget should be about 3% of the total company revenue. That's pretty average for companies of this size. If you're significantly under or significantly over, you need to do some soul searching to figure out why.
2. Every year we put together a survey and ask the exact same questions. Its going on 5 years now so we can compare our performance year over year. We ask about 20 questions and score them on a scale of 1-5. Things like 'hows training?' to 'how well does your cell phone work?'
Counting trouble tickets is mostly a worthless exercise. Although, you can manipulate it to your advantage. Start closing 120 tickets this month, 140 next month, etc. When you get to 240 tickets a month you can take that graph to upper management and say, "we're working twice as hard as we were 7 months ago and need to hire someone."
In the end, you have two ways to view this: as a bullshit exercise (and possibly an excuse to fire someone as others have said) or as a way to attempt to objectively evaluate your department.
----- obSig
There are a number of ways you can do this:
1) For the next few weeks, only deal with issues in the ticketing system that can be resolved quickly. This shows how responsive you are on the "count of problems solved" and "time to resolution".
2) Always upgrade easy problems to "Extremely Urgent", so that they get picked up first (as per above).
3) Do NOT under any circumstances touch a complicated problem that requires consideration or actual work. Find someone to outsource it to. Then blame the outsourcing costs and lack of efficiency (obviously they do not have the same fast response time as you) for the problem.
Seriously: In a 3 man team, you and your manager should KNOW who is working and who is on facebook all day. If you are all working hard, then it is not time to add more pressure by introducing metrics, it is time to hire more help. If on the other hand you are all on facebook all day - well - then good luck to you in your new job at Walmart....
Meus subcriptio est nocens Latin quoniam bardus populus reputo is sanus callidus
Twice you used the phrase "been tasked with". This is not the English you learned in school. Is your wife 'tasked with' washing the dishes?
There is a certain kind of lizard who embraces corporate speak, perhaps to kiss the ass of management or impress co-workers. This is a soulless sort of individual who is doomed to a lifetime of servitude.
If you reclaim your dignity, speak correctly and stand up straight you will either be respected or fired. Either way you will have found a better path in life.
...omphaloskepsis often...