Ideal, and Actual, IT Performance Metrics?
An anonymous reader writes "Recently it was revealed that our company measures IT performance by the time it takes to close trouble tickets. I consider IT's primary goal to be as transparent to the user as possible, thus this metric was rather troubling to me. Shouldn't we be focused on reducing calls, rather than simply closing them quickly?
My question is: How is your IT performance measured, and how do you think it should be measured?"
I think poster has a point.
A nice metric might be the count of tickets that are never opened.
An IT-department, IMHO, should be working on making itself obsolete.
I consider IT's primary goal to be as transparent to the user as possible, thus this metric was rather troubling to me. Shouldn't we be focused on reducing calls, rather than simply closing them quickly?
Not for "stupid" users, the ones you see on a day-to-day basis. Now, this all depends on who you are giving support to, competent IT professionals or the day-to-day office worker. If you are giving them to fellow IT people, it should be a goal to be transparent. For the office worker the main job is productivity, that means fix the problem as soon as possible or tell them there is no problem and have a good day.
Taxation is legalized theft, no more, no less.
But it's close. Of course, closed tickets are something a manager can measure. Needless to say, it measures nothing meaningful. For example, I tell a customer to reboot. Close the ticket. That takes little time and closed the ticket fast. In fact, I can improve my metrics by telling that same person to do this ever 4 hours for several years. OR, I can get up, go to their desk, and solve the problem permanently. It takes longer, making my metrics look bad, but in reality-land (a land far, far away from management land), that person is doing productive work longer and more efficiently because the interruption and downtime have been removed.
Please do not read this sig. Thank you.
s/metrics/bullcrap
A good metric should be
1 - Enterprisy looking
2 - Easy to gamble by the interested
Your boss wants a number, give it to them quickly. It's all BS (or 99% of it at least. Don't agree? Do the job then) in the end.
So good metrics could be.
- Unplanned downtime
- Number of users, number of bytes used, etc (that plots a nice ascending graph, and ASCENDING IS GOOD, you can print that and put it in the wall)
If they stay on 'time to close the ticket' NEEDINFO and WORKSFORME is your friend.
how long until
Amount of service calls resolved: h
Server/network downtime (in hours): d
Use formula '(s / h) + 2d"
Use resulting number to chart IT support performance, assuming that the network + server uptime and stability is more important than user inconvenience. You could decide that anything above a certain threshold is too much, or use it to compare personnel with each other.
Yet Another Tech Blog
(but so much more, including game and movie reviews)
http://yanteb.peasantoid.org
In my department, we have an agreement with the rest of the company outlining the level of service that must be performed within a pre-determined amount of time, based on incident priority. With the right tools, it's fairly easy to track the percentage of incidents resolved within the terms of the SLA.
"Ask not what your country can do for you." --John F. Kennedy
Yeah, that's pretty much it. Managers and executives can't handle anything that doesn't have a nice, neat, single number that tells them everything they need to know without having to actually know what's going on.
You can count calls, count time spent on calls, how long it takes between when a call is received and a tech is dispatched. You can count how many devices you have deployed in the field. All of these numbers tell you different things, and not one of them tells you much of anything by itself. Management needs to actually be in touch with the field and truly understand what's going on in their IT department, otherwise all those numbers are pretty meaningless.
I don't know about you, but my servers run on the power of cotton candy and happy thoughts. -Anonymous Coward
Management gets the behavior that it rewards, not necessarily the behavior that it pretends to ask for
Whenever I see a metric that measures quantity instead of quality, that tells me the manager gets a bonus. Hopefully, you're getting a piece of that bonus.
Sounds pretty normal for a call center. At my last job management got excited if a case was open for over two weeks regardless if the issue was resolved or not. That's what I call great customer service!
Here is the problem... you are trying to assign arbitrary numbers to something that cannot be measured. These are numbers for accountants, they want one number to be able to show them where to cut cost. Problem is that there is no way to quantify how much money an IT department saves a company. Metrics have gotten out of control in this country. We are always measuring the cost and never measuring the value. How do you assign a number to a person who is not a number? How do you quantify the guy who spent all weekend fixing the server? How do you quantify the accrued knowledge of a human being? It impossible to do. The accountants never ask questions like, "How would my quality of life be affected if I couldn't get effective tech support?", "How much money would the company loose if these computers and programs didn't exist?". You need to measure the man and his work as a whole, person to person.
I thought IT got paid for the number of times they said 'No' to us during the day.
Here's a trick, if you want them to start saying 'yes' give them more of a budget, as most 'no's comes from a lack of money.
The force that blew the Big Bang continues to accelerate.
For example, for every fax successfully sent via the fax server without IT intervention, the IT department gets one point.
For every fax that needs IT intervention to be sent, the IT department loses one point.
I like this idea, because it has the side-effect of forcing managment to define in writing and exactly the services the IT-department / infrastructure is actually supposed to provide and also forces them to define some metrics and mechanism to measure this. This enables the IT-department to respond to inappropriate requests in a nicely formal way. Also the managment can prioritize on this to help IT fend off the odd jerk that thinks their particular problem is the most important in the world and should be taken care of ASAP. Such a system would also provide transparency to managment and users as to wtf these IT-jerks are doing all day and why.
Actually, that's good - a proactive IT department would work on fixing issues that many users have difficulty with, even if that means replacing the copier contract with one that delivers a more user-friendly machine that has slightly worse "paper specs". As a random not-very-technical example.
You're special forces then? That's great! I just love your olympics!
I think the average time taken to close a trouble ticket is important, but it's not the only factor you want to look at.
The primary purpose of issuing unique trouble ticket numbers is to provide an easy "one stop" tracking mechanism for the issue. A customer (or employee) should always be able to reference a ticket # to support staff, and in turn, they should be able to pull up a fairly comprehensive history of what's been done so far to resolve the issue.
If you push too hard for closing tickets quickly, you'll see a tendency for new tickets to get issued on things which should REALLY be continuations of an existing ticket, held open longer.
(EG. I call in complaining that my inkjet printer won't print yellow. A ticket is created and they tell me my color cartridge is clogged up, so put a new one in and I should be fine. Ticket is closed. I switch cartridges with a new one, and discover it STILL doesn't print yellow. I call in and a new ticket is made for what's really the same issue. I'm told how to run the printer through cleaning cycles, and instructed that I may have to do it "up to 10 times" to see results. Ticket closed. I get around to trying that the next day when I get time, and even after 10 or 15 attempts, no yellow is coming out. I call back in, only to have ANOTHER new ticket opened, and the tech wastes my time asking me if I "tried a new cartridge yet?" and I have to interrupt him in the middle of re-explaining how to do a cleaning cycle. Problem is eventually determined to require a replacement printer ... but should obviously have all been filed under one ticket.)
But how do you measure the success rate of a problem you solved proactively, thus ensuring it never becomes a measurable problem?
The New Tax Credits system in the UK used the same call-time metric - likely still does - it was able to get most calls averaging the artificial target of three minutes. Never mind that if you looked at the call logs you could see most callers indeed spent around 3 minutes on the phone, but never got their problem solved. The unlucky representative who got that caller when they were fuming mad, and determined not to hang up until the problem was solved, would get placed lower in internal league tables.
So it came down to politics - the terrible metrics allows us the ability to satisfy tribal instincts by ranking participants. That was the real motivation. Call centers around the country ended up competing with each other on this metric too, and the directors of the most useless call centers were the ones who got promoted to run the whole show.
But this problem is beyond NTC or IT.. it's the defining issue of this backward planet.
Every time accounting asks "Why are we paying these guys, they don't seem to do anything," you get 5 points.
Only for so long. Eventually you burn them out and they just decide not to bother doing extra if they company can't be bothered to fund things.
"If you make people think they're thinking, they'll love you; But if you really make them think, they'll hate you." - DM
I don't say "no" any longer. I ask them what their budget is for accomplishing the task they want.
me: "How much do you have budgeted for this project"
them: "Budget? You mean it costs money? I thought you could do this for free"
me: "We can't do that for free" (laughing to myself the whole time) .... later they come back ...
them: "We have $400 for the project"
me: "Does that include the licensing? Does that include ongoing support? Does that include setup, training, and installation of new infrastructure needed to support your project?"
them: "Uh, no. What do you mean?"
me: "Well, when you want a project ... say for a new building, do you just present $400 and say can you build the building for that?"
them: "Well, no, we have professional architects design the building, then we have professional contractors bid on the project, then we included additional maintenance in the budget for the new building and .... "
me: "So, what you are saying is that you don't view IT as being professional"
them: "No no no no! That's not what I mean at all."
me: "So, how come you just expected us to do what you wanted without asking us what it would take to do it?"
them: "Because it is too expensive when I do ask that"
me: "It is more expensive to do things right. If you want to do it wrong, any non-professional can quote you a lower price. You can get a building and have it built a lot less expensive if you don't hire Architects and Contractors to design and build a building, and it will get built, but it will be missing things you probably want and need. But you know this, and that is why you trust those professionals."
them: "yes, but you are too expensive"
me: "Then the answer is no"
---
Sometimes it is just easier to say "NO". The sad fact is, people don't respect IT professionals AS professionals. We often don't deserve it either, but that is another topic.
Agent K: A *person* is smart. People are dumb, stupid, panicky animals, and you know it.
[sales to IT] We need (something that is a huge security risk).
[IT to sales] No.
[sales to administration] waaaahhh.
[admin to IT] Do it.
[IT] Grumble grumble fuck you. *does it*
[sales] yaaay!
[Admin] Damn IT.
Shit hits the fan, IT is blamed. Goto 10.
if the answer isn't violence, neither is your silence / freedom of expression doesn't make it alright
We are big on SLAs. Department directors have to sign off on an SLA before IT will support their stuff. Actually this is how IT gets it's budget.
For example, marketing comes to IT and asks for a service like sales tracking. After figuring out what they want we give them a quote with SLA and how much it will cost. After buildout there is a sign off and the service is available for use. To the users there is no concept of hardware of server. They just know if their stuff is working or not. I mean they are marketing people. Any problems that occur are tracked by our ticketing system, and its just a matter of tracking resolution time, incident severity and number of incidents. All of this is defined in the SLA. Resolution time usually comes into play when looking at service availability, and in the incident review process for high or critical outages.
For our team individual performance usually comes down to how well we contribute to the team. My review is not that much different from a kindergarden report card. "Plays well with others" is now "Maintains positive relationships with external partners"
Stop asking to do stupid things like
- run an internet server without a firewall
- Setup accounts without passwords
- Use 1-off proprietary software when we've selected the best solution for everyone in the company. Too bad our selection costs 3x more than the other stuff.
- Bring a 64-way server up without a fail over, test, dev, and DR instances too.
- Bring a 32-way server up this week, when your project hasn't been approved yet. These things take about a month to get delivered and another month to get installed, configured, connected to the SAN and ready for applications
- Allow an outsourced vendor unlimited access to internal networks with 10,000+ servers without a corp-2-corp VPN in place.
- Send and accept unlimited sized emails without any virus and malware checks.
- Demand something fast because YOU didn't schedule and budget properly - MARKETING, this is for you.
- Run a machine that will be hacked easily and turned into a torrent, porn, music, VoIP server a few months after it gets placed onto the network.
Stupid metrics are part of the problem. When I worked for Gateway, they wanted your call average to be between 7 and 11 minutes. If you went above for the week/month, you were too slow and bad at your job. If you went below, you were probably just getting people off the phone without solving their problems.
That metric worked for most people, because they talk slow and have to look up every single issue.
For me, it was killer. I was consistently getting 5 minutes averages, even with that inevitable once-a-day 1-hour phone call. I got reprimanded twice about it before I gave up and quit. Almost every caller was happy with how I helped them. The others couldn't be helped, or I made a mistake. (I told a guy he could clean his keyboard, once... They had switched to keyboards that fall apart if you try to open them, apparently. In my defense, I had offered to send one, but the guy thought cleaning it would be a lot faster.)
Also note that a certain percentage of calls were recorded and reviewed, and I -never- got talked to about any of my calls. The only complaint I had was the keyboard guy. And yet I still got yelled at for short call times.
Again, stupid metrics are stupid. Call-time has nothing to do with customer satisfaction.
"If you make people think they're thinking, they'll love you; But if you really make them think, they'll hate you." - DM
It's IT management from a wholistic point of view.
SLAs are only one aspect of IT management.
There is no point measuring something unless you are going to do something with the information. Are your metrics getting better because things are getting better or are you just getting better at fighting the same old problems. Are you measuring a metric because it's easy to meassure or because the business needs that metric to be good?
Ultimately the idea is to get incidents themelves to zero because that means a smoothly running infrastructure operating exactly as the users and business expect it to. Not exactly possible, but at least it provides a direction to move in... And if your incident management system is any good, it'll tell you where the problems are, and where money should be spent to fix them. That may be user training, education on the portfolio of services that IT provide, or replacing a critical application that falls over every 10 minutes or is too slow, etc etc.
Deleted
[ADMIN] You failed to protect the data of the sales team. we were compromised, you bastards... Its your job to make sure of that...
What are we going to do tonight Brain?
[sales to IT] We need (something that is a huge security risk).
[IT to sales] No.
[sales to administration] waaaahhh.
[admin to IT] Do it.
[Competent IT with minimal people skills] No, and here's why
[Admin] Ok, it was a dumb idea.
Anyone can "stand up for what they believe", but it takes a very brave individual to change what they believe. - Loundry
Right, right. Because IT people are going to be more persuasive than SALES people who make their LIVING persuading people.
It's far easier to get someone to say "yes" than it is to tell them that what they're doing is fundamentally wrong or broken. You can only sugar coat so much.
"If a business service owner signs off then what is the problem?"
If they were rational beings, ala Spock then, sure, there would be no problem. But they are not quite alike Spock. They will sign off and then still will make you responsible. You'll uncover problems, bussiness would state their priorities, you'll follow their priorities and still you'll take the shit when it hits the fan.
"Just make sure your change management board includes them, and finance as well."
For this to happen you should have a change management board first. These kind of bussiness doesn't have a CMB and there won't be no way to make a bussiness case for it (it means more burocracy, more upfront cost and only "soft" advantages at the beginning. They don't want that kind of burocracy because of costs and because it would put it in their places. And if they were able to see the soft benefits, they wouldn't have such behaviours to start with).
Being there, seen that.
Years ago I learned that most managers are so remarkably ignorant of what good IT workers do, you know preventative work that ensures users can do their jobs without interruption, that the only way to get ahead is to be a bad IT worker.
Meaning if you let all sorts of bad stuff happen and then rush in and be the savior of the day you will be rewarded with promotions and bonuses.
A few years ago, just before I left my last job, I demanded a job review having gone 6 years without one. I got to sit through my review by the VP in charge of the division I supported and my direct IT boss from Denton Tx, only to be criticized for not socializing with some techs who came up from Denton to help with a move of the office 80 mile to a new city. I had elected to do my appointed tasks for the move, baby the servers and double check backups prior to taking them down packing them up and reinstalling them in the new sites server room.
Had I done the socializing I would have ignored my duty to the corporation but not been f*cked over during the review. If the servers had not come up I would likely have been heralded as a saint had I been able to resurect them too.
It makes no sense but my advice is don't bother with looking at meaningful metrics unless it is to satisfy your own needs. Focus on the only metric management sees... Crisis frequency and crisis resolution... be the superhero!
[Sales to IT] We need (something that is a huge security risk).
[Good IT] Here's a slightly different solution that addresses your needs without creating a security risk.
[Sales] Great, thanks!
[Good Admin to IT] Good job understanding the client's needs and thinking outside the box to get it done.
I think that a large part of the problem with creating usable IT performance metrics begins with a basic problem with human nature: namely that we tend to notice flaws far more than we notice the absence of flaws. Things that run smoothly tend to get taken for granted and thus get forgotten, while the squeaky wheel gets the grease.
I see this in the quality of most of the big money "enterprise solutions" that I have to support and integrate on my job. When a software vendor relies on long term support contracts for their revenue, as most enterprise solutions do, there is a disincentive for such a vendor to ensure their software is either easy to use, deploy, or administrate, since such ease erodes the need for support. Likewise, in my experience, it is difficult to convince corporate IT customers to pay a premium for higher quality solutions, which is what forces the vendors into the support model in the first place.
Until the day that there are good, widely-used metrics for assessing the value of when things *don't* go wrong, I suspect the flawed metrics like the submitter's are going to prevail.
Momentarily, the need for the construction of new light will no longer exist.
There are three types of companies.
Generic
Brands with no real value
Brands with a good reputation, worthy of trust.
A Generic company honestly should not give a crap about user satisfaction. Your goal should be to spend as little $/user as possible without resulting in lawsuits. You are selling crap and people are buying it because it is cheap. They won't care about customer satisfaction, otherwise they would have bought a brand name.
You correctly describe a no real value brand. Middle satisfaction is the goal.
But a real brand (like Apple) needs HIGH satisfaction. They want it as high as possible. This sell their product at a premium, using their reputation as the reason. They need people to say it is worth spending more for their product.
Similarly, the same thing works for a company that wishes to maintain a good reputation as being on the cutting edge. If you are a law firm, like say the one I work in, and you are trying to attract the best lawyers, it helps a TON to be known as someone with the best stuff and 'it just works.' (to copy a certain slogan). You can't do that if people go around saying nothing works and IT doesn't get back to you. You need IT to solve your problems ASAP, keeping all employees happy and not thinking "This wouldn't happen if I took that job at XXXX"
excitingthingstodo.blogspot.com
[Sales to IT] We need (something that is a huge security risk).
[Good IT] Here's a slightly different solution that addresses your needs without creating a security risk.
[Sales] Great, thanks!
[Good Admin to IT] Good job understanding the client's needs and thinking outside the box to get it done.
No, it goes like:
[Sales to IT] We need (something that is a huge security risk).
[Good IT] Here's a slightly different solution that addresses your needs without creating a security risk.
[Sales] No, Dammit, we asked for (something that is a huge security risk). We already marketed (something that is a huge security risk). We'll lose $$$ if we don't get (something that is a huge security risk).
[Bad Administration to IT] ZOMG! $$$? Doit doit doit!
[Good IT to Bad Administration] But we'll likely lose $$$$$ when (something that is a huge security risk) is exploited.
[Bad Administration to IT] We _WILL_ lose $$$ if you don't do it. Forget it. FU. You're fired for being a recalcitrant douchebag. We'll hire someone who will do it for less money than we pay you.
How quaint. You think IT gets invited to meetings and is allowed to present documentation in their defense.
[IT] We requested $X for the job, you gave us $X/100 for the job, we couldn't. We warned you what the risks were in these 12 Emails, 2 Printed Letters and numerous phone calls. You didn't care because it cost too much at $X but was worth it for $X/100. Now you have to pay $X x 100 to fix your short sightedness.
Don't blame us if you didn't understand the risks.
Agent K: A *person* is smart. People are dumb, stupid, panicky animals, and you know it.