Slashdot Mirror


Ask Slashdot: Good Metrics For a Small IT Team?

First time accepted submitter shibbyj writes "I'm a member of a small 3 person IT team for a medium sized business (approximately 300-350 employees) that has multiple locations internationally. I have been tasked with logging our performance using the statistics from our ticket management system. I've also been tasked with comparing these stats and determining if we are performing above or below what is considered optimal. I'm wondering what people opinions are on what good metrics should be in regards to mttr mtbf etc. I have had trouble finding information on this."

14 of 315 comments (clear)

  1. Metrics suck by SJ2000 · · Score: 5, Informative

    Didn't we just have a story about how metrics suck?

    1. Re:Metrics suck by ScuzzMonkey · · Score: 4, Informative

      Bad metrics suck, good metrics are useful data.

      As folks have already mentioned, time to answer and time to resolve are both important, and I think you have to watch for re-opened as well to curb "how fast can I shove this under the bed?" resolution games.

      My favorite is average tickets per user, though. Particularly on a small team, what you really want to gear your measurements toward is preventing incidents in the first place. It is helpful to know what your overall ticket volume looks like, then, and to aim to decrease it over time in the same way you might try to decrease time to answer and time to resolve. That's important, because as the previous article suggests, if you will get what you measure... and your overall goal should not simply be to answer tickets faster and resolve them more quickly, but to not have as many in the first place*. Every issue represents a waste of somebody's time and therefore corporate resources that could be put to more productive uses. Steadily decreasing mtta and mttr are nothing to cheer about if your ticket count is increasing.

      But you can keep it simple. You can drown yourself in metrics and lose sight of why you're tracking them and what you really want to accomplish. You may not really need any more than these few; better to start small and add what you need when you need it. I know there's always tension over getting a system in place that can capture what you need for historical purposes when you realize you need to know something new down the road, but resist the urge to over-collect. Half the time you won't need it all and are just wasting time getting it in the system.

      * There is a caveat to this; in some organizations, I actually DO want to see an increasing ticket/user count, at least for a time. This is something I shoot for when relations between IT and users have broken down badly enough that users have stopped reporting problems to IT because they feel it's useless, and their issues are never resolved anyway. In those cases, a rising ticket count can represent an increasing trust level, which is good. You generally won't fix issues you don't know about.

      --
      No relation to Happy Monkey
    2. Re:Metrics suck by CAIMLAS · · Score: 4, Informative

      Metrics is just a management word for "bad statistics".

      With a distribution of 3, it's not really possible to have statistics of meaningful nature. You've got shared responsibilities and bounce things off of each other. One person may open and close more tickets, have a shorter duration for tickets, etc. - but he's only doing the actual "work". The others may be giving him all the input necessary to complete the task(s).

      Ideally, your ticketing system will reflect, very vaguely, who's doing work and who is not, but even then it's not going to be well representative of what's actually going on.

      People do different types of work, of different levels of difficulty. For instance, I may do one ticket on Monday, three on Tuesday, and one for Wednesday through Friday. Why? Aside from the fact that I'm bad about actually doing tickets for my work (my god, I'd not have the time for work, and then there'd be more things that aren't getting done, making us -all- look bad), there's the reality that my tickets aren't terribly easy, often requiring hours of log perusal and research to try to fix problems. Meanwhile, the guy who knocks out 40 tickets a week - malware disinfection, workstation reinstalls, etc. - has fairly wrote work of a repetitive nature, comparably. Also, he's following instructions or asking for advice on a regular basis, even if I'm not his boss.

      You said so yourself: you're a member of a 3-person IT team. The only use 'metrics' have here aside from what should be plainly obvious in a group of 3 (who's fucking up, who's not getting back to people, who's not doing work) is to keep track of what amounts to customer requests and problems. X workstation needs to be reinstalled, Y server has a crashing whatever, and so on. If you're working on and sharing a ticket queue, you are all mutually responsible for all of the tickets: if something isn't getting done, it's everyone's fault (or nobody's fault). You may consider presenting your metrics in the light of this reality (like statistics, metrics can lie, too).

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
  2. Here's a few... by HerculesMO · · Score: 5, Informative

    Time to answer call, time to resolve ticket, abandoned tickets (unresolved).

    If you google a few of those it will bring some more, but that's a simple start.

    --
    The price is always right if someone else is paying.
    1. Re:Here's a few... by suutar · · Score: 4, Informative

      Number of calls back after initial call (measuring, in theory, how often the initial call resolved the issue) Number and duration of system outages (if you're doing sysadmin stuff as well as support stuff)

    2. Re:Here's a few... by Anonymous Coward · · Score: 2, Informative

      The reference metric must be to have everything operated by IBM. My experience over the last 10 years or so:

      Every 2 years there will be a major power outage at the server room (for one reason or another), and everything will be down, and it will take them around 24 hours to reboot everything in the correct order, and longer to make sure everything is running smoothly.

      At lest a few times per year, there will be no connection to critical servers. May be caused by many things.

      Every couple of years, the backup system can't keep up, so everything is slow until noon when the backup is finished.

      Once a year, their DNS server is down for at least 8 hours.

      Once every 5 years, multiple drives fails in succession in the SAN, and RAID is not able to recover. And IBM failed to react to when one drive per day started to go down. Do not replace what is not broken.

      Then there all all the minor issues. Like Lotus Notes down, which happens at least every 3 months.

      All the above is considered normal stable operations if they outsource to IBM.

  3. optimal? by Anonymous Coward · · Score: 5, Informative

    talk about flow, about bottle necks. Visualize workflow. Look at Henrik Kniberg's paper on kanban as applied to IT Ops. My guess is that your ticketing systems will provide low value data on volumes on resolution time - gear up - visualize the pipeline. check http://www.infoq.com/minibooks/priming-kanban-jesper-boeg - turn the conversation around to "business value" - don't get wrapped in the ropes of volumetrics /peace.

    1. Re:optimal? by bbutton · · Score: 3, Informative

      +1 for this answer. Kanban is a great way to generate actually useful metrics for a team, project, or department. You'll be able to calculate things like how long it takes the average ticket to work its way through your processes, where tickets tend to get stuck (cumulative flow diagram), and where the sources of waste are in your processes.

      In addition to the book mentioned above, I also like this one by David Anderson: http://www.amazon.com/Kanban-Successful-Evolutionary-Technology-Business/dp/0984521402/ref=sr_1_1?s=books&ie=UTF8&qid=1324000945&sr=1-1

      I've led several teams using Kanban as a way to visualize our workflows and measured the cycle times for each work item through our processes. By driving out common causes of variance between work items, its possible to arrive at a consistent cycle time per item. You can then use any process improvement technique you like to show tangible improvements in cycle time.

      -- bab

  4. Re:Hahaha by Anonymous Coward · · Score: 5, Informative

    Yes, this.

    Management wants to eliminate someone and wants to do so in an "objective" way to hide the fact that they're firing someone while probably giving the CEO a fat Christmas bonus. You're tasked with figuring out which of the three of you gets fired and how you can cloak this in enough "objectivity" that no one can object to it. Your best bet is to make this shit up. Figure out who the weakest link besides yourself is, or who you like the least, and generate a system of metrics that's biased towards eliminating that person. Use lots of acronyms and jargon. Also, make sure no one at work reads Slashdot.

    totally irrelevant CAPTCHA: forgive

  5. Add something by BenEnglishAtHome · · Score: 3, Informative

    Your ticketing system needs or needs to have added an automatic followup to the customers. The system sends out an email after every ticket asking "Did the problem get resolved in a reasonable amount of time? Did the IT staff respond in a way that enabled you to get back to work?" Nothing more complex than that, though you can parse things out by ticket priority (though deciding what's a higher priority than other things is, just by itself, a major undertaking).

    Your goal should be to increase the percentage of positive responses.

    Why this touchy-feely stuff instead of a hard metric? Easy. There are no metrics that work in your situation. It's quite easy to argue that there are no metrics that work, period.

    By adding this email feedback to your ticketing system, you have met the requirement to come up with a metric derived from the ticketing system.

    Selling this to management can be simple, depending on how you handle it. Something along the lines of "Given that the IT staff is so idiotically understaffed, we must be given the agility to solve problems instead of meeting random metrics. Only our customers can know if we met their needs, considering all pertinent factors. Someday, when we actually have enough people and money to divide work more rigidly, we can add metrics like timeliness of ticket closure, etc." Then you hope they never notice that you never divvy up the work rigidly. All of this requires having an IT manager who is dedicated to the inescapable truth - that their function is to keep the MBAs off your ass and let you do your job.

    I've worked where my performance was measured in this way. It can be heaven.

    One more thing - if your upper management doesn't already have faith in you, they'll never go for it. They need to already appreciate your contribution to go along with this. The very fact that they're asking for metrics tends to suggest they don't sufficiently appreciate you now. If that's the case, than all I can say is that I've worked under those circumstances, too, and my heart goes out to you.

  6. Re:call / ticket time is bad metrics by DarthBart · · Score: 5, Informative

    I got written up once because my ticket stats were radically different than the other people on my team. 15% lower "total time on tickets" but 20% more tickets closed. I was apparently fudging numbers and closing unresolved tickets.

    Fortunately, a trip to HR with a ream of printouts from closed tickets proved otherwise.

    Still left the company a few months later.

  7. Here's what you should do... by certain+death · · Score: 3, Informative

    Touch up your Resume', go tell your boss to get bent, pound sand, etc. and look for a new place to work. Anyone who needs metrics on a three person team deserves anything they get, up to and including a swift kick in the ass. If the manager can't figure it out on his or her own, they should be the one being sent out the door with boxes.

    --
    "My immediate reaction is "WTF? What kind of moron doesn't make things 64-bit safe to begin with?" Linus
  8. Re:Hahaha by WOOFYGOOFY · · Score: 3, Informative
    Your comment represents a type f fallacy I believe is called, , because it's so, therefore it's so. Essentially you're saying because they are in management they must have earned it.

    First observe that no matter what, someone HAS to be in management. CEO is a position, unlike yours, which cannot by stay empty for long.

    Second, all that has to happen to rise in a company is someone above you promotes you. People who bet on Skill and Hard Work taking them there are a dime a dozen and what's become essential to their positions and thus their superiors' well being.

    They aren't going anywhere, except out the door , when they can be replaced with someone cheaper or their talents are no longer required.

    But people who are visibly ambitious (but not too!) and agreeable (to management) and wiling to fuck their fellow employees over in private conversation (but not obviously) and have a lust for power, and meet the other requirements male, white (or Brahmin, as the environment demands) taller than average and , uh nice looking with a authoritative air... you know, the leader type THOSE people are hard to come by and need to be kicked upstairs to those open positions ASAP. What a joke. In my multi-billion dolar copany, the peole above me did my job for a matter of months.

  9. Re:Done in one by nahdude812 · · Score: 4, Informative

    You cynical man. For all you know, upper management have a budget flush with cash and have singled out someone in the hard working but unacknowledged IT department for a raise and a promotion.

    Don't forget the free pony.