System Admin's Unit of Production?
RailGunSally writes "I am a (strictly technical) member of a large *nix systems admin team at a Fortune 150. Our new IT Management Overlord is a hardcore bean-counter from hell. We in the trenches have been tasked with providing 'metrics' on absolutely everything from system utilization to paper clip recycling. Of course, measuring productivity is right up there at the top of the list. We're stumped as to a definition of the basic unit of productivity for a *nix admin. There is a school of thought in our group that holds that if the PHBs are simple enough to want to operate purely from pie charts and spreadsheets, then we should just graph some output from /dev/random and have done with it. I personally love the idea, but I feel the need for due diligence, so I put the question to the Slashdot community: How does one reasonably quantify admin productivity?"
The best sys admins are the ones you never notice. If the productive workers in a company never see or need to talk to a sys admin it's been a productive day for the admins.
"How does one reasonably quantify admin productivity?""
If no one in the building but HR and your line report need to know your name, you're doing your job...
Other than that, it would be like a trash collector counting how many cans he emptied during the day or a wildfire firefighter how many burning bushes he chopped. If there weren't any fires or trash these people wouldn't be needed, would they?
You can't quantify SA productivity.
It's easy to quantify /my/ productivity as a support tech (at the U of CA) in number of tickets resolved per shift. But sysadmins have a number of duties which they are performing /continuously/, so how can you quantify that?
.:Semper Absurda:.
Since the real proof of actual productivity for network admins is negative: nothing goes wrong (no trouble tickets). Also, the PHB will get their wish: No one to pay is infinite productivity (measured as output per $ spent).
Do uptime. Unless your team has serious problems, those numbers should always look good. If you do any sort of work in response to in-house or outside tech support requests, you can measure how long it takes to resolve issues.
You aren't building automobiles or painting teapots. You are a support function and not a line function.
You should have business plan objectives. These things are usually annual; there can be longer strategic objectives. If the person who set these things did it right, they should be measurable.
What I'm trying to say is, if you're banging your head against the wall trying to figure out how your performance should be measured, your higher up didn't set your objectives correctly.
This doesn't apply anywhere and everywhere. When the organization is in the business of IT itself, you might be measured differently since you'd then be contributing directly to the organization's core business. But from the description provided, it sounds like you're not.
The Banjo Players Must Die!
firefighters spend a lot of their time these days preventing fires, doing stuff like controlled burns.
maybe you cant measure 'productivity', but at some point you have to make a budget of how many people you need to hire for the season, and to do that, you have to know how many people it takes to do certain activities in a given amount of time.
whicih means you need to measure those things.
---
Systems Administration falls into several categories.
:)
Projects, Service requests, Patching, and user satisfaction are a few.
Once you have an idea of what you do, define some SLAs with your customers and the metrics are easy from there.
Now compare your defined SLAs to the following.
Metrics:
Time to ticket close?
Were the requesters satisfied?
Projects completed in the expected time?
Resource allocation is at what percentage?
Don't forget to measure your ongoing education and professional development. How much should you get, are you getting it?
Patch schedule being met?
Availability metrics.
Resource loads on the systems are easy and provide management nice graphs, plus they can be automated.
My systems roll all this information up and e-mail it for me.
While none of this is really important to us, the management teams operate almost entirely on this data. Take this as an opportunity. In some shops I've worked, management defines the metrics and they mostly are irrelevant. In your case it seems you have the rope to hang yourself so take care to present the data that is important and will help you meet your goals. As always, a good admin will automate the task but not tell anyone.
--russ
I am sure that others could find much better ways of quantifying performance, but this is something that jumped out at me. I was part of a consulting team that was asked to improve performance in a company several years back, and they came up with something similar.
What you need to do is contact some other F150 companies and ask their senior IT admins/CTOs how they measure productivity. I work for a major investment firm and we have metrics for everything we do (even though we're private) because of two primary reasons:
1. its how you improve, and
2. its what our competitors do too.
Its that simple.
pi=sigma{n:0-infinity}[(1/16)^n][(4/(8n+1))-(2/(8n +4))-(1/ (8n+5))-(1/(8n+6))]
Assume for a second you had a perfect server farm. Its always up, backups are made, users are added and removed, etc. While we are at it, assume you have a staff of say two admins per shift, 24x7. That's at least 8 admins, probably more to cover holidays, vacation, etc. In this case, their productivity is zero, they have nothing to do. In reality, they are working their tails off, and deserve a nice bonus. So tell the PHB that productivity is not important, its problems. Its uptime, transactions delivered, average delay on transactions, etc. Get the Users to define what the 'requirements' are, and have the sysadmins deliver it. That is the measure of what is important.
No. How many tickets were not opened in the first place because things
just work.
Yeah, I know.
emt 377 emt 4
The nice thing about the /dev/random solution is its easy automation, saving the ridiculous amount of time this sort of bean-counting takes away from real work, and ultimately leading to greater business efficiency.
You can't measure 'productivity'. Managers with this style of management are not worth working for. Find another job.
Clearly this person is an idiot, and whatever scheme they use to measure productivity will only result in reduced productivity.
It is your manager's job to measure productivity empirically. If he can't do that, he's not fit for his job. Your manager should be sufficiently involved to determine if you are productive. If he's not, he's not productive, and should be fired.
The whole culture of targets and 'management science' is ideologically bankrupt, and practiced only by morons.
Hours of productivity per day lost to productivity measuring?
but for a boss like that the only metric s/he deserves is "copies of resume circulated per week". Of course this does your boss no good until you turn in notice. That's the idea.
Simple answer is that you don't. Productivity in terms of IT and related fields has become a dirty little word but more than that it is a business term, not technical. If you aren't a director or higher in title, and your duties don't include justifying expenses and planning resources for solutions, then it isn't really your realm to measure something like productivity. If this guy has an MBA or similar qualifications, it is he who should know how to measure productivity. But alas the word productivity has become corrupted by half-assed business journalists trying to write articles about over all productivity and how your employees waste too much time on facebook. If this guy just wants a number and gives you no guidelines as to how to come up with the number, then my guess is that he just wants to kiss up to the CEO that "productivity" is up 40% or he wants a number to justify laying off people. Either way, if he cant tell you how he reached his number, I would suggest getting your resume ready.
Also ideally, a CTO wouldn't be asking those in the trenches how to measure productivity, but rather how to improve it. As someone in the trenches, you probably know where the snags are in efficiency, or what software you would need to purchase to help smooth things along or even where people are over worked or over looked. This is the positive way to improve productivity. Basically he should be asking you what you need in order to get your job done, and he should get it for you (within reason of course)
meep
Simple! If you get calls you aren't productive enough, if on the other hand you get no calls it's time to slash the workforce... At least that seems to be the thinking of most non-IT staffers where I work.
This pretty much says it all; your manager wants you to do HIS job. Shouldn't he develop his own metrics? He can ask you for ideas but he should do the work himself.
You are right, but you are also skating on thin ice here. Asking someone who has no clue what is happening to set metrics is just asking for trouble....
HA! I just wasted some of your bandwidth with a frivolous sig!
You know, that's a very noble thought. But the reality is, if you simply refuse, you allow the PHBs to define their OWN metrics, which can be orgasmically stupid:
- 1 point for every spam mail that the PHBs get that they don't want.
- 1 point for every time a website that they want to get to is blocked
- 1 point for every time they see anyone in the company surfing a non-business website
You let them define the measures, and you'll be looking for a job. It's a truism that they DON'T UNDERSTAND WHAT YOU'RE DOING. To let them measure and qualify your job would be nuts.
-Styopa
Counting support tickets is almost as bad an idea as counting lines of code. There's all kinds of ways to generate and close a huge volume of support tickets.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
To approach it from an entirely different angle, much of an system administrator's job(whether Unix or not) is to avoid things, much like a security guard.
Just for one example: How do you measure avoided data leaks that would of cost millions?
I don't read AC A human right
Of course the elephant repellent is working! You don't see any elephants around here, do you?
Seriously though, that's a problem in many fields. People don't appreciate the value of a good military until they're under attack. They don't appreciate the value of a well funded police department until the crime rate starts increasing exponentially. And they don't appreciate the value of a good fire department until their whole block has gone up in flames. Sysadmins are no different.
I knew a guy who was a millright for GM at Hydromatic, he was paid $45.00 an hour and played Euchre all day, management was fine with this because when he went to work, the plant lost $45,000.00 an hour. When a sysadmin is working, really working at his/her real job, the shit done hit the fan.
Apocalypse Cancelled, Sorry, No Ticket Refunds
That's a good list. I'd add a little more, though.
Personally, I split sysadmin work up into two categories: doing something and making it so you don't have to do anything. The second is much more important, but much harder to quantify.
For the first category, you can definitely count things for managers. E.g., X accounts created, Y support requests handled. Be very careful quantifying things like this, though, or you create perverse incentives. If I make a system that's hard to use, I can receive and satisfy a lot of support requests. Or if I concentrate power rather than distributing it, then I get to look busy and important.
The other category is much trickier. Long ago I worked for a financial trading company. About 80% of the working day, the head clerk would just loiter on the trading floor, reading the paper and shooting the shit with clerks and traders. And that was exactly what his bosses wanted: they correctly saw that as a sign he kept things running smoothly. And then when problems popped up, he could give them his full attention while the rest of the operation kept running.
So I'd add two items to your list: user satisfaction, measured through surveys, and crisis preparedness, measured by speed and quality of response during drills (and actual crises, of course, but you can't wait for those to find out how ready you are).
... in my opinion, is to be as bored as possible. Everything which is done on a regular basis should be as automated as possible, and as much effort and resources thrown at avoiding potential problems as the finances and customers will allow (data backups, spare or redundant equipment, etc.).
Much of a "good" sysadmin's time should be spent doing regular, but occasional spot checks on the automation (which can also be greatly automated) to ensure everything is running as smoothly as possible.
Obviously, not all problems can be avoided, especially hardware failures, but if everything else is in place, even recovering a dead, but critical server can be fairly painless.
Oh and you claim to know more than the EEs at your fortune 500 company? God, slashdot is full of you guys and frankly I think you're all full of shit.
No offense to you personally, I just hate seeing people kicking on college degrees like they don't mean anything.
I am an SA who became a bean counter. One of my primary motivations was that I saw f*ck-ups getting rewarded with less work and raises while hard-working SAs suffered with more work and dead end jobs.
I think management deserves to know what is good work and what isn't. If you leave it up to them, they are going to pick something like tickets resolved or customer satisfaction and you are going to see the a**-kissers move up while the hard-working straight-shooters get the shaft.
I think the metrics described here are good ones, but I'd change #4 to the ratio of load to capacity -- which is a measure of efficiency and good planning. Overall, a good SA should be able to maximize delivery of services. I'd also change #5 to security risk measured as ELV (expected loss value). I know a lot of security professionals who hate this and think it is meaningless, but so far none has given me any better metric to show management that security risks are actually getting better managed over time.
In short, think of what a good SA does for a company and propose metrics that reflect that. Do NOT leave it up to management like some have suggested. THey are asking for your opinion as an expert. Step up and show that you are the expert by giving them an expert answer. Show them that you know the difference between a good job and a bad job.
My boss is happy with my work so far. Why is he happy? He tells me straight up: "Because I don't hear anything bad."
If the sys admin wasn't doing the job well, neither would anyone else.
Many people here commented that you can't measure productivity of a Sysadmin.
...
This sort of misses the point. And to keep maintaining this stance of "what we
are doing cannot be put in numbers" in a huge company will
ultimately lead to job cuts in the IT Operations departments if times get tougher,
money has to be saved, and heads will be counted. Because everybody else
(Marketing, Finance etc.) *will* have numbers at hand to show how "productive"
they are and how they cannot spare even one FTE.
Add to that that companies like IBM are knocking on the door of your CIO everyday
with nice slides showing how IT Operations outsourcing will cut his costs and risk.
You cannot argument against that with the handwaving I'm reading around here.
I work for a big telecommunications provider, and their Service Assurance
department have strong KPIs and process cost numbers running.
The first thing you / your company will have to do is to have unified processes
for operations (look up ITIL Service Management in google) - if something
like this isn't in place already.
Then define clearly, together with management, what you want to measure (and maybe
optimize as a result of your measurements).
Probably you want to measure total cost of ownership of your IT infrastructure,
based on standards, and compare that.
Make also clear that individual productivity is not what is really important,
but measuring the result of this is. For example, the number of time you
solve a problem in production is not important
the cumulative time needed for this is, but not for measuring
your personal productivity, but to measure how much time you are needing for
fixing things compared to prevent thing compare to just do maintainance work
(in ITIL Terms: Incident vs. Change vs. Problem Mgmt. time).
Together with this initiative, your direct management needs to make blatantly
clear to upper management that the productivity / effiency of the individual
is only measurable by them, i.e. by direct assessment of your personal skill etc.
That way, you show as a group that you are willing to work transparently, while
at the same time making that your future existance within the company is more secure.
Last word: We in IT really have to face the fact that it was our stance of
"trust us, you cannot understand our value for the company anyway" has helped
in making outsourcing so attractive for PHBs, because it gives them the ability
to replace us with something they (believe to) understand: a simple contract.
Let me summarize your assertion:
"If attributes do not have numerical quantifications, then they cannot be compared at all."
I hope you can spot the error.
-josh
Try to get him to understand that some deliverables are 'negative' deliverables. Uptime (lack of downtime) or security (lack of intrusions) are good examples. They are partly the expression of your due diligence, good practices, savoir faire, and flair. These will never be piechart-able. If he does not understand that or does not want to understand it, pack away and get working somewhere they deserve you better. A job is not just an exchange of money for work. You have to get some consideration and self-fulfillment out of it.
Ask your CEO if he also canceled his business insurance as well, since no one sued him last year. And ask him if he canceled his life insurance, since he didn't die last year.
This signature is a waste of 42 characters
No offense to you personally, I just hate seeing people kicking on college degrees like they don't mean anything. He probably DOES know more than the EEs....about being a sysadmin. As he pointed out the EEs couldn't run their UNIX box, but likewise I wouldn't expect the grandparent poster knows how to design a circuit board. We all have our stations - People can't be expected to know everything about everything.
-R
I generally agree with the parent with the possible exception of expecting the manager to come up with all of the metrics. The main things they are going to be looking at and held accountable for themselves is cost estimates and deadlines met. If you are using a trouble ticket system then turn around time would be a good more granular measurement. If you are experienced try making a list of services that the system you are administering provides. Weight these according to importance to the consumers of the services and assign a dollar value to these. If providing services for developers, ask them for their top 10 "What is most important to you, system-wise, in performing your job?" Factor in how many users you will have to be assisting. Some things that come to mind are: database performance, system latency, network latency, package purchasing and maintenance, backups and the ability to recover from user screw-ups in a usable time and granularity, etc. Sometimes it is useful to ask your manager in advance of all this to get their input on what they consider important as this can save you the trouble of coming up with a meaningful system only to find your manager only wants to know how many hours you work! Be aware as well that many large organizations rotate managers so all is not lost if the first one is a zero. If the zero has been in this position for years and their boss is their best buddy, get that resume' out and remember to ask insiders about the environment before hiring in. Networking is invaluable. Good luck.
Be as you would have the world become.