Ask Slashdot: System Administrator Vs Change Advisory Board
thundergeek (808819) writes "I am the sole sysadmin for nearly 50 servers (win/linux) across several contracts. Now a Change Advisory Board (CAB) is wanting to manage every patch that will be installed on the OS and approve/disapprove for testing on the development network. Once tested and verified, all changes will then need to be approved for production. Windows servers aren't always the best for informing admin exactly what is being 'patched' on the OS, and the frequency of updates will make my efficiency take a nose dive. Now I'll have to track each KB, RHSA, directives and any other 3rd party updates, submit a lengthy report outlining each patch being applied, and then sit back and wait for approval. What should I use/do to track what I will be installing? Is there already a product out there that will make my life a little less stressful on the admin side? Does anyone else have to go toe-to-toe with a CAB? How do you handle your patch approval process?"
Microsoft System Center Configuration Manager?
They want bureaucracy, they make the paperwork. Tell them to track windows and distro security pages, the changes are there. I would be toasted with that kind of tape, I updated my servers in a pinch immediately after the first news of heartbleed at 3 in the morning. 0300AM right. How about dusting your resume and changing jobs? Let them play the shuffling reports game alone.
What we normally do is get a blanket approval if its coming from the OS provider with an understanding that patching will be done on a specific schedule.
IE. If all the patches come from Redhat there is no approval its necessary to keep them up to date for security purposes. The same is true for patches pushed out from Microsoft.
Then your only dealing with 3rd party applications. Even those the more common ones we get added to the blanket approval, ie. Adobe. This way you are only telling them you are bringing them into line with the latest set of patches provided by the OS vendor without having to list all the packages that are being updated. Then they only have to ask you if a program has or does not have a certain bug.
If they have an entire board to bord reviewing patches and micromanaging the system, AND you are the sole admin for 50 servers (and what probably several hundred if not thousands of users) then I would say you should go fine another job. Obviously they can afford a bunch of paper pushers, but no help in the trenches... I'm just sayin...
ethanol.
New product your comapny requires is called: junior admin? Expensive stuff but does the job.
You know that stress reduces your life expectancy? You have most stress with dumb supervisors/bosses. Go and quit there. This has also the effect that you've ultimately showed your position about it.
I have to do this and it's no problem at all, although our change management process doesn't sound quite as onerous as yours (I suspect yours will adapt over time -- the CAB will soon get bored if they have to approve every single OS patch).
I have to do a risk analysis for each change that gets made to a system (not just patches). Sometimes this risk analysis is fairly informal, for example if the change is to add more RAM to a VM, it's very unlikely to have a significant adverse impact and is easily reversible, so low risk. Other times the risk analysis (and processes that come out of that) may take a long time and require significant co-ordination with other parts of the organisation I work in.
A good example is if we make a change to a service that impacts the look and feel of that service. It will require co-ordinating with our communications, helpdesk, training and documentation teams as well as other parts of the technical group I work in and the CAB really acts as a check to make sure all of that has happened properly.
There are still a few people in our organisation who see the CAB as a barrier to getting work done, but for me it is really a check to make sure we're delivering changes in a proper way.
I can recommend you take a look at The Phoenix Project by Gene Kim, Kevin Behr and George Spafford. http://itrevolution.com/books/... - I had quite a few "this is where I work" moments whilst reading it :)
Are you working for Citi, by any chance?
Setup a WSUS server, you probably already have the licenses. From there you can pull the patches to it and then push it to needed servers as approved.
There are commercial products that can also this in a nicer manner but they cost money.
Ask a simple question, will this patch cost lives if it is applied. If the answer is no then apply the patch. Justification of applying the patch , no people will die if the patch is applied.
Don't like doing what you are told? Then leave. Life is too short. You've had to too cushy thus far, and clearly just apply patches by default. Welcome to the real world!
Well, welcome to the big leagues.
Any company of any reasonable size NOT doing something like this is stupid.
Not that I have any love for CABs, Change Management, quite the opposite.
However, when the shit hits the fan, someone is going to be doing an Root Cause Analysis, and having all that stuff available is useful/necessary/legally required in some cases.
You're not the only one out there that has to deal with this. Some places you need CAB approval via a Change Request in Remedy just to change port speeds.....
Some sort of Blanket Approval as mentioned earlier will solve a lot of the hassle, and let you minimize required Changes to a smaller subset of actions.
This is known as the change process in ITIL, and it does have a remedy. The remedy is pre-approved changes (standard changes), which should include patching the OS with patches approved by the vendor. It's meant for exactly this situation, and if your change process doesn't have them it's just a paper wall.
The ITIL change process is all about reducing risk. If there is a risk with patching your OS (there is, especially since you mention Windows, it's not that unheard of that a Windows patch makes your whole network inoperative) you have to weigh it against the risk of not patching it (meaning you leave known security holes in).
So, my advice is to get OS patches for your OSes pre-approved by the CAB, that is, when a vendor releases a set of patches you are allowed to patch your systems in the way and the order of that pre-approved change. Of course it's paper-pushing, but use it to your advantage and push some paper yourself. If a server gets compromised and you have the papers (changelog) to prove that you followed procedure, blame will be placed somewhere else. And things will be done differently from there on, since it has been proven that the procedure didn't work, and everybody wins.
Or you could go find another job (like some other posters recommended) where you are the sole *cowboy*-admin and nothing gets done properly. Your choice really.
I bet your CEO or upper level boss is the typical dimwit/jerk, knows nothing about the business, microcontroller type of guy, stupid games of power, calls you on purpose once his secretary tells him you are out of the door. Small guy, stupid looking, may beard of a goatee, cheap-looking suit. Tell him to sod off and change jobs...
Given your description, you're the sole sysadmin. This means you're the person who should take these decision - nobody else. If the company disagrees with this, then either you've done a poor job previously, or they don't trust you to do your job for some strange reason.
Now, if it's you that have fscked up on previous occasions, then it's understandable that they want the red tape.
If you haven't, then it's time to put down the foot and say "Nope, that's my job". If they disagree with that - linkedin should be a relatively short distance away, and after you find yourself a new job - simply hand in your resignation pointing out that you have no interest in having babysitters.
"Rune Kristian Viken" - http://www.nwo.no - arca
- Explain to them that having a full board consideration of routine job tasks ain't gonna work and tell them to make some streamlined process (like just letting you do your job).
- Hire a second sysadmin to do your bitchwork.
- Find a new job with more dealable policies.
As a software developer I have multiple times had a development box screwed over by an IT department pushing unneeded drivers and patches that cause problems. I say prove they are good or needed before you waste other peoples time. If you just want to push any random patch that comes along then you should be forced to resolve all issues without the traditional reinstall the machine.
Just apply this dogma.
Follow the instructions diligently, put the administrative burden on them with tons of notices and emails, and don't forget to ask for a quick answer every now and then, because security is at stake.
Sounds like you want to be efficient, the CAB doesn't want that, they want this. If this is impacting other aspects of your job that needs to be communicated. If you personally can't stand it, as others have pointed out they will probably get tired of this. If they don't, it doesn't hurt to make sure you're resume is up to date, there could be better opportunities out there for you, which others have pointed out.
Scares the hell out of me how many SYS admins don't know what a Microsoft KB article is... And you are not paying attention not knowing about "patch Tuesday" and where Microsoft announces out-of-band patches... get WSUS and half the work done for you.
Simple really. Ask for another five - ten staff members to manage those servers.
In the voice of Nelson from the Simpsons: Ha-ha!
They want to make your work more transparent. Apparently, they think you have too much spare time, too. Or you getting fired/outsourced, and this is a gentle reminder to document your work..
Since all the reports are similar, I would just create a script to handle the documentation needs. I would also do extra work: create report how much this affects the efficiency of patch / hotfix distribution and how time all these process changes take (and maybe inflate that number a bit, just a bit).
This would also be a great time to ask for an assistant to ease the workload.
If they want bureaucracy, give it to them. These people pay you and are entitled to tell you how to work.
Spend lots of time testing windows patches, let other things go. On no account increase your hours to do this.
Also, be sure to mention that testing windows patches on anything other than an exact simulacrum of your development environment is not going to be effective. Get them to allocate a few devs as test subjects. Watch as their efficiencey drops too.
Oncethey've realised that things slow down and you've stuffed so much paperwork down their throats that they choke people may loosen up and realise that each patch doesn't need its own cr.
Also, be sure to mention that testing windows patches on anything other than an exact simulacrum of your development environment is not going to be effective. Get them to allocate a few devs as test subjects. Watch as their efficiencey drops too.
I do actually like change control, but I've got the situation set up where I don't have to ask for every patch.
I work in Change Management for a major telco, I chair the IT CAB, and I oversee server and client patching (amongst many other changes!). When we patch clients, we are patching up to around 30,000 real and virtual desktops - when we patch servers, they also number in the thousands.
There is no way we would allow a sysadmin to patch anything at any time without some level of oversight, an individual admin has no oversight on other patches, hardware interventions, application releases, network upgrades, business campaigns, etc that may be happening on our environment at any given moment (this isn't their job to be keeping track of all of that info). For server and client patching is as light as possible, but we still maintain a close oversight.
On the Wednesday following the second Tuesday of each month (for example), I sit down with the Windows server guys and the Windows client guys, and we review their proposals to patch - usually we have a fairly rapid timescale that we can meet to ensure that the patches are deployed (including pilot testing, etc to catch any issues before everyone's desktop is broken!), sometimes there are other major interventions that overlap, and then we need to make prioritisation decisions and decide which has priority. We have made similar agreements with the Linux teams, where they have a special process to patch, and we have close oversight on Unix patches, as upgrading these servers with a reboot can be a very big deal.
The last thing you want is an application version release of a critical ordering application happening at the same time as a system software patch, and then to have an issue afterwards - is it the application version, is it the systems patch, was there some conflict with the activties being performed at the same time? Troubleshooting gets more difficult, teams point fingers at eachother, and the whole time the business is screaming blue murder.
Of course in an Incident situation there is more flexibility to get things fixed fast, and with security issues I am keen to break open the S-CAB process to expedite a rapid approval flow to ensure that security holes are fixed as fast as possible - of course most changes are encouraged to follow the rules though, the change calendar is published, and everyone knows when the "standard" slots for deployment are, and if most people manage to schedule their changes within those windows, then it minimises potential conflict for everyone.
Change management are not your enemy, they are your friend - once you register your change with them, they have your back, they will guard from other interventions clashing with you, will stop you from inadvertently upsetting the business, and will decrease change related Incidents. However, with great power comes great responsibility, and Change Management need to find the right process for the right type of change - we cannot have a full in depth investigation into every configuration change, every patch, every bug-fix, every new server to be provisioned. A good Change Management team will guide changes to the appropriate flow, and grease the wheels for certain types of interventions - it seems that the CAB mentioned in the summary are still finding their feet a little, and I am sure they will evolve over time as they start to understand which changes are high risk, and which can be allowed to pass with a lighter touch.
-- Pete.
Monochrome - Probably the UK's largest internet BBS
Probably all you are missing over there are scheduled maintenance windows.
You give them a list once per month about what is about to change, get a confirmation, proceed with them available on standby for fixes on the spot or, rollback.
Try to think the big picture: how would you maintain the systems, if they were life-supporting medical equipment? Why not give same quality of service?!
System Administrator Vs Change Advisory Board
50 quatloos on the newcomer!
systemd is Roko's Basilisk.
1 of 3 possibilities:
1. You are perfect. You NEVER screw up. In this case, the CAB is just being a PITA.
2. You can make certain types of updates quickly, with little or no risk, and you never screw up. The CAB should agree to make these standard changes with very low overhead. The other types of updates are likly to help YOU, not to mention everyone else in the company that depends on you.
3. It's hard to say in advance - most of the time, things work OK, but sometimes problems arise and there is unexpected downtime (it's NEVER your fault, however). Bit the bullet. You are not running a world class shop and you need help to improve. Anyway, downtime in production always takes more of your time than filing an RFC.
Posting as AC for obvious reasons. I had this situation. change board was announced; I predicted productivity would take a nose dive; it has. Job satisfaction has taken a nose dive as well. Stuff that used to take hours now is wrapped in red tape and takes weeks instead. I'm currently looking for something else.
Turn this request proactive. Ask for a good vulnerability scanner, one that can perform authenticated scans. Qualys or Rapid7 would be good choices. The scanner will list out all of the vulnerabilities on each server including those that have patches available and those that don't. Let the scanner do the work and then present the report of both patchable and unpatchable vulnerabilities and let them work off that. This is how we do the CAB at our 300 server and 2,000 desktop bank.
After the patches are installed run the same scan again and now you have proof that the patches did in fact close the vulnerability. Both the "before" and "after" scan becomes part of the CAB documentation.
This in fact will seriously increase your workload for months because there are a whole lot more vulnerabilities that you know about and many of those will be configuration issues. But for fifty servers it should be less than six months and then you'll be in a good place. And the CAB will lighten up a lot as things show improvement. Too many sysadmins think that Windows Update and the RHN are the only tools they need for vulnerability management and that is not anywhere close to the truth.
There is genuine value in a well-run change management program. Organizations need to know what is going on in their infrastructure, and plan things properly. In many industries there is a growing regulatory requirement to have change management, and auditors are looking for these things more often too. Many smaller shops are bringing in change control, so rather than handing in your badge my advice would be to deal with it and learn the lessons.
One lesson is rather than fight it, use it to your advantage. Yes, there's paperwork, however if you follow the system correctly they cannot blame you if things go wrong. What you thought of as freedom was also a risk to your own position as you had sole responsibility - change control means less freedom, but you are covered. Also, you can get budget for better management systems which will make your life easier. Put together a realistic list of what you need and get involved with setting up the change control process. If you stay silent or fight it you won't get a say.
I used to work for a Fortune 100 company. I'm not sure how CAB works at other companies but I get the impression that their implementation was flawed. 1) You could easily go around the process. 2) I'm certain nobody reviews the code - They just kind of discussed it. In my opinion this is a half-baked solution to prevent things from getting pushed to production which could cause problems (errors, leak sensitive info, etc). I am 100% confident that I could have gotten CAB approval for nearly anything. I understand the idea behind CAB but in my experience it isn't effective.
I actually quit that job partially due to things like CAB. Increasingly control was taken away from people in the IT department, and handed to things like CAB or to 3rd parties who managed our systems, databases, etc. The jobs of myself and others in IT staff were being reduced from "actually doing the work" to "submitting tickets and following up on tickets." Nothing like being on hold when calling the 3rd party for a critical issue you yourself know how to fix in 5 minutes. It's also a blast when I had to tell the support guy what commands to run because he wasn't familiar.
And no we didn't fuck up anything to deserve this treatment. It was dictated to us from upper management.
Do exactly what they say to the letter. After the second "patch Tues" where they pound the ever lovin fuck out of Windows Server with updates and the CAB has a pile of paperwork big enough to roast a wild boar they'll suddenly regain a measure of common sense.
The paper trail for the process is the easy part it's the part where some manager need to be hand guided through the process of making a decision he is not qualified to make that adds cost, and reduces productivity to the point where it might affect stability.
And that's why enterprise system are always 2 years behind on patches and conically unstable and ridden with security holes.
The end result of a underscored CAB process is always less patching, higher costs, and worse or at best similar stability, with the most common Root Cause for downtime being lack of due diligence in patching/maintaining systems.
There really should be some kind of predetermined rule-set for when a patch get deferred and when it gets implemented, if there is you dont need a board to look at every patch and if there is not the board will always lead to worse results.
Buy something like Tenable Nessus or Rapid7. Make reports very easy and works across Windows, Linux, Cisco, etc. If you get Security Center it will track changes over time and you can see trends over time with patching.
Where I come from CAB stands for "Change Acceptance Board", they don't get to make dumb decisions...
Seems to me that you need to establish a list of pre-approved changes. For example, if you're running Windows and IIS, make sure there's a clause that says anything that comes down the pipeline via Windows Update does not need formal approval. That way you can offload the responsibilty, and work, onto Microsoft. You can keep your core software up-to-date. Third party software, same thing for corporations. Student projects and your own shell scripts might need more examination; not a bad idea actually. But if there's a new version of Firefox, why in the world would a Change Advisory Board think it knows more than Mozilla?
fuck this site and popups
BYE
.. as the admin for a couple of hundred Windows servers, an efficient CAB is your friend. As another said, they have your back, and that of the business (and by extension, the poor guy who is up at 4am fixing any issues introduced). That said, I've also worked with companies and CABs that know how everything is written in the ITIL handbook, but with no clue of how to put it into (an efficient) practice. It sounds like your CAB just wants the paperwork done - did you bring on consultants recently? - and think/hope it will mitigate the risks involved with patching. Change request for patching on a development environment? Routine change. Keep up with the news for any issues from this month's patches. You patch dev, or your pre-prod environment or whatever you have, monitor for a few days and if all is good you apply the same patches to your production machines. This is enough risk mitigation for most, and it gets the job done at the end of the day. Make up a nice RACI chart (Responsible, Accountable, Consulted, Informed) for the whole process - you are probably R/A for successful patching, but, the CAB will provide the approval for you to go ahead. They won't allow you to do it if there's a big release, or some on-going issues. Then you only need to know how to push the patches and have a good engineer to fix anything that might occur on the night, and the accountability trail takes care of any finger-pointing and addresses any gaps in the process you might have noticed. Start slow, start small. Work your way up in volume as the becomes more like a routine change.
You need to join the modern world, your actions affect more than just the servers and need to be communicated throughout the organization. If you cannot speak to what a patch is going to do, then why the fuck would you apply it?
I'm the zOS Systems Programmer at a Fortune 500 company. When we do system maintenance cycles our CRB just wants to know when the system environment is changing, not what's changing.
If anyone ever does want to know I do have detailed logs and a before and after image of the maintenance management database (SMP/E Consolidated Software Inventory) for them to peruse. They never do; since they don't understand zOS Systems Programming, and they shouldn't have to. It's their job to manage system availability and to ensure that proper testing and system validation activities were performed. It's my job to manage the environmental change.
For anyone who's foolish enough to ask for detailed documentation of every module, macro, load module, dataset, file in the Unix System Services file system that's being modified, well enjoy yourself.
What I won't stand for, is for someone to have veto power over what maintenance goes on. That's my decision, and since I'm the best person in the organization to decide, I do so.
Yes, I know how they are thinking and the pain you are feeling. To accomplish the implementation of this change management process you will need a lot of people working for you. Use this to your advantage. Quickly study up on the subject so your experience with the systems will not leave you with a dog pile of new bosses to tell you how to do your job. Instead insist that you need to hire more people to manage the overhead.
In the end that probably won't work and you'll be kept "at the bottom" where you are now.
These changes are going to be enormously expensive and despite all you have done, it will be perceived that you created this mess by not having a change management system in place to begin with. Of course, they will also see that you don't know about change management and will prefer to hire someone who already knows about it.
Now I'm not going to down change management processes. They can prevent problems and identify people who would otherwise deflect blame and hide in the shadows. But from what I have seen, you're just getting the beginning of the tsunami of changes.
Push for testing systems and additional hardware to support it. Of course it will also require more space and other resources. Try to get ahead of this beast.
We got our CAB to agree to a certain class of routine changes that require minimum review. They don't need anymore detail than, Test servers updated on Tuesday, Production one week later per maintenance windows.
Vermifax
Logout
...and necessary* but that doesn't stop some change management boards being needlessly obstructive.
Years back, I was working at a company where all of our servers got patched at build and then never patched again "in case it broke something". Myself and the rest of the ops team begged and pleaded for the business to allow us maintenance windows, allowed to reboot the OS outside of business hours, install patches... all to no avail.
Until the company lost a bidding on a contract because they had no maintenance or patch management policy in place so the business comes running at us screaming why we don't patch our servers (they would listen to their potential clients about computer security and whatnot, but not to their own staff). Cue us showing them the dozen or so draft maintenance policies that we'd submitted over the years, all of which were rejected by the directors. Red faces all round in that meeting :)
So the latest draft gets pushed into force by a wheelbarrow full of cash and we go out and buy Shavlik, a really rather nice patch management solution... and then our change management board goes nuts when they see our report. Lots of w2k and w2k3 boxes had literally hundreds of service packs and patches oustanding before, and like the OP wanted an individual change raised for each patch going on each server. We then set up an email direct to the change board that gave them Shavlik's automated PDF thingy which gives a list of all the patches outstanding on a server along with a hyperlink to the MS KB or similar... but that wasn't good enough. They wanted a report on what each patch did, which files it altered, all the usual stuff. Now as another poster had pointed out, under ITIL this should all have been "standard change" without needing so much paperwork (seriously, they should be at least aware of ITIL even if they're not going to follow it to the letter) but we could sympathise with them that, even with our planned dependency-based staggered rollout over a 4 week period, this was both a radical shift in company culture and posed a significant opportunity for breakage... but still. Filing about 20,000 change requests it was to be.
So obviously, since we were dealing with obstructive officials, we did exactly that. Did a few dozen hacky shell scripts that took the PDFs that Shavlik made, CURLed down the contents of the link to the KB page and then posted it off into the change management system - one request per patch per machine. After about twenty minutes of this we'd submitted about 400 requests and the change management system (an in-house pile o' shite that wasn't so much written as congealed out of various bits of sharepoint and was universally hated) had slowed to a crawl enough that it took 10mins to open the page. It used funky whizz-bang ajax to load *all* of the pending change requests in the background ("who needs a LIMIT on this SQL parameter?! We're never going to have more than fifty open change requests!" The developer in question also seemed to think that using a LIMIT statement was akin to taking the go-fasta stripes off your car. Wonder if he's doing webscale development now). After some brief arguing where they actually suggested we should open a change request to submit changes - at which point we cackled at the prospect of submitting another 20,000 pre-change-request changes - and after finding their ITIL manual down the back of the sofa they finally agreed that yes, actually, they didn't need quite such a detailed report, and were prepared to accept our risk assessment report as a single change for the first weekend's rollout.
So about 20,000 patches/service packs were staged and installed over the next two months, and luckily we didn't have a single failure due to the patches (yes, I also thought this was miraculous considering the crufty applications). From then on, every patch cycle needed just four changes, one for each week. That's how it should be done.
* Yes, necessary! I've done more than my fair share of JFDI but that just do
Moderation Total: -1 Troll, +3 Goat
This makes no sense unless you also have a QA department were all these patches would be tested. Then the CAB would need to get a list of the patches description, justification, and impact to existing enterprise applications. Based on this list they could select what can be applied immediately, bundled in a weekly/montly release, scrapped or postponed until a remediation plan is completed. Without QA results the CAB is useless.
In my experience a CAB usually gets introduced in a small organization if something really got screwed up under the old process. There are exceptions - you could get a CTO who is gung-ho for ITIL, or you may have a new, important customer who insists on "process". But a CAB is an attempt to manage change and prevent problems in the working environment. So unless you have a better solution that will prevent negative impacts from your change process, go do the paperwork, with special attention to any risks or issues associated with the change (extended maintenance window, complex install or backout process, partial or incomplete fixes that still leave issues open). You can probably half-ass the CAB and get your work done almost like the old days, but when the next failed change occurs and they find out you hid risks or didn't do proper research, your ass could be out the door.
OTOH, if you really hate bureaucracy that much, hauling your ass out the door could be your best option - as long as you have a different career in mind besides sysadmin.
We are the 198 proof..
Posting as anon for obvious reasons. We run an estate of c.4-500 servers, 3500 pc's and we never patch, unless we absolutely have to. It solves all sorts of problems, so our XP estate is still running on SP2! We have a full ITIL change management process, but we don't patch! Go figure.
Sometimes lessons that aren't painful aren't learned. Not a big fan of unions, but I'm also not a big fan of management handing down off the cuff decisions that aren't well thought out.
Keep making your patches for 90% of what needs to be done and spoon feed them a few to mull over. Let them take their time. Slowly increase the amount of patches for consideration until they gt tired of looking at them and you gt autonomy back.
CM is there for both preventing fuck ups and dealing with them when they occur. First things first: do you have a test environment? If not, build one. Do you have documented processes? If not, document them.
Proper change management ensures that: 1. people in the group know what is going on. 2. you have a second/third set of eyes to ensure that you have both a plan, a backout plan (or plan B in case it can't be backed out) and a test methodology to ensure that a change hasn't broken things. 3. to make you think about the implications of what you are doing, and 4. that business stakeholders are informed and know how to plan around any impact both expected and unforeseen.
If you aren't doing all of those things already, sorry dude but you are just winging it. That's efficient, etc. until one day it all goes horribly wrong and you need to figure it out on the fly how to get back to normality, with unpredictable outage durations, etc. All of that should be worked out before going live with your changes.
Yes, it sounds like a lot of faffing about for no real benefit, but really, one day it will save your arse. And really, you will be surprised at just how many effects even a single change to a production system can have.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
CABs are not there to ensure quality or function of an organization's IT assets. They are there because they are power-hungry HOA types who think they know how to do your job better than you do, because you're "just a computer nerd" who likes to "play with computers," but they are the ones who really know best.
hi Uncle, I have been fired of yet another job...do you have something for me in your firm, as you told me when I left uni? For sure, we create a new department, the other guy does the work and fill up all the documentation, and you just have to show up every week with a suit and sign some papers...
In a previous life, we passed around virtual machines rather than doing paperwork. Paperwork is to be sure you have a plan to solve the explosion-and-revert problem.Managing machines instead of paper allowed us to include a process for doing an immediate revert on explosion (;-))
The VMs we passed around were Solaris zones, so they were very lightweight. If I wanted to apply an emergency patch to production, I first applied it to an image, put an instance on pre-prod, a physical machine, and varied it into test. After the smoke-test, I varied it into the pool on the load-balancer, and watched it closely. If it fixed the problem and didn't explode, I put lots of instances on the production physical servers and put them into the load-balancer, quiescing the un-patched instances but not erasing them. If the patch blew up after all, I could revert to the previous buggy release as fast as the load-balancer could disconnect people. Not quite as fast as doing an atomic change on a single server, but fast.
This is a minor variant on some old unix norms: 1) you aren't prohibited from doing even silly things, as prohibitions will keep you from doing something brilliant. 2) You can do anything, but you can't hide what you did, 3) you can change things atomically while running, and 4) if you do something dumb, you can revert it immediately.
The process is a variant/predecessor of ITIL, with pre-set apply and revert steps for emergency changes, which are the high-value part of the whole ITIL change process. Non-emergency changes were a little more heavy-weight, as we tested the patch in an instance in QA, then did a simulated UAT overnight (it was automated, but exceedingly slow), reviewed the results and then the de-facto board decided if we could release the image to production, QA and dev. Your paper-oriented CAB does approve all patches to QA and dev, right? I'll bet they missed that part (:-))
--dave
I did once have a customer where I had to do paper-based CAB approvals, but that was because we weren't funded to have a proper dev, and had no QA at all. As you might guess, we still had at least one fiasco. I shortened the contract as much as I could without doing a no-bid in the middle.
davecb@spamcop.net
Sysadmins are will never be trusted again.
We are a 100,000+ user operation. Our patch tracking and approval process is a giant paperwork nightmare that does nothing useful. I would get Microsoft Security Baseline Analyzer, run the report after Patch Tuesday, and send it to the management types. Say look at this nice list of required patches. If there are no objections, we will roll them out :D
Replacing a server = One change
Reconfiguring some shared folders = One change
Replacing a whole bunch of printers = One change
There are a couple of advantages with a change process like this.. the first one is collective responsibility, so the poor sysadmin can pass at least some of the blame back to the CAB if it goes wrong. And then also there's the point that other people might have a legitimate input into the process, especially if there are things happening in the business on the same day as the proposed change that IT doesn't know about.
Never email donotemail@WeAreSpammers.com
Consuela, before you dust off anything or clean the latrine, please fill 6 paper forms for the CAB approval... oh, dont forget to do that before taking a piss too. And you know, before going to be with the old boss, it be form AA479.
So you told them it won't work and they didn't listen. Now show them it won't work. Script something to send them a request for each update for each server. When they get flooded with 100+ perfectly valid requests each day they will beg for mercy. Then file one request for 'ongoing ad-hoc security updates for systems' and watch how fast they approve that one.
Things like this always annoy me. Someone has decided either that you don't know your job or that they need more layers of bureaucracy. In my experience it is usually because they think you don't know your job as a system admin. Do I really need a 'paper trail' or make work for things I'm already tasked to manage the risk for? And why would a group of business people (generally) think they are somehow better at mitigating IT risks than the IT person?
Part of what they are supposed to be paying me for is to know that if patch X breaks on the test server it is probably not a good idea to go live with it and I should also know already how to revert the changes in said patch if they have an adverse effect on a live server when they did not on the test server.
Things like this were they feel a need to micromanage things they don't really understand just annoy the heck out of me.
we are all invisible unless we choose otherwise
ITIL is for shit. What an awful program, and let me wipe myself with my THREE certifications.
The CAB is where the otherwise unemployable go to die. It never gets streamlined, the castaways from other organizations will find their home there, and those most lacking in intelligence will go here.
As others have told OP, he should honestly quit. Lone admin, they can stand up a CAB but can't hire more help. Bad sign.
Swim away.
God I hate IT.
You are maintaining 50 servers for multiple contracts and they want to know what is patched.
To me this would be a completely fair and normal situation.
I haven't worked with Windows Server in 8+ years. But WSUS was great for telling you what patches were needed and approving them to be installed.
I know that RedHat has similar technologies. Though you can also roll your own as well.
From my own company, I attend a Change Control meeting and one meeting a month has the Microsoft patch bundle as part of it. The patches get installed on a subset of the company on day X and they day X+2 they get pushed to everyone. This allows testing.
For production servers that customers are using to me it is a no brainer that the customer wants to know and approve what changes are happening to their server.
Depending how busy you are, you might have a resource issue which is fair to complain about.
But that they want to track and approve patches / changes? Suck it up buttercup.
I have on occasion run into patches that break things, and they didn't break things in testing. It's very difficult in an enterprise environment to tease out every possible situation where a patch might cause failure. Now if you have a patch deployment system (like WSUS) you should be able revoke the patch and pull it back but the idea that nothing ever breaks if the admin is doing his job is bs. Jackasses like you are why admins are leaving the field in droves.
I've been an admin for a very long time. What I see is a lot of admins think the OS is the most important and fail to understand why the server even exists in the first place. If you patch simply because it was made available, you don't test or know what the application the server is hosting does at all, then are you really doing what is best? Yes, patches break things and often the patch "fixes" something that was low or no risk inside the corporate network to begin with. Too many admins fail to balance the risks with application uptime. ...and that's why you end up with a CAB - to keep everyone informed, to balance risk and to account for audit controls. These usually pop up after too many system outages or lack of information sharing. Admins have a bad habit of being too smart and too busy to keep others informed. I have worked with a lot of CAB's in many companies and the best way to work with them is to be proactive in keeping them informed and to build a trust relationship in advance.
Retina http://www.eeye.com/Products.aspx does some of what you are asking. It will audit what patches/updates are missing from a system, generate reports that indicate the risk level, etc.
They keep an up to date list of patches/security bulletins for most products that you subscribe to, so you wouldn't have to do that yourself.
MS SCCM and RH Satellite are the two OS vendor specific patch management solutions. However your licensing will end up being more expensive per server and could be cost prohibitive for a small company. You cheapest option would be to script patch groups. You could do this in Powershell and Bash. The CAB may not require you to list in great detail exactly what each patch modify's. They may only ask you to list out the patch numbers being applied. The point of a CAB is to make you slow down rapid poorly thought out changes, bring stability, and external oversight to IT changes. CAB may also have a purpose in letting your greater organization know what is going on. You will find the new requirements painful and often times annoying or illogical, however they will also make you and your organization stronger.
There is or can be built a machine that can simulate any physical object. -Church-Turing principle
At our organization (linux), we have "dev" "test" and "production" servers... We mirror the update repos from RHN and Debian (our two primary distros) on the 1st of the month, then week 1, we generate a list of pending patches for "dev", which gets submitted to the CAB. Once the change is approved, we update a file on our puppet configuration system which allows the patch to proceed. The patch process then sends us a report of how many patches were expected, and how many applied. We then use that information to close out the CAB request. Week 2 the same process happens on "test", week 3 is production, and week 4 is usually empty, unless we had a problem with a patch cycle.
It's a combination of tools from the web, a few in-house scripts, and all managed via Puppet.
Obviously, it would be more difficult with Windows, but as someone else suggested, a WSUS server and some GPO's would go a long way towards ensuring your sanity.
If you are administering 50 servers and do no change control, you're doing it wrong. LANDesk, System Center, Altiris, etc. can track your patching.
It seems that the process is not that bad (even though your description does look a lot worse). Subscribe to the Microsoft Security Bulletins and they have a full description of each patch that they put out on Patch Tuesday (e.g., https://technet.microsoft.com/...). The same goes with RHSA. Subscribe to the updates that you are interested in; these will most likely be your OS, web servers, app servers, other software installed. Similarly, most vendors run security patch announcements. There will likely be a lot of noise but in a couple of months you will know how to extract the information the change advisory board needs. Here's the positive aspect of CAB: if you screw something up, you have someone else to blame! ;-)
For Windows, use WSUS. This can be configured on a Windows Server. Use GPO to have your systems get updates from your internal server. You can also have hierarchies of WSUS servers if your organization is so large that you need different systems to have different policies and administrators.
For Red Hat, there is a roughly equivalent product called Red Hat Satellite:
http://www.redhat.com/products/enterprise-linux/satellite/
Use Pulp. Run your own mirrored software repositories, run a copy of them for test servers and a third copy for prod. Do a diff on directory listings to see what updates are available and write some scripts to categorize them by potential impact (major version change, mission critical etc). Roll them out to test. Print your report and when it's approved, roll them out to prod. As a bonus, you can keep old versions around for downgrades or yum rollbacks.
I would recommend the use of either Windows Software Update Services (WSUS) or in combination with System Center Configuration Manager (SCCM). WSUS allows you to approve/unapprove all the updates you want to allow in your network. You can group specific computers to a specific set of approved updates if you would like. You can also use SCCM to manage the change control, what was approved, and what was installed. SCCM can also be used to deploy updates in certain circumstances.
Of both of the options, WSUS is free and can be installed on Windows 2k3 or newer. SCCM is now licensed through the System Center package which may or may not be worth looking into if you want to look at the other built in components to it.
Tell them to stick their change advisory board up their shiny rear end. For fifty servers, with updates applied separately for each, they'll never get anybody to come in and do that task voluntarily. They'd need a small team.
Microsoft do actually spend quite a bit of time ensuring that all the changes they apply are proper stable fixes or improve security. How could some advisory board know more about these proposed fixes than Microsoft's developers who are writing the damn things in the first place?
"Is the Chief Priest an Offlian? Do dragons explode in the wood?"
The way to succeed as a techie is not about being technically brilliant any more, it is about how you can talk people round to your way of thinking and use evidence to back up your points of view.
Meh, sometimes it truly is nice to see people suffer the consequences of disregarding one's expert advice. One time I was ordered to have our production system send a copy of all log warning output to a low-level exec's email (that was hooked to her blackberry).
I stressed this was a very bad idea due to the sheer volume of email this would produce (thousands of warnings per nightly run, from midnight to 3 AM). The order was reiterated. I believe my commit log message for the config change was, "Jodi reaps the whirlwind."
I heard she put her blackberry under a pillow in a different room so she could sleep thanks to the ~2,000 emailed log warnings she was getting each night. Apparently, the notification on the blackberry was getting queued, so it would constantly notify for hours. I have no clue why she couldn't just put it on silent... phone call rings, maybe?
Regardless, she had to put up with it for several days before the config change to disable her email cc could be deployed.
I spent a lot of years working for a company with a very structured tech environment. In all fairness to the company, they work in an industry that is heavily regulated. That said, it was a highly competent development team of SAs that decided what should be on the servers. A bunch of managers on a CAB will not be able to replicate that. With a single SA and only 50 servers, you have a pretty small shop. Sounds like maybe they have plans to grow the business? It sounds like there is no process in place right now except what is inside your head. Hope you never get hit by a bus! Servers are too important to the functioning of a modern business to leave things to that kind of chance. I think the company is doing the right thing but they are attempting too much too soon. Try to help them but start small; maybe define a standard build of each type of server and then use one of the automation tools to keep each server in conformance with the defined standard build. You might even then use one of the tools like to Tripwire to notify you when someone or something makes an unauthorized change in your servers. Basically, work with your management to improve the situation. The upside of all this for you is that the management in your company will realize that your job is a lot more complicated then they ever imagined..
90% of your changes won't have any effect on production systems. Just lump those together under "Routine changes to UNIX/Linux production environments" and explain that you've tested those on your sandbox network.
10% of your changes will impact your production systems, even if it's just because it's upgrading Apache or some Perl module that your systems use. This can be as trivial as "updated Perl module; ran complete unit, load and regression tests, everything works fine." to "This is a kernel patch that requires us to power cycle each box. Here is our plan to do this in a way that generates no application downtime." Those are the changes CAB is meant to catch. Document each one in a different request. Document them clearly and thoroughly. Run them by people whom you trust to write good English. Make sure that your deployment, testing, and rollback plans are solid, and document them thoroughly in each request.
After a while, you'll get really good at this, and people will trust your requests.
Finding God in a Dog
you should start browsing dice.com now It's escape-hatch time
Part of the Second American Revolution!
Do you have any? Get one of them to attend the meeting in your stead.
I've done this before. For one particular big project, I had *two* project managers. I got one of them to go to the change review committee meeting for me. Afterwards I got asked why I didn't go. I replied that the project manager was perfectly capable of answering questions about the software that was going in. The person replied that I knew more about the software than the project manager. I said that of course I did, but the people in the change review committe weren't capable of understanding the things that I knew that the project manager didn't.
Seriously... tell them if they want paperwork and wasted time, instead of secure servers, they can get some administrative dork who doesn't know security to do a crappy job for them.
Shavlik Protect (used to be called HFNetchk) will do it too...
WSUS relies on the machines to poll WSUS and see what's available, with Shavlik, you scan the boxes, and push the patches that are missing to them. You can schedule when the patch installs start, stop services or reboot before/after, etc...
With the scan, you can get some decent reporting, including descriptions of the patches to hand to your CAB
That's how I'd handle it. If they want patch reports, that's reasonable. If they want you to patch the test environment a week ahead so that the devs can check for problems and alert you not proceed, that's reasonable too.
If they want to micromanage your tiny components of your job they can get bent and good luck finding a replacement. No preapproval for routine systems administration activity.
Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
Comment removed based on user account deletion
I'm not sure that CAB is necessarily the right solution, but patching really is a problem and can't be done blindly unless your business can take the occasional production hit.
Admin is outsourced at out company, (I'm a former sysadmin who now does application admin, still local) and the contract apparently specifies "current minus one", which means we patch frequently on all platforms. The problem is, the offshore admins have no context, no idea what server provides what resources, (and yes, we've tried to educate them -- the information gets "lost" within weeks or months) and no conception of the idea of patching first on dev, then test, then prod. They manage patches by version numbers not by environments, which means a collection of patches may be announced (to all and sundry because they refuse to use the contact list) is a hodgepodge of development, sandbox and production servers. Information is commonly that the servers "will be patched" but not to what version, which has caused contractual support problems (where a server is running a more recent version of the OS than is supported by the app). Other joys have involved bricking prod servers with firmware patches, because they didn't try them in test first, insisting on doing nonessential servers on the weekends instead of evenings (because, no context) and forgetting that when it's daytime over there, it's dark over here, and I'm probably not going to be at my desk at 0'dark thirty to give some last minute approval to take a server down.
It's a mess, and the CAB process, as obnoxious as it is (we sit through 150 -- 200 change descriptions every week) serves to catch many of the above issues. The outsourcing company is annoyed by this -- they just want to patch -- but we have the process as self defense against very real issues.
What I'd recommend to the OP is to hire someone to manage the CAB process. We did, and it worked out pretty good.
Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
If you're in a small company but still managing 50 servers, what is your role on the CAB? You should advocate to be part of the CAB, at the very least, so that you can coordinate processes to streamline critical and security patches, and keep management informed of the process. If you are approaching this with a hostile or obstructive attitude, it doesn't reflect well on you and it injures your ability to get management to listen to you when it counts. A CAB can be a rubber stamp, for the most part, that ensures that there is at least documentation and a modicum of thought going into the maintenance of the company's infrastructure. The creation of a CAB is pretty reasonable, but the key is to be involved in its creation and ongoing existence as well so that you can eliminate the red tape while still documenting the process to CYA in case something goes wrong. And something, sometime, will go wrong.
It's possible that moving to a more validated and controlled environment will increase your workload, but again, documenting the volume of changes, etc. is to your benefit. That is the data you need in order to provide management the justification for getting a second admin hired. This is also the data that you can use to justify a big raise or promotion at your next review, instead of people wondering what exactly you do all day.
Also, as mentioned previously, you really should have WSUS or a similar solution to deploy patches across your organization, both to make administration easier and to help enforce a consistent environment across all the systems. If you patch everything piecemeal, it's quite difficult to tell whether a patch will break one system with a different set of patches than this other one.
This post awaiting approval of Dice Holdings Change Control Management Department.
As the sysadmin you should have a seat on the CAB. That's it.
You're the person doing implementation, and you're the person most suited to evaluate the technical impact of the changes that you're making.
If you were not a lone sysadmin, it would be your director or their delegate who ought to be on the board; as a standalone sysadmin, though, including you is the sane thing to do.
-----
I assume you have various individuals/groups who have an interest in the systems you administrate. Users, developers, etc. Also regulators. Don't forget the utility of a good documentation system when the auditors come around*. So you need a process to keep them informed of the upcomming system changes. So they can ensure that their product or process isn't going to be broken by a change.
If you have relatively few of thes interested parties, the communications could be mandles manually and by you. If that community is large, the procedures need to be formalized and possibly automated. Having a CAB to represent your user community can offload the communications task from you. At the expense of some paperwork.
On the other hand, I've worked in organizations where the CAB was a make-work task for a few layers of management. People whos only other job prospects are standing by an off-ramp with a cardboard sign*.
*At one of my previous jobs, this was the acid test of the utility of our CAB. I had to fill out stacks of paperwork and await their blessing to make a change. But strangely enough, whenever the FAA came around, they were nowhere to be found. I had to walk the auditors through our systems myself.
Have gnu, will travel.
The ITIL change control and CAB process are quite useful when used properly and facilitated by the appropriate staff. That IT changes cause the majority of datacenter outages is not a debate, proper change control shows us this. However, CABs use is to get all changes in a room so impact and stacked changes can be weighted to ensure these "change outages" do not occur or at least minimize the risk.
Example: I need to push changes across a WAN but the network team has a router upgrade also planed. Someone is going to fail here. CAB resolves these conflicts.
Where CAB is supposed to back off is when we have ITIL-defined standard changes which dont require approval but are still notified so others can be aware systems WILL be rebooting. Our admins responsible for patching, have a pre-approved and scheduled patching cycle of systems, and CAB is the method we notify other departments of these updates. We have a stage and production environment for most applications anyway so we use these for patching also. Well know if an OS patch or package update breaks because it broke in stage, not production. And CAB is only there to pause patching IF theres an issue or other scheduled maintenance window more critical.
We have anywhere from 10 to 30 people in each weekly CAB and it takes 15 minutes (video conf/dial-in bridge). We then have another 15 minute meeting the day of our major maintenance windows. Since doing this weve reduced unplanned outages by 75% simply due to proper scheduling between departments. It takes 30 minutes a week. I spend maybe 30 to 45 minutes a week writing my change requests, so in the grand scheme, the benefit outweighs the administrative overhead.
However, this improvement assumes ITIL was facilitated and maintained by people with enough technical knowledge (through director-level) to make appropriate decisions based off changes presented. If your CAB is run by paper admins, youre in a world of hurt because theyre going to make uninformed approvals regardless of the amount of data presented. In your case, the CAB should not require a write up of every KB and you should not have to "prove" each update. Instead, if an update of a "test environment" goes fine, the CAB should only be there to make exceptions to your planned maintenance window.
I dont know the size of your contracts but my 30 minutes of CAB and pre-maintenance meetings allow us to maintain several hundred applications across multiple VM and physical server farms with additional AWS infrastructure. If ITIL is new to you and your CAB partners, it does take time to smooth out the workflow. It should take almost a year to get a nice flow because it is quite a shift to everyones workflow.
Or, if you have a bunch of chuckleheads running the show now, like others said, dust off the resume.
First, as some other folks have said, give them a weekly list, not every day, or every time one's announced.
Actually, that might burn them out... or, they might decide to batch them on their own, and think they'll get to it eventually.
Here's one: give them a weekly list, AND INSIST on a weekly meeting to discuss it. EVERY WEEK, without fail, without cancellations. Tell them that you'll also want a spot meeting, when you get critical updates (like yesterday's Java from Oracle, with it's 4 that had a CVA rating of 10 on a scale of 10). Insist that if you get those, they need to meet that day, or the next, or give you the pre-approval to put those in without consulting them.
The weekly meetings will get to them in relatively short order, they being so busy and all....
Also, here's another pushback: do you have a testing group, that runs regression tests before regular updates, and especially on ermergency ones? If no, question the committee how they expect you to regression test everything. Also, do you have test, as opposed to development systems? If not, that's another budget item the board needs to approve..
Make them do that real job, professionally. See how much they hate doing it, and maybe it'll go away.
mark
Seriously.
If you're billing by the hour, this should be a GODSEND.
Otherwise, start updating your resume..
and you to are the best example why a well designed Change Process is needed.
With a Change Process the Admin (as his role of a Requester) fills a ChangeRequest to Patch your system. He states which patches are to be installed, including a short description of each patch, when the change is planned to happen and if there is a downtime involved.
You, in your Role as the Application Owner, get the change to review the patches. If there is one which might conflict with your Application or the timeframe of the changes happens to be during your nightly automated, business critical task(both things the requester might not be aware of), you have the change to reject the change or shift the timeframe better suited.
If you have question you can directly raise them during the CAB Meeting.
At the end of the day this improves the job for both of you.
I am working as a sysadmin in an enterprise environment and attend weekly CAB Meetings. Before i was a sysadmin in a small Environemnent (250 Users - about 15 Server) without a CAB Board and in retrospective the first thing i would change in this small Environment would be to establish a (simple) Change Process.
One of the things you can do with Puppet is get a change record of what WILL happen, specifically so you can show it to a CAB, get it approved, and then apply it during a scheduled maintenance period.
"Don't teach a man to fish, feed yourself. He's a grown man. Fishing's not that hard." - Ron Swanson
I'm a developer, but check it out:
Deploy WSUS, google it, but it manages updates for a network of servers or computers.
Under no circumstance should you explain what an update does, A. you don't really know B. Microsoft provides KB's, or... descriptions on the server, or maybe WSUS has them too.
If they're looking for risk assessment, assess on a per server basis, not on an update basis (service packs are a notable exclusion to this).
P.S. deploying WSUS means you'll be running 51 servers, but it's worth it.
Welcome to hell. son.
You need to come up with a number of how much time it takes to patch, evaluate and test, turn around time, and a testing environment.
Because you are going to need at least one other people, more likely 2 or 3. Now they will need to justify their CAB decision against actual money.
When I did it, I was at least able to get a whole slue of 'standard' changes that needed CAB notification, but not approval.
Don't let this force you into work a single second more then you already do. When something can't ge done just say 'Sorry, I'm mandates to do all this extra work for the CAB, so I didn't have time to get to it". Also chase it up the chain like 'I"m trying to do X, but I can't becasue of Y, I need more people." Be the flag waver for more budget.
They need to be aware, and feel the impact of a decision this big.
The Kruger Dunning explains most post on
Try to convince the CAB to manage its work by a risk analysis. Configuration changes in a secure environment do tend to cause more problems than patches to workstations using OS vendor's default configuration.
If they want to approve every change, then just flood 'em with paperwork. 1 day spent automating your process should keep them busy for at least 6 months. Meanwhile you won't have any changes that have been approved, so you can get on with the interesting stuff.
Oh and if anything fails, dies, gets a virus (presumably security updates and virus scanner downloads count as changes) or lets the world and his/her dog steal your company's secrets then it's not your fault: the board hadn't approved the change you submitted weeks ago.
The good thing is that the change board are taking on responsibility for the changes. By approving them, provided you execute them exactly as described, then they are to blame for any problems - as they gave approval. Make sure you keep a paper trail and have a record of everything you do.
They will quickly tire of the burdensome, boring and ultimately futile work. So enjoy the honeymoon period. It wn't last forever, but if you handle it properly, you can shed the blame for any problems for at least a year - even if the board disband. The confusion and lack of clear indications of who should have approved what can be spum out for a long time - in the right hands.
Meantime, you will have plenty of opportunity to look for another job.
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
And you sir are the reason why developers are NOT sysadmins or typically given admin privileges on servers. Sysadmins DO evaluate the patches and updates. That's a requirement before putting them on the machines. Developers however rarely review the latest security updates and changes required by vendors as relevant to the core OS functions - because they don't have to. So they rely on 5 year old driver implementations (which SUCK) and outdated security models (because that's not their job - to deal with security - they write code and new products!). FUCKING BULLSHIT. I have had more developers take down their own machines than I can count. The original comment is right. If you're working with such brittle fucking code that you can't deal with patch deployments - then go work in VM environment where you can snapshot and rollback with a few clicks. Fucking developers always think then know everything about computers "because I make them dance!" Bullshit. I bet you never took one fucking class on OS development or kernel basics. Stupid fucking arrogance.
"The story so far: In the beginning the Universe was created. This has made a lot of people very angry and has been wide
Write a script to download the KB articles. wkhtmltopdf is extraordinary for this task. Write a cover page each week and you're golden. They won't understand most of whats in the document, but at least it will be documented. Lots of times when people ask for stupid things like this, they just want A answer, they don't know or care enough about it being the correct answer.
We manage our patching process by exception. By that I mean, "bad" patches are held back and everything else goes through. I am responsible for about 1400 VMs running on 60 physical ESX hosts. We have a small subset of VMs that are representative sample of the environment. Those get patched two weeks ahead of time. If nothing goes wrong with those servers, the corresponding patches are pushed into production.
We have an exception for the web tier. Those get patched the weekend after patch Tuesday. They are higher risk due to being public facing.
We have some verbiage in our documentation that states something to the effect of, "We expect that the vendors will properly test and QA their patches before releasing them. We do not have the time to fully vet every patch before deploying it. Therefore we take the following steps to mitigate the potential damage to the environment caused by a bad patch...."
Snapshots are taken of all VMs before patching. That way in case something slips through the cracks, we can quickly roll back to a known good state.
If you need to go toe to toe with the CAB, make them provide you with a business case justification that details the perceived risk(s) and danger of not mitigating the risk. If they cannot do that, they are completely worthless.
Your counter argument then becomes, "Mitigating your perceived risk is going to take xx hours of time. If the risk were to actually occur, we would lose xx hours of time cleaning up."
At the end of the day, if the risk absolutely has to be mitigated and you do not have enough time with all of your other responsibilities, then they need to provide resources. They can do that by either assigning the task to someone else, or hiring a new employee. Ultimately that is your supervisor's call to make the business case for needing more help. All you can do is quantify the time required to comply, and then make your supervisor make a decision on what you will stop doing because you will now be dealing with the new mandate.
Try to understand where the CAB is coming from. They probably have a regulatory requirement, either because of the business that your company is in, or because of the business that your clients are in. They have to prove that they have a functional change management process. It seems like they are just going too far overboard with the process. A change management process just needs to show that people cannot make unauthorized changes to the environment whenever they feel like it. It also needs to show that changes that are made are documented. Potentially destructive changes that could impact application or service available should be discussed, or at the very least, procedures should be developed to mitigate any potential impact of a destructive change.
Meet them half way. Suggest constructive solutions to address their concerns.
WSUS allows for you to track patches and installed software much easier. It works as a pretty good gatekeeper for that sort of stuff. I'd recommend it.
As for dealing with CAB boards, just use logic and reason to destroy them and crush their spirits.
Student in Computer Security here.
frankly, this is a a horrifying example of how NOT to do change management. The whole idea about creating processes for these sorts of things is to SUPPORT your work, not make more of it: for example, the testing + verification lets you manage patches in a slightly more automated way, without worrying that you'll break functionality in the production environment. But CAB (and, from the size of the company you seem to be at, seriously, a whole stinkin' board?) should be inferior to the CIO (effectively, you). That said, there may be ways to handle this gracefully (and if these aren't at least considered, do what they say while quietly looking for a new job).
First, to work with them:
The following roughly outlines a good relationship between you and the CAB: Testing, Verification, and Approval of patches should be automated, as much as possible, with successful tests being given greenlights for deployment. When problems arise, you (or whoever runs the testing) should report on the specific issue: What work-flow is interrupted? How is it interrupted (from the user perspective)? How might this work-flow problem be solved? And which solution is recommended?. This is an opportunity for you to offload accountability from decision-making onto the CAB.
If patches become issues in production, the CAB should grant you authority to rollback the changes under certain circumstances (this may be all occasions, or only if some core functionality is down). The guidelines they give you in this case should be flexible IN YOUR FAVOR. This is key - you shouldn't get in trouble for rolling back patches unless you're doing it because that one guy hates the new look of the window. However, if some sort of event occurs, you should be doing up a report (as in the previous paragraph).
Lastly, you ARE the technical security expert - you should be given leeway for rushed patching (i.e., they have to respond within an hour; Heartbleed is a good example of this) in certain circumstances, and those should come with a report.
Second, ammo against them: compile a short report of the incidences in the past, say, 5 years, where patches have caused an outage and approximate (roughly) how much that cost the business in wasted time and money. Include time and money costs to the organization; estimate these, and make sure to fudge the figures a little in favor of the CAB's proposal (i.e., that you need to do this). Once the cost of 'doing nothing' has been analyzed, find the costs of performing their proposal: new testing equipment, up-front and lifetime; hours spent by the CAB; hours spent by you; new employee (ranking and earning less than you, of course); and subtract the cost of any patches you think would NOT have been deployed as a result.
Finally, analyze the business-process I gave you above. What are the costs in man-hours, $$$, &c. for that (or a reasonable variant thereof)? Again, try not to make the estimates look too shiny, but they should be a reasonable compromise between do-nothing and drown-in-documents. Propose the best path you found (which should be don-nothing or do-change-smart); send it to the CAB and your immediate boss; CC'ing it to the boss(es) of the CAB may be called for, but use your own judgement on that, as a surprise politics may come back to bite you. If the CAB tries to play politics, that may be the best time to chat with their supers - I assume they don't want to here about the wasted budget.
Lastly, if everything goes through in their favor (and you haven't been convinced), start internally 'billing' the CAB and everyone else to show your value to whoever's financially responsible; remember that IT is a Supporter and Enabler for primary business functionality, but only do costs + some reasonable for-work pay (although whoever you hire to do the patch pre-reports will pretty much be billed entirely to the CAB). It's last-ditch, but documentation of costs associated with each department is wonderful CYA material.
CAB is your friend too. I don't know what so many people are moaning about here.
If you don't already understand ITIL, go learn it, and then you will realize why CAB is your friend, how it saves you, and why it lets you do what needs to be done without having to take on 100% of the responsibility when something fails.
Make patch recommendation to CAB.
Wait for CAB response.
CAB says ok.
Implement patch.
Patch fails.
Angry CIO wants to know what happened.
Point to CAB.
Done!
This sounds like change management gone wrong.
The idea of change management is to ensure that changes are tracked, but this sounds like bureaucratic crap. Setup WSUS so you can track what patches are applied where, and then talk to the CAB to approve monthly (or whatever schedule) patches en-masse. Otherwise you'll end up not patching, and that's an even worse result.
I don't mind change management when it's done with some amount of sanity.
Jeremy Baumgartner
After many years of working with a CAB, my suggestion is to work with them but try to push for a Fast Track process that will allow you to apply lightweight changes with low risk. It will cut your struggles with the bureaucracy considerably. Also, when appropriate, try to bundle changes together into larger block releases, rather than taking through many small revisions.
-Bob-
OP starts with: "I am the sole sysadmin for nearly 50 servers (win/linux) across several contracts. ..."
This implies that he's paid hourly. Contracts implies that he's a consultant. If there's anything that a consultant craves, it's billable hours...
I have no problem with your religion until you decide it's reason to deprive others of the truth.
As someone who's worked for mom-n-pop shops and a fortune 500, there's a distinct difference in change management between the two. The key word in his argument is "sole" vs. CM processes at the fortune 500 that involved multiple teams of sys admins and dedicated CABs.
A sys admin who wants to keep their job and is knowledgeable of update systems like SCCM & WSUS will shrug this requirement off and handle it, but I can certainly emphasize as this adds more work with little to no benefit based on the scope of the network and subsequent organization. I'm willing to bet the company he works for is trying out new processes (running as much stuff through CAB as possible), will see them fail and then recede, so he just has to weather the BS.
Lastly, a fortune 500 CM probably has entirely different qualifications for reviewing technical patches and service packs than a small business CAB formed yesterday.
However where I have seen problems:
1). The risk estimation system can be 100% biased towards the idea that Change = Risk. If that is true (it's not) then doing nothing is the risk-free option. Which is a totally skewed perspective.
2). I don't think that CABs generally place enough value on the time they take up. If they actually put a $ value on the amount of time they suck and were forced to do a business justification, with ongoing reviews every year as part of the budgeting process, I suspect that an attitude change would be seen. Without that the CAB is inward focused on it's own mandate and problems;
3). Risk estimation never seems to have any place for Reputation. When I admin a system it's highly relevant whether a given patch process has a history of going well or poorly. I alter my behaviour significantly based upon reputation. CAB risk estimation seems to go back to first principles all the time and only asks, "what could go wrong." Like, you mean in theory? Well in theory a helluva lot! If a specific patch process has proven reliable though you may, may decide to dial back the paranoia a bit and run fewer protection mechanisms. And before I hear the howls of protest consider this. The work expended costs the organization money. That's the bottom line. If you find a process dead reliable then maybe you only need 1 backup instead of two.
And that's my biggest beef with CAB. They look at everything as a risk and that's a flawed, limited viewpoint. In point of fact it's a Risk-Reward ratio and you don't get points for only looking at one side of that equation. You add value to your organization by correctly judging the Risk-Reward ratio. If you do less then you aren't doing a complete job.
Two types of system admins here; the kind I would fire, and the kind who deserve to be called professionals. When I took over the shop I manage, there were a number of cowboys. These three amigos typically caused at least an hour of downtime per week with their lax and unprofessional behavior. It has taken 8 years to clean that mess up, introduce a CAB, and get real processes into place. In that time, we've gone from a laughing stock of the organization to one of the more professional units.
Your kidding right! In any company environment you must follow "Change Management" procedures and that usually involves getting written approval from all project managers that are responsible for each project that is installed on the particular machine. On a Production and/or machines it is usually good policy to be at least one month (possibly six) behind Testing.
I am well aware Redhat are very professional however you should never just update without appropriate testing and management approval. As for Microsoft the same concepts apply. The "cowboy" approach may be ok for home use but put yourself in the shoes of someone who has to explain to a really pissed off management why something went wrong when you were not following "Change Management" procedures.
There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
Easy!
I was in the same situation. Since they wanted to have control over what patches I installed I just told them to tell me. Easy as pie! (the cake) A big plus for me since the responsibility for the security of the software in the machines were no longer my responsibility... Well for a week, then they realized that they didn't have skill-set needed for the job and we went back to our old ways.
We have this and it provides zero benefit to us. Keep track of how much time you spent doing CAB related things. If/when they come asking why you're not getting something done you can tell them you're wasting 1/2 your time on CAB stuff.
You don't say why a Change Advisory Board is wanting to manage every patch - is it over-zealous micro-management or is there a wider governance issue?
Really hard. Judgement calls, prayer shots, gut feelings. If you had my job, you'd go home crying the second day. That's why I'm paid more than you, and why the owners don't let you near the servers.
Bring up the liability of not patching in a timely fashion because of the process. Have them sign/approve language that will guarantee you will never be held accountable or liable for delays they create. Raise the issue above their heads to their management or the executive sponsor of the committee.
Or just find another job and then quit. Why are you bothering to care so much. Let them choke on it and take the organization down!
Take everything I say with a grain of salt: I'm not in management and don't have 20 years of system engineer or system administrator experience. We recently implemented a change advisory board and while it's not perfect, it seems to meet our needs without requiring too much. While I haven't read every comment here, many are filled with cynical comments but no matter how cynical you become, it's never enough to keep up. But there are also loads of very helpful and useful comments too. It’s been a good couple of hours well spent so far. There was a time when we shot from the hip. A change would be made that would ultimately affect dozens/hundreds of users resulting in loads of calls to the help desk. At some point management would be alerted to the ‘trend’ in all the calls that would result in an investigation which often led to "Oh yeah, this 'tiny' change was made an hour ago." Now that the [potential] source was identified, the work was double checked by the responsible parties, often with a few managers standing nearby, until the problem was found & corrected or the change reverted. There was a lot of foot shooting going on. We’re not idiots, but we’re not perfect either which means that sometimes mistakes happen. And occasionally, even after having done all the research, risk & impact assessments, unexpected complications would arise. I'll admit, there was something nice about operating autonomously, without being micromanaged, scrutinized and often provided anything but constructive criticism; And it was great not having to deal with the bureaucratic red-tape one often has to go through to get a simple a change done. But as someone else pointed out, the catalyst that brought about this change was the perceived perception of an unstable system due to ‘lower than acceptable’ success rates when changes were made. When we adopted some form of change control, which later morphed into a change advisory board, trips to the ER for bullet wounds in the foot dropped dramatically. And when something did go wrong, we weren't fearful for having made an ‘unauthorized’ change. I don’t think I’m one to resist change. More often than not, I'm the one trying to drive a change and am rarely affected by someone else's change. And when I am, it usually doesn't require a massive cultural, routine, behavioral etc. change on my part. So when it came time to implementing some form of change control, I could understand how it was beneficial and why it was necessary. I’ll admit, it wasn't easy and required some getting used to, but I have an appreciation for it does for us. But IMHO, it sounds like, for many, the real crux of the issue is *how* a CAB is implemented. I realize every organization is different, but it goes a little something like this on this side of the fence: - Create your change request, which amounts to filling out an online form including things like who is doing the work, why are we doing it, how this affects our users, what’s the procedure to make the changes, what’s the testing process, what’s the back out plan etc.. You’re encouraged to include as much detail here as possible. Strongly. Encouraged. - Then you have to ‘socialize’ the changes with the [affected] departments/department heads. This is kind of a gray “wild card” area as it could be a number of individuals, and you could potentially find yourself repeating the same thing multiple times a day over several days. As such, I suggest holding a regular meeting a day or two before the cab, invite ‘the powers that be’ to go over your proposed changes. The ‘socialization’ step is arguably the most important one because if questions come up in the CAB, or if just one person isn't comfortable, it almost guarantees it’ll be denied until you work it out. Because of that, I personally think this is absurd and loathe the process, but I obey. - Finally on CAB day it should be a slam dunk beca
You are looking at it from the technology perspective, not the business perspective. These processes exist to protect the business which is what you should be thinking about (what business services do your servers enable?).