Ask Slashdot: System Administrator Vs Change Advisory Board
thundergeek (808819) writes "I am the sole sysadmin for nearly 50 servers (win/linux) across several contracts. Now a Change Advisory Board (CAB) is wanting to manage every patch that will be installed on the OS and approve/disapprove for testing on the development network. Once tested and verified, all changes will then need to be approved for production. Windows servers aren't always the best for informing admin exactly what is being 'patched' on the OS, and the frequency of updates will make my efficiency take a nose dive. Now I'll have to track each KB, RHSA, directives and any other 3rd party updates, submit a lengthy report outlining each patch being applied, and then sit back and wait for approval. What should I use/do to track what I will be installing? Is there already a product out there that will make my life a little less stressful on the admin side? Does anyone else have to go toe-to-toe with a CAB? How do you handle your patch approval process?"
They want bureaucracy, they make the paperwork. Tell them to track windows and distro security pages, the changes are there. I would be toasted with that kind of tape, I updated my servers in a pinch immediately after the first news of heartbleed at 3 in the morning. 0300AM right. How about dusting your resume and changing jobs? Let them play the shuffling reports game alone.
What we normally do is get a blanket approval if its coming from the OS provider with an understanding that patching will be done on a specific schedule.
IE. If all the patches come from Redhat there is no approval its necessary to keep them up to date for security purposes. The same is true for patches pushed out from Microsoft.
Then your only dealing with 3rd party applications. Even those the more common ones we get added to the blanket approval, ie. Adobe. This way you are only telling them you are bringing them into line with the latest set of patches provided by the OS vendor without having to list all the packages that are being updated. Then they only have to ask you if a program has or does not have a certain bug.
ethanol.
New product your comapny requires is called: junior admin? Expensive stuff but does the job.
You know that stress reduces your life expectancy? You have most stress with dumb supervisors/bosses. Go and quit there. This has also the effect that you've ultimately showed your position about it.
I have to do this and it's no problem at all, although our change management process doesn't sound quite as onerous as yours (I suspect yours will adapt over time -- the CAB will soon get bored if they have to approve every single OS patch).
I have to do a risk analysis for each change that gets made to a system (not just patches). Sometimes this risk analysis is fairly informal, for example if the change is to add more RAM to a VM, it's very unlikely to have a significant adverse impact and is easily reversible, so low risk. Other times the risk analysis (and processes that come out of that) may take a long time and require significant co-ordination with other parts of the organisation I work in.
A good example is if we make a change to a service that impacts the look and feel of that service. It will require co-ordinating with our communications, helpdesk, training and documentation teams as well as other parts of the technical group I work in and the CAB really acts as a check to make sure all of that has happened properly.
There are still a few people in our organisation who see the CAB as a barrier to getting work done, but for me it is really a check to make sure we're delivering changes in a proper way.
I can recommend you take a look at The Phoenix Project by Gene Kim, Kevin Behr and George Spafford. http://itrevolution.com/books/... - I had quite a few "this is where I work" moments whilst reading it :)
Setup a WSUS server, you probably already have the licenses. From there you can pull the patches to it and then push it to needed servers as approved.
There are commercial products that can also this in a nicer manner but they cost money.
Ask a simple question, will this patch cost lives if it is applied. If the answer is no then apply the patch. Justification of applying the patch , no people will die if the patch is applied.
This is known as the change process in ITIL, and it does have a remedy. The remedy is pre-approved changes (standard changes), which should include patching the OS with patches approved by the vendor. It's meant for exactly this situation, and if your change process doesn't have them it's just a paper wall.
The ITIL change process is all about reducing risk. If there is a risk with patching your OS (there is, especially since you mention Windows, it's not that unheard of that a Windows patch makes your whole network inoperative) you have to weigh it against the risk of not patching it (meaning you leave known security holes in).
So, my advice is to get OS patches for your OSes pre-approved by the CAB, that is, when a vendor releases a set of patches you are allowed to patch your systems in the way and the order of that pre-approved change. Of course it's paper-pushing, but use it to your advantage and push some paper yourself. If a server gets compromised and you have the papers (changelog) to prove that you followed procedure, blame will be placed somewhere else. And things will be done differently from there on, since it has been proven that the procedure didn't work, and everybody wins.
Or you could go find another job (like some other posters recommended) where you are the sole *cowboy*-admin and nothing gets done properly. Your choice really.
I bet your CEO or upper level boss is the typical dimwit/jerk, knows nothing about the business, microcontroller type of guy, stupid games of power, calls you on purpose once his secretary tells him you are out of the door. Small guy, stupid looking, may beard of a goatee, cheap-looking suit. Tell him to sod off and change jobs...
Given your description, you're the sole sysadmin. This means you're the person who should take these decision - nobody else. If the company disagrees with this, then either you've done a poor job previously, or they don't trust you to do your job for some strange reason.
Now, if it's you that have fscked up on previous occasions, then it's understandable that they want the red tape.
If you haven't, then it's time to put down the foot and say "Nope, that's my job". If they disagree with that - linkedin should be a relatively short distance away, and after you find yourself a new job - simply hand in your resignation pointing out that you have no interest in having babysitters.
"Rune Kristian Viken" - http://www.nwo.no - arca
As a software developer I have multiple times had a development box screwed over by an IT department pushing unneeded drivers and patches that cause problems. I say prove they are good or needed before you waste other peoples time. If you just want to push any random patch that comes along then you should be forced to resolve all issues without the traditional reinstall the machine.
Just apply this dogma.
Follow the instructions diligently, put the administrative burden on them with tons of notices and emails, and don't forget to ask for a quick answer every now and then, because security is at stake.
oh god Remedy....I used that once.
But the concept is good- you need a 'bug tracker' where the requests for patches can be made to you, and you can then assign tot he CCB. Once they agree it, then assign it back to you for implementation.
Any dev bugtracker will provide you with this kind of audit trail - think 'requirements' for the CCB authorisation, 'development' for the implementation, 'test' for the testing. You might want to rename these though.
I'd make it web based so access is simple for everyone involved - last thing you need is a Excel based solution. I've used Mantis, or Redmine but Bugzilla would work too as would any number of web based bug/task tracker tools. Get one installed before someone on the CCB says "we'll use a spreadsheet", seriously.
Scares the hell out of me how many SYS admins don't know what a Microsoft KB article is... And you are not paying attention not knowing about "patch Tuesday" and where Microsoft announces out-of-band patches... get WSUS and half the work done for you.
In the voice of Nelson from the Simpsons: Ha-ha!
They want to make your work more transparent. Apparently, they think you have too much spare time, too. Or you getting fired/outsourced, and this is a gentle reminder to document your work..
Since all the reports are similar, I would just create a script to handle the documentation needs. I would also do extra work: create report how much this affects the efficiency of patch / hotfix distribution and how time all these process changes take (and maybe inflate that number a bit, just a bit).
This would also be a great time to ask for an assistant to ease the workload.
I work in Change Management for a major telco, I chair the IT CAB, and I oversee server and client patching (amongst many other changes!). When we patch clients, we are patching up to around 30,000 real and virtual desktops - when we patch servers, they also number in the thousands.
There is no way we would allow a sysadmin to patch anything at any time without some level of oversight, an individual admin has no oversight on other patches, hardware interventions, application releases, network upgrades, business campaigns, etc that may be happening on our environment at any given moment (this isn't their job to be keeping track of all of that info). For server and client patching is as light as possible, but we still maintain a close oversight.
On the Wednesday following the second Tuesday of each month (for example), I sit down with the Windows server guys and the Windows client guys, and we review their proposals to patch - usually we have a fairly rapid timescale that we can meet to ensure that the patches are deployed (including pilot testing, etc to catch any issues before everyone's desktop is broken!), sometimes there are other major interventions that overlap, and then we need to make prioritisation decisions and decide which has priority. We have made similar agreements with the Linux teams, where they have a special process to patch, and we have close oversight on Unix patches, as upgrading these servers with a reboot can be a very big deal.
The last thing you want is an application version release of a critical ordering application happening at the same time as a system software patch, and then to have an issue afterwards - is it the application version, is it the systems patch, was there some conflict with the activties being performed at the same time? Troubleshooting gets more difficult, teams point fingers at eachother, and the whole time the business is screaming blue murder.
Of course in an Incident situation there is more flexibility to get things fixed fast, and with security issues I am keen to break open the S-CAB process to expedite a rapid approval flow to ensure that security holes are fixed as fast as possible - of course most changes are encouraged to follow the rules though, the change calendar is published, and everyone knows when the "standard" slots for deployment are, and if most people manage to schedule their changes within those windows, then it minimises potential conflict for everyone.
Change management are not your enemy, they are your friend - once you register your change with them, they have your back, they will guard from other interventions clashing with you, will stop you from inadvertently upsetting the business, and will decrease change related Incidents. However, with great power comes great responsibility, and Change Management need to find the right process for the right type of change - we cannot have a full in depth investigation into every configuration change, every patch, every bug-fix, every new server to be provisioned. A good Change Management team will guide changes to the appropriate flow, and grease the wheels for certain types of interventions - it seems that the CAB mentioned in the summary are still finding their feet a little, and I am sure they will evolve over time as they start to understand which changes are high risk, and which can be allowed to pass with a lighter touch.
-- Pete.
Monochrome - Probably the UK's largest internet BBS
Probably all you are missing over there are scheduled maintenance windows.
You give them a list once per month about what is about to change, get a confirmation, proceed with them available on standby for fixes on the spot or, rollback.
Try to think the big picture: how would you maintain the systems, if they were life-supporting medical equipment? Why not give same quality of service?!
System Administrator Vs Change Advisory Board
50 quatloos on the newcomer!
systemd is Roko's Basilisk.
1 of 3 possibilities:
1. You are perfect. You NEVER screw up. In this case, the CAB is just being a PITA.
2. You can make certain types of updates quickly, with little or no risk, and you never screw up. The CAB should agree to make these standard changes with very low overhead. The other types of updates are likly to help YOU, not to mention everyone else in the company that depends on you.
3. It's hard to say in advance - most of the time, things work OK, but sometimes problems arise and there is unexpected downtime (it's NEVER your fault, however). Bit the bullet. You are not running a world class shop and you need help to improve. Anyway, downtime in production always takes more of your time than filing an RFC.
Turn this request proactive. Ask for a good vulnerability scanner, one that can perform authenticated scans. Qualys or Rapid7 would be good choices. The scanner will list out all of the vulnerabilities on each server including those that have patches available and those that don't. Let the scanner do the work and then present the report of both patchable and unpatchable vulnerabilities and let them work off that. This is how we do the CAB at our 300 server and 2,000 desktop bank.
After the patches are installed run the same scan again and now you have proof that the patches did in fact close the vulnerability. Both the "before" and "after" scan becomes part of the CAB documentation.
This in fact will seriously increase your workload for months because there are a whole lot more vulnerabilities that you know about and many of those will be configuration issues. But for fifty servers it should be less than six months and then you'll be in a good place. And the CAB will lighten up a lot as things show improvement. Too many sysadmins think that Windows Update and the RHN are the only tools they need for vulnerability management and that is not anywhere close to the truth.
That's not the big leagues, that's the short bus.
yes, changes need to be documented. They should be deployed on a test server before going into production. The rest is just people who were presumably traumatized by falling out of a tree as a child seeking revenge.
Take the people in the CAB and replace them with extra admins who are bright enough to know what I said in the 2nd paragraph.
There is genuine value in a well-run change management program. Organizations need to know what is going on in their infrastructure, and plan things properly. In many industries there is a growing regulatory requirement to have change management, and auditors are looking for these things more often too. Many smaller shops are bringing in change control, so rather than handing in your badge my advice would be to deal with it and learn the lessons.
One lesson is rather than fight it, use it to your advantage. Yes, there's paperwork, however if you follow the system correctly they cannot blame you if things go wrong. What you thought of as freedom was also a risk to your own position as you had sole responsibility - change control means less freedom, but you are covered. Also, you can get budget for better management systems which will make your life easier. Put together a realistic list of what you need and get involved with setting up the change control process. If you stay silent or fight it you won't get a say.
sure, but how does that help with having to run the CAB through 102 patches?
I think go for easy solution. introduce the patches in batches for the board. ("monday updates for week 32").
the fucking board will not care after 2 weeks anyways so just do lip service for two weeks.
world was created 5 seconds before this post as it is.
I used to work for a Fortune 100 company. I'm not sure how CAB works at other companies but I get the impression that their implementation was flawed. 1) You could easily go around the process. 2) I'm certain nobody reviews the code - They just kind of discussed it. In my opinion this is a half-baked solution to prevent things from getting pushed to production which could cause problems (errors, leak sensitive info, etc). I am 100% confident that I could have gotten CAB approval for nearly anything. I understand the idea behind CAB but in my experience it isn't effective.
I actually quit that job partially due to things like CAB. Increasingly control was taken away from people in the IT department, and handed to things like CAB or to 3rd parties who managed our systems, databases, etc. The jobs of myself and others in IT staff were being reduced from "actually doing the work" to "submitting tickets and following up on tickets." Nothing like being on hold when calling the 3rd party for a critical issue you yourself know how to fix in 5 minutes. It's also a blast when I had to tell the support guy what commands to run because he wasn't familiar.
And no we didn't fuck up anything to deserve this treatment. It was dictated to us from upper management.
Do exactly what they say to the letter. After the second "patch Tues" where they pound the ever lovin fuck out of Windows Server with updates and the CAB has a pile of paperwork big enough to roast a wild boar they'll suddenly regain a measure of common sense.
Buy something like Tenable Nessus or Rapid7. Make reports very easy and works across Windows, Linux, Cisco, etc. If you get Security Center it will track changes over time and you can see trends over time with patching.
Heck, we have a CR process for anything that touches a live server. I even had to go through the process to get details of a file as it would have resulted in an unexpected file write. By way of background, the server used to fill up during the day's processing and empty out overnight. It got very tight sometimes and when someone made a copy of a file without checking the size, it filled the filesystem and the server fell over. That particular outage cost several million given what the server did.
I want a list of atrocities done in your name - Recoil
Where I come from CAB stands for "Change Acceptance Board", they don't get to make dumb decisions...
Ug. Remedy is such a bitter pill.
My karma is not a Chameleon.
Seems to me that you need to establish a list of pre-approved changes. For example, if you're running Windows and IIS, make sure there's a clause that says anything that comes down the pipeline via Windows Update does not need formal approval. That way you can offload the responsibilty, and work, onto Microsoft. You can keep your core software up-to-date. Third party software, same thing for corporations. Student projects and your own shell scripts might need more examination; not a bad idea actually. But if there's a new version of Firefox, why in the world would a Change Advisory Board think it knows more than Mozilla?
fuck this site and popups
BYE
.. as the admin for a couple of hundred Windows servers, an efficient CAB is your friend. As another said, they have your back, and that of the business (and by extension, the poor guy who is up at 4am fixing any issues introduced). That said, I've also worked with companies and CABs that know how everything is written in the ITIL handbook, but with no clue of how to put it into (an efficient) practice. It sounds like your CAB just wants the paperwork done - did you bring on consultants recently? - and think/hope it will mitigate the risks involved with patching. Change request for patching on a development environment? Routine change. Keep up with the news for any issues from this month's patches. You patch dev, or your pre-prod environment or whatever you have, monitor for a few days and if all is good you apply the same patches to your production machines. This is enough risk mitigation for most, and it gets the job done at the end of the day. Make up a nice RACI chart (Responsible, Accountable, Consulted, Informed) for the whole process - you are probably R/A for successful patching, but, the CAB will provide the approval for you to go ahead. They won't allow you to do it if there's a big release, or some on-going issues. Then you only need to know how to push the patches and have a good engineer to fix anything that might occur on the night, and the accountability trail takes care of any finger-pointing and addresses any gaps in the process you might have noticed. Start slow, start small. Work your way up in volume as the becomes more like a routine change.
I'm the zOS Systems Programmer at a Fortune 500 company. When we do system maintenance cycles our CRB just wants to know when the system environment is changing, not what's changing.
If anyone ever does want to know I do have detailed logs and a before and after image of the maintenance management database (SMP/E Consolidated Software Inventory) for them to peruse. They never do; since they don't understand zOS Systems Programming, and they shouldn't have to. It's their job to manage system availability and to ensure that proper testing and system validation activities were performed. It's my job to manage the environmental change.
For anyone who's foolish enough to ask for detailed documentation of every module, macro, load module, dataset, file in the Unix System Services file system that's being modified, well enjoy yourself.
What I won't stand for, is for someone to have veto power over what maintenance goes on. That's my decision, and since I'm the best person in the organization to decide, I do so.
Yes, I know how they are thinking and the pain you are feeling. To accomplish the implementation of this change management process you will need a lot of people working for you. Use this to your advantage. Quickly study up on the subject so your experience with the systems will not leave you with a dog pile of new bosses to tell you how to do your job. Instead insist that you need to hire more people to manage the overhead.
In the end that probably won't work and you'll be kept "at the bottom" where you are now.
These changes are going to be enormously expensive and despite all you have done, it will be perceived that you created this mess by not having a change management system in place to begin with. Of course, they will also see that you don't know about change management and will prefer to hire someone who already knows about it.
Now I'm not going to down change management processes. They can prevent problems and identify people who would otherwise deflect blame and hide in the shadows. But from what I have seen, you're just getting the beginning of the tsunami of changes.
Push for testing systems and additional hardware to support it. Of course it will also require more space and other resources. Try to get ahead of this beast.
We got our CAB to agree to a certain class of routine changes that require minimum review. They don't need anymore detail than, Test servers updated on Tuesday, Production one week later per maintenance windows.
Vermifax
Logout
...and necessary* but that doesn't stop some change management boards being needlessly obstructive.
Years back, I was working at a company where all of our servers got patched at build and then never patched again "in case it broke something". Myself and the rest of the ops team begged and pleaded for the business to allow us maintenance windows, allowed to reboot the OS outside of business hours, install patches... all to no avail.
Until the company lost a bidding on a contract because they had no maintenance or patch management policy in place so the business comes running at us screaming why we don't patch our servers (they would listen to their potential clients about computer security and whatnot, but not to their own staff). Cue us showing them the dozen or so draft maintenance policies that we'd submitted over the years, all of which were rejected by the directors. Red faces all round in that meeting :)
So the latest draft gets pushed into force by a wheelbarrow full of cash and we go out and buy Shavlik, a really rather nice patch management solution... and then our change management board goes nuts when they see our report. Lots of w2k and w2k3 boxes had literally hundreds of service packs and patches oustanding before, and like the OP wanted an individual change raised for each patch going on each server. We then set up an email direct to the change board that gave them Shavlik's automated PDF thingy which gives a list of all the patches outstanding on a server along with a hyperlink to the MS KB or similar... but that wasn't good enough. They wanted a report on what each patch did, which files it altered, all the usual stuff. Now as another poster had pointed out, under ITIL this should all have been "standard change" without needing so much paperwork (seriously, they should be at least aware of ITIL even if they're not going to follow it to the letter) but we could sympathise with them that, even with our planned dependency-based staggered rollout over a 4 week period, this was both a radical shift in company culture and posed a significant opportunity for breakage... but still. Filing about 20,000 change requests it was to be.
So obviously, since we were dealing with obstructive officials, we did exactly that. Did a few dozen hacky shell scripts that took the PDFs that Shavlik made, CURLed down the contents of the link to the KB page and then posted it off into the change management system - one request per patch per machine. After about twenty minutes of this we'd submitted about 400 requests and the change management system (an in-house pile o' shite that wasn't so much written as congealed out of various bits of sharepoint and was universally hated) had slowed to a crawl enough that it took 10mins to open the page. It used funky whizz-bang ajax to load *all* of the pending change requests in the background ("who needs a LIMIT on this SQL parameter?! We're never going to have more than fifty open change requests!" The developer in question also seemed to think that using a LIMIT statement was akin to taking the go-fasta stripes off your car. Wonder if he's doing webscale development now). After some brief arguing where they actually suggested we should open a change request to submit changes - at which point we cackled at the prospect of submitting another 20,000 pre-change-request changes - and after finding their ITIL manual down the back of the sofa they finally agreed that yes, actually, they didn't need quite such a detailed report, and were prepared to accept our risk assessment report as a single change for the first weekend's rollout.
So about 20,000 patches/service packs were staged and installed over the next two months, and luckily we didn't have a single failure due to the patches (yes, I also thought this was miraculous considering the crufty applications). From then on, every patch cycle needed just four changes, one for each week. That's how it should be done.
* Yes, necessary! I've done more than my fair share of JFDI but that just do
Moderation Total: -1 Troll, +3 Goat
This makes no sense unless you also have a QA department were all these patches would be tested. Then the CAB would need to get a list of the patches description, justification, and impact to existing enterprise applications. Based on this list they could select what can be applied immediately, bundled in a weekly/montly release, scrapped or postponed until a remediation plan is completed. Without QA results the CAB is useless.
In my experience a CAB usually gets introduced in a small organization if something really got screwed up under the old process. There are exceptions - you could get a CTO who is gung-ho for ITIL, or you may have a new, important customer who insists on "process". But a CAB is an attempt to manage change and prevent problems in the working environment. So unless you have a better solution that will prevent negative impacts from your change process, go do the paperwork, with special attention to any risks or issues associated with the change (extended maintenance window, complex install or backout process, partial or incomplete fixes that still leave issues open). You can probably half-ass the CAB and get your work done almost like the old days, but when the next failed change occurs and they find out you hid risks or didn't do proper research, your ass could be out the door.
OTOH, if you really hate bureaucracy that much, hauling your ass out the door could be your best option - as long as you have a different career in mind besides sysadmin.
We are the 198 proof..
That particular outage cost several million given what the server did.
The problem is not the admin actions it's "what the server did"
The application the business was dependant on to generate millions of dollars was designed in such a fragile way, that it could fail as a result of whatever happened to just one server....
You see... this is bad architecture. Servers are prone to failure, even when designed with redundant components.
It is improper for a business application that generates revenue to be sensitive to a single or double server failure. Critical applications should be architected with a level of robustness that reflects their level of importance.
CM is there for both preventing fuck ups and dealing with them when they occur. First things first: do you have a test environment? If not, build one. Do you have documented processes? If not, document them.
Proper change management ensures that: 1. people in the group know what is going on. 2. you have a second/third set of eyes to ensure that you have both a plan, a backout plan (or plan B in case it can't be backed out) and a test methodology to ensure that a change hasn't broken things. 3. to make you think about the implications of what you are doing, and 4. that business stakeholders are informed and know how to plan around any impact both expected and unforeseen.
If you aren't doing all of those things already, sorry dude but you are just winging it. That's efficient, etc. until one day it all goes horribly wrong and you need to figure it out on the fly how to get back to normality, with unpredictable outage durations, etc. All of that should be worked out before going live with your changes.
Yes, it sounds like a lot of faffing about for no real benefit, but really, one day it will save your arse. And really, you will be surprised at just how many effects even a single change to a production system can have.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
In a previous life, we passed around virtual machines rather than doing paperwork. Paperwork is to be sure you have a plan to solve the explosion-and-revert problem.Managing machines instead of paper allowed us to include a process for doing an immediate revert on explosion (;-))
The VMs we passed around were Solaris zones, so they were very lightweight. If I wanted to apply an emergency patch to production, I first applied it to an image, put an instance on pre-prod, a physical machine, and varied it into test. After the smoke-test, I varied it into the pool on the load-balancer, and watched it closely. If it fixed the problem and didn't explode, I put lots of instances on the production physical servers and put them into the load-balancer, quiescing the un-patched instances but not erasing them. If the patch blew up after all, I could revert to the previous buggy release as fast as the load-balancer could disconnect people. Not quite as fast as doing an atomic change on a single server, but fast.
This is a minor variant on some old unix norms: 1) you aren't prohibited from doing even silly things, as prohibitions will keep you from doing something brilliant. 2) You can do anything, but you can't hide what you did, 3) you can change things atomically while running, and 4) if you do something dumb, you can revert it immediately.
The process is a variant/predecessor of ITIL, with pre-set apply and revert steps for emergency changes, which are the high-value part of the whole ITIL change process. Non-emergency changes were a little more heavy-weight, as we tested the patch in an instance in QA, then did a simulated UAT overnight (it was automated, but exceedingly slow), reviewed the results and then the de-facto board decided if we could release the image to production, QA and dev. Your paper-oriented CAB does approve all patches to QA and dev, right? I'll bet they missed that part (:-))
--dave
I did once have a customer where I had to do paper-based CAB approvals, but that was because we weren't funded to have a proper dev, and had no QA at all. As you might guess, we still had at least one fiasco. I shortened the contract as much as I could without doing a no-bid in the middle.
davecb@spamcop.net
We are a 100,000+ user operation. Our patch tracking and approval process is a giant paperwork nightmare that does nothing useful. I would get Microsoft Security Baseline Analyzer, run the report after Patch Tuesday, and send it to the management types. Say look at this nice list of required patches. If there are no objections, we will roll them out :D
Replacing a server = One change
Reconfiguring some shared folders = One change
Replacing a whole bunch of printers = One change
There are a couple of advantages with a change process like this.. the first one is collective responsibility, so the poor sysadmin can pass at least some of the blame back to the CAB if it goes wrong. And then also there's the point that other people might have a legitimate input into the process, especially if there are things happening in the business on the same day as the proposed change that IT doesn't know about.
Never email donotemail@WeAreSpammers.com
So you told them it won't work and they didn't listen. Now show them it won't work. Script something to send them a request for each update for each server. When they get flooded with 100+ perfectly valid requests each day they will beg for mercy. Then file one request for 'ongoing ad-hoc security updates for systems' and watch how fast they approve that one.
The nightmare of change control has existed long before Snowden leaked anything. This is normal big company stuff.
HOA?
I find burying these people with more requests than they can handle causes them to back off.
Things like this always annoy me. Someone has decided either that you don't know your job or that they need more layers of bureaucracy. In my experience it is usually because they think you don't know your job as a system admin. Do I really need a 'paper trail' or make work for things I'm already tasked to manage the risk for? And why would a group of business people (generally) think they are somehow better at mitigating IT risks than the IT person?
Part of what they are supposed to be paying me for is to know that if patch X breaks on the test server it is probably not a good idea to go live with it and I should also know already how to revert the changes in said patch if they have an adverse effect on a live server when they did not on the test server.
Things like this were they feel a need to micromanage things they don't really understand just annoy the heck out of me.
we are all invisible unless we choose otherwise
ITIL is for shit. What an awful program, and let me wipe myself with my THREE certifications.
The CAB is where the otherwise unemployable go to die. It never gets streamlined, the castaways from other organizations will find their home there, and those most lacking in intelligence will go here.
As others have told OP, he should honestly quit. Lone admin, they can stand up a CAB but can't hire more help. Bad sign.
Swim away.
God I hate IT.
You are maintaining 50 servers for multiple contracts and they want to know what is patched.
To me this would be a completely fair and normal situation.
I haven't worked with Windows Server in 8+ years. But WSUS was great for telling you what patches were needed and approving them to be installed.
I know that RedHat has similar technologies. Though you can also roll your own as well.
From my own company, I attend a Change Control meeting and one meeting a month has the Microsoft patch bundle as part of it. The patches get installed on a subset of the company on day X and they day X+2 they get pushed to everyone. This allows testing.
For production servers that customers are using to me it is a no brainer that the customer wants to know and approve what changes are happening to their server.
Depending how busy you are, you might have a resource issue which is fair to complain about.
But that they want to track and approve patches / changes? Suck it up buttercup.
I've been an admin for a very long time. What I see is a lot of admins think the OS is the most important and fail to understand why the server even exists in the first place. If you patch simply because it was made available, you don't test or know what the application the server is hosting does at all, then are you really doing what is best? Yes, patches break things and often the patch "fixes" something that was low or no risk inside the corporate network to begin with. Too many admins fail to balance the risks with application uptime. ...and that's why you end up with a CAB - to keep everyone informed, to balance risk and to account for audit controls. These usually pop up after too many system outages or lack of information sharing. Admins have a bad habit of being too smart and too busy to keep others informed. I have worked with a lot of CAB's in many companies and the best way to work with them is to be proactive in keeping them informed and to build a trust relationship in advance.
MS SCCM and RH Satellite are the two OS vendor specific patch management solutions. However your licensing will end up being more expensive per server and could be cost prohibitive for a small company. You cheapest option would be to script patch groups. You could do this in Powershell and Bash. The CAB may not require you to list in great detail exactly what each patch modify's. They may only ask you to list out the patch numbers being applied. The point of a CAB is to make you slow down rapid poorly thought out changes, bring stability, and external oversight to IT changes. CAB may also have a purpose in letting your greater organization know what is going on. You will find the new requirements painful and often times annoying or illogical, however they will also make you and your organization stronger.
There is or can be built a machine that can simulate any physical object. -Church-Turing principle
It seems that the process is not that bad (even though your description does look a lot worse). Subscribe to the Microsoft Security Bulletins and they have a full description of each patch that they put out on Patch Tuesday (e.g., https://technet.microsoft.com/...). The same goes with RHSA. Subscribe to the updates that you are interested in; these will most likely be your OS, web servers, app servers, other software installed. Similarly, most vendors run security patch announcements. There will likely be a lot of noise but in a couple of months you will know how to extract the information the change advisory board needs. Here's the positive aspect of CAB: if you screw something up, you have someone else to blame! ;-)
I would recommend the use of either Windows Software Update Services (WSUS) or in combination with System Center Configuration Manager (SCCM). WSUS allows you to approve/unapprove all the updates you want to allow in your network. You can group specific computers to a specific set of approved updates if you would like. You can also use SCCM to manage the change control, what was approved, and what was installed. SCCM can also be used to deploy updates in certain circumstances.
Of both of the options, WSUS is free and can be installed on Windows 2k3 or newer. SCCM is now licensed through the System Center package which may or may not be worth looking into if you want to look at the other built in components to it.
Tell them to stick their change advisory board up their shiny rear end. For fifty servers, with updates applied separately for each, they'll never get anybody to come in and do that task voluntarily. They'd need a small team.
Microsoft do actually spend quite a bit of time ensuring that all the changes they apply are proper stable fixes or improve security. How could some advisory board know more about these proposed fixes than Microsoft's developers who are writing the damn things in the first place?
"Is the Chief Priest an Offlian? Do dragons explode in the wood?"
Yeah, after a couple of weeks of having to run through a few hundred patches at a time (make sure you write at least a page for each patch!) they'll get the hint that this is fucking retarded and back off.
I think and your parent underestimate the ability of committees to do work that is fucking retarded. I can't count the number of fucking retarded processes at my company that people have been happily doing for years.
Enigma
The way to succeed as a techie is not about being technically brilliant any more, it is about how you can talk people round to your way of thinking and use evidence to back up your points of view.
Meh, sometimes it truly is nice to see people suffer the consequences of disregarding one's expert advice. One time I was ordered to have our production system send a copy of all log warning output to a low-level exec's email (that was hooked to her blackberry).
I stressed this was a very bad idea due to the sheer volume of email this would produce (thousands of warnings per nightly run, from midnight to 3 AM). The order was reiterated. I believe my commit log message for the config change was, "Jodi reaps the whirlwind."
I heard she put her blackberry under a pillow in a different room so she could sleep thanks to the ~2,000 emailed log warnings she was getting each night. Apparently, the notification on the blackberry was getting queued, so it would constantly notify for hours. I have no clue why she couldn't just put it on silent... phone call rings, maybe?
Regardless, she had to put up with it for several days before the config change to disable her email cc could be deployed.
I spent a lot of years working for a company with a very structured tech environment. In all fairness to the company, they work in an industry that is heavily regulated. That said, it was a highly competent development team of SAs that decided what should be on the servers. A bunch of managers on a CAB will not be able to replicate that. With a single SA and only 50 servers, you have a pretty small shop. Sounds like maybe they have plans to grow the business? It sounds like there is no process in place right now except what is inside your head. Hope you never get hit by a bus! Servers are too important to the functioning of a modern business to leave things to that kind of chance. I think the company is doing the right thing but they are attempting too much too soon. Try to help them but start small; maybe define a standard build of each type of server and then use one of the automation tools to keep each server in conformance with the defined standard build. You might even then use one of the tools like to Tripwire to notify you when someone or something makes an unauthorized change in your servers. Basically, work with your management to improve the situation. The upside of all this for you is that the management in your company will realize that your job is a lot more complicated then they ever imagined..
90% of your changes won't have any effect on production systems. Just lump those together under "Routine changes to UNIX/Linux production environments" and explain that you've tested those on your sandbox network.
10% of your changes will impact your production systems, even if it's just because it's upgrading Apache or some Perl module that your systems use. This can be as trivial as "updated Perl module; ran complete unit, load and regression tests, everything works fine." to "This is a kernel patch that requires us to power cycle each box. Here is our plan to do this in a way that generates no application downtime." Those are the changes CAB is meant to catch. Document each one in a different request. Document them clearly and thoroughly. Run them by people whom you trust to write good English. Make sure that your deployment, testing, and rollback plans are solid, and document them thoroughly in each request.
After a while, you'll get really good at this, and people will trust your requests.
Finding God in a Dog
you should start browsing dice.com now It's escape-hatch time
Part of the Second American Revolution!
It got very tight sometimes and when someone made a copy of a file without checking the size, it filled the filesystem and the server fell over. That particular outage cost several million given what the server did.
At this point, or better yet several months before you get to this point, it's a good idea to volunteer the information that additional disk storage would cost only several thousand and prevent these kinds of problems.
Depending on what the server really does, a complete spare system ready to take over in the event of even the smallest failure also looks like a good investment. Don't let the business wait until they lose several million before spending a hundred thousand to prevent it.
Why write anything? Include the full expanded content from the MS KB article for the update, they generally run 1-5 pages each if printed on 8.5x11/A4
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
Just include the link. Don't bother with the expanded content. Make them feel like they are doing real work by having to click the link.
Bureaucrats need jobs too! They are a help to the organization in the same way that leeches are a help to their host organism.
I'll see your senator, and I'll raise you two judges.
When something gets royally F-ed up, and eventually it will, who is going to get the blame?
I'll see your senator, and I'll raise you two judges.
I would suggest writing a php look up page where all you need to do is copy and paste the requisite KB Patch number, and it have it scrub the http://support.microsoft.com/k... article for related information and paste it into Re canned Letter.
Patch Request for KB
This patch is critical to maintain a stable and update to systems environment. Failure to approve and install this patch will leave your systems vulnerable to
Please note that after applying this patch
Please sign off as approved or rejected
Approved by
Printed name
Signature:
Date:
Rejected by
Printed name
Signature:
Date:
Sincerely your system admin,
Copy & Paste your KBs then proof read each letter make small adjustments where needed. Must most KB description articles are close enough with proper php scripting you should have no trouble pulling the relevant info from the page in the variables. and customizing a script with the info they want to see.
Print and repeat. Hand them hard copies, drink beer,
After they sign off on about 30 of them they will get tired and just say just do what you think is best and go back to doing your job.
What I've observed at the customer site that I've been at for 6+ years is that some mid-level administrator will realize his/her job is becoming obsolete so he/she writes a white-paper for management outlining a huge problem that is waiting to bite the company (that upper management might be personally liable for should it ever come to pass)... Management has a bit of a freakout and generates new policy. Voila, author of white-paper is seen to be the resident expert on this problem and is put in charge of handling it. A new department is born and people are hired/transferred. It used to be ISO-900x that drove this stuff, then SOCS came around and provided even better fodder...
That's how I'd handle it. If they want patch reports, that's reasonable. If they want you to patch the test environment a week ahead so that the devs can check for problems and alert you not proceed, that's reasonable too.
If they want to micromanage your tiny components of your job they can get bent and good luck finding a replacement. No preapproval for routine systems administration activity.
Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
I'm not sure that CAB is necessarily the right solution, but patching really is a problem and can't be done blindly unless your business can take the occasional production hit.
Admin is outsourced at out company, (I'm a former sysadmin who now does application admin, still local) and the contract apparently specifies "current minus one", which means we patch frequently on all platforms. The problem is, the offshore admins have no context, no idea what server provides what resources, (and yes, we've tried to educate them -- the information gets "lost" within weeks or months) and no conception of the idea of patching first on dev, then test, then prod. They manage patches by version numbers not by environments, which means a collection of patches may be announced (to all and sundry because they refuse to use the contact list) is a hodgepodge of development, sandbox and production servers. Information is commonly that the servers "will be patched" but not to what version, which has caused contractual support problems (where a server is running a more recent version of the OS than is supported by the app). Other joys have involved bricking prod servers with firmware patches, because they didn't try them in test first, insisting on doing nonessential servers on the weekends instead of evenings (because, no context) and forgetting that when it's daytime over there, it's dark over here, and I'm probably not going to be at my desk at 0'dark thirty to give some last minute approval to take a server down.
It's a mess, and the CAB process, as obnoxious as it is (we sit through 150 -- 200 change descriptions every week) serves to catch many of the above issues. The outsourcing company is annoyed by this -- they just want to patch -- but we have the process as self defense against very real issues.
What I'd recommend to the OP is to hire someone to manage the CAB process. We did, and it worked out pretty good.
Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
If you're in a small company but still managing 50 servers, what is your role on the CAB? You should advocate to be part of the CAB, at the very least, so that you can coordinate processes to streamline critical and security patches, and keep management informed of the process. If you are approaching this with a hostile or obstructive attitude, it doesn't reflect well on you and it injures your ability to get management to listen to you when it counts. A CAB can be a rubber stamp, for the most part, that ensures that there is at least documentation and a modicum of thought going into the maintenance of the company's infrastructure. The creation of a CAB is pretty reasonable, but the key is to be involved in its creation and ongoing existence as well so that you can eliminate the red tape while still documenting the process to CYA in case something goes wrong. And something, sometime, will go wrong.
It's possible that moving to a more validated and controlled environment will increase your workload, but again, documenting the volume of changes, etc. is to your benefit. That is the data you need in order to provide management the justification for getting a second admin hired. This is also the data that you can use to justify a big raise or promotion at your next review, instead of people wondering what exactly you do all day.
Also, as mentioned previously, you really should have WSUS or a similar solution to deploy patches across your organization, both to make administration easier and to help enforce a consistent environment across all the systems. If you patch everything piecemeal, it's quite difficult to tell whether a patch will break one system with a different set of patches than this other one.
This post awaiting approval of Dice Holdings Change Control Management Department.
As the sysadmin you should have a seat on the CAB. That's it.
You're the person doing implementation, and you're the person most suited to evaluate the technical impact of the changes that you're making.
If you were not a lone sysadmin, it would be your director or their delegate who ought to be on the board; as a standalone sysadmin, though, including you is the sane thing to do.
-----
I assume you have various individuals/groups who have an interest in the systems you administrate. Users, developers, etc. Also regulators. Don't forget the utility of a good documentation system when the auditors come around*. So you need a process to keep them informed of the upcomming system changes. So they can ensure that their product or process isn't going to be broken by a change.
If you have relatively few of thes interested parties, the communications could be mandles manually and by you. If that community is large, the procedures need to be formalized and possibly automated. Having a CAB to represent your user community can offload the communications task from you. At the expense of some paperwork.
On the other hand, I've worked in organizations where the CAB was a make-work task for a few layers of management. People whos only other job prospects are standing by an off-ramp with a cardboard sign*.
*At one of my previous jobs, this was the acid test of the utility of our CAB. I had to fill out stacks of paperwork and await their blessing to make a change. But strangely enough, whenever the FAA came around, they were nowhere to be found. I had to walk the auditors through our systems myself.
Have gnu, will travel.
The ITIL change control and CAB process are quite useful when used properly and facilitated by the appropriate staff. That IT changes cause the majority of datacenter outages is not a debate, proper change control shows us this. However, CABs use is to get all changes in a room so impact and stacked changes can be weighted to ensure these "change outages" do not occur or at least minimize the risk.
Example: I need to push changes across a WAN but the network team has a router upgrade also planed. Someone is going to fail here. CAB resolves these conflicts.
Where CAB is supposed to back off is when we have ITIL-defined standard changes which dont require approval but are still notified so others can be aware systems WILL be rebooting. Our admins responsible for patching, have a pre-approved and scheduled patching cycle of systems, and CAB is the method we notify other departments of these updates. We have a stage and production environment for most applications anyway so we use these for patching also. Well know if an OS patch or package update breaks because it broke in stage, not production. And CAB is only there to pause patching IF theres an issue or other scheduled maintenance window more critical.
We have anywhere from 10 to 30 people in each weekly CAB and it takes 15 minutes (video conf/dial-in bridge). We then have another 15 minute meeting the day of our major maintenance windows. Since doing this weve reduced unplanned outages by 75% simply due to proper scheduling between departments. It takes 30 minutes a week. I spend maybe 30 to 45 minutes a week writing my change requests, so in the grand scheme, the benefit outweighs the administrative overhead.
However, this improvement assumes ITIL was facilitated and maintained by people with enough technical knowledge (through director-level) to make appropriate decisions based off changes presented. If your CAB is run by paper admins, youre in a world of hurt because theyre going to make uninformed approvals regardless of the amount of data presented. In your case, the CAB should not require a write up of every KB and you should not have to "prove" each update. Instead, if an update of a "test environment" goes fine, the CAB should only be there to make exceptions to your planned maintenance window.
I dont know the size of your contracts but my 30 minutes of CAB and pre-maintenance meetings allow us to maintain several hundred applications across multiple VM and physical server farms with additional AWS infrastructure. If ITIL is new to you and your CAB partners, it does take time to smooth out the workflow. It should take almost a year to get a nice flow because it is quite a shift to everyones workflow.
Or, if you have a bunch of chuckleheads running the show now, like others said, dust off the resume.
First, as some other folks have said, give them a weekly list, not every day, or every time one's announced.
Actually, that might burn them out... or, they might decide to batch them on their own, and think they'll get to it eventually.
Here's one: give them a weekly list, AND INSIST on a weekly meeting to discuss it. EVERY WEEK, without fail, without cancellations. Tell them that you'll also want a spot meeting, when you get critical updates (like yesterday's Java from Oracle, with it's 4 that had a CVA rating of 10 on a scale of 10). Insist that if you get those, they need to meet that day, or the next, or give you the pre-approval to put those in without consulting them.
The weekly meetings will get to them in relatively short order, they being so busy and all....
Also, here's another pushback: do you have a testing group, that runs regression tests before regular updates, and especially on ermergency ones? If no, question the committee how they expect you to regression test everything. Also, do you have test, as opposed to development systems? If not, that's another budget item the board needs to approve..
Make them do that real job, professionally. See how much they hate doing it, and maybe it'll go away.
mark
Seriously.
If you're billing by the hour, this should be a GODSEND.
Otherwise, start updating your resume..
It was a very old and very complex system that was midway through having its replacement built. The system was not something you could easily add resource too. Yes, it was a disaster waiting to happen (although it had DR) but as is often the case, trying to persuade the suits that they needed to spend millions on a system so that they'd get a new one that did the same thing, isn't very easy.
I want a list of atrocities done in your name - Recoil
Bah, a pageful of links doesn't have the same weight as a ream of paper dropped on the table in front of each participant =)
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
One of the things you can do with Puppet is get a change record of what WILL happen, specifically so you can show it to a CAB, get it approved, and then apply it during a scheduled maintenance period.
"Don't teach a man to fish, feed yourself. He's a grown man. Fishing's not that hard." - Ron Swanson
I've heard that story before. I once started a job and was told "Sure, this system is a bit touchy, but don't worry about a thing, it's being replaced and won't be needed after next month." My new coworker pointed out that he had been told the same story a year and a half earlier when he joined.
A year later the power supply on the single main server running dontworryitwillbereplacednextmonth blew out and we needed to replace it. Only it ran a very specific version of Netware with a highly specialized database product that nobody knew how to work with any more and even though we had backups it was still next to impossible to even reinstall the base operating system on anything but the original machine. We wound up checking eBay for an exact duplicate of the dead server, raced out to the next city to buy it and then just swapped the hard drives, powered it up and hoped for the best.
It worked, and a quick meeting was held in which my team stressed just how lucky we were to be able to recover from this and that we needed a real solution that didn't involve waiting until next month. The CIO listened to this,nodded his head and suggested that we should try to buy two more replacement servers just in case it happened again.
Another year later, when the company was finally bought up and the office closed down, dontworryitwillbereplacednextmonth was still happily running in the corner of the server room, waiting for a new application that would do everything that it did.
Any month now.
Welcome to hell. son.
You need to come up with a number of how much time it takes to patch, evaluate and test, turn around time, and a testing environment.
Because you are going to need at least one other people, more likely 2 or 3. Now they will need to justify their CAB decision against actual money.
When I did it, I was at least able to get a whole slue of 'standard' changes that needed CAB notification, but not approval.
Don't let this force you into work a single second more then you already do. When something can't ge done just say 'Sorry, I'm mandates to do all this extra work for the CAB, so I didn't have time to get to it". Also chase it up the chain like 'I"m trying to do X, but I can't becasue of Y, I need more people." Be the flag waver for more budget.
They need to be aware, and feel the impact of a decision this big.
The Kruger Dunning explains most post on
If they want to approve every change, then just flood 'em with paperwork. 1 day spent automating your process should keep them busy for at least 6 months. Meanwhile you won't have any changes that have been approved, so you can get on with the interesting stuff.
Oh and if anything fails, dies, gets a virus (presumably security updates and virus scanner downloads count as changes) or lets the world and his/her dog steal your company's secrets then it's not your fault: the board hadn't approved the change you submitted weeks ago.
The good thing is that the change board are taking on responsibility for the changes. By approving them, provided you execute them exactly as described, then they are to blame for any problems - as they gave approval. Make sure you keep a paper trail and have a record of everything you do.
They will quickly tire of the burdensome, boring and ultimately futile work. So enjoy the honeymoon period. It wn't last forever, but if you handle it properly, you can shed the blame for any problems for at least a year - even if the board disband. The confusion and lack of clear indications of who should have approved what can be spum out for a long time - in the right hands.
Meantime, you will have plenty of opportunity to look for another job.
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
And you sir are the reason why developers are NOT sysadmins or typically given admin privileges on servers. Sysadmins DO evaluate the patches and updates. That's a requirement before putting them on the machines. Developers however rarely review the latest security updates and changes required by vendors as relevant to the core OS functions - because they don't have to. So they rely on 5 year old driver implementations (which SUCK) and outdated security models (because that's not their job - to deal with security - they write code and new products!). FUCKING BULLSHIT. I have had more developers take down their own machines than I can count. The original comment is right. If you're working with such brittle fucking code that you can't deal with patch deployments - then go work in VM environment where you can snapshot and rollback with a few clicks. Fucking developers always think then know everything about computers "because I make them dance!" Bullshit. I bet you never took one fucking class on OS development or kernel basics. Stupid fucking arrogance.
"The story so far: In the beginning the Universe was created. This has made a lot of people very angry and has been wide
We manage our patching process by exception. By that I mean, "bad" patches are held back and everything else goes through. I am responsible for about 1400 VMs running on 60 physical ESX hosts. We have a small subset of VMs that are representative sample of the environment. Those get patched two weeks ahead of time. If nothing goes wrong with those servers, the corresponding patches are pushed into production.
We have an exception for the web tier. Those get patched the weekend after patch Tuesday. They are higher risk due to being public facing.
We have some verbiage in our documentation that states something to the effect of, "We expect that the vendors will properly test and QA their patches before releasing them. We do not have the time to fully vet every patch before deploying it. Therefore we take the following steps to mitigate the potential damage to the environment caused by a bad patch...."
Snapshots are taken of all VMs before patching. That way in case something slips through the cracks, we can quickly roll back to a known good state.
If you need to go toe to toe with the CAB, make them provide you with a business case justification that details the perceived risk(s) and danger of not mitigating the risk. If they cannot do that, they are completely worthless.
Your counter argument then becomes, "Mitigating your perceived risk is going to take xx hours of time. If the risk were to actually occur, we would lose xx hours of time cleaning up."
At the end of the day, if the risk absolutely has to be mitigated and you do not have enough time with all of your other responsibilities, then they need to provide resources. They can do that by either assigning the task to someone else, or hiring a new employee. Ultimately that is your supervisor's call to make the business case for needing more help. All you can do is quantify the time required to comply, and then make your supervisor make a decision on what you will stop doing because you will now be dealing with the new mandate.
Try to understand where the CAB is coming from. They probably have a regulatory requirement, either because of the business that your company is in, or because of the business that your clients are in. They have to prove that they have a functional change management process. It seems like they are just going too far overboard with the process. A change management process just needs to show that people cannot make unauthorized changes to the environment whenever they feel like it. It also needs to show that changes that are made are documented. Potentially destructive changes that could impact application or service available should be discussed, or at the very least, procedures should be developed to mitigate any potential impact of a destructive change.
Meet them half way. Suggest constructive solutions to address their concerns.
WSUS allows for you to track patches and installed software much easier. It works as a pretty good gatekeeper for that sort of stuff. I'd recommend it.
As for dealing with CAB boards, just use logic and reason to destroy them and crush their spirits.
This sounds like change management gone wrong.
The idea of change management is to ensure that changes are tracked, but this sounds like bureaucratic crap. Setup WSUS so you can track what patches are applied where, and then talk to the CAB to approve monthly (or whatever schedule) patches en-masse. Otherwise you'll end up not patching, and that's an even worse result.
I don't mind change management when it's done with some amount of sanity.
Jeremy Baumgartner
After many years of working with a CAB, my suggestion is to work with them but try to push for a Fast Track process that will allow you to apply lightweight changes with low risk. It will cut your struggles with the bureaucracy considerably. Also, when appropriate, try to bundle changes together into larger block releases, rather than taking through many small revisions.
-Bob-
OP starts with: "I am the sole sysadmin for nearly 50 servers (win/linux) across several contracts. ..."
This implies that he's paid hourly. Contracts implies that he's a consultant. If there's anything that a consultant craves, it's billable hours...
I have no problem with your religion until you decide it's reason to deprive others of the truth.
I won't comment on MS Windows, although I don't think what you have said will work very well, since I have never seen Production MS centric machines updated on a weekly basis.
Providing information for Linux (Redhat example) is very easy if you have the rpm's. All you need to do is run "rpm -qp --changelog " on every package associated with a particular kernel release/update and provide that information to the Change Advisory Board (CAB) which may result in 100's of package information. This is extremely easy to automate and should only take you a few minutes.
If you provide the above type of info to the CAB then I am quite sure they will do one of two things. 1) Throw a "hissy-fit" (grin) and never want to speak to you again, or 2) Thank you for the information and get back to you in a few months. Of course to keep on their good side you could just give them the changelog of the kernel you are going to install then explain that this is your reference and in the case of a Redhat distribution, which the company has to pay for, this should be enough although you may want to list all the packages that will be updated and let CAB decide if they need their changelogs as well.
I do have to state that in a production, QA, development and to a lesser extent a test and/or "crash and burn" environment you should have appropriate software contracts in place whether it be for a Linux, Microsoft or Unix solution or even some other OS. Having an appropriate software contract in place should save yourself allot of problems and you actually look good with management especially if you can give CAB the info they require (not necessarily want) that will get the job done quickly and efficiently.
In the case of Linux it is fairly easy to setup (approx a days works) an "in-house" repo "jump" server keeping in mind your network people need to get involved here since all target machines will need network access to this machine (or multiple machines if you have separate networked environments. On your "repo" server (appox 100GB+ needed) make sure the appropriate distribution are kept current (within a week) then create links to a staging area that the software updater programs (ie. yum or apt-get) on the target machines can see which contains the packages that will be updated against a kernel (changelog provided to CAB) that they will be reference against. It must be noted that emergency (ie. security) patches should always (you need to check) have the kernel that the patch came out with which means you should update all packages associated with that kernel. Google and your software provider is your best friend here.
Obviously in the case of a company you must follow "Change Management" procedures and if they don't have one (yes some companies don't have this) make sure there is one in place since this covers you if things don't actually go as planned, then you would need to fall back to the appropriate part of the companies "Disaster Recovery Plan" (your company does have one that is tested, I hope).
Sounds complicated, well it is and it isn't. Basically no company that is serious should be without "Change Management" procedures and an appropriate tested "Disaster Recovery Plan" should contain a section for backups and recovery processes and the policies covering them. I am aware that some people will disagree with me but put yourself in the the shoes of the System Admin who has to explain to Management why the Production machine crashed and/or data was corrupted or lost because procedures were not followed.
There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
Your kidding right! In any company environment you must follow "Change Management" procedures and that usually involves getting written approval from all project managers that are responsible for each project that is installed on the particular machine. On a Production and/or machines it is usually good policy to be at least one month (possibly six) behind Testing.
I am well aware Redhat are very professional however you should never just update without appropriate testing and management approval. As for Microsoft the same concepts apply. The "cowboy" approach may be ok for home use but put yourself in the shoes of someone who has to explain to a really pissed off management why something went wrong when you were not following "Change Management" procedures.
There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
You don't say why a Change Advisory Board is wanting to manage every patch - is it over-zealous micro-management or is there a wider governance issue?
Take everything I say with a grain of salt: I'm not in management and don't have 20 years of system engineer or system administrator experience. We recently implemented a change advisory board and while it's not perfect, it seems to meet our needs without requiring too much. While I haven't read every comment here, many are filled with cynical comments but no matter how cynical you become, it's never enough to keep up. But there are also loads of very helpful and useful comments too. It’s been a good couple of hours well spent so far. There was a time when we shot from the hip. A change would be made that would ultimately affect dozens/hundreds of users resulting in loads of calls to the help desk. At some point management would be alerted to the ‘trend’ in all the calls that would result in an investigation which often led to "Oh yeah, this 'tiny' change was made an hour ago." Now that the [potential] source was identified, the work was double checked by the responsible parties, often with a few managers standing nearby, until the problem was found & corrected or the change reverted. There was a lot of foot shooting going on. We’re not idiots, but we’re not perfect either which means that sometimes mistakes happen. And occasionally, even after having done all the research, risk & impact assessments, unexpected complications would arise. I'll admit, there was something nice about operating autonomously, without being micromanaged, scrutinized and often provided anything but constructive criticism; And it was great not having to deal with the bureaucratic red-tape one often has to go through to get a simple a change done. But as someone else pointed out, the catalyst that brought about this change was the perceived perception of an unstable system due to ‘lower than acceptable’ success rates when changes were made. When we adopted some form of change control, which later morphed into a change advisory board, trips to the ER for bullet wounds in the foot dropped dramatically. And when something did go wrong, we weren't fearful for having made an ‘unauthorized’ change. I don’t think I’m one to resist change. More often than not, I'm the one trying to drive a change and am rarely affected by someone else's change. And when I am, it usually doesn't require a massive cultural, routine, behavioral etc. change on my part. So when it came time to implementing some form of change control, I could understand how it was beneficial and why it was necessary. I’ll admit, it wasn't easy and required some getting used to, but I have an appreciation for it does for us. But IMHO, it sounds like, for many, the real crux of the issue is *how* a CAB is implemented. I realize every organization is different, but it goes a little something like this on this side of the fence: - Create your change request, which amounts to filling out an online form including things like who is doing the work, why are we doing it, how this affects our users, what’s the procedure to make the changes, what’s the testing process, what’s the back out plan etc.. You’re encouraged to include as much detail here as possible. Strongly. Encouraged. - Then you have to ‘socialize’ the changes with the [affected] departments/department heads. This is kind of a gray “wild card” area as it could be a number of individuals, and you could potentially find yourself repeating the same thing multiple times a day over several days. As such, I suggest holding a regular meeting a day or two before the cab, invite ‘the powers that be’ to go over your proposed changes. The ‘socialization’ step is arguably the most important one because if questions come up in the CAB, or if just one person isn't comfortable, it almost guarantees it’ll be denied until you work it out. Because of that, I personally think this is absurd and loathe the process, but I obey. - Finally on CAB day it should be a slam dunk beca