Ask Slashdot: IT Staff Handovers -- How To Take Over From an Outgoing Sys Admin?
Solar1ze writes "I've just started a role in an IT services firm. I'm required to take over from an incumbent who has been in the position for three years. What are some of the best practices for knowledge transfer you have used when you've taken over from another IT staff member? How do you digest the thousands of hosts, networks and associated software systems in a week, especially when some documentation exists, but much of it is still in the mind of the former worker?"
Hope to Christ he took good notes.
Ask to see the last year's quarterly reports that went to Department Supers with signing power over budgets.
Run like hell....
Did this recently. Started with core network topology documentation, moved on to DNS. Foundational stuff. Documenting subnets, figuring out what documentation and systems should be deprecated. Made lots of diagrams. Reviewed monitoring tools. Prioritized systems by importance to review for best practices. Got a network security audit to find holes. Bam.
hope they have documents
Unfortunately if he / she hasn't already done due diligence documentation there wont be time.
I suggest you spend 50% of your new job for the first 6 months documenting or searching for documents.
Then document as if you might be killed in a car accident on the way home to work and your manager has to take over.
One rule to rule them all: get the passwords!
Primarily, you'll want to build an honest rapport with the other person. Get inside their head a little and allow them to brag A LOT. Ask how they found the place and what they did to change it. You'll want to breeze through all of the high level and important documentation first so you'll have a baseline. Take as much notes as you can. Ask what websites/resources they use to make it easier to follow in their tracks. Explain your situation to them. It will humanize yourself in their mind and you might be able to engage their compassion for you. Perhaps they would be available to answer questions after they leave! Is there budget money for them to be used as a compensated resource? Hopefully they like the idea of helping others and putting some scratch in their pocket.
Bon chance!
Steve
--- rapper/producer/bachelorette party stripper
On my current gig, I got one day...
Solving Unix problems since 1989...
Who the hell manages to become responsible for 1000s of systems and networks without being forced to document them as part of their job ?
BTW, is this a voluntary or involuntary job move by the outgoing person ? That's going to affect the quality of data you are going to receive.
They will be out the door. You can ask for the world, but really you'll get nothing. Get what you can and plan a path forward. I've done it three times, after six months or so it dies down. Start documenting things you find, port maps, cubes, etc. Create what you don't have. Switch your hours so your working 1-2 hours yours users are not per day. During that time you will get more done then the other 6-7 hours of the day. Good luck.
>> How do you digest the thousands of XXXs in a week?
Dude, step away from SlashDot RIGHT NOW.
I would start by writing your own manuals and have the outgoing person review them.
Some people die at 25 and aren't buried until 75. -Benjamin Franklin
Find out which systems or processes breaks the most and learn them, this is where most of the support will come from.
Learn the other systems later when needed.
Give up now - one week is useless. Just kick them to the curb and take over - about as good!
Take over right away. Don't let him do anything. Ask lots and lots of questions. Take notes.
1. Get da passwords. Verify them. :)
2. Support contracts.
3. "What are common problems"
4. "Can I get your email"
Make sure you have thick gloves and sanitize everything. Check for booby traps. Never push any buttons till you've traced the wires back to their origins.
http://bofh.ntk.net/BOFH/
Where did this "in a week" concept come from? You or management?
Depends on how big of a company this is too - are you in a team? - are you tier 2 with a helpdesk underneath?
Anyway, slow down and be realistic. Focus on the users of the systems (your customers) and not the systems themselves and you'll be fine.
2c.
Eat his brain. Just be careful of kuru.
If they don't have them documented... good luck.
1) Need passwords... immediatly change them.Exiting person should have no futher access except through you.
2) Require exiting person to produce network diagram. Make it their last duty if one doesn't exist.
3) Now starts the pain... audit devices and systems for rogue accounts.
4) document as you go.
5) turn in passwords to supervisor.
Good Luck
The only way to address an information void is to fill it with good information. Hopefully everything is standardized.
The incumbent will know what to teach you if you only have a week. If they are leaving on good terms, they probably won't be adverse to having some questions asked by email after they leave, so it's not the end of the world after 7 days.
I've left jobs almost a decade ago and still get an occasional question once or twice a year. It's not that it wasn't documented or couldn't be solved through a few hours of investigation, it's just that a 2 minute email and a 2 minute response later you get your answer and move on. Much more efficient. Sooner or later, the questions stop, and they are self sufficient.
Keep them on as a consultant, and *pay* that $$$ per hour when you need to.
(This assumes they are quality folk in the first place, of course.)
http://rocknerd.co.uk
Secret Server (or something like it) is very cool. Get the outgoing person to put all the access passwords, locations, etc. for every bleeping system in it.
Then change the master password after he leaves.
If possible, build a relationship with the outgoing administrator. Accept that a lot of his head knowledge will dissipate soon after he leaves the company but it wouldn't hurt to have him as an information resource for the first few weeks, just don't abuse the privilege. If he's moving on to another job, his loyalties and focus will be to them not his old company.
Get his permission and comb through his corporate inbox and home directory -- dump it to an offline location. I know you don't need his permission but a little humble pie goes a long way. Again, don't abuse the privilege. Burn a little overtime and construct your own documentation from whatever you find. Let your new supervisor know you're going to bill overtime and why it is necessary. A little work now will make your holiday season a lot smoother.
I did this in my first LAN Admin job. Within four months I was able to take off to the Caribbean for a week and I never received one phone call.
Only the dead have seen the end of War. - Plato
Ask about the problem children and squeaky wheels (regarding to servers), that will get you down to the one-off fixes that are held up by bits of string and expect scripts that rely on chaining ssh across 5 machines to touch a file that doesn't exist. Ask about the oldest equipment, and spend some extra time getting to know your world. Leverage the time you have with him, when he goes home for the day scour the network and start looking at boxes and going over notes, when you run into problems write them down and ask him about those.
And you dont have a week you have 4 days, because the last day he may show up but you most likely wont get much out of him unless its a quick Q and A. That and keep your resume up to date because you may want to use it again soon.
Good leaders run toward problems, bad leaders hide from them.
Now if the old guy worked for the IT services firm then you should not have the outsoureing roundabout in the way.
But if they worked at the place that the firm took over then there may be paper work / other things that get in the way like some things that the old IT guy used to work on are now not part your IT firm systems or things that you control / are part of your pricing.
Or other stuff the like the non manged box that does not have uses log to in but runs say the phone switch or the one that runs the door systems that if they get put under the domain / your over IT plan for uses they will fail or will end up being that box needs to be tied to user or it may get kicked off or out of the network / office.
NO ONE here suggested the most expedient and obvious solution - the Vulcan mind-meld!
Aside of that, ask LOTS of questions, take LOTS of notes. You only have 5 days.
The first thing is to figure out what are the Most Mission Critical systems, and cover them in order of priority, really try to press the criticalcality of the system.
Top Priority: Systems where there is a Downtime has an immediate impact. There is NO Work Around, it needs to run
High Priority: Systems where there is downtime work around and they can tolerate it down for a few hours while you mess with it
Medium Priority: Systems that can be down for a Day
Low Priority: Systems that can be down longer then a day
Try to get the passwords, or make sure you have a passwords and rights to all the systems work in order of priorities.
Create a network map, inventory every system, switch and router... Make sure you have access to them.
Find the Power Users in the area, they may be able to help you out later on, they may not know everything the sysadmin does but they know their little section and sometimes has tips and tricks that don't get passed on. If there is an issue after he leaves you have contacts.
Get the vendor support numbers if available.
Working in order or priority find the custom stuff programs/scripts etc... Do an overview on what they do, what language affect what systems...
On the second to last day, shadow the old admin, on the Last day do everything, he should only mentor.
After he leaves. CHANGE ALL THE PASSWORDS he knows, and check for back doors in the network to prevent him from entering the system.
Due to short time of transition you will probably stumble a bit, but you should have enough to hit the ground running.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
Make sure the person leaving knows you'll buy him or her a beer or two when you have a question you can't figure out on your own in reasonable time.
Circle the wagons and fire inward. Entropy increases without bounds.
Step 1. Kidnap his wife/girlfriend
Step 2. Tell him the only way he'll see his Significant Other is if he gives you all the information and passwords you need.
Step 3. Torture his SO, and then call him up and let him listen to her screaming.
Step 4. Tell him that if he tries to call the cops you'll torture her to death.
Step 5. Wait for him to give you the information and passwords.
Step 6. Kill him and his SO. Don't leave any evidence. Don't keep any trophies. Drop the bodies off in a predominantly African-American ghetto to reduce the likelihood of a serious investigation.
Step 7. ???
Step 8. Profit.
fastest way to learn
If there is nothing else, you can Dig your network with some Autodiscovery software, Whatsup Gold for example will do the trick.
I can use Layer 2 and Layer 3 tools to map your entire network, probably you can see or reset snmp passwords on your devices and help the thing a lot.
Your predecessor should hand you two letters.
Open the first at the first crisis, it should read something like: "Blame it on your predecessor"
At the second crisis, you should open the second letter, which should read "Write two letters...
And just give me your paycheck that you would have collected, and we will pretend you didn't ask this question.
Obviously your new boss is an idiot.
I was thinking the same thing as soon as I read the article summary. If the incoming system administrator has to ask the question either the organization has no written policies or the manager is a fool for hiring the new employee. The simplest answer to the person's question...ask the current system administrator whom you are replacing.
It is the only way to be sure.
"To those who are overly cautious, everything is impossible. "
Every conversation you have with the outgoing admin, record (with permission, of course). When they're showing you something a workstation, screen capture it. Write notes up for all of this the first while its still fresh. Have them walk you through each server, each device, and all the issue with it. You won't remember everything they've said, and they aren't going to do as great a job documenting things as you'd like during their last week, as they're head's already out the door.
Mod point free since 2001
I may be reading too much into the OP's statement but I get the impression that the sysadmin may be the walking repository of knowledge for IT in that organisation.
That is A Bad Thing.
Why? Because everything that sysadmin knows should exist in a system accessible to the successor. Infrastructure should be documented. Business value and use should be described. Change records since implementation should be available. The successor should be able to recreate and describe the current state (or at least a recent last known good) from the body of knowledge.
It's understandable that some organisations don't go the full ITIL, but some kind of record keeping would be expected. That means that any handover could be limited to training the successor in the systems and processes unique to the role and a walk through of a transition document outlining what is where.
If such a body of knowledge is not readily available the successor should make it a first priority and an agreed short-term objective to baseline the systems while detailing to their line manager why this is required and the risks arising from effectively building a picture of the business' IT from scratch.
One good way to learn the new environment is to test any existing disaster recovery procedures, you will find out quickly what's important and things that don't work.
Get up!
1. Know the bau issues, and how to fix those.
2. Know the architecture and at least 4 day to dy work.
3. Get the passwords and check
4. Get thwe name of most priority systems and understand the pain ares.
When I quit my last job, I was there for 5 weeks after saying, "I'm gonna go ahead and leave." I said I could stay as long as they needed to have a smooth transition but that was clearly a mistake. 2 weeks in, absolutely nothing had been done to transfer my tasks so I set a firm date for 3 weeks later. Had all my tasks documented but no direction on who would take over. Another week goes by. "Who is taking over these tasks?" And another. "Who is taking over these tasks? I would really feel more comfortable walking them through their first week." [cricket_chirps] A couple days before I left, I emailed my documentation to the remaining department employees with one last reminder that, even if everything else is ignored, backups and archiving are very important and require daily attention. I assume they figured things out without catastrophic failures because they're still around.
I don't belong in a sysadmin role because I can't use available software tools to document the system setups myself. Help me, slashdot! Do my job!
A lot of the posts assume you're replacing the only sys admin there is, but if you're joiningt a team and only going to be responsible for part of it then there's different questions to ask.
I have been doing this for the last 18 months, since our sys admin was terminated. Write stuff down. Find a secure place (or two) on the network to store an Excel spreadsheet with IP addresses, dns names, and credentials for servers, databases, routers, printers. Encryption keys, vendor support websites. Save root, administrator, and sys passswords, and any other admiinistrivia, in some sort of order you can decipher in 3 months at midnight. I use worksheets to identify categories of information.. It's probably more secure to not keep this stuff all in one spreadsheet, but the fact is the document becomes a corporate asset. You can be the keeper of it, and the central answer person--lots of parties need that kind of information. Back it up, encrypt it, whatever. Where I work, only the CIO, two database admins, and the network admin have read permissions on it. Do not print it out, or carry it on a usb stick that can be misplaced. It's an admirable gesture, but probably masochistic to try and store this information in a secure database, because that may run on the server that goes down at midnight when you most need that list. Plus it's freeform-- we keep different columns of data for OS's, servers, cert keys, routers, databases, etc.. It's also nice to have it handy and organized, so you can paste it into vendor inquiries. Saves money and consternation next time you don't have to look up the info ad hoc. It's easy enough to find out the MySql version, but when there are 10+ servers, you will be glad you've got it in one spreadsheet.
Save model numbers, sales staff information, customer contacts, warranty information, service contracts. Also record server software versions. It's easy to remember if you just bought it, but in two years, you will be glad you know It's Oracle 10.1.0.5 and not just 10g. All the big IT suppliers-- Oracle, Microsoft, HP, Dell, NetApp, SAP-- have their own twisted bureaucracies, ticket tracking systems, incident reporting and escalation, and lines of communication. Put as much of that info in the spreadsheet as you can. You can even embed links to support sites in Excel.
Try and figure out which servers talk to each other, which have dependencies and would be affected by an issue with another server. It's good to learn the network topology-- which equipment and services are in which segment and why. Where does the internet come in? Try not to work too late. Don't carry a gun to work. Be nice to the users. That's about all I've got.
Everything I've ever learned the hard way was based on a statistically invalid sample.
Seriouly, there's no hope you'll actually be able to cram everything you need to know into your brain and make it stick. You need runbooks.
Here's a Technet Article on how to put together a Windows server run book. You'll also be able to google for Linux or Unix examples, although you'll find mostly snippets focused on how to write a runbook section for one specific product or another.
A high-level runbook should document overall systems architecture: network layering, external and important internal connections, service agreements, contacts, roles and responsibilities. The per-system runbooks should focus on configuration details and functional description (why the server is in the architecture). Per-service runbooks crosscut servers and describe how a particular service is deployed, started, stopped, upgraded, etc.
It's a lot. If you don't already have a lot of this, start now. If you do, get it current and updated now.
Welcome to the Panopticon. Used to be a prison, now it's your home.
There is a Linux tool called netmap (package netmapr on RHEL systems, in the rpmforge repository). It will help you build a graph of the hosts and connections on the network. That's a good place to start. Also, look at the configuration of the local hosts in your local DNS server (assuming you have one, right?).
Overview the available documentation, talk with the guy if he is available. A lot of times I ended up reverse-engineering everything although so be ready for that possibility.
Everything I write is lies, read between the lines.
I had to do this after a company ran for 6 months without an admin and with no documentation whatsoever left for others. From when the sysadmin left to when I arrived, most things went to rack and ruin with only some developers doing a little sysadmin on the side. In the end it took nearly the same amount of time as the time that the company was without an admin to find out the information or rebuild what couldn't be salvaged.
I have to say that it gave me a lasting impression of what a company can lose when handover fails.
Luckily, I've also had some good handovers and the best way I've found to do it would be to book the whole week as a workshop. Nothing the outgoing staff member can do is more important than the handover and it can often create a level of goodwill in that you are asking for their assistance and making them realise how important they have been.
However, there are also some rules to the week. Some apply to you - you need to check what you are being told. Anything that starts to look hinky or just plain wrong needs to be constructively challenged. If it still doesn't add up after a challenge and you know they are lying then you need to get them on garden leave as soon as you have the keys and passwords.
My approach for general sysadmin has been to try to understand the systems from the ground up very quickly and I've found it useful to have the following as general headings:
1) Passwords - where they are, how they are kept, what policies are in place? Generally find out how it has been managed in the past. Most important - verify them.
2) Network diagrams - use network scanning and mapping tools to verify what you are finding
3) Infrastructure services - understand the setup for anything important to the infrastructure of the systems. Things like DNS/DHCP/NIS/Kerberos/Pam/LDAP/AD/Certificate Authorities/Identity Management/etc/etc.
4) Storage services - SANs/Makes/models/Where to find support contracts/BACKUPs/Data replications/File stores/etc.
5) Core end user services - File/Print/Core Databases/Core Apps.
6) Cloud services/domain registration accounts/3rd party supplier access
There will always be more to find out but hopefully having a list of what you need can stop your company wasting a lot of time and money in having to rebuild what it can't support.
What if you never see the guy like I did, oh, the last three positions I've had? Interviewed, got the job, never saw the guy but inherited his mess. The hiring manager has his contact info but depending on the way he left and the culture of the organization it may be impossible to get much out of the departed. It's been months of finding lovely little "projects" that were turned into production systems. Ugh. I am now gutting the place little by little and instituting proper processes and procedures, as well as detailed documentation. Sysadmins aren't the best at this in most cases. There are usually gaping holes in the docs...sometimes miles wide! Grin and bear it until you can get things righted.
Win his trust, become his friend. (However annoying or stupid or malevoolent he is.)
That's the basis, without it you will not get any sensible information, let alone light in the dark corners.
(working for an IT Support firm)
Most of the above posts pretty much have it nailed... The point made about WHY things were done is very astute - not just technically but politically.
I would like to add one thing though - do your very best not to slag off the departing sysadmin or the environment – it is a sad fact that IT people get a bad rep and many of them do not deserve it (See the point about WHY things were done) – often this ‘bad mouthing’ starts from the client – but it should not readily be agreed with, unless there are very obvious and serious failings!
Try to think about your profession and the industry as well as your current role and DOCUMENT! & COMMUNICATE!
This is the service you are providing. Get to work.
Most of the stuff was 'glued together' with Perl. So one of the job requirements for my replacement was an understanding of Perl. So I sit down with him on a simple task and have him look over my shoulder as I patched one of the Perl CGI scripts. After about 5 minutes of his silence, he asked, "What language is this written in?" And there we were, staring at the "#! /bin/perl" line at the top of the script. Things were not going to go well.
As my end date approached, I gave him a copy of a configuration file that managed sending out e-mail/paging on error conditions and suggested that upon my departure, he put his own e-mail address and pager number in there. One of the addresses was my home e-mail account. For the next three years, I continued to get, "The server is up/The server is down" messages.
Some time later, the company outsourced the whole system to an offshore company (Strange, I thought, for a DoD contractor). They found my name in the headers of all the files I had revised (its a rather unique name, easy to Google) and hired me as a well paid consultant to assist them in maintaining the system.
Have gnu, will travel.
First step is plan a maintenance window and shutdown everything and back it up, preferably before they depart and bring it back online. Video tape if possible the procedure the outgoing SysAdmin performs. Assume you will never ever have contact with them beyond this encounter.
Day 1 - get the passwords and test them all.
Day 2-5 - off site knowledge handover at the pub. Buy the cnut 1 million beers and listen to all his bitching. You'll learn *everything* from the bitching.
Switch your hours so your working 1-2 hours yours users are not per day. During that time you will get more done then the other 6-7 hours of the day.
I do exactly that. It's the best productivity advise ever. I also try to work weekends instead of weekdays some of the time just to get a better thoughput of work.
The short answer is: you don't.
You fake it until you make it, or can corner the guy in a dark alley and beat it out of him.
In all seriousness, I've been there probably more than most. I've worked most of my career for managed service providers (of varying quality) where there is no environmental documentation to speak of, in most cases, and almost invariably things are a complete goddamn mess because they're going from in-house to outsourced for a reason. More often than not, you're expected to replace 3-5 as many people as you are - all while handling many other customers at the same time, in similar scenarios.
I was blessed several years ago by replacing 6 in-house IT staff in an academic environment by myself; I took over their full role, to the exception of a guy who came to do desktop stuff several times a week. Almost exclusively Linux systems, many with fairly exceptional setups, and lots of stupid interdependency: the previous 'administrators' were more developers than they were admins (and yes, that is an insult for a competent admin).
There were fewer than 80 physical servers and a couple hundred additional workstations (Windows/Mac/Linux). Half a dozen different distros, no virtualization. Half a dozen subnets, most carved up poorly with little forethought (or brought over from 'very legacy' environments and not migrated using best practices). Lots of one-off solutions were difficult to support and had few tenable options for migrating away from - situations where you just pray that hardware gets good enough, fast enough, to make what you need to do tenable to the users.
It took me a full year to get everything to the point where I was comfortable with the environment enough to know what was happening without needing the monitoring systems I'd put in place to tell me. This seems about right in my experience, at least for me. It's taken roughly a year, consistently, to get comfortable with the full scope of tasks and requirements of an environment from the users up through the ranks.
Until then, you fake it until you make it.
~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
Prepare three envelopes
... today wasn't your first day at Bluehost.
which is under a suitable jurisdiction. The waterboard them until they have told you everything. If somebody asks: terrorists attacked your network.
I've done both, a couple times. Two weeks notice is always a bad idea, and a generous leave notice is always favorable to future employers. It's a pretty good sign that the dude isn't too worried about HIS situation or their's.
1) I left with two weeks notice, taking an international job, so there wasn't much wiggle room. It sucked, and I kept good rapport to this day assisting them as much as I could by building documentation and working with my team for those two weeks and random calls thereafter. The team replaced me.
2) Left the international job, gave them 7 months. Took 5 to find the new guy. Politics and teaching him where to find information were the killers. He shadowed me for a few weeks, then we started splitting assignments, and then I took a mutually accepted garden leave for the last two weeks handling only critical tasks that shouldn't carry forward. Rapport is strong with these companies still as well.
3) Started the new job. Hired on Friday, started Monday, didn't even have a place to live. This one was a mess and I HAD to start on Monday because the bugger was GTFO. No documentation or best practices. Went through every server, what OS it was running, what applications, and the access. Started to canvas the network, only to realize one week wasn't enough. Focused on access and design realizing quickly it was a big ball of duct tape with VLANs and different OSPF styles.
When it really came down to it, I gauged my skillset, supplemented my tools and information with whatever could carry the biggest impact, and prepared myself to crash and burn. Now, less than a year later, I have started rolling out best practices based on my documentation and there's been a stark change as the enhancements and simplification has begun winning the battle. But it was a big uphill battle. Your armor will be more important than your weapons for a while, but every chance you get to use your weapons to correct something for win-win, do it and don't look back.
I actually was in this position back in 1999. I was just hired at this midsized company... The Network Admin was stepping down to a programmer position... He went on a 2 week vacation a week after I was hired then turned in his 2 weeks notice on the first day of his vacation. It was tons of fun. The way I did it? Changed all the passwords and went thru everything and charted everything out, then started making changes just in case. Good luck with your experience with this. I know it sucks.
Learn to juggle with chainsaws... NOW!
I'm in the middle of such mayhem right now: Old "team leader" left to weeks ago, new "director it operations" is still is two weeks out, and the team was cut down from 4 sysadmins to... just me.
Learnings:
* Get access to his business mail account. Most of the information you need is burried in there and many obscure mails are still delivered to it.
* Get each and every account/password he knows of before he leaves, use administrative punishment if need be.
* Get someone from higher up the ladder to sort/prioritize the incoming tickets.
* Relax! Thousands have gone this path before you, and they all lived to tell their story.
The very best advice I have ever found:
(I forget who wrote this but I am posting it knowing that this advice will certainly help you!)
The EVIL Lecture
It's really, really, really hard. It requires a very complete audit. If you're very sure the old person left something behind that'll go boom, or require their re-hire because they're the only one who can put a fire out, then it's time to assume you've been rooted by a hostile party. Treat it like a group of hackers came in and stole stuff, and you have to clean up after their mess. Because that's what it is.
Audit every account on every system to ensure it is associated with a specific entity.
Accounts that seem associated to systems but no one can account for are to be mistrusted.
Accounts that aren't associated with anything need to be purged (this needs to be done anyway, but it is especially important in this case)
Change any and all passwords they might conceivably have come into contact with.
This can be a real problem for utility accounts as those passwords tend to get hard-coded into things.
If they were a helpdesk type responding to end-user calls, assume they have the password of anyone they worked with.
If they had Enterprise Admin or Domain Admin to Active Directory, assume they grabbed a copy of the password hashes before they left.
If they had root access to any *nix boxes assume they walked off with the password hashes. Also reset any public-key SSH keys that may be in use for root-login SSH (don't do that at all, but if you have it, clear 'em).
If they had access to any telecom gear, change any router/switch/gateway/PBX passwords. This can be a really royal pain.
Fully audit your perimeter security arrangements.
Ensure all firewall holes trace to known authorized devices and ports
Ensure all remote access methods (VPN, SSH, BlackBerry, ActiveSync, Citrix, SMTP, IMAP, WebMail, whatever) have no extra authentication tacked on, and fully vet them for unauthorized access methods.
Ensure remote WAN links trace to fully employed people, and verify it. Especially wireless connections. You don't want them walking off with a company paid cell-modem or smart-phone. Contact all such users to ensure they have the right device.
Fully audit internal privileged-access arrangements. These are things like SSH/VNC/RDP access to servers that general users don't have, or any access to sensitive systems like payroll.
Start hunting for logic bombs.
Check all automation (task schedulers, cron jobs, or anything that runs on a schedule) for signs of evil. By "All" I mean all. Check every single crontab. Check every single Windows Task Scheduler. Even workstations.
Validate key system binaries on every server to ensure they are what they should be. This is tricky.
Start hunting for rootkits. By definition they're hard to find, but there are scanners for this.
Not easy in the least. Justifying the expense of all of that can be really hard without definite proof that the now-ex admin was in fact evil. The entirety of the above may not even be doable with company assets, which will require hiring security consultants to do some of this work.
If actual evil is detected, especially if the evil is in some kind of software, trained security professionals are the best to determine the breadth of the problem. This is also the point when a criminal case can start being built, and you really want people who are trained in handling evidence to be doing this analysis.
But, really, how far do you have to go? For routine admin departures where expectation of evil is very slight, the full circus is probably not required; changing admin-level passwords and re-keying any external-facing SSH hosts is probably sufficient. Again, corporate security posture determines this.
For admins who were terminated for cause, or evil cropped up after their otherwise normal departure, the circus becomes more needed. The worst-case scenario is a paranoid BOFH-type who has been notified that their position will be mad
Sodium Pentathol
Hello,
Stay in touch with him, try to get all his knowledge that you can. Good luck!
My experience taking over was about the worst-case scenario you can have. I was hired to replace the lead System Administrator at a newspaper after he was fired for a number of reasons (e.g. he told Reporters to fuck off when they needed his help). He was a true BOFH. He was gone by the time I started, and everything was in chaos. There was no documentation for anything. The expensive robotic tape backup unit that the IT Director thought was being used to do backups was actually in the original box in the corner of the guy's office. The newspaper didn't have any backups! It also turns out that he had been running his own side business on company time. The thing that helped me understand everything the most was doing an audit/inventory of every server, computer, router, etc in the building. If a sysadmin leaves on bad terms your first priority has to be securing and updating everything. For me that meant forcing password resets, making sure every account in LDAP was needed by someone still working there, making sure there were no local accounts on any computers, updating virus signatures and running virus scans, making sure the latest software patches were applied, etc. I found a few computers with modems on them, so I removed those. I enabled remote desktop on computers so I could help people from my office. An audit/inventory of everything helps helped me meet a lot of people too. There were many hostile users... and they treated me like shit at first because of their experience with the previous guy. You have to have thick skin and just kill 'em with kindness. At a newspaper there's a period of time when everyone is rushing to make their deadlines to get the newspaper out the next day, so I blocked out an hour of each day during the crunch to spend it the newsroom. If anyone had any problems they didn't have to call me or try to find me; I was right there to help fix it. If there weren't any problems I'd just work from one of the empty desks. It garnered a lot of good will, which comes in very handy to any sysadmin.
Documentation is good, but rare.
Get all the passwords, and take the guy out for a beer.
Spend a year documenting the network with the guy present, it's really the only way to do it. Anything else and basically you are screwed, get all the login codes, pray he doesn't forget any and start firing up the network scanners like nmap and logging in and documenting everything.
1099 and a prayer.
How do you digest the thousands of hosts, networks and associated software systems in a week, especially when some documentation exists, but much of it is still in the mind of the former worker?,
My recommendation is you demand that your boss get the old guy on retainder to be available by phone and e-mail for at least 30 days.
You really need 14 days worth of acclimation.
Get as much as you can out of him, bring a notepad and pencil, and a voice recording device.
I recommend you start with him getting you the keys to whatever he considers to be all the most critical systems; usernames, passwords, IP addresses. What the systems do, what software they run, the basics of how they're configured.
What all the software vendors contract numbers and license keys, and software installation media are; where those are being kept, config file locations, etc.
Get throuh all those. Next make sure you get all the keys to the firewall and routers and switches.
And miscellaneous hosts.
run nmap scans among the IP assigned subnet to make sure you didn't miss any devices. Start building a spreadsheet of devices, if he didn't have one; or complete it if there are missing things.
Make sure that some time during the week, the two of you go through a physical audit together of the "server room" or "server closets", and whatever other places there are IT infrastructure.
The physical audit should include a verification that you didn't miss getting keys or IP address/config details to any equipment.
You need to know where all the equipment is, circuit breakers, cable management, etc
If this is a large site, you might need more than 7 days. Anything you can't get during the time before the guy leaves, YOU get to figure out; probably at the worst possible time, because people are screaming at you about an issue caused by the sudden non-operation of a device you don't know exists.
On the bright side, you'll improve your forensic skills.
Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
Don't worry, I think they have a good cell phone reception in Sheremetyevo. Just give him a call
you know the rest.
Ask for an inventory with parts-count list, including a network diagram, with services locations visible; the complexity here is often to understand how the IP level (Layer 3) configuration relates to Layer 2 and Layer 1, plus what interdependencies there are among services placed on systems (high availability features etc).
That kickstarts the technical handover, but you need the organizational one, too. The Australians seem to get it right, carry on reading.
ps. :)
Yes, everything, yes: full shutdown, up to the electrical supply. You can't cheat that a lot; you'll miss info, but only secondary info. That will be good base for stellar (reverse) engineering.
Here is my favorite hard-to-bend systems hand-over test: shutdown and restart everything. If you can
First, congratulations. you've hit the proverbial jackpot if you don't have any kind of documentation. A wonderful amount of hours is guaranteed to come your way. Second assume docs if available have a certain amount of non trustworthiness. Third, with how things are going on in the world, assume everything has been compromised. What this means is everything is headed for a rework. Though most stakeholders in a place that has been running for years will definitely not be excited about any prospects of change at all. The disgruntled sysadmin scenario is really a massive pain, so number one on top of the list is find out if there are processes in place that deals with user accounts -- proper aging, disabling, change management, etc. If there isn't one policy in place that deals with this specifically, and the folks haven't developed one yet, slowly try to gain a visability on the this attack vector. Depending on whether you can breathe down the neck of the folks that use the system to provide you a list of accounts you can purge, this is going to bug you no end. So you'll be hit with the question of whether you should fit in by following the lax user addition process, or try to beat in a new account usage policy. Those stupid exploits would have a smaller surface of attack if folks couldn't log in and fuzz the crap out of the app.
It's amazing what holes in the docs show up when things really go wrong. You can pre-empt this by acting as if things has gone wrong and try to find a solution. This should remind the outgoing guy of what is important to pass on.
Nmap kills some things like HP jetdirect external boxes, crashes some Samsung office phones and has other consequences so beware of using it on an unknown network during working hours. It's not that there's nothing wrong with nmap, it's just that some network hardware is fragile pieces of shit that will fall over (or in the case of that HP stuff, break) if you hit some ports with packets.
Not an admin but this is what I would consider doing with management backing.
Install a pair of SSH relay servers with full logging of everything going in/out of it to a write only filesystem. Configure production boxes to only accept connections via this relay server.
There are good reasons to have a full audit of everything admins do; and suddenly you know absolutely everything the admins are doing today.
If a ticket is closed and the process isn't documented, give your technical writer the log snippet so they can document the process.
Either you've got 2 weeks worth of work fully documented or evidence that the previous guy wasn't working.
Rod Taylor
I did that once and it had the opposite effect since upper level management assumed they would have a lot of time for the changeover. My replacement arrived some time after I left the state and my assistant with almost no experience was left swimming in deep shit for a week or two. That was after giving a couple of months notice and then at the last minute agreeing to delay my departure for another month. There's a middle ground somewhere that probably works.
Sorry, the company has a "no assholes" policy.
Pay the departing sysadmin for their time, by any legal means, to provide additional information. I've had to work with companies where a core admin had just departed, and had to help hide that we then hired one such admin as part of our company with a different title in another group, partly so we could tap them legally for information about their old company's environments. We got a good engineer, they got a good contract to help out while they looked for a permanent role, and were able to factor in undocumented aspects of the old company's security practices and backup systems which they were flat-out lying about.
Find out why that admin is leaving, without their manager in the room or any witnesses. Don't take "no" or "we'll get that to you" as an answer: go behind the company's back if you have to, because if they're hiding it, it's probably _vital_ to know about.
Do a complete hardware inventory, both of material they're directly responsible for and of devices _connected_ to those. Include the names of the people responsible for services, and who need to be contacted for issues, for every single system.
Verify that the backups are complete and that they do in fact work. This is a very good time to get that backup server, or that failover switch, that has been awaiting the right time to install, and ideally perform the restorations on those.
Warn the managers that there are likely to be service interruptions, and ensure that the monitoring system works well to report them.
Do not change the default scripting language or configuration management system or source control system or account management tools until an opportunity to learn the old one is at least 80% completed.
You'll want to keep the old sysadmin's email address alive and forwarded to you or someplace you can see.
In a year or two some server that you don't know about will have a ssl cert from someone like Verizon expire. About 30 days earlier, they sent an email to two people in your organization who are no longer there. Also, some vendor whose support contract you now need has expired because you didn't get the renewal reminder email.
Naturally, in a sensible organization, these things would not go to a corporate email address assigned to just one person, but it happens, you have no way of knowing if it's been done until after something has gone away.
Once dead drag the body to a 'restroom' stall with a shower.
Dismember the body, be careful to drain blood through the stall drain.
Using a small Poulin Chain Saw, further dismember the body parts into small 'fist-size' pieces.
Be careful: brain, heart, pancreas and intestines are special cases.
Gather the pieces into a Hefty Trash bag and render them to a room with a blender.
Tose into the blender a 'fist-sized' piece, add some water and a hand-full of salt.
Why salt? Salt in the blender will aid dismemberment of the tissues.
Render the 'remains' from the blender to a sink or toilet which ever is more convenient.
Flush.
Repeat process.
That was the 'hard part': congratulations.
The easy part is now figuring out how the prick ass wipe SysAdmin bungie-trapped the OS and filesystems.
Give it about 48 hour to secure the system, rebooting parts as needed.
Done.
First thing to do is check that backups are being taken, then set up something to automatically go round all the servers and gather as much information on them as possible. Then do what the others say, but if the previous incumbent has not done this, then there isn't much hope. Enk.
I work on cruise ships as IT Office and we have 6 months contracts. At the end someone comes in and takes over. Sometimes we have time for handover, sometimes we say hi and bye at the gangway. No matter what the scenario we complete our handover notes. There is no other way to do it other than have EVERYTHING documented.
...should be the ultimate goal. Understand the design, get it under basic control and then work with a team (largest you can muster) of diligent specialists to design replacement systems that are firewalled off from the original. The reasons for this are twofold:
1) No matter how well documented, well designed, etc. the system is, your knowledge of it will never be perfectly complete and you'll never be able to turn around changes with the same degree of confidence and alacrity as the original admin.
2) Your Career -- If you bend over backwards to make the system work perfectly the original designer will still get most of the credit. If you try your best but the system falls short of expectations, you will take the blame as the new "owner" of the system. It's a lose-lose proposition. Building something new, something that you can demonstrate is supported by more than just one person (unlike the original) will be a feather in your cap.
Book a flight to north korea, or at least russia.
I already took over systems in both scenarios, friendly and very unfriendly. I agree with another poster the ultimate goal is to reimplement most of the systems yourself. My last takeover was downright hostile. I had to do an audit of the systems, and for instance albeit I had a list of passwords, many were swapped or I had to find passwords for MySQL servers in logs or in scripts. I also found a couple of *very obvious* backdoors. First thing I did when taking over after documenting all systems passwords, and services running was to create a control server, will SSH keys, and a central syslog server. The 2nd one was disabling all root passwords, and allowing access only by sudo to document all accesses to the team. The 3rd was to create SSH RSA logins. The following step was to deactivate all the unnecessary services, like X or file sharing daemons in machines not sharing drives. After a year, I already reengineered like 80%-90% of the services, as they were rather old and unsupported implementations. The documentation/automation phase proved to be invaluable to be capable of answering to ongoing requests. It is not in the middle of a crisis that you want to find out you don't have a password to a system, or to find out how it works. Nowadays, we already monitor most of the services in NAGIOS, with extensive scripting to adapt to our environment, have service recovery in most of our servers, and also have a page that does automatic audits much more complete than the original audit, minus the passwords (for obvious reasons). We also implement more defined responsibilities in the (new) team - linux admin - windows admin/etc and also starting to invest in internal training. For starters, we ask for volunteers to talk about a technology they are most comfortable with to the others members of the team in an hour-format.
Get whatever you can from the person, but just assume that even if he/she tells you EVERYTHING that you won't retain it. Bottom line, regardless of how well the outgoing admin preps you, nothing will prepare you for the first thing that goes wrong. Just suck it up, struggle through it and learn the system as best you can. It will be painful for you and for your users for a few months, but after that you'll have the hang of it. Just don't give up, and don't blame your predecessor. In time, you'll realize that some of those decisions were wise and some were dumb. There is no such thing as a "smooth transition". Your attitude and your willingness to dig will be what makes the difference, not the "documentation" your predecessor leaves behind. Make the system yours. Treat it as yours and you will be successful. The longer it's "someone else's fault", the longer your users will be in pain and curse your name.