Ask Slashdot: IT Staff Handovers -- How To Take Over From an Outgoing Sys Admin?
Solar1ze writes "I've just started a role in an IT services firm. I'm required to take over from an incumbent who has been in the position for three years. What are some of the best practices for knowledge transfer you have used when you've taken over from another IT staff member? How do you digest the thousands of hosts, networks and associated software systems in a week, especially when some documentation exists, but much of it is still in the mind of the former worker?"
Hope to Christ he took good notes.
Ask to see the last year's quarterly reports that went to Department Supers with signing power over budgets.
Run like hell....
Did this recently. Started with core network topology documentation, moved on to DNS. Foundational stuff. Documenting subnets, figuring out what documentation and systems should be deprecated. Made lots of diagrams. Reviewed monitoring tools. Prioritized systems by importance to review for best practices. Got a network security audit to find holes. Bam.
Primarily, you'll want to build an honest rapport with the other person. Get inside their head a little and allow them to brag A LOT. Ask how they found the place and what they did to change it. You'll want to breeze through all of the high level and important documentation first so you'll have a baseline. Take as much notes as you can. Ask what websites/resources they use to make it easier to follow in their tracks. Explain your situation to them. It will humanize yourself in their mind and you might be able to engage their compassion for you. Perhaps they would be available to answer questions after they leave! Is there budget money for them to be used as a compensated resource? Hopefully they like the idea of helping others and putting some scratch in their pocket.
Bon chance!
Steve
--- rapper/producer/bachelorette party stripper
On my current gig, I got one day...
Solving Unix problems since 1989...
Who the hell manages to become responsible for 1000s of systems and networks without being forced to document them as part of their job ?
BTW, is this a voluntary or involuntary job move by the outgoing person ? That's going to affect the quality of data you are going to receive.
>> How do you digest the thousands of XXXs in a week?
Dude, step away from SlashDot RIGHT NOW.
I would start by writing your own manuals and have the outgoing person review them.
Some people die at 25 and aren't buried until 75. -Benjamin Franklin
Take over right away. Don't let him do anything. Ask lots and lots of questions. Take notes.
1. Get da passwords. Verify them. :)
2. Support contracts.
3. "What are common problems"
4. "Can I get your email"
Make sure you have thick gloves and sanitize everything. Check for booby traps. Never push any buttons till you've traced the wires back to their origins.
http://bofh.ntk.net/BOFH/
Where did this "in a week" concept come from? You or management?
Depends on how big of a company this is too - are you in a team? - are you tier 2 with a helpdesk underneath?
Anyway, slow down and be realistic. Focus on the users of the systems (your customers) and not the systems themselves and you'll be fine.
2c.
Eat his brain. Just be careful of kuru.
If they don't have them documented... good luck.
1) Need passwords... immediatly change them.Exiting person should have no futher access except through you.
2) Require exiting person to produce network diagram. Make it their last duty if one doesn't exist.
3) Now starts the pain... audit devices and systems for rogue accounts.
4) document as you go.
5) turn in passwords to supervisor.
Good Luck
The only way to address an information void is to fill it with good information. Hopefully everything is standardized.
The incumbent will know what to teach you if you only have a week. If they are leaving on good terms, they probably won't be adverse to having some questions asked by email after they leave, so it's not the end of the world after 7 days.
I've left jobs almost a decade ago and still get an occasional question once or twice a year. It's not that it wasn't documented or couldn't be solved through a few hours of investigation, it's just that a 2 minute email and a 2 minute response later you get your answer and move on. Much more efficient. Sooner or later, the questions stop, and they are self sufficient.
Keep them on as a consultant, and *pay* that $$$ per hour when you need to.
(This assumes they are quality folk in the first place, of course.)
http://rocknerd.co.uk
Secret Server (or something like it) is very cool. Get the outgoing person to put all the access passwords, locations, etc. for every bleeping system in it.
Then change the master password after he leaves.
Lol good luck!! A few years ago when I left the company I worked for when a certain very large bank that was taking over the very large bank I worked for.. I took the package. They sent a Unix (AIX) admin to come learn about my 400 (ish) Solaris systems. Most of which where Oracle RAC and VCS Clusters. This poor AIX guy knew nothing of Solaris, Oracle or VCS. All the documentation in the world wasn't going to help him. So, I did what any self respecting UNIX guy did. I told the very large bank what my very large hourly rate would be going forward after I left and gave them all the help I could for the next six months while they found a Solaris knowledgable UNIX admin.
If possible, build a relationship with the outgoing administrator. Accept that a lot of his head knowledge will dissipate soon after he leaves the company but it wouldn't hurt to have him as an information resource for the first few weeks, just don't abuse the privilege. If he's moving on to another job, his loyalties and focus will be to them not his old company.
Get his permission and comb through his corporate inbox and home directory -- dump it to an offline location. I know you don't need his permission but a little humble pie goes a long way. Again, don't abuse the privilege. Burn a little overtime and construct your own documentation from whatever you find. Let your new supervisor know you're going to bill overtime and why it is necessary. A little work now will make your holiday season a lot smoother.
I did this in my first LAN Admin job. Within four months I was able to take off to the Caribbean for a week and I never received one phone call.
Only the dead have seen the end of War. - Plato
Ask about the problem children and squeaky wheels (regarding to servers), that will get you down to the one-off fixes that are held up by bits of string and expect scripts that rely on chaining ssh across 5 machines to touch a file that doesn't exist. Ask about the oldest equipment, and spend some extra time getting to know your world. Leverage the time you have with him, when he goes home for the day scour the network and start looking at boxes and going over notes, when you run into problems write them down and ask him about those.
And you dont have a week you have 4 days, because the last day he may show up but you most likely wont get much out of him unless its a quick Q and A. That and keep your resume up to date because you may want to use it again soon.
Good leaders run toward problems, bad leaders hide from them.
Now if the old guy worked for the IT services firm then you should not have the outsoureing roundabout in the way.
But if they worked at the place that the firm took over then there may be paper work / other things that get in the way like some things that the old IT guy used to work on are now not part your IT firm systems or things that you control / are part of your pricing.
Or other stuff the like the non manged box that does not have uses log to in but runs say the phone switch or the one that runs the door systems that if they get put under the domain / your over IT plan for uses they will fail or will end up being that box needs to be tied to user or it may get kicked off or out of the network / office.
NO ONE here suggested the most expedient and obvious solution - the Vulcan mind-meld!
Aside of that, ask LOTS of questions, take LOTS of notes. You only have 5 days.
The first thing is to figure out what are the Most Mission Critical systems, and cover them in order of priority, really try to press the criticalcality of the system.
Top Priority: Systems where there is a Downtime has an immediate impact. There is NO Work Around, it needs to run
High Priority: Systems where there is downtime work around and they can tolerate it down for a few hours while you mess with it
Medium Priority: Systems that can be down for a Day
Low Priority: Systems that can be down longer then a day
Try to get the passwords, or make sure you have a passwords and rights to all the systems work in order of priorities.
Create a network map, inventory every system, switch and router... Make sure you have access to them.
Find the Power Users in the area, they may be able to help you out later on, they may not know everything the sysadmin does but they know their little section and sometimes has tips and tricks that don't get passed on. If there is an issue after he leaves you have contacts.
Get the vendor support numbers if available.
Working in order or priority find the custom stuff programs/scripts etc... Do an overview on what they do, what language affect what systems...
On the second to last day, shadow the old admin, on the Last day do everything, he should only mentor.
After he leaves. CHANGE ALL THE PASSWORDS he knows, and check for back doors in the network to prevent him from entering the system.
Due to short time of transition you will probably stumble a bit, but you should have enough to hit the ground running.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
Make sure the person leaving knows you'll buy him or her a beer or two when you have a question you can't figure out on your own in reasonable time.
Circle the wagons and fire inward. Entropy increases without bounds.
Step 1. Kidnap his wife/girlfriend
Step 2. Tell him the only way he'll see his Significant Other is if he gives you all the information and passwords you need.
Step 3. Torture his SO, and then call him up and let him listen to her screaming.
Step 4. Tell him that if he tries to call the cops you'll torture her to death.
Step 5. Wait for him to give you the information and passwords.
Step 6. Kill him and his SO. Don't leave any evidence. Don't keep any trophies. Drop the bodies off in a predominantly African-American ghetto to reduce the likelihood of a serious investigation.
Step 7. ???
Step 8. Profit.
It is the only way to be sure.
"To those who are overly cautious, everything is impossible. "
Every conversation you have with the outgoing admin, record (with permission, of course). When they're showing you something a workstation, screen capture it. Write notes up for all of this the first while its still fresh. Have them walk you through each server, each device, and all the issue with it. You won't remember everything they've said, and they aren't going to do as great a job documenting things as you'd like during their last week, as they're head's already out the door.
Mod point free since 2001
One good way to learn the new environment is to test any existing disaster recovery procedures, you will find out quickly what's important and things that don't work.
Get up!
When I quit my last job, I was there for 5 weeks after saying, "I'm gonna go ahead and leave." I said I could stay as long as they needed to have a smooth transition but that was clearly a mistake. 2 weeks in, absolutely nothing had been done to transfer my tasks so I set a firm date for 3 weeks later. Had all my tasks documented but no direction on who would take over. Another week goes by. "Who is taking over these tasks?" And another. "Who is taking over these tasks? I would really feel more comfortable walking them through their first week." [cricket_chirps] A couple days before I left, I emailed my documentation to the remaining department employees with one last reminder that, even if everything else is ignored, backups and archiving are very important and require daily attention. I assume they figured things out without catastrophic failures because they're still around.
I have been doing this for the last 18 months, since our sys admin was terminated. Write stuff down. Find a secure place (or two) on the network to store an Excel spreadsheet with IP addresses, dns names, and credentials for servers, databases, routers, printers. Encryption keys, vendor support websites. Save root, administrator, and sys passswords, and any other admiinistrivia, in some sort of order you can decipher in 3 months at midnight. I use worksheets to identify categories of information.. It's probably more secure to not keep this stuff all in one spreadsheet, but the fact is the document becomes a corporate asset. You can be the keeper of it, and the central answer person--lots of parties need that kind of information. Back it up, encrypt it, whatever. Where I work, only the CIO, two database admins, and the network admin have read permissions on it. Do not print it out, or carry it on a usb stick that can be misplaced. It's an admirable gesture, but probably masochistic to try and store this information in a secure database, because that may run on the server that goes down at midnight when you most need that list. Plus it's freeform-- we keep different columns of data for OS's, servers, cert keys, routers, databases, etc.. It's also nice to have it handy and organized, so you can paste it into vendor inquiries. Saves money and consternation next time you don't have to look up the info ad hoc. It's easy enough to find out the MySql version, but when there are 10+ servers, you will be glad you've got it in one spreadsheet.
Save model numbers, sales staff information, customer contacts, warranty information, service contracts. Also record server software versions. It's easy to remember if you just bought it, but in two years, you will be glad you know It's Oracle 10.1.0.5 and not just 10g. All the big IT suppliers-- Oracle, Microsoft, HP, Dell, NetApp, SAP-- have their own twisted bureaucracies, ticket tracking systems, incident reporting and escalation, and lines of communication. Put as much of that info in the spreadsheet as you can. You can even embed links to support sites in Excel.
Try and figure out which servers talk to each other, which have dependencies and would be affected by an issue with another server. It's good to learn the network topology-- which equipment and services are in which segment and why. Where does the internet come in? Try not to work too late. Don't carry a gun to work. Be nice to the users. That's about all I've got.
Everything I've ever learned the hard way was based on a statistically invalid sample.
Seriouly, there's no hope you'll actually be able to cram everything you need to know into your brain and make it stick. You need runbooks.
Here's a Technet Article on how to put together a Windows server run book. You'll also be able to google for Linux or Unix examples, although you'll find mostly snippets focused on how to write a runbook section for one specific product or another.
A high-level runbook should document overall systems architecture: network layering, external and important internal connections, service agreements, contacts, roles and responsibilities. The per-system runbooks should focus on configuration details and functional description (why the server is in the architecture). Per-service runbooks crosscut servers and describe how a particular service is deployed, started, stopped, upgraded, etc.
It's a lot. If you don't already have a lot of this, start now. If you do, get it current and updated now.
Welcome to the Panopticon. Used to be a prison, now it's your home.
Overview the available documentation, talk with the guy if he is available. A lot of times I ended up reverse-engineering everything although so be ready for that possibility.
Everything I write is lies, read between the lines.
I had to do this after a company ran for 6 months without an admin and with no documentation whatsoever left for others. From when the sysadmin left to when I arrived, most things went to rack and ruin with only some developers doing a little sysadmin on the side. In the end it took nearly the same amount of time as the time that the company was without an admin to find out the information or rebuild what couldn't be salvaged.
I have to say that it gave me a lasting impression of what a company can lose when handover fails.
Luckily, I've also had some good handovers and the best way I've found to do it would be to book the whole week as a workshop. Nothing the outgoing staff member can do is more important than the handover and it can often create a level of goodwill in that you are asking for their assistance and making them realise how important they have been.
However, there are also some rules to the week. Some apply to you - you need to check what you are being told. Anything that starts to look hinky or just plain wrong needs to be constructively challenged. If it still doesn't add up after a challenge and you know they are lying then you need to get them on garden leave as soon as you have the keys and passwords.
My approach for general sysadmin has been to try to understand the systems from the ground up very quickly and I've found it useful to have the following as general headings:
1) Passwords - where they are, how they are kept, what policies are in place? Generally find out how it has been managed in the past. Most important - verify them.
2) Network diagrams - use network scanning and mapping tools to verify what you are finding
3) Infrastructure services - understand the setup for anything important to the infrastructure of the systems. Things like DNS/DHCP/NIS/Kerberos/Pam/LDAP/AD/Certificate Authorities/Identity Management/etc/etc.
4) Storage services - SANs/Makes/models/Where to find support contracts/BACKUPs/Data replications/File stores/etc.
5) Core end user services - File/Print/Core Databases/Core Apps.
6) Cloud services/domain registration accounts/3rd party supplier access
There will always be more to find out but hopefully having a list of what you need can stop your company wasting a lot of time and money in having to rebuild what it can't support.
(working for an IT Support firm)
Most of the above posts pretty much have it nailed... The point made about WHY things were done is very astute - not just technically but politically.
I would like to add one thing though - do your very best not to slag off the departing sysadmin or the environment – it is a sad fact that IT people get a bad rep and many of them do not deserve it (See the point about WHY things were done) – often this ‘bad mouthing’ starts from the client – but it should not readily be agreed with, unless there are very obvious and serious failings!
Try to think about your profession and the industry as well as your current role and DOCUMENT! & COMMUNICATE!
Most of the stuff was 'glued together' with Perl. So one of the job requirements for my replacement was an understanding of Perl. So I sit down with him on a simple task and have him look over my shoulder as I patched one of the Perl CGI scripts. After about 5 minutes of his silence, he asked, "What language is this written in?" And there we were, staring at the "#! /bin/perl" line at the top of the script. Things were not going to go well.
As my end date approached, I gave him a copy of a configuration file that managed sending out e-mail/paging on error conditions and suggested that upon my departure, he put his own e-mail address and pager number in there. One of the addresses was my home e-mail account. For the next three years, I continued to get, "The server is up/The server is down" messages.
Some time later, the company outsourced the whole system to an offshore company (Strange, I thought, for a DoD contractor). They found my name in the headers of all the files I had revised (its a rather unique name, easy to Google) and hired me as a well paid consultant to assist them in maintaining the system.
Have gnu, will travel.
hope they have documents
If they don't just ask your local NSA guy, I'm sure he'll be able to help out with some diagrams and backdoors to your systems.
I'm not a complete idiot... Some parts are missing.
Then document as if you might be killed in a car accident on the way home to work and your manager has to take over.
I've never understood why any admin would care about this. If the employer is too cheap to realise they need support in depth to actually be supported, why should I care about the operation going tits up if I get taken out by a bus? They gambled knowing the risks and lost. Suck it up.
Real support is more than one over-worked wizard who knows and controls everything (cf. San Francisco). I want to be training a PNG into the position who can learn, who I can bounce ideas off, who comes in with a different perspective and history from mine, helps with the drudge work, and takes over when I'm not there (sick, recovering from an outage, holidays, bus error).
Any employer who can't see this can go fsck themselves. You get what you pay for. You gamble wrong, you ought to lose your shirt.
"Tongue tied and twisted, just an Earth bound misfit
Some companies can't afford 2 sysadmin people. It's not that they are deliberately gambling, they are doing the best they can with limited money.
Obviously this only applies to tiny or failing companies.
Switch your hours so your working 1-2 hours yours users are not per day. During that time you will get more done then the other 6-7 hours of the day.
I do exactly that. It's the best productivity advise ever. I also try to work weekends instead of weekdays some of the time just to get a better thoughput of work.
You can learn a lot in a week, if you really want to.
The short answer is: you don't.
You fake it until you make it, or can corner the guy in a dark alley and beat it out of him.
In all seriousness, I've been there probably more than most. I've worked most of my career for managed service providers (of varying quality) where there is no environmental documentation to speak of, in most cases, and almost invariably things are a complete goddamn mess because they're going from in-house to outsourced for a reason. More often than not, you're expected to replace 3-5 as many people as you are - all while handling many other customers at the same time, in similar scenarios.
I was blessed several years ago by replacing 6 in-house IT staff in an academic environment by myself; I took over their full role, to the exception of a guy who came to do desktop stuff several times a week. Almost exclusively Linux systems, many with fairly exceptional setups, and lots of stupid interdependency: the previous 'administrators' were more developers than they were admins (and yes, that is an insult for a competent admin).
There were fewer than 80 physical servers and a couple hundred additional workstations (Windows/Mac/Linux). Half a dozen different distros, no virtualization. Half a dozen subnets, most carved up poorly with little forethought (or brought over from 'very legacy' environments and not migrated using best practices). Lots of one-off solutions were difficult to support and had few tenable options for migrating away from - situations where you just pray that hardware gets good enough, fast enough, to make what you need to do tenable to the users.
It took me a full year to get everything to the point where I was comfortable with the environment enough to know what was happening without needing the monitoring systems I'd put in place to tell me. This seems about right in my experience, at least for me. It's taken roughly a year, consistently, to get comfortable with the full scope of tasks and requirements of an environment from the users up through the ranks.
Until then, you fake it until you make it.
~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
Some companies can't afford 2 sysadmin people. It's not that they are deliberately gambling, they are doing the best they can with limited money.
I don't agree. If admin is critical to the future of the business, either they're cheaping out or they shouldn't be in the business in the first place as they're incapable of estimating the real cost of doing that business.
If something fails when I'm home sick and the business suffers, they should be wearing a "Kick me!" sign on their back. They've no right to blame anyone but themselves. I'm human, not a perfect machine or a robot. Expecting otherwise is just wishful thinking on their part. They deserve the consequences.
"Tongue tied and twisted, just an Earth bound misfit
Prepare three envelopes
... today wasn't your first day at Bluehost.
which is under a suitable jurisdiction. The waterboard them until they have told you everything. If somebody asks: terrorists attacked your network.
'Incapable of estimating the real cost of their business.' pretty much sums up most small companies. They cheap out in a lot of areas but in a lot of cases that makes the difference between getting wiped out by a debt collector on a bill they can't afford to pay and making a small profit.
Of course the upper management types never cheap out when it comes to their new company iphones, macbook air's, company cars, or giving the fat secretary they are screwing a huge pay rise.
I've done both, a couple times. Two weeks notice is always a bad idea, and a generous leave notice is always favorable to future employers. It's a pretty good sign that the dude isn't too worried about HIS situation or their's.
1) I left with two weeks notice, taking an international job, so there wasn't much wiggle room. It sucked, and I kept good rapport to this day assisting them as much as I could by building documentation and working with my team for those two weeks and random calls thereafter. The team replaced me.
2) Left the international job, gave them 7 months. Took 5 to find the new guy. Politics and teaching him where to find information were the killers. He shadowed me for a few weeks, then we started splitting assignments, and then I took a mutually accepted garden leave for the last two weeks handling only critical tasks that shouldn't carry forward. Rapport is strong with these companies still as well.
3) Started the new job. Hired on Friday, started Monday, didn't even have a place to live. This one was a mess and I HAD to start on Monday because the bugger was GTFO. No documentation or best practices. Went through every server, what OS it was running, what applications, and the access. Started to canvas the network, only to realize one week wasn't enough. Focused on access and design realizing quickly it was a big ball of duct tape with VLANs and different OSPF styles.
When it really came down to it, I gauged my skillset, supplemented my tools and information with whatever could carry the biggest impact, and prepared myself to crash and burn. Now, less than a year later, I have started rolling out best practices based on my documentation and there's been a stark change as the enhancements and simplification has begun winning the battle. But it was a big uphill battle. Your armor will be more important than your weapons for a while, but every chance you get to use your weapons to correct something for win-win, do it and don't look back.
I actually was in this position back in 1999. I was just hired at this midsized company... The Network Admin was stepping down to a programmer position... He went on a 2 week vacation a week after I was hired then turned in his 2 weeks notice on the first day of his vacation. It was tons of fun. The way I did it? Changed all the passwords and went thru everything and charted everything out, then started making changes just in case. Good luck with your experience with this. I know it sucks.
No organization has the budget to pay for full time "understudies" for every role in the company.
Nor did I suggest that. Some roles certainly do deserve to be thought of in this way. If your IT infrastructure is as important to your business as we around here think it generally is, companies ought to be putting a lot more thought into this. IT is not just an accounting cost centre when it's storing critical business records, serving as the business' marketing face on the Internet, and making day to day work for thousands of employees even possible.
What you're, in fact, saying is that you feel you have no particular professional responsibility to document your work ...
No, I'm not saying that. I take documentation duties seriously; as seriously as I wish companies would take their obligations. Modern companies don't think they should have to, and that makes them shallow and brittle and makes my job vastly more difficult. You can't cheap out on critical infrastructure and expect to get away with it for too long.
"Tongue tied and twisted, just an Earth bound misfit
My experience taking over was about the worst-case scenario you can have. I was hired to replace the lead System Administrator at a newspaper after he was fired for a number of reasons (e.g. he told Reporters to fuck off when they needed his help). He was a true BOFH. He was gone by the time I started, and everything was in chaos. There was no documentation for anything. The expensive robotic tape backup unit that the IT Director thought was being used to do backups was actually in the original box in the corner of the guy's office. The newspaper didn't have any backups! It also turns out that he had been running his own side business on company time. The thing that helped me understand everything the most was doing an audit/inventory of every server, computer, router, etc in the building. If a sysadmin leaves on bad terms your first priority has to be securing and updating everything. For me that meant forcing password resets, making sure every account in LDAP was needed by someone still working there, making sure there were no local accounts on any computers, updating virus signatures and running virus scans, making sure the latest software patches were applied, etc. I found a few computers with modems on them, so I removed those. I enabled remote desktop on computers so I could help people from my office. An audit/inventory of everything helps helped me meet a lot of people too. There were many hostile users... and they treated me like shit at first because of their experience with the previous guy. You have to have thick skin and just kill 'em with kindness. At a newspaper there's a period of time when everyone is rushing to make their deadlines to get the newspaper out the next day, so I blocked out an hour of each day during the crunch to spend it the newsroom. If anyone had any problems they didn't have to call me or try to find me; I was right there to help fix it. If there weren't any problems I'd just work from one of the empty desks. It garnered a lot of good will, which comes in very handy to any sysadmin.
Spend a year documenting the network with the guy present, it's really the only way to do it. Anything else and basically you are screwed, get all the login codes, pray he doesn't forget any and start firing up the network scanners like nmap and logging in and documenting everything.
How do you digest the thousands of hosts, networks and associated software systems in a week, especially when some documentation exists, but much of it is still in the mind of the former worker?,
My recommendation is you demand that your boss get the old guy on retainder to be available by phone and e-mail for at least 30 days.
You really need 14 days worth of acclimation.
Get as much as you can out of him, bring a notepad and pencil, and a voice recording device.
I recommend you start with him getting you the keys to whatever he considers to be all the most critical systems; usernames, passwords, IP addresses. What the systems do, what software they run, the basics of how they're configured.
What all the software vendors contract numbers and license keys, and software installation media are; where those are being kept, config file locations, etc.
Get throuh all those. Next make sure you get all the keys to the firewall and routers and switches.
And miscellaneous hosts.
run nmap scans among the IP assigned subnet to make sure you didn't miss any devices. Start building a spreadsheet of devices, if he didn't have one; or complete it if there are missing things.
Make sure that some time during the week, the two of you go through a physical audit together of the "server room" or "server closets", and whatever other places there are IT infrastructure.
The physical audit should include a verification that you didn't miss getting keys or IP address/config details to any equipment.
You need to know where all the equipment is, circuit breakers, cable management, etc
If this is a large site, you might need more than 7 days. Anything you can't get during the time before the guy leaves, YOU get to figure out; probably at the worst possible time, because people are screaming at you about an issue caused by the sudden non-operation of a device you don't know exists.
On the bright side, you'll improve your forensic skills.
Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
Ask for an inventory with parts-count list, including a network diagram, with services locations visible; the complexity here is often to understand how the IP level (Layer 3) configuration relates to Layer 2 and Layer 1, plus what interdependencies there are among services placed on systems (high availability features etc).
That kickstarts the technical handover, but you need the organizational one, too. The Australians seem to get it right, carry on reading.
ps. :)
Yes, everything, yes: full shutdown, up to the electrical supply. You can't cheat that a lot; you'll miss info, but only secondary info. That will be good base for stellar (reverse) engineering.
Here is my favorite hard-to-bend systems hand-over test: shutdown and restart everything. If you can
It's amazing what holes in the docs show up when things really go wrong. You can pre-empt this by acting as if things has gone wrong and try to find a solution. This should remind the outgoing guy of what is important to pass on.
Nmap kills some things like HP jetdirect external boxes, crashes some Samsung office phones and has other consequences so beware of using it on an unknown network during working hours. It's not that there's nothing wrong with nmap, it's just that some network hardware is fragile pieces of shit that will fall over (or in the case of that HP stuff, break) if you hit some ports with packets.
Not an admin but this is what I would consider doing with management backing.
Install a pair of SSH relay servers with full logging of everything going in/out of it to a write only filesystem. Configure production boxes to only accept connections via this relay server.
There are good reasons to have a full audit of everything admins do; and suddenly you know absolutely everything the admins are doing today.
If a ticket is closed and the process isn't documented, give your technical writer the log snippet so they can document the process.
Either you've got 2 weeks worth of work fully documented or evidence that the previous guy wasn't working.
Rod Taylor
I did that once and it had the opposite effect since upper level management assumed they would have a lot of time for the changeover. My replacement arrived some time after I left the state and my assistant with almost no experience was left swimming in deep shit for a week or two. That was after giving a couple of months notice and then at the last minute agreeing to delay my departure for another month. There's a middle ground somewhere that probably works.
Pay the departing sysadmin for their time, by any legal means, to provide additional information. I've had to work with companies where a core admin had just departed, and had to help hide that we then hired one such admin as part of our company with a different title in another group, partly so we could tap them legally for information about their old company's environments. We got a good engineer, they got a good contract to help out while they looked for a permanent role, and were able to factor in undocumented aspects of the old company's security practices and backup systems which they were flat-out lying about.
Find out why that admin is leaving, without their manager in the room or any witnesses. Don't take "no" or "we'll get that to you" as an answer: go behind the company's back if you have to, because if they're hiding it, it's probably _vital_ to know about.
Do a complete hardware inventory, both of material they're directly responsible for and of devices _connected_ to those. Include the names of the people responsible for services, and who need to be contacted for issues, for every single system.
Verify that the backups are complete and that they do in fact work. This is a very good time to get that backup server, or that failover switch, that has been awaiting the right time to install, and ideally perform the restorations on those.
Warn the managers that there are likely to be service interruptions, and ensure that the monitoring system works well to report them.
Do not change the default scripting language or configuration management system or source control system or account management tools until an opportunity to learn the old one is at least 80% completed.
First thing to do is check that backups are being taken, then set up something to automatically go round all the servers and gather as much information on them as possible. Then do what the others say, but if the previous incumbent has not done this, then there isn't much hope. Enk.
Having worked for a smaller company that could only afford a one person IT department, I agree that it doesn't make sense paying for a second full time person to sit around and stare at the primary sysadmin while they do their job. Frankly, in most companies with one person IT departments, work load is somewhat inconsistent to begin with - oftentimes there's just enough work to do where it would cost more to bring in a part-time consultant to do 4 hours of work that day than to pay the full time sysadmin to do 4 hours of work and then read Slashdot for the other 4. That problem would only get worse if you had to pay for an "understudy".
Thankfully, there's a really easy way to manage that, even for smaller companies. Bring in a part-time consultant periodically (say, for a couple hours every month) as an insurance policy. For the brief period they're there, have them focus on documentation and chatting up the sysadmin. As an added bonus, maybe have them check backups, server logs and the like to ensure the sysadmin isn't falling asleep at their desk. Another bonus is that, if the sysadmin has a large project planned that they could use some additional temporary headcount on, you have someone else with some institutional knowledge lying around.
...should be the ultimate goal. Understand the design, get it under basic control and then work with a team (largest you can muster) of diligent specialists to design replacement systems that are firewalled off from the original. The reasons for this are twofold:
1) No matter how well documented, well designed, etc. the system is, your knowledge of it will never be perfectly complete and you'll never be able to turn around changes with the same degree of confidence and alacrity as the original admin.
2) Your Career -- If you bend over backwards to make the system work perfectly the original designer will still get most of the credit. If you try your best but the system falls short of expectations, you will take the blame as the new "owner" of the system. It's a lose-lose proposition. Building something new, something that you can demonstrate is supported by more than just one person (unlike the original) will be a feather in your cap.
Having worked for a smaller company that could only afford a one person IT department, I agree that it doesn't make sense paying for a second full time person to sit around and stare at the primary sysadmin while they do their job.
I've worked for a few of those too, yet I've never seen that. There's always been a lineup of requested features, systems due to be replaced/enhanced/re-worked, unexpected fires to be fought, and things to be learned or re-learned. Sitting around staring at someone else who's doing the work is what managers and ditch diggers do, not IT people, and I wish those people confined themselves to Facebook instead of trying to muddy the waters on tech sites. Why aren't they ruining HR's day instead of mine? HR's not doing anything useful.
"Tongue tied and twisted, just an Earth bound misfit
No, I'm not saying that.
You did say that ...
No, I didn't. When your nuclear physicist or rocket scientist takes a day off, do you expect your receptionist to dig into their notes to fill in? I produce good documentation and well commented code that those with an average knowledge or skill in the area can use to enlighten themselves to carry on my work. I don't expect managers to understand it nor do I expect them to want to. Writing it so they could would bore to death those who actually could fill in.
"Tongue tied and twisted, just an Earth bound misfit
I already took over systems in both scenarios, friendly and very unfriendly. I agree with another poster the ultimate goal is to reimplement most of the systems yourself. My last takeover was downright hostile. I had to do an audit of the systems, and for instance albeit I had a list of passwords, many were swapped or I had to find passwords for MySQL servers in logs or in scripts. I also found a couple of *very obvious* backdoors. First thing I did when taking over after documenting all systems passwords, and services running was to create a control server, will SSH keys, and a central syslog server. The 2nd one was disabling all root passwords, and allowing access only by sudo to document all accesses to the team. The 3rd was to create SSH RSA logins. The following step was to deactivate all the unnecessary services, like X or file sharing daemons in machines not sharing drives. After a year, I already reengineered like 80%-90% of the services, as they were rather old and unsupported implementations. The documentation/automation phase proved to be invaluable to be capable of answering to ongoing requests. It is not in the middle of a crisis that you want to find out you don't have a password to a system, or to find out how it works. Nowadays, we already monitor most of the services in NAGIOS, with extensive scripting to adapt to our environment, have service recovery in most of our servers, and also have a page that does automatic audits much more complete than the original audit, minus the passwords (for obvious reasons). We also implement more defined responsibilities in the (new) team - linux admin - windows admin/etc and also starting to invest in internal training. For starters, we ask for volunteers to talk about a technology they are most comfortable with to the others members of the team in an hour-format.