Seven Habits of Highly Effective Unix Admins
jfruh writes: "Being a Unix or Linux admin tends to be an odd kind of job: you often spend much of your workday on your own, with lots of time when you don't have a specific pressing task, punctuated by moments of panic where you need to do something very important right away. Sandra Henry-Stocker, a veteran sysadmin, offers suggestions on how to structure your professional life if you're in this job. Her advice includes setting priorities, knowing your tools, and providing explanations to the co-workers whom you help."
What habits have you found effective for system administration?
The issue with #6 is that users almost invariably never accept an answer here. And a lot of the time it may be something you can't adequately explain, which is something they don't like even more. Especially if you know the problem wasn't the result of something you did.
I discovered tmux (terminal multiplexer) a while back, and is a very potent replacement for screen, it supports splitting windows, having multiple sessions, sharing windows between sessions, customizable status bars etc. Try it out!
When working on a problem, I usually have two or more shells open. I don't mean multitasking, but with more then one open, I can issue commands from one and use the others to monitor logs/etc.
Physics is like sex. Sure, it may give some practical results, but that's not why we do it
i thought they were
sloth, gluttony, pride,...
What habits have I found effective for system administration? BOFH spring to mind ...
I know them all. They all work in Marketing.
"A door is what a dog is perpetually on the wrong side of" - Ogden Nash
The first time a task comes up deal with it manually, it may or may not be related to a problem.
The second time this task occurs deal with it manually.
The third time this task occurs, it's time to start scripting.
It may take you a day or more to write the script, test debug, etc. or even longer for complex tasks but, this behavior tends to be a winner. The script is already some degree of documentation, it records the steps, etc. If it's robust enough it can be used to by your support techs to resolve issues, expanding the number of people who can resolve an issue, freeing the admin for other tasks. Scripts tend not to make typos (yes, I know your command line skills are legendary) and can save a lot of time and effort in the long run.
If you are not doing active improvements, planning for failover, and using good configuration management techniques then your slow time is adding to the number of hurry-up-and-fix-all-the-things times. There are always external matters like heartbleed that will come along, as a sysadmin's job is not to review the memory allocator in the SSL library regularly. However, if your web services or mail services are down because a single system went offline then you're to be blaming yourself.
The reason there are more fat people in IT isn't because we want to be. It is because the GOOD IT people get fat because they know that the best IT people never need to leave their seats. If you have to leave your seat to do something as an admin, you are doing something wrong and not using the technology that is available to you to be able to fix everything but physical hardware failure or installation from your seat.
We were all warned a long time ago that MS products sucked, remember the Magic 8 Ball said, "Outlook not so good"
Did you try turning it off then on again?
From TFS, I really don't get why that applies only to Unix admins. That describes the years I've spent as a Windows admin as well.
"Think about how stupid the average person is. Now, realise that half of them are dumber than that." - George Carlin
Everyone knows real programmers code in C, and in C you count from zero. Counting from one? that is so FORTRAN. Retire already, old chap.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
"What habits have you found effective for system administration?"
Carrying an Uzi.
Need Mercedes parts ?
As someone who's managed a team of sysadmins that moved to the Linux world from Windows, I have this tip: "Reboot does not fix anything, it just hides things".
For some reason, Windows admins have been trained to reboot immediately when things don't work well rather than to figure out why something is failing. I'm sure this was a valid "fix" in older versions of Windows, but Windows has been stable for quite some time, and things shouldn't mysteriously stop working for no reason. Take a bit of time to figure out *why* the CPU is suddenly spiking on the database server, since if you reboot it, you will have lost most of the evidence for why it's happening, and it's likely to happen again. If it's a production server and you can't spend much time, run a few diagnostics (ps, "top", lsof, etc) and save to a file for the postmortem, but don't just go in and reboot before looking around.
Only three things are necessary for a highly effective unix admin:
To crush your userbase
To see their accounts deleted before you
To hear the lamentations of the salesmen
... works like a charm for me.
rgb
Even when the experts all agree, they may well be mistaken. --- Bertrand Russell.
Don't waste time reading slashdot.
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
Fact is - a good SysAdmin can play well anywhere on the pitch. Typically a 'system' comprises of more than one discipline.
Basic time management, inter-personal skills and some grasp of hygiene are pretty much must-haves.
Knowledge of the tools required to perform your duties and save the planet are a gimme, surely, once you are in that position. I am sure a 'nix admin can't avoid other disciplines the same as a wintel admin can't avoid *nix. Difference perhaps is that a decent multi-disciplines admin won't throw their toys out of the pram when they find out they have to interact with 3rd parties. (personally, physically or programmatically - take your pick)
By definition a system is a "complex whole" and should not, in 2014 be defined by OS...unless you are going to be specific about that OS, in which case you are not a sysadmin you are a *nix admin shirly.
90% of the job: "Have you tried turning it on and off again?" https://www.youtube.com/watch?...
Using anything like puppet or chef under version control to do all server ops will not only leave you with a full timestamped documentation, but will allow you to easily horizontally scale servers, rebuild them should disaster strike and protect you from stupid upstream package updates that b0rk your config files.
Have a staging and production environment? pushing your chef/puppet scripts to production after they're proven to work insures you have the same changes applied on both sides, and avoid manual operations on production.
You really need to have a beard to get it. Do you have a beard? You don't sound like you have a proper beard.
I have determined that my sig is indeterminate.
With Linux/Unix just say, "Well it's an antiquated operating system." - and if Linux add "and with this F/OSS operating system, well, you get what you paid for and with the addition of [insert package name like systemd] 'blah blah blah blah' caused our problem. I need a raise to work with this shit!"
If we were on Windows, this wouldn't happen!
It works the other way around if you are a Windows admin, too. Just replace F/OSS with "closed undocumented source and [insert money hungry profit driven dribble screwing over customers stuff here] ' blah blah blah' If this were Linux, this wouldn't have happned! I need a raise to work with this shit!"
The exception is Oracle. Everybody throws their hands up, skakes their heads, and gets the knee pads and KJ and LIKE IT!
Have a mini-fridge under your cubicle desk for constant snacking. The constant snaking would be the habit. Really though, there are too many fat bastards in IT.
Obligatory video: Valve Snack Bar.
I think the most useful talent I've developed is the ability to go to sleep fast and to wake up fast and alert. When the phone rings or pager goes off, the faster you can reach "full on", find and fix the problem, and get back to sleep, the more sleep you get in the long run. Cohorts who have trouble getting to sleep after a late night emergency tend to be seriously dragging by the end of their oncall time.
Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
If you are in a shop that chains together MacMini servers, whe there's problem, start crying, ask for a hug and just say "I don't know what happened?! *sobbing* Mac is Go...Job's gift to mankind kind! *sob* I'm such a failure! *snot runs from nose*"
You'll get a couple of days off, they'll power down everything, turn it back on, and it'll all be working fine when you get back.
First time for this task ;-)
Rule #8 would be not to fix problems too quickly (and let some that you can see coming, happen).
If you fix every problem before it gets serious and avert the other 90%, your bosses will think they have a highly reliable IT infrastructure. They will then cast their eyes about for cost savings - and the biggest target will be the most highly paid admins - the most senior ones - YOU!!!
So keep the problems coming, as all that management have to assess you on are the number of fixes and the time to fix. Nobody ever got promoted for solving problems that never happened.
Finally: 60 hours a week? Don't be daft. If you're really an effective administrator you should have your work finished well inside 30 hours and/or 4 working days.
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
Find the people on your team who can be trusted to do the job well. Encourage them to do it. Work with them to build their skills as well as yours.
Find the people on your team who can not be trusted to do the job well, and replace them with shell scripts.
Yes... getting out of "the tedious low end job that sysadmin is" just so that you can sit in painfully dull meetings all day. Great plan that.
A Pirate and a Puritan look the same on a balance sheet.
For habit #2 Nagios comes in really handy (could watch MRTG et al as well).
Setup all hosts in Nagios, sending alerts to an email for a couple weeks. Figure out what hosts have certain patterns.
You really need to have a beard to get it. Do you have a beard? You don't sound like you have a proper beard.
Ehhh, of course you need a beard. But the article also says, to be successful you should remove spaghetti once in a while:
Habit 7: Make time for yourself
[... ]Taking care of yourself is an important part of doing a good job."
For example, I have some processes that involve visual basic scripts that run on a windows virtual server and send data files to a Unix server that reformats the files using Perl, preparing them to be ingested into an Oracle database.
I guess that answers the question of how many times one can curse in one sentence.
If Pandora's box is destined to be opened, *I* want to be the one to open it.
And as a zOS Systems Programmer too.
Hello IT,
Have you tried turning it off and back on again?
No problem mate.
I am Bennett Haselton! I am Bennett Haselton!
OK fatass.
I like to shut all the ports in the firewall. The sense of calm that descends on the servers is downright pleasant. Of course, then the phone begins to ring...
I've fallen off your lawn, and I can't get up.
It is now that the face time in front of a PC dos, unix, os390, linux Windows years and years worth 2.0 to 8.1 Worthless. What use to bring 100.00 per hour when 100.00 per hour meant something. Now lucky if you get 10.00. Let them google it themselves. Heck we learned with no google.
Be honest and candid with your teammates. If you tripped over a power cable, let the other admins know so they don't waste time analyzing the unscheduled reboot. (And, of course, secure that cable.)
But never, ever let it outside your team. While your fellow techies will generally appreciate your ability to admit fault, it'll only come back to bite you later if you admit fault to anyone outside your group.
The reason there are more fat people in IT isn't because we want to be. It is because the GOOD IT people get fat because they know that the best IT people never need to leave their seats. If you have to leave your seat to do something as an admin, you are doing something wrong and not using the technology that is available to you to be able to fix everything but physical hardware failure or installation from your seat.
This is why my office chair is a toilet. Actually my entire desk is in a toilet cubicle with the rest of the IT Team 'just in case of emergencies'. Curiously though the sound of urination is no different from the sound of people pissing on things to make their territory but they can't because we are already pissing on everything.
It's sometimes very odd when someone urgently bursts in during one of our meetings, but they usually leaved feeling relieved.
My ism, it's full of beliefs.
You should try to become replaceable. Make most your task become automatic or trivial, that systems try to heal themselves when known problems arise. That anyone else can understand how exactly the systems work based on your documentation, or see that a problem is about to happen based on your monitoring.
That will make your work easier, be able to take appropiate vacations, and be irreplaceable when (not if) things change.
Sad but true. The better you get at this job, the more weight you put on. What we need, is an augmented reality device that let us work while jogging. Or something ...
The Seven Deadly Sins
Most sysadmins are 6-wing 5
Type 5 on enneagram, the sin is greed.
Type 4 it is envy
Type 2 Pride
Types 7 Gluttony
Type 8 Lust
Type 9 Sloth
Type 1 Anger
Notice that the core types 3 and 6 do not map directly, Modern mapping add traits of Type 6 Cowardice and Type 3 Deceit and these can be seen as variants of the Sloth at point 9 since they are all sins of omission, not being available, not cmmitting to action and not supporting truth.
It has become my personal superpower. There is nothing more important as a tool. If you are not using one, start today. I can't recommend it enough.
lay off the donuts. snack on crudites.
Star Trek transporters are just 3d printers.
Tell that to all the linux based copiers around here. Even the dollar bill changer in one of our coke machines stops working until it is rebooted. Granted those items aren't "fixed", but replacing everything that a reboot resolves would be rather expensive.