Seven Habits of Highly Effective Unix Admins
jfruh writes: "Being a Unix or Linux admin tends to be an odd kind of job: you often spend much of your workday on your own, with lots of time when you don't have a specific pressing task, punctuated by moments of panic where you need to do something very important right away. Sandra Henry-Stocker, a veteran sysadmin, offers suggestions on how to structure your professional life if you're in this job. Her advice includes setting priorities, knowing your tools, and providing explanations to the co-workers whom you help."
What habits have you found effective for system administration?
The issue with #6 is that users almost invariably never accept an answer here. And a lot of the time it may be something you can't adequately explain, which is something they don't like even more. Especially if you know the problem wasn't the result of something you did.
For an inter-OS flame war ? (Again)
I discovered tmux (terminal multiplexer) a while back, and is a very potent replacement for screen, it supports splitting windows, having multiple sessions, sharing windows between sessions, customizable status bars etc. Try it out!
When working on a problem, I usually have two or more shells open. I don't mean multitasking, but with more then one open, I can issue commands from one and use the others to monitor logs/etc.
Physics is like sex. Sure, it may give some practical results, but that's not why we do it
Have a mini-fridge under your cubicle desk for constant snacking. The constant snaking would be the habit. Really though, there are too many fat bastards in IT. (and everywhere else, but especially IT). And no, on a plane, I do not prefer sitting next to a fat person that needs to seats, that is sweaty and smelly versus a person of normal weight.
i thought they were
sloth, gluttony, pride,...
What habits have I found effective for system administration? BOFH spring to mind ...
I know them all. They all work in Marketing.
"A door is what a dog is perpetually on the wrong side of" - Ogden Nash
The first time a task comes up deal with it manually, it may or may not be related to a problem.
The second time this task occurs deal with it manually.
The third time this task occurs, it's time to start scripting.
It may take you a day or more to write the script, test debug, etc. or even longer for complex tasks but, this behavior tends to be a winner. The script is already some degree of documentation, it records the steps, etc. If it's robust enough it can be used to by your support techs to resolve issues, expanding the number of people who can resolve an issue, freeing the admin for other tasks. Scripts tend not to make typos (yes, I know your command line skills are legendary) and can save a lot of time and effort in the long run.
If you are not doing active improvements, planning for failover, and using good configuration management techniques then your slow time is adding to the number of hurry-up-and-fix-all-the-things times. There are always external matters like heartbleed that will come along, as a sysadmin's job is not to review the memory allocator in the SSL library regularly. However, if your web services or mail services are down because a single system went offline then you're to be blaming yourself.
Did you try turning it off then on again?
Everyone knows real programmers code in C, and in C you count from zero. Counting from one? that is so FORTRAN. Retire already, old chap.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
"What habits have you found effective for system administration?"
Carrying an Uzi.
Need Mercedes parts ?
As someone who's managed a team of sysadmins that moved to the Linux world from Windows, I have this tip: "Reboot does not fix anything, it just hides things".
For some reason, Windows admins have been trained to reboot immediately when things don't work well rather than to figure out why something is failing. I'm sure this was a valid "fix" in older versions of Windows, but Windows has been stable for quite some time, and things shouldn't mysteriously stop working for no reason. Take a bit of time to figure out *why* the CPU is suddenly spiking on the database server, since if you reboot it, you will have lost most of the evidence for why it's happening, and it's likely to happen again. If it's a production server and you can't spend much time, run a few diagnostics (ps, "top", lsof, etc) and save to a file for the postmortem, but don't just go in and reboot before looking around.
have a Windows computer on hand for video games and Facebook.
Only three things are necessary for a highly effective unix admin:
To crush your userbase
To see their accounts deleted before you
To hear the lamentations of the salesmen
... works like a charm for me.
rgb
Even when the experts all agree, they may well be mistaken. --- Bertrand Russell.
Don't waste time reading slashdot.
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
90% of the job: "Have you tried turning it on and off again?" https://www.youtube.com/watch?...
Using anything like puppet or chef under version control to do all server ops will not only leave you with a full timestamped documentation, but will allow you to easily horizontally scale servers, rebuild them should disaster strike and protect you from stupid upstream package updates that b0rk your config files.
Have a staging and production environment? pushing your chef/puppet scripts to production after they're proven to work insures you have the same changes applied on both sides, and avoid manual operations on production.
If you really acquired good work habits, one would expect that you would be able to get out of the tedious, low end job that sysadmin is. That so much of the job can be automated should tell you a lot about it too.
With Linux/Unix just say, "Well it's an antiquated operating system." - and if Linux add "and with this F/OSS operating system, well, you get what you paid for and with the addition of [insert package name like systemd] 'blah blah blah blah' caused our problem. I need a raise to work with this shit!"
If we were on Windows, this wouldn't happen!
It works the other way around if you are a Windows admin, too. Just replace F/OSS with "closed undocumented source and [insert money hungry profit driven dribble screwing over customers stuff here] ' blah blah blah' If this were Linux, this wouldn't have happned! I need a raise to work with this shit!"
The exception is Oracle. Everybody throws their hands up, skakes their heads, and gets the knee pads and KJ and LIKE IT!
I think the most useful talent I've developed is the ability to go to sleep fast and to wake up fast and alert. When the phone rings or pager goes off, the faster you can reach "full on", find and fix the problem, and get back to sleep, the more sleep you get in the long run. Cohorts who have trouble getting to sleep after a late night emergency tend to be seriously dragging by the end of their oncall time.
Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
If you are in a shop that chains together MacMini servers, whe there's problem, start crying, ask for a hug and just say "I don't know what happened?! *sobbing* Mac is Go...Job's gift to mankind kind! *sob* I'm such a failure! *snot runs from nose*"
You'll get a couple of days off, they'll power down everything, turn it back on, and it'll all be working fine when you get back.
Seriously Slashdot, please don't summon up that old seven habits trite again. It takes a special type of ignorance to ascribe to this garbage. Just like the "100 ways to" maximize self indulgence.
First time for this task ;-)
Rule #8 would be not to fix problems too quickly (and let some that you can see coming, happen).
If you fix every problem before it gets serious and avert the other 90%, your bosses will think they have a highly reliable IT infrastructure. They will then cast their eyes about for cost savings - and the biggest target will be the most highly paid admins - the most senior ones - YOU!!!
So keep the problems coming, as all that management have to assess you on are the number of fixes and the time to fix. Nobody ever got promoted for solving problems that never happened.
Finally: 60 hours a week? Don't be daft. If you're really an effective administrator you should have your work finished well inside 30 hours and/or 4 working days.
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
Find the people on your team who can be trusted to do the job well. Encourage them to do it. Work with them to build their skills as well as yours.
Find the people on your team who can not be trusted to do the job well, and replace them with shell scripts.
On the other hand. . . make sure whatever changes you make will survive a reboot. . .
For habit #2 Nagios comes in really handy (could watch MRTG et al as well).
Setup all hosts in Nagios, sending alerts to an email for a couple weeks. Figure out what hosts have certain patterns.
Realistically, all that managers know these days is outsourcing. That's their response to anything that comes up. So what difference does it make what you do when you're going to get fired anyway? They're going to fire you, outsource what you're doing to someone who does it cheaper, and give themselves a bigger bonus. If you're an employee, you're going to get fired because you cost too much.
You really need to have a beard to get it. Do you have a beard? You don't sound like you have a proper beard.
Ehhh, of course you need a beard. But the article also says, to be successful you should remove spaghetti once in a while:
Habit 7: Make time for yourself
[... ]Taking care of yourself is an important part of doing a good job."
For example, I have some processes that involve visual basic scripts that run on a windows virtual server and send data files to a Unix server that reformats the files using Perl, preparing them to be ingested into an Oracle database.
I guess that answers the question of how many times one can curse in one sentence.
If Pandora's box is destined to be opened, *I* want to be the one to open it.
Hello IT,
Have you tried turning it off and back on again?
No problem mate.
I am Bennett Haselton! I am Bennett Haselton!
I like to shut all the ports in the firewall. The sense of calm that descends on the servers is downright pleasant. Of course, then the phone begins to ring...
I've fallen off your lawn, and I can't get up.
It is now that the face time in front of a PC dos, unix, os390, linux Windows years and years worth 2.0 to 8.1 Worthless. What use to bring 100.00 per hour when 100.00 per hour meant something. Now lucky if you get 10.00. Let them google it themselves. Heck we learned with no google.
"you often spend much of your workday on your own, with lots of time when you don't have a specific pressing task". We are the opposite. We spend every day with all departments blaming the network and us telling everyone how to do your jobs and stop blaming the network for your own problems or mistakes or lack of planning. If only we had that extra time to actually pay attention to the tasks we have rather than dealing with your problems. =]
A growing number of companies aren't hiring Unix admins or database admins or network admins anymore. These responsibilities are now being put on the shoulders of the software developers, who are expected to know how to do everything.
Be honest and candid with your teammates. If you tripped over a power cable, let the other admins know so they don't waste time analyzing the unscheduled reboot. (And, of course, secure that cable.)
But never, ever let it outside your team. While your fellow techies will generally appreciate your ability to admit fault, it'll only come back to bite you later if you admit fault to anyone outside your group.
You should try to become replaceable. Make most your task become automatic or trivial, that systems try to heal themselves when known problems arise. That anyone else can understand how exactly the systems work based on your documentation, or see that a problem is about to happen based on your monitoring.
That will make your work easier, be able to take appropiate vacations, and be irreplaceable when (not if) things change.
The Seven Deadly Sins
Most sysadmins are 6-wing 5
Type 5 on enneagram, the sin is greed.
Type 4 it is envy
Type 2 Pride
Types 7 Gluttony
Type 8 Lust
Type 9 Sloth
Type 1 Anger
Notice that the core types 3 and 6 do not map directly, Modern mapping add traits of Type 6 Cowardice and Type 3 Deceit and these can be seen as variants of the Sloth at point 9 since they are all sins of omission, not being available, not cmmitting to action and not supporting truth.
It has become my personal superpower. There is nothing more important as a tool. If you are not using one, start today. I can't recommend it enough.
lay off the donuts. snack on crudites.
Star Trek transporters are just 3d printers.
You poop in your seat?
Tell that to all the linux based copiers around here. Even the dollar bill changer in one of our coke machines stops working until it is rebooted. Granted those items aren't "fixed", but replacing everything that a reboot resolves would be rather expensive.