Disempowering the Singular Sysadmin?

← Back to Stories (view on slashdot.org)

Disempowering the Singular Sysadmin?

Posted by kdawson on Monday January 10, 2011 @03:39AM from the trust-but-verify dept.

An anonymous reader writes "Practically every computer system appears to be at the mercy of at least one individual who holds root (or whatever other superuser identity can destroy or subvert that system). However, making a system require multiple individuals for any root operation (think of the classic two-key process to launch a nuke) has shortcomings: simple operations sometimes require root, and would be enormously cumbersome if they needed a consensus of administrators to execute. There is the idea of a Distributed Administration Network, which is like a cluster of independently administered servers, but this is a limited case for deployment of certain applications. And besides, DAN appears still to be vaporware. Are there more sweeping yet practical solutions out there for avoiding the weakness of a singular empowered superuser?"

21 of 433 comments (clear)

Min score:

Reason:

Sort:

In other news... by Anonymous Coward · 2011-01-10 03:45 · Score: 5, Insightful

Rule by a benevolent dictator has certain advantages, and rule by committee has certain opposite advantages. It was ever thus.
There is a well tested method for that by arivanov · 2011-01-10 03:46 · Score: 5, Insightful

It is called: "Change Control" and usually goes along with "Revision Control" on configs.
If you change without recording the reason for change and without checking in the result so that the two versions can be compared and analysed you get a pink slip. Voila. Problem solved.

--
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/
1. Re:There is a well tested method for that by Anon-Admin · 2011-01-10 03:52 · Score: 4, Insightful
  
  What an Amazing Idea, now tell me who does this? I have worked for 4 fortune 10 companies and 1 financial institution. Not a single one has used Revision control, and only one has used change control. That is if you consider a meeting of 20 non-technical managers who can nix a change with out explaining why, change control.
2. Re:There is a well tested method for that by vlm · 2011-01-10 04:02 · Score: 5, Insightful
  
  Works, although excruciatingly slowly for planned work.
  The collision of excruciatingly slow proactive planned work, and reactive trouble tickets, always is a source of utter hilarity. Usually the end result is you only do planned proactive paper shuffling for meaningless stuff "lets change the background color to be 0.001% darker" and ram thru development as part of a trouble ticket with no oversight at all (well, to make our big customer happy, we've decided to completely redo our database schema and stored procedures this afternoon as part of the ticket).
  Another example, if it takes a month and endless meetings to replace a failing drive during scheduled maint, and a half hour to replace a failed drive at any time, this simply eliminates all proactive maintenance. Much easier / cheaper to burn the power supply out, have a nice long outage, and then replace the whole device, than to get permission to blow dust out of the air filter.
  The end result is usually much worse than it was at the beginning.
  
  --
  "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
3. Re:There is a well tested method for that by JWSmythe · 2011-01-10 04:47 · Score: 5, Interesting
  
  Another example, if it takes a month and endless meetings to replace a failing drive during scheduled maintenance, and a half hour to replace a failed drive at any time, ...
  Sadly enough, I've had a simple drive replacement tied up in meetings and other office politics for months. Write up a proposal for change, sit in meetings where various department heads without a clue discuss the potential hazards, write up the rollback process (for changing a drive?). Your plans are torn apart and put back together. Departmental announcements, customer notifications, etc, etc. Accounting wants numbers, and proposals from 3 sources for the cost of a replacement drive (which you have 5 of in the datacenter, and a regular supplier). You're sitting there with the mind numbing noise flowing past. All you can think is "the array was set up with no hot spare. It's running in a degraded mode. Change the damned drive." Of course, complaints of slow drive performance are scattered throughout the meeting.
  Two months and more meetings than you can remember later, they slate it for an arbitrary windows. Saturday at 3am. Not only change it, but you are required to stay while it rebuilds, "just in case...". Just in case? You have me working 8 to 7 Monday through Friday, weekends on demand (which are every weekend) AND you want me to blow off Saturday night to do the change? Ah who cares, I don't need sleep.
  Then Thursday afternoon before the schedule change is done, a second drive in the array fails, and the whole thing is down. All the same people who were in on the meetings start screaming "How could you let this happen?!"
  Thursday afternoon becomes Thursday night, and by Friday morning you have the array back up and working, through some dumb luck. (crossing fingers, praying to whatever gods may be listening, and tapping the drive with a screwdriver at boot time to make it spin up). The only planning that helped is that you keep a change of clothes and a toothbrush in the car, since you don't have time to go home once you're done. In doing the work, you notice the same thing happening to a neighboring machine. Damned aging hardware. So you just change it without the mess that accompanied the first change. Not only are you bitched out for not fixing the first array in time, but you get it twice as bad for fixing the other one before it became a problem. How could you have independent thought? How could you make a change without proper authorization?
  The only thoughts still in your head are "I hate this job", "my car keys are in my pocket, and I could just leave." Is this the day you quit? Maybe, just maybe. Just one more thing, and that'll be it. I don't need this shit.
  Friday afternoon, not sleeping since Wednesday night, you are told "Do [some other task] after hours tonight." No, you won't get paid any overtime since you're on salary. The task will take at least 8 hours, and they need it done before Saturday morning. Do you scratch out a resignation with a sharpie on the CEO's wall at 2am, or do you just walk out?
  I really hated that job.
  
  --
  Serious? Seriousness is well above my pay grade.
4. Re:There is a well tested method for that by sglewis100 · 2011-01-10 05:05 · Score: 4, Interesting
  
  Sadly enough, I've had a simple drive replacement tied up in meetings and other office politics for months. Write up a proposal for change, sit in meetings where various department heads without a clue discuss the potential hazards, write up the rollback process (for changing a drive?).
  Not that I don't agree that some companies make change management more than it needs to be (mine does it OKAY), but I bet the guy I knew years ago who changed a drive on a RAID-5 array had thought about testing and rollback. You see, he received the replacement drive late in the day, ran into the data center, popped out a drive, popped in the new drive, and went home. Sadly, he had pulled the wrong drive.
how do they design nuclear missile systems? by circletimessquare · 2011-01-10 03:48 · Score: 5, Interesting

look at programs where there is a lot of technical activity and communication activity for time sensitive work
you can't have a nuclear missile system where one guy can invoke the bombs to go off. at the same time, the system has to be quick and responsive
so you need to engineer administrative systems where not less people are involved but MORE: you can't do this function or that function without also involving this guy over there turning a key, etc.: all admin functions invoke more than one person. that's the best way to have a system where power can't be abused. its about redundancy and layers of admins, not less admins
and if people are pursuing this question because they don't want to pay an admin or can't trust someone else with their system, then such idiots get the system they deserve: a broken one and no one willing to fix it at the money you want to pay

--
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
Eventually, you have to trust someone. by Rogerborg · 2011-01-10 03:49 · Score: 5, Funny

Oh, the jobs people work at! Out west, near Hawtch-Hawtch, there's a Hawtch-Hawtcher Bee-Watcher. His job is to watch... is to keep both his eyes on the lazy town bee. A bee that is watched will work harder, you see.
Well...he watched and he watched. But, in spite of his watch, that bee didn't work any harder. Not mawtch.
So then somebody said, 'Our old bee-watching man just isn't bee-watching as hard as he can. He ought to be watched by another Hawtch-Hawtcher! The thing that we need is a Bee-Watcher -Watcher!'
Well... The Bee-Watcher-Watcher watched the Bee-Watcher. He didn't watch well. So another Hawtch-Hawtcher had to come in as a Watch Watcher-Watcher!
And today all the Hawtchers who live in Hawtch-Hawtch are watching on Watch-Watcher-Watchering-Watch, Watch-Watching the Watcher who's watching that bee.
You're not a Hawtch-Watcher. You're lucky, you see.

--
If you were blocking sigs, you wouldn't have to read this.
Reinventing history by vlm · 2011-01-10 03:51 · Score: 4, Interesting

would be enormously cumbersome if they needed a consensus of administrators to execute.
Thats why you leave changes to the 24x7 onsite operations team not one lone admin doin' his thing in the cube. They're the ones monitoring the systems, seems most sensible if they "push the buttons" on the things they watch. Ideally you have one team that does nothing but watch and one team that does nothing but do, and theoretically they cooperate.

And besides, DAN appears still to be vaporware.
DAN appears to be a poor reinvention of flight control software for aerospace from the 70s/80s. Those whom don't know their history are doomed to poorly repeating their past.
Next up, we'll reinvent the concept of the security office from AS/400, or maybe the idea of hard realtime control.
Maybe someone out there could could reinvent the concept of the watchdog timer so the "DAN" cluster doesn't go into deadlock? Naah, we'll let them "discover" it themselves, the hard way.

--
"Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
There are Safeguards Already by BooRadley · 2011-01-10 03:55 · Score: 5, Insightful

Mostly, except in very small organizations, there are several implicit safeguards to keep any one person from doing evil with the systems. They are subtle, but effective.
Peer review: Most sysadmins are hired by other sysadmins, or at the very least a technical manager. This means that you are hired based on your skills, reputation, track record, and demonstrated attitude. This means that ideally, you wouldn't even *think* about intentionally subverting a system, because that would mean breaking it or compromising it in some way, and most professional SA'a are simply too OCD to allow it.
Business continuity: Most organizations have several layers of continuity in place, such as disaster recovery scenarios, system snapshots, monitoring, and auditing. This means that unless you are VERY subtle, or work for an entirely incompetent team, you WILL get caught, and the damage will be minimized as you are being put into a police car, never to work in IT again.
There are no "indispensable people:" If you are a sysadmin, and you are the only one who knows your systems, you have not done your job. Every system and app should be documented, and there should be accountability for every change and decision.
No technical solution will ever replace good management and planning, and a design that eliminates the vulnerabilities of a system to rogue sysadmins, will also eliminate its flexibility. It's just a lot cheaper and easier to try and run a good shop.

--
-- lk t lv ll th vwls t f wrds. T svs lts f tm t wrt bt ts pn n th ss t rd nd mks m lk lk cmplt dpsht.
Re:sternobread by goofy183 · 2011-01-10 03:57 · Score: 4, Interesting

That is how all of our servers are setup. I'm just a "developer" that uses them but I believe no one knows the root password for our systems. It is a *big* random string that is printed out by the sysadmin that sets up the machine, sealed in an envelope with that person's signature on both sides and stuck in a safe. In the event that a machine is so hosed that the root password is needed it is used and then a new one is generated and sealed away again.
Everyone uses sudo for everything. All sudo access is logged.
The system isn't perfect of course, nothing is, but it goes a long way to the worry of one person having root keys for things.
Powerbroker & logging by Doc+Hopper · 2011-01-10 04:01 · Score: 4, Informative

We have several solutions which work together to minimize the risk of root at my company:
1. Powerbroker. It's in use on every single UNIX system administered by our Global IT teams. Every user has a role (or several roles), and that allows them to execute a variety of commands with elevated privileges. Once Powerbroker is invoked, however, every single keystroke is logged and can be played back. These logs are stored indefinitely; access is very restricted.
2. Automated, centralized root password management. One of the steps to setting up a UNIX machine here is ensuring the root password and remote console admin passwords match that dictated by our automated provisioning system. Then every 30-90 days (depending on policy for this type of system) the root password is changed to a very long, apparently very random string. I can look this password up if my role allows it, but the lookup is also logged.
3. A good Change Request (CR) process. Every system that exists in a data center should have a record in our systems database. Once a system has passed through the phases of deployment (Warehouse -> Data Center Install -> Sysadm Configure -> Deployed) any change made to the system must be requested and approved by the owners of the system. This approval is logged, and the date/time of the work is also logged. Sysadms must close service requests within the time window specified by the CR, or apply for an extension or reschedule if they're unable to complete it within the allotted time.
The downside to this is that you lose quite a bit of system administrator work hours filing and managing change requests. However, this loss of efficiency -- IMHO -- is better than the mayhem that ensues without an organized change process.
4. Automated forensic tools to monitor changes. Information overload is a real risk with any Tripwire-style system, though. We're still working out some of the kinks on this part of the system. Once we ensure that all normal changes due to operation of the system and scheduled maintenance get excluded, this will be the fourth leg to reduce the risk of super-user privileges.
At any company, IT must find a balance between controlling user actions and monitoring those actions. In most cases, the easiest approach is to prohibit by policy only those things that might typically result in lawsuits, but monitor everything else to the best of your ability. Combining a Powerbroker-like product with automated root password management -- both with fascistic logging -- is a reasonable approach that works well for many large companies. Combine this with a change management system, and a forensic tool to automatically monitor and notify of unauthorized changes, and super-user isn't really all that big of a concern.

--
Matthew P. Barnson
I learn what I think when I read what I write
1. Re:Powerbroker & logging by Doc+Hopper · 2011-01-10 05:30 · Score: 4, Informative
  
  You've tossed out a few red herrings and a couple of valid points. I'll try to address them in order.
  
  this tells me that there is somebody that holds access above the other users, basically missing the point here.
  No, I haven't missed the point at all. The point is to distribute the responsibility with sufficient checks in order to ensure that misbehavior will be caught and dealt with in a timely fashion. Is it possible someone could scheme up a way to slide abuses past the admins? Of course it is. But between good backups, fascistic logging, role-based access control, and routine audits by the change control committee, the risk is minimized.
  There's no one person who holds the "keys to the kingdom". No critical data is stored on the machines themselves; it's all stored on centralized storage. The folks who admin the automated root password changes don't have any access to storage; the storage folks typically don't have any access to the systems.
  
  Again, that means that there's somebody administering the logging system. and I almost assure you that even if their logins are listed somewhere: they have full access to remove those entries and make it look like it never happened.
  Incorrect. I didn't cover this in my original post, but logs are (and should be) stored on write-once media. You can designate volumes on modern storage media so that, once written, it can never be altered without destroying the entire volume. We use this extensively.
  
  say I have a machine that stores credit card numbers on a DSS approved network that's locked down in the ways you describe above. at the admin level, it would take me minutes to provision a machine to replicate the target. I don't mean replicate as in contents, I mean replicate to the network view.
  Once again, distributed access can prevent this. The network team and the sysadm team aren't the same teams. Every port on your switch is disabled until it's enabled by the network team. Even once enabled, that port must be on the same VLAN as the hypothetical credit-card storage system.
  That's once again where fascistic logging and automated reporting come into play. If a port is disconnected, unless a host has been blacked out with an appropriate change control ticket filed, the port disconnection generates an immediate Priority 1 service request to investigate.
  If a drive is removed from centralized storage, that also generates an immediate P1 ticket. The sysadm's access would have been logged the moment he swiped his badge, and cameras throughout the data center capture the switch-over.
  A corrupt admin can do a lot of damage, I admit. There's no getting around it. But with sufficient logging -- and yes, I include physical surveillance as "logging" too -- they're not going to get away with it.
  
  the replicated machine can be tunneled into place and act as if it was the machine in question.
  Now this is the red herring. If you've ever done ANYTHING major with credit cards in a data center, you are aware that you're subject to yearly audits of your infrastructure by Payment Services. They do a deep-dive of your systems to enforce a huge number of requirements. I can't go into it here. It literally fills a large book, and they go over it line-by-line with all the admins involved, every single year. I've been through several of these, and each year it gets broadened to cover more potential issues.
  Chief among these requirements? A separate admin/management network from the front-end/back-end network. You can't "tunnel in" to that network and make it "act like" another system. The network is an unroutable private VLAN or fibre-channel connection.
  
  at this point, I can reverse firewall the unit preventing it for calling for help or reporting the changes I make. I can snapshot the drive and move it offsite
  Ye
  
  --
  Matthew P. Barnson
  I learn what I think when I read what I write
Re:why? by somersault · 2011-01-10 04:05 · Score: 5, Insightful

Not really. It's fun to think I could do anything I wanted, but I don't want to. I like my job, I like the people I work with, I don't want to screw them over. It's nice to have an employer that trusts you too. If I wasn't trusted, I would probably just leave. If they want me to be able to administer and troubleshoot everything, I obviously need full access.

--
which is totally what she said
Re:Yes by ByOhTek · 2011-01-10 04:10 · Score: 4, Interesting

A subset of administrative applications requiring multiple administrators may not be such a bad compromise.
ex:
* change root password (or password to any "wheel" account) - requires multiple administrators to enter the same passwords
*su/sudo'ing to a "wheel" account, or changing said account's privileges, requires the authorization of at least one other wheel'ed user.
* Alterning an active network interface, shutting down, and restarting requires authorization by other administrative users.
Stuff like that, which are things that shouldn't be done often, anyway, and could allow one admin to take over the whole system, seem like good candidates for multiple-approvals. Everything else could be left alone.
The approval process is basically - the root users needs to take the action, and then 2+ non-root (but wheel) users must approve it.
I'm using 'wheel' as that is the group in FreeBSD that is typically allowed access to sudo/su. Not sure how other systems typically work.

--
Self proclaimed typo king, and inventor of the bear destroying coffee table (patent not pending).
Re:sternobread by s4ltyd0g · 2011-01-10 04:24 · Score: 4, Informative

sudo logs are almost useless for system audit. Run sudo su - and have at it. There are no logs to follow what actions you perform. Go ahead and craft a sudoers file that eliminates all the ways to load up a shell. Have fun with that...
Re:Yes by jijacob · 2011-01-10 04:31 · Score: 5, Insightful

If you don't trust your sysadmin, they shouldn't be your sysadmin. Just like the accounting department probably has the ability to steal a certain sum of money before anyone will notice, your sysadmin is given responsibilities that could potentially cause grief if they are on the wrong team.
Re:sternobread by Phreakiture · 2011-01-10 04:33 · Score: 5, Insightful

Run sudo su - and have at it.

The solution here is to follow a reasonable security protocol in writing the sudoers file. Specifically, the default action is to prohibit. Permitted actions are then whitelisted. On a high-security system, no entry should allow a user to sudo su -. Problem solved.
Incidentally, I see no point in locking down users who have physical access to the DC.

--
www.wavefront-av.com
Re:Yes by jijacob · 2011-01-10 04:54 · Score: 4, Insightful

The tricky part comes in at the point that, while most CEOs have at least a basic understanding of accounting and other departments under their watch, IT departments are *typically* a foreign land to the understanding of those in charge. Even if they wanted to audit proper usage of root it would be difficult or impossible. Small businesses have it hardest. At least in the larger ones there's a layering system so you can have higher-ups in IT auditing the lower guys.
Re:Too many cooks... by BrokenHalo · 2011-01-10 04:57 · Score: 5, Interesting

...spoil the soup.

The submission seems to presume that the system in question is some sort of *nix or Windows box. If we look into the world of mainframe operating systems, we'll see that this has already been fully adressed, and any number of individuals with discrete UIDs may have superuser access. This has evolved out of a history where sysadmins worked shifts, so sharing a single privileged UID/password was/is a bad idea.

The way such access is administrated needs a proper policy within the organisation, though. Back in the '90s, I worked at one outfit (an insurance company) where the vice-CEO demanded superuser privileges despite having no knowledge of system administration or any other computing background. He just wanted to act as overlord as to what staff had access to on their signons. I was very tempted to tell him to get fucked, phrased in more professional terms. Like "Go get professionally fucked".

My immediate boss was (wisely) more inclined to a diplomatic approach, however, so he pursuaded me to install a dummy program for him that was enough to convince him that he had what he wanted, without granting him any kind of command line access, or ability to change system configuration.
Superuser by fyngyrz · 2011-01-10 07:18 · Score: 4, Funny

Are there more sweeping yet practical solutions out there for avoiding the weakness of a singular empowered superuser?"
No. Now just hang on a second while I delete your user account and all your data, you presumptuous bitch.

--
I've fallen off your lawn, and I can't get up.