Ask Slashdot: Linux Login and Resource Management In a Computer Lab?

← Back to Stories (view on slashdot.org)

Ask Slashdot: Linux Login and Resource Management In a Computer Lab?

Posted by timothy on Tuesday July 22, 2014 @05:40AM from the explain-your-system dept.

New submitter rongten (756490) writes I am managing a computer lab composed of various kinds of Linux workstations, from small desktops to powerful workstations with plenty of RAM and cores. The users' $HOME is NFS mounted, and they either access via console (no user switch allowed), ssh or x2go. In the past, the powerful workstations were reserved to certain power users, but now even "regular" students may need to have access to high memory machines for some tasks. Is there a sort of resource management that would allow the following tasks? To forbid a same user to log graphically more than once (like UserLock); to limit the amount of ssh sessions (i.e. no user using distcc and spamming the rest of the machines, or even worse, running in parallel); to give priority to the console user (i.e. automatically renicing remote users jobs and restricting their memory usage); and to avoid swapping and waiting (i.e. all the users trying to log into the latest and greatest machine, so have a limited amount of logins proportional to the capacity of the machine). The system being put in place uses Fedora 20, and LDAP PAM authentication; it is Puppet-managed, and NFS based. In the past I tried to achieve similar functionality via cron jobs, login scripts, ssh and nx management, and queuing system — but it is not an elegant solution, and it is hacked a lot. Since I think these requirements should be pretty standard for a computer lab, I am surprised to see that I cannot find something already written for it. Do you know of a similar system, preferably open source? A commercial solution could be acceptable as well.

7 of 98 comments (clear)

Min score:

Reason:

Sort:

Trust your users by Anonymous Coward · 2014-07-22 05:48 · Score: 5, Funny

Trust your users.
Is this all necessary? by Sycraft-fu · 2014-07-22 06:00 · Score: 5, Insightful

Seems like you are trying to work out a solution to a problem you don't have yet. Maybe first see if users are just willing to play nice. Get a powerful system and let them have at it. That's what we do. I work for an engineering college and we have a fairly large Linux server that is for instructional use. Students can log in and run the provided programs. Our resource management? None, unless the system is getting hit hard, in which case we will see what is happening and maybe manually nice something or talk to a user. We basically never have to. People use it to do their assignments and go about their business.
Hardware is fairly cheap, so you can throw a lot of power at the problem. Get a system with a decent amount of cores and RAM and you'll probably find out that it is fine.
Now, if things become a repeated problem then sure, look at a technical solution. However don't go getting all draconian without a reason. You may just be wasting your time and resources.
1. Re:Is this all necessary? by MerlynEmrys67 · 2014-07-22 06:38 · Score: 4, Interesting
  
  This is hilarious. So was in College several decades ago. Large computer labs and lots of SSH/X forwarding to do work. The only time I remember getting in "trouble" was when we were on a LISP module as a freshman. Their resource management only allowed a few LISP interpreters on the machine - otherwise it would deny them for resource reasons. I quickly got sick of typing $lisp and waiting for my session to actually start - so I created a shell script that ran an infinite loop asking for a lisp interpreter...
  15 minutes later, someone tapped on my shoulder and asked me what I was doing - I had taken the full processing capabilities for a while. I showed my script - gasp horror, and a 1 second pause was added to the script and I was good to go. Learned a lesson too.
  The year before I got there - enough people were learning how to hack the system to crash it that they were having trouble keeping the system up. Their solution - install a button next to each keyboard that when pushed would crash the system. No work was accomplished for a week - then it didn't go down again. We were told about the button, it was rough for a couple days - and then the systems were rock solid.
  Kids will be kids - good kids will create a nightmare for you - work to focus that energy in a positive way and good things will result.
  
  --
  I have mod points and I am not afraid to use them
Did you look at the PAM modules on your system? by Anonymous Coward · 2014-07-22 06:00 · Score: 4, Informative

Some of what you're asking for are ulimit settings - total number of processes, for example. That's pam_limits. Some could also be handled with pam_tally2. Or, since you're already using LDAP, you could use a simple web-based reservation system which specifies allowed login hosts in the LDAP server for however long someone wants to "check out" a machine; that's how I've done it when I've needed to control access to cluster resources.
When you talk about controlling other resources beyond logins, it's generally better to handle it at the application level rather than the OS level if you can. But using ulimits (and again, this can be integrated into LDAP pretty easily), you can restrict resources and apply process priority (ionice and nice are your friend) based on membership in a specific group or another LDAP attribute.
You could, for example, create a "highpower" group per set of machines / per machine (highpower_serverA) and add users to that group based on a checkout system, then define limits on the number of processes they can use, amount of memory they can use, total CPU time they can use, etc in limits.conf based on being in that group or not being in that group.
I'll send you my bill tomorrow.
A lot of this seems superflurious ... by dougmc · 2014-07-22 06:08 · Score: 4, Interesting

If you're giving your users access to the machines, they should be able to use them. And if you can't trust them to use them responsibly, don't give them access.
If it were me, I'd secure the boxes normally, set up some resource usage rules (guidelines?) and see what happens. If problems happen often, then maybe look into something automated to enforce the rules, but if not, then you're done.
As for renicing stuff done by remote users, I'm not sure this is a good idea, but if you want to do it you can renice sshd itself, and to be thorough you can also renice crond (if you give them access to cron/at.) But do keep in mind that nice (and ionice) can't do magic with an overloaded system -- they help, but they don't do magic.
As for commercial systems, I haven't really seen this as being a big problem outside academia. Multiuser *nix systems where different people are competing for resources is kind of rare in the commercial sector, as it seems like the trends lately are to have enough hardware, often dedicated, and to enforce limits through voluntary compliance (and have their boss talk to them if it's still a problem.)
That "have their boss talk to them" bit may not work so well for students, but still, I would wait for a problem to appear before I put too much effort into solving it.
Instead, put your efforts into proper sysadmin stuff -- stay up to date on patches, look for problems (especially security ones), make sure backups work, help users with problems, etc. If there's any troublemakers, talk to them, and if they don't shape up after a few warnings, kick them out. (And make sure the policies permit that!)
You can enforce limits on specific users through pam and sshd_config and some other mechanisms, but I'd suggest leaving that for later. Anything you do that will limit what people can do will eventually keep them from doing what they legitimately need to be doing.
Technical solution to a social problem. by Vellmont · 2014-07-22 06:34 · Score: 4, Insightful

If your users can't play nice together, the solution isn't to treat the place like a prison with automated systems enforcing a hard and fast set of rules.
The solution is for users to create their own enforcement. If some guy tries to take all the resources across your network with distcc, then the people affected should be able to notice that and tell the guy to knock that the fuck off.
In other words, give the users the freedom to break stuff, but also the knowledge to find out who'd breaking their stuff. It'll serve them far better than creating a walled garden where someone else has the responsibility to enforce social rules.
Slashdot and reddit work this way. Neither go around trying to enforce how people behave, they give the users the power to do that themself.

--
AccountKiller
Re:I would write my own with LDAP by mrvis · 2014-07-22 06:42 · Score: 4, Insightful

I would be terrified if you were my co-worker.