Linux Implementation For 2500 Workstations?
Jeff Kwiatkowski asks: "We are looking to roll out Linux to over 2500 desktops and could use any advice that we can get. We need security info, implementation suggestions and any other advice that you would care to offer. We are currently evaluating Debian, Caldera and Red Hat. I also want a minimalist desktop, so I have been leaning toward WindowMaker as the window manager. In addition, we currently have machines with 32 meg of RAM (fast processors, though) and would like to keep the upgrade to 64 meg, only, if possible. Lastly, do any of you have any thoughts on Word Perfect vs. Applixware?" For those of you who think that the claim 'Linux is not ready for the desktop' is a falsehood, then this story is for you. As you can see, people are looking at deploying Linux on the desktop, and suggestions from you guys could make this process a lot easier.
My suggestion: Leave the terminals with 32 meg, as that is plenty for the X server. Your standard terminal should have one single harddrive with a minimal installation of some distribution. You will want to have a standard way of setting up such a machine when a disk dies and gets replaced, but backups aren't needed (as there is no user data on the terminal) and you won't have to care much about updates either (as there should be no daemons running on the terminals).
One big benefit: As the terminals do not hold data, it doesn't matter if they are stolen. Terminals are not trusted.
The X protocol is made for networks, and a 10 MBit/s hose to each terminal would be just great. However, it's not encrypted, so you should at least consider how physically secure your network is, and what the requirements would be.
Then set up one server for each N users. If they are doing web access and text editing, your average ``high end but not that high'' server should be able to run 15-40 users. Maybe more, but I haven't tried this type of workload myself so I can't say. Anyone ?
You will end up with a server farm. Each server should hold a home filesystem locally, and preferrably the users with the homes on that local fs should log in on that server. You can choose to let the server export their home fs'es to the other servers as well and share user accounts with NIS, which would let any user log in anywhere. If a terminal is tied to a user and vice versa, there should be no need for a terminal to be able to choose other servers, but if they're not, then the need will be there.
I've done a few such setups, but at a *much* smaller scale. I can tell you that it is a relief to _only_ have to update software on the server(s).
Back at University, we had a very hacked up Slackware distribution which did nothing special, expect on bootup where it would download and install any packages that sat on the upgrade server.
The same principle is absolutely essential for anything more than 100 or so machines (even if upgrades aren't a priority, bug fixes and security fixes will be).
In truth, I can't imagine any distribution would be better suited than any another here, especially if you are willing to write a boot up script which can download any new RPMs or DEBs and install them. The only problem is making sure they are not "interactively installed". Lots of Debian packages are but this is easily remedied. In fact, if you used Debian, adding apt-get update && apt-get dist-upgrade to your boot script and setting up your own packages repository (a simple FTP folder) would do that for you. You may need to tweak the odd package to force some settings but that's what your network of 5 machines reserved for testing are for right...
I'd also go with Sawfish/Sawmill instead of Window Maker. While I'm a huge fan of WM, I think sawfish has a much more desktop friendly future ahead. It can also look pretty identical to WM, and some of the other themes are very practical for desktop use. Its memory footprint on my machine is just under 4MB with half of that as shared libs which lots of other programs are using. Perhaps a choice at login would be useful, especially if offered with something pretty like GDM.
The major issue will probably be support, although that's more likely to be for specific applications than the whole system. I take it that to be entrusted to install 2500 desktops, you know your greps from your seds and are pretty capable of writing some scripts to manage upgrades. If not, find someone who is and pick their brains.
Rather than bogging down the network with remote X apps, *please* investigate Debian's apt-get tool. In some ways, the Debian distribution is a 10,000+ distributed cluster of homogenous systems.
For my home Debian box, all I have to do is run apt-get update; apt-get upgrade once a day, and then my system is homogenous with the official Debian distribution.
If you put those two commands into your user's init scripts (probably with the --force option), then lock down the /etc/apt/sources.list, then...
The biggest disadvantage of using apt-get is that your network will probably get bogged down after you change a large package. The next morning, at 8am, when 1000 people turn on their computers, they'll all be trying to download the same package at the same time, which could be a mini nightmare. If you're a good sysadmin, you'll figure out a good way around it.
The other big advantages:
If you don't need to roll out this installation tomorrow, I'd recommend that you install a copy of Debian (Debian 2.1 is stable, but out of date, Debian 2.2 is not quite released yet).
Once you install Debian 2.1, hang out for a while talking to people on the irc channels (irc.debian.org), and get all your stuff configured, then run the command apt-get update; apt-get dist-upgrade, and your distribution will automatically be upgraded from 2.1 to 2.2 (hopefully with almost no user intervention).
This message turned out to be a lot longer than I expected, but there's a lot to consider in your situation. Good luck!
--Robert
Debian is easy to install and update through the network. And there is a package replicator available to replicate an machine installation through the network. It's really great. The replicator maintainer is also very responsive. I had lots of mail exchange with him and he helped me if I needed. got to http://www.ens-lyon.fr/~schaumat/replicator/
For a rollout of that size, I'd say that you need two key things: first, either a network or CDR-based install from a cut-down release tailored to your business environment, with all options pre-selected, and secondly, the seemingly trivial but massively important separation of system and user areas, each in their own filestore.
The first is important because one of your major costs is going to be support --- this will skyrocket if you use standard distro CDs because they're all based on interactive user choice in varying degrees, and corporate handholding costs money.
The second is important because without the separation, upgrading will become a nightmare over time --- again, this will increase your support costs. In fact, consider seriously the possibility of not holding any user data on the workstations at all, but on a central filestore instead. That simplifies data backup as well as workstation upgrading, because then you can regard workstation state as throwaway.
"The question of whether machines can think is no more interesting than [] whether submarines can swim" - Dijkstra
... I'm a Sys/Net Architect, so guess where my biases are? :-)
Anyway, what you have on the desktop matters (esp the mechanism you use for clone workstations (you are planning to clone workstations, right?)), but I'll concentrate on something else equally important, and which will affect how you set up the desktops: Network and Backend System Design
First off, you don't want any data locally. That's right. I don't care who has the workstation, the only thing sitting on the local disk should be the OS. All user files, and major applications should be sitting on a remote filesystem. Otherwise, you end up with a completely intractible backup and upgrade problem. Trust me on this.
As a correllary to the last statement, you don't want to use NFS as your file sharing method. Hell, even SMB would be better. You want to look at either AFS or Coda. I would recommend the latter, as it's nowhere near as nasty to set up.
As part of Coda/AFS, you are going to have to think about how you design your file server setup. A central bank of servers is tempting, but this tends to be really harsh on the campus backbone, as it puts the workstation relatively "far" from the server, and all traffic has to traverse the backbone. Consider local file servers which may cache user data for replication back to the master server(s) later.
Printing is also a bit of a problem. I heartily recommend the CUPS system talked about here a couple of days ago. Have all your workstations spool to dedicated print servers. They don't have to be powerful, but make them dedicated. You won't regret it.
As far as security and other mishmash goes, do the usual /etc/inetd.conf edit, and comment EVERYTHING out. Don't run ANY daemons on the clients (other than what is absolutely necessary for Coda). Have all mail blindly forwarded to a central mail server. As a correllary, use IMAP (preferably IMAP-over-SSL) as your mail server. Stay away from local UNIX mail, and POP. And look at running postfix or exim instead of sendmail.
You can think about using application servers (i.e. run X apps remotely) if you want, but realize that this will up the bandwidth requirement, and honestly, you probably can't run more than two dozen major X apps over a LAN before it bogs down completely. That is, you need a local app server with 100Mbit connections to about 25 machines so each can run 1 or 2 X apps remotely.
If you can afford it, and have the time, use LDAP as your user info directory - avoid NIS and NIS+ (the first is horribly insecure, and the second is nasty).
This is a first approximation of what you might do. If you want a serious proposal, I'm available nights and weekends (for a modest fee, of course... heehee)
Good luck!
-Erik
There are always four sides to every story: your side, their side, the truth, and what really happened.
There's been about 80 posts already and not one refers to a Beowulf cluster. Come on! He has 2500 machines here. Keep on your toes!