Linux Clustering w/Bootable CD-ROMS?
Cameron asks:
"Has anyone tried to make a Linux cluster on a typical
company/school network? I am trying to make a Linux cluster by
taking bootable CDs and putting them in computers on an existing
network. Red Hat (or another distro if it's better suited) would
boot and run off the CD without needing any (or much) HD space. This
way the computers aren't changed and I can have a virtual supercomputer
for a while. Mmm 18Ghz. Anyway has anyone tried this before? Also
I'd appreciate any suggestions on which distro to use and what
cluster software/daemons I'd need. i.e. Beowolf or something like
it." While an interesting approach to clustering, unless each node
has quite a bit of RAM, I would think that you might need a
marginal swap partition. What problems do you see with this idea, and
would they be surmountable? Might such an idea be useful for quickly
converting a computer lab to a cluster when it's not being used for
other things?
You should probably look at people who have set up diskless Linux systems. The root directory is NFS. You can even use a network device for swap space. You'll probably need some sort of network file system for doing your work anyway.
Yes, NFS would be needed for a distributed thing such as a beo since you need to maintain a shared storage for the applications to be run. However, you're going to have enough cross chatter on a decent network with this NFS and the beo already that adding the OS as an NFS thing would have a severe impact.
The advantage of booting off of NFS is that you don't have to burn new CDs when you udate the cluster.
Everything in life is a trade off - if you don't want the work of burning the disks, then you'll need the trade off of waiting extra long for a job to complete due to the network bottleneck.
Wheeeee
Slackware (http://www.slackware.com) has a product called zipslack (http://www.slackware.com/zipslack/) which (as its name implies) runs off of a zip drive. That might work for you.
I looked into doing something like this with a web server cluster a while back. The goal was to eliminate as many single points of failure as possible. The design goes something like this: Buy 4 machines with a CDROM drive, and a gig or so of RAM. Have a stripped down and hardened version of your favorite distro on it, and set it to create a ramdisk on bootup with the distro entirely in RAM. Put them all behind a firewall. Set up an internal gigabit ethernet network connecting all of these machines together, and put your backend database with all of your web pages on the internal network. Now, put in sysconnect gigabit ethernet cards in them (with the two ports on each card and drivers for linux/freeBSD/solaris/almostanyothernix which do automatic failover if you yank a network cable). Make the firewalls, routers, and database backend failover properly, and assuming you have two completely independently routed incoming network connections, the only single point of failure is incoming power and the fact that all your machines are in one place, so if you have a fire, you're kinda screwed.
Get F5 BigIP firewall thingies. They do 900Mb/s of real world traffic (USA Today's web page, Supreme Court website (remember Florida?)), etc. They don't pay me, and I've never worked for them; I'm just really impressed with their product. I don't even know if they make these anymore, but if they do, they're way better than anything Cisco makes (or, to be more precise, they were way better than any Cisco firewall product at the time I looked into this.)
Anyway, the main point, as it related to the poster's question, is that if anyone managed to get past your firewall and invade your box, he's only modifying what's going on in RAM. If you detect him there, and plug the security hole, you don't have to worry about what things he changed on what disks, and so on. All you have to do is burn 4 more CDs, take the machines down one by one (the F5s do load balancing as well as firewalling, so you aren't going to have any downtime upgrading your machines -- did I mention they're the shiznit?) and reboot off the new CD. Yes, I know that the database server still has disks and an attacker could get in and mess that up (again, this is assuming he got past the firewall, which is a pretty big assumption if you're blocking everything but port 22 and port 80) but by and large, it's a pretty interesting way of dealing with possible breakins. If you're really hardcore, you can run openbsd on the boxen (if openbsd allowed that whole foofy "boot into ramdisk" thing -- I have no idea whatsoever if it does. You could run it on the database backends tho.)
For various social and political reasons, we had to make this design work for either linux servers or solaris servers; hence the use of sysconnect gigabit ethernet cards, which work in both solaris and linux. However, I've heard that Solaris now has netcard failover built in; I'm not sure if the linux kernel has the infrastructure to do that or not (I don't think it would be too terribly hard to write it tho, if you had to.)
Electric_Boy banned: Banned by Metallica: See http://infringe.napster.com/metallica.html
See http://labs.psoftware.org it's a bootable CD-ROM Distro that can turn your school/office Windows PC into perfect mosix client !!! Bye
Check out scyld.com for their beowulf distribution. It does exactly what you need (though there is a need for a dedicated head node system). You can run all of the slave nodes entirely diskless, and control booting into the beowulf stuff via a floppy, cdrom, or the hard disk.
You can pick up the CDs at cheapbytes (I think), so it's only a few $$$ for a basic install. Support or buying the professional edition will cost you bigger $$$.
The advantage of this is that all of your files and state and management is all done on the head node. The slave nodes boot up and pick up their configurations from the head node and go. Scyld beowulf is also significantly easier to install/maintain than rolling your own.