Ask Slashdot: Building a Cheap Computing Cluster?
New submitter jackdotwa writes "Machines in our computer lab are periodically retired, and we have decided to recycle them and put them to work on combinatorial problems. I've spent some time trawling the web (this Beowulf cluster link proved very instructive) but have a few reservations regarding the basic design and air-flow. Our goal is to do this cheaply but also to do it in a space-conserving fashion. We have 14 E8000 Core2 Duo machines that we wish to remove from their cases and place side-by-side, along with their power supply units, on rackmount trays within a 42U (19", 1000mm deep) cabinet." Read on for more details on the project, including some helpful pictures and specific questions.
jackdotwa continues: "Removing them means we can fit two machines into 4U (as opposed to 5U). The cabinet has extractor fans at the top and the PSUs and motherboard fans (which pull air off the CPU and remove it laterally — (see images) face in the same direction. Would it be best to orient the shelves (and thus the fans) in the same direction throughout the cabinet, or to alternate the fan orientations on a shelf-by-shelf basis? Would there be electrical interference with the motherboards and CPUs exposed in this manner? We have a 2 ton (24000 BTU) air-conditioner which will be able to maintain a cool room temperature (the lab is quite small), judging by the guide in the first link. However, I've been asked to place UPSs in the bottom of the cabinet (they will likely be non-rackmount UPSs as they are considerably cheaper). Would this be, in anyone's experience, a realistic request (I'm concerned about the additional heating in the cabinet itself)? The nodes in the cabinet will be diskless and connected via a rack-mountable gigabit ethernet switch to a master server. We are looking to purchase rack-mountable power distribution units to clean up the wiring a little. If anyone has any experience in this regard, suggestions would be most appreciated."
A beowulf cluster of these! FP
Seriously, it isn't worth your effort - especially if you want something reliable. People who set out to make homemade clusters find out the hard way about design issues that reduce the life expectancy of their cluster. There are professionals who can build you a proper cluster for not a lot of money if you really want your own, or even better you can rent time on someone else's cluster.
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
throwing gear away or giving it away. Just because you have it doesn't mean to have to, or should use it. If energy and space efficiency are important, you need to carefully consider what you are reusing. Sure, what you have now may have already fallen off the depreciation books, but if it's going to draw twice the power and take double the space that newer used kit would, it may not be the best option, even when the other options involve purchasing new or newer-used gear.
Not saying you need to do this, just recommending you keep an open mind and don't be afraid to do what needs to be done if you find it necessary.
I work for the Department of Redundancy Department.
1. buy malware at a shady virus exchange to create a beowulf botnet
2. ???
3. profit!!!
Is your second mistake. How much memory is available and what will your interconnects be?
I thought some folks had switch to GPUs for heavy number-chrunching... Though the custom hardware setups no doubt renders this a moot point.
:\
Glad I could help
You'll need to consider how you're going to provision and maintain a collection of systems.
Our company currently uses the ROCKS cluster distribution, which is a CentOS-based distribution that provisions, monitors and manages all of the compute nodes. It's very easy to have a working cluster set up in a short amount of time, but it's somewhat quirky in that you can't fully patch all pieces of the software without breaking the cluster.
One thing that I really like about ROCKS is their provisioning tool which is called the "Avalanche Installer". It uses bittorrent to load the OS and other software on each compute node as it comes online and it's exceedingly fast.
I installed ROCKS on a head node, then was able to provision 208 HP BL480c blades within an hour and a half.
Check it out at www.rockclusters.org
Don't anthropomorphize computers, they don't like it.
Slashdotters only imagine building Beowulf clusters. This is the first time anyone's been serious about it.
Besides the cost of electricity and cooling (which you will either pay yourself or share with others) the hassle of maintaining your own cluster is not worth it. I set up a purpose-built 50-blade cluster as a grad student and it ate upy time like nothing else. Not a good idea.
Trust me why not try asking in LQ forum? I am sure someone will come up with something good, here in /. filter hundreds of replies / comments for answer to your original question :)
Some comments will let you thinks if the poster is fucken drunk or in sleep while posting comments.
I've been working in academic HPC for over a decade. Unless you are building a simple 2-3 node cluster to learn how a cluster works (scheduler, resource broker and such things), it's not worth your time. What you save in hardware, you'll lose in lost time, electricity, cooling, etc.
If you're interested in actual research, take one computer, install an AMD 7950 for $300, and you will almost certainly blow the doors off a cluster cobbled from old Core 2 Duo's, and you'll save more than $300 in electricity.
It's 2013 don't build your own cluster just use AWS EC2 spot instances.
I'm routinely mounting things in a 42U cabinets that ought not be mounted in them, so I've got *some* insight.
The standard for airflow is front to back and upwards. Doing some sticky note measurements, I think you could mount 5 of these vertically as a unit. I'd say get a piece of 1" think plywood and dado cut channels 1/4" top and bottom to mount the motherboards. This would also give you a mounting spot that you could line up the power supplys in the back. This would also put the Ethernet ports at the back. Another thing this would allow would be for easy removable of a dead board.
Going on this idea, you could also make these as "units" and install two of them two deep in the cabinet (if you used L rails).
Without doing any measuring, I'm suspecting this would get you 5 machines for 7U or 10 machines if you did 2 deep in 7U.
Yes Francis, the world has gone crazy.
What do you intend to use for inter-node communication? Gigabit ethernet? You need to realize that latency in inter-node communication can cause *extremely* poor scaling for non-trivial parallelization. Scientific computing clusters typically use infiniband or something like it, which has extremely slow latency, but the equipment will cost you a pretty penny. If you are interested in doing computations across multiple computing nodes, you should really setup just two nodes and benchmark what kind of speed increase there is between running the job on a single node and on two nodes. My guess is that you are going to get significantly less than a 2x speedup. It is entirely possible that the calculation will be *slower* on two nodes than on just one. Of course, if you are just running a massive number of unrelated calculations, then inter-node communication becomes much less important, and this won't be an issue.
It may initially seem like a good idea, but if the population isn't homogeneous, you could find your time eaten up looking for spares. With a single type of PC, a node can be sacrificed to keep others running. But these are systems near the end of their design lifetime (and loaded with dust -- and who knows what else?) so components (fans, HDDs, power supplies) are going to be starting to fail more frequently. And the rats' nest of power cables! Perhaps a bunch of multiprocessor, multicore server blades would be a better choice? They go pretty cheaply, and you'd get more cores per power supply, and use less floor space to boot, by rack mounting them.
Scientific American article: http://www.scientificamerican.com/article.cfm?id=the-do-it-yourself-superc
Your solution will take 14 servers, connect them with ancient 1GbE interconnect and hope for the best. The interconnect for clusters REALLY matters, many problems are network bound - and not only network bound but latency bound as well. Look at the list of fastest supercomputers and you will barely see Ethernet anymore (especially at the high end) and definitely not 1GbE. Your new boxes will probably come with 10GbE that will definitely help... Especially since there will be fewer nodes to have to talk to (only 2, maybe 4)
The other problem that you will run into is your system will take about 20x the power and 20x the air conditioning bill (yeah - that is a LOT of power there), the modern new system will pay for itself in 9-12 months (and that doesn't include the tax deduction for donating the old systems and making them Someone Else's Problem)
Recycling old hardware always seems like fun. At the end of a piece of hardware's life cycle look at what it will actually cost to keep it in service - Just the electricity bill will bite you hard, then you have the maintenance, and fun reliability problems.
I have mod points and I am not afraid to use them
You're wasting your time with UPS if you don't have a Cabinet sized supply. The MTBF, maintenance, efficiency etc just doesn't make sense.
Put real money into making your cluster redundant or don't have a UPS at all.
You ought to consider what the cost of doing this with Amazon S3 or similar services might be.
I have a feeling you don't have any specific computation goals though, so it will be difficult to measure success.
( former builder of a 80 node 2 way Pentium III 1GHz cluster back in the day.)
Clusters are still very valuable, but be sure and accurately describe the computational cost of what you have planned becuase as you're building your cluster, prices of current tech keep getting so cheap that you might be able to just sell your equipment and lease time on someone else's HPC for half the money.
SPECfp2006 rate results:
...sell the E8xxx series PC's in boxes for$100 a peice with windows licence
e8600 34
i7-3770 130
x4 the performance
and use the $1400 towards buying Qty.4 lga1155 motherboards (4x$80), 4 unlocked K series i7's (4x$230) and 4x8Gb of DDR3 RAM (4x$40), 4x ~3-400W budget power supplies (4x $30) = $1520
Use a specialized clustering OS (linux) and have a smaller, easier to manage system, with lots more DDR 3 memory and lower electricity (and AC electricity) bill....
I know that as a win7 desktop / office use those machines will still work fine. And I am fairly sure a local Boys & Girls Club / YMCA / choose your charity would take them, even if for re-donation to their clients.
Not sure if .edu's need tax write offs, but at least they will go to a better use.
Then get a modern high end video card for less than this will cost to build, use it for compute, and have a faster end solution.
Unless you have a large number of identical machines capable of PXE booting and the necessary network hardware to wire them all together, you are really just building a maintenance nightmare. It might be fun to play with a cluster, but you'd do better to buy a couple of machines with as many cores as you can. It will take less space, less power, less fumbling around with configurations, less time and likely be cheaper than trying to cram all the old stuff into some random rack space.
If you insist on doing this, I suggest the following. 1. Only use *identical* hardware. (Or at least hardware that can run on exactly the same kernel image, modules and configurations) with the maximum memory and fastest networks you can. 2. Make sure you have well engineered Power supplies and cooling. 3. PXE boot all but one machine and make sure your cluster "self configures" based on the hardware that shows up when you turn it on because you will always have something broken. 4. Don't use local storage for anything more than swap, everything comes over the network... 5. Use multiple network segments, split between storage network and operational network.
By the way... For the sake of any local radio operations, please make sure you don't just unpack all the hardware from it's cases and spread it out on the work bench. Older hardware can be really big RFI generators. Consider keeping it in a rack that offers at least some shielding.
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
Would it be best to orient the shelves (and thus the fans) in the same direction throughout the cabinet, or to alternate the fan orientations on a shelf-by-shelf basis?
Keep them all the same, so that the system works as one big fan, pulling cool air from one side of the cabinet and exhausting hot air from the other. It's easiest to visualize if you imagine the airflow with a simple scenario. Imagine you had all of the even numbered shelves facing backward, blowing hot air to the front of the rack, while all the odd numbered shelves were trying to suck cool air from the front. That would totally fail because the odd numbered shelves would be sucking in hot air blown out from the even ones and vice-versa. You'd just be blowing hot air around the rack, not moving air through the rack. The same generally applies to other less simple configurations - if different units are arranged differently, they'll work against each other to some extent, rather than working as one team.
ave a 2 ton (24000 BTU) air-conditioner which will be able to maintain a cool room temperature (the lab is quite small)
1 BTU is 0.29 watt/hour. So take your total power usage and multiply by three. That's how many BTU of heat the rack will diisipate (all power eventually turns to heat). That's how much ADDITIONAL cooling you'll need beyond what's already used to keep the room cool.
Don't build it, rent it. For the cluster size (number of cores) you are proposing, it will be much faster, easier, and cheaper to rent the resources you need from Amazon Web Services. Then use MIT StarCluster to build the software infrastructure, run your cluster jobs, and shut the whole thing down. If you want to learn about building small clusters, that's a fun academic exercise. If you want to get work done, rent a cluster by the hour.
Messing with old hardware to try and make it rack mountable? Pfft. Save the effort. Buy a few mid-range servers and you'll get similar compute performance compared to that energy hog of a cluster. If you really want to use that hardware, don't remount it. Just stack the servers in a corner, plug them in, and install ROCKS. It's still gonna be an energy hog and have crappy performance though.
A few people are saying don't bother. I'd like to extrapolate on that a little.
If your problem is embarrassingly parallel, you might get some good mileage out of your cluster. If not, don't bother.
it would be cheaper and faster to replace those 14 E8000 with 4 I7-3900 with DDR3 - old hardware should be retire, they are a pain to maintain, worse yet
no one carry those IDE PATA
Raspberry Pi.
http://www.tomshardware.com/news/Raspberry-Pi-Supercomputer-Legos-Linux,17596.html
Why not give them away and buy 2 i7 26xx or better CPU's for the same performance? You could fit that in 1U instead of a 42U rack. No switch required, smaller UPS required, less aircon load, less electricity.
Check out the Microwulf work. It's not necessarily what you're looking for, but the community has produced some creative custom cases/racks. It might give you some fresh ideas.
We have a cluster at my lab that's pretty similar to what the submitter describes. Over the years, we've upgraded it (by replacing old scavenged hardware with slightly less old scavenged hardware) and it is now a very useful, reasonably reliable, but rather power-hungry tool.
Thoughts:
- 1GbE is just fine for our kind of inherently parallel problems (Monte Carlo simulations of radiation interactions). It will NOT cut it for things like CFD that require fast node-to-node communication.
- We are running a Windows environment, using Altair PBS to distribute jobs. If you have Unix/Linux skills, use that instead. (In our case, training grad students on a new OS would just be an unnecessary hurdle, so we stick with what they already know.)
- Think through the airflow. Really. For a while, ours was in a hot room with only an exhaust fan. We added a portable chiller to stop things from crashing due to overheating; a summer student had to empty its drip bucket twice a day. Moving it to a properly ventilated rack with plenty of power circuits made a HUGE improvement in reliability.
- If you pay for the electricity yourself, just pony up the cash for modern hardware, it'll pay for itself in power savings. If power doesn't show up on your own department's budget (but capital expenses do), then by all means keep the old stuff running. We've taken both approaches and while we love our Opteron 6xxx (24 cores in a single box!) we're not about to throw out the old Poweredges, or turn down less-old ones that show up on our doorstep.
- You can't use GPUs for everything. We'd love to, but a lot of our most critical code has only been validated on CPUs and is proving very difficult to port to GPU architectures.
(Posting AC because I'm here so rarely that I've never bothered to register.)
1. "As a commentator I reject your premise."
2. "You shouldn't want what you state you want."
3. "You should spend additional money to pay for more efficient machines rather than the computer you already have which are paid for, because money grows on trees, I place no value on your learning exercise, and I assume the electricity comes right out of your departmental budget exactly the same way purchase hardware would."
4. "I will ignore your very specific and detailed description of your setup, because screw you, that's why."
Go ask the guys over at Microwulf. They appear to have licked this particular challenge and link to others who have as well.
"A person is smart. People are dumb, panicky dangerous animals and you know it." - K
Racks are built for air flow from front to back, you'll need to turn the boards 90 unless you remove the side panels... No, you do not want to alternate airflow, you want a hot side and a cool side, it makes cooling easier. If you can, try to vent the hot air out instead of cooling it down, it is cheaper than cooling it down. Btw. did you consider putting 4 or 5 boards vertically in 2 rows behind each other ?
I setup a cluster using Beowolf several years ago as it came from my home state. But latency is a HUGE issue with it. Besides all cables have to be the same length, you have to make sure you have 0 latency otherwise you just have a bunch of computers connected each other rather than one cluster.
and it wants its 'ask slashdot' story back!
Why not use PicoPSU (160W version is available). And a large 12VDC power supply to drive all of the PSUs at once? It would save space and let you consolidate several large components into one beefy component.
For interconnect 4Gb fiberchannel is only about $750/each (if you count the card + switch ports). Myrinet used to be the cheap fast way (faster than ethernet), but the cards aren't really available anymore. the intel chipset on your motherboard might be capable of GAMMA or DET, which is probably just as good as Myrinet and a whole lot cheaper.
Ok, slow computers that probably lack memory. You could run Erlang to scale a system and experiment with fault-tolerance. Virtualization setups are probably not work spending time on. Experimenting with MPI can be fun. To keep the situation simple, use a network boot over TFTP, using a PXE binary so that you have a single node responsible for the image (a lean Linux distribution), all the other nodes can be diskless. You can get the cheap machines up to 1GB probably, and that is enough for experimentation. You will waste money on electricity and cooling, but knowledge is priceless, and you don't necessarily need to run them continuously... especially if you use your cluster for loadtesting (with Tsung) like I do.
http://linux.softpedia.com/get/System/Operating-Systems/Linux-Distributions/Cluster-Live-23558.shtml
I've done this. Starting with a couple of racksful of PS/2 55sx machines in the late '90s and continuing on through various iterations, some with and some without budgets. I currently run an 8-member heterogenous cluster at home (plus file server, atomic clock, and a few other things), in the only closet in the house that has its own AC unit. It's possible I know something about what you're doing.
Some of what I'll mention may involve more (wood) shop or electrical engineering than you want to undertake.
My read of your text is that there is a computer lab that will be occupied by people that will also contain this rack with dismounted Optiplex boards and P/Ss. This lab has an A/C unit that you believe can dissipate the heat generated by new lab computers, occupants, these old machines in the rack, and the UPSs. I'll take your word, but be sure to include all the sources of heat in your calculation, including solar thermal loading if, like me, you live in "the hot part of the country". Unfortunately, this eliminates the cheapest/easiest way of moving heat away from your boards -- 20" box fans (e.g. http://www.walmart.com/ip/Galaxy-20-Box-Fan-B20100/19861411 ) mounted to an assembly of four "inward pointing" boards. These can move somewhat more air than 80 mm case fans, especially as a function of noise. One of the smartest thermal solutions I've ever seen tilted the boards so that the "upward slope" was along the airflow direction -- the little bit of thermal buoyancy helped air arriving at the underside of components to flow uphill and out with the rest of the heated air. I.e., this avoided a common problem of unmodeled airflow systems of having horizontal surfaces that trapped heated air and allowed it to just get hotter and hotter.
Nevertheless, the best idea is to move the air from "this side" to "that side" on every shelf. Don't alternate directions on successive shelves. If you're actually worried about EMI, then you must have an open sided rack (or you shouldn't be worried). One option is to put metal walls around it, which will control your airflow. Another option that costs $10 is to make your own Faraday cage panels however you see fit. (I've done chicken wire and I've done cardboard/Al foil cardboard sandwiches. Both worked.)
You should probably consider dual-mounting boards to the upper *and* lower sides of your shelves. Another layout I've been very happy with is vertical board mounts (like blades) with a column of P/Ss on the left or right.
A *really* good idea for power distribution is to throw out the multiple discrete P/Ss and replace them with a DC distribution system. There's very little reason to have all those switching power supplies running to provide the same voltages over 6 feet. The UPSs are the heaviest thing in your setup; putting them at the bottom of the rack is probably a good idea. They generate some heat on standby (not much) and a lot more when running. Of course, when they're running, the AC is (worst case) also off and at least one machine should have gotten the "out of power" message and be arranging for all the machines to "shutdown -h now".
You only plan on having two cables per machine (since your setup seems KVM-less and headless), so wire organization may not be that important. (Yes, there are wiring nazis. I'm not one.) Pick Ethernet cables that are the right length (or get a crimper, a spool, and a bag of plugs and make them to the exact length). You'll probably get everything you need from 2-sided Velcro strips to make retaining bands on the left and right columns of the rack. Label both ends of all cables. Really. Not kidding. While you're at it, label the front and back of every motherboard with its MAC(s) and whatever identifiers you're using for machines.
Functioning computer systems are rarely useless; the E8000 systems the OP has will run software just like they did a few years ago when they were purchased. The most important question is: what do you want this cluster to do? If you want the experience of building it, including solving the HW issues of racking and stacking, and the software issues of cluster management software, job scheduling and resource management, then don't throw the equipment away. There are many opportunities for making decisions that require problem-solving and resourcefulness. Plenty of FOSS solutions, even while using only the built-in network connections for an interconnect. If you have some HPC or scientific cluster-aware software in mind that you want to run, tailor your software configuration to run that. The folks who built Beowulf clusters in the early 2000s had a goal in mind; often, that goal was to provide an environment to develop their own MPI software to simulate some phenomena they were interested in. Are you a programmer, or want to learn parallel programming? Are you offering your cluster to folks who are learning parallel programming? http://www.open-mpi.org/ has good information, and FOSS implementations for Linux distributions. There are also Windows clustering solutions, if that's what your user base requires; not free, obviously. So, what do you want this cluster to do?
If you have datasets of more than trivial size you don't want to be spending time waiting while you are shuffling them back and forth over the internet.
Dude, this kind of question is what the beowulf mailing list is all about. Post your questions there and you'll get lots of answers. www.beowulf.org
bear in mind that because of Moore's law and related phenomena, clustering old computers is usually a bad deal compared to buying a single new computer that's faster and draws a lot less power. However, it's a great way to learn about clustering, about running MPI, etc. and all sorts of cluster management things.
You're at the sort of nice spot to learn. A 4 node cluster is too small. A 100 node is too big.
That none of these posts are helpful. I would bet the majority of submitters cut their teeth on similar setups.
And my question, why not do it? Personally, I originally built a ISP out of 45 486s, designed a similar rack and used box fans as circulation , and it worked wonderfully for 7 years. I see no reason it couldnt work for you. I actually built a box, installed a rack into it and using 2x4s mounted L brackets to hold motherboards in place. a box fan on top and UPS's in the bottom, the horrifying part? I had no air conditioner to cool it, and yet I still had minimal hardware issues
http://openssi.org/cgi-bin/view?page=features.html:
All of the machines in the cluster see one cluster wide filesystem, one cluster wide process space, cluster wide IPC space, etc. Processes can be migrated around the cluster. TCP/IP is cluster-wide (connections migrate with processes, and the initnode can load balance connections around).
About the only thing missing is shared memory, where all nodes have direct read/write access to all of the other nodes. For that, you'd need hardware set up for "NUMA", such as the high end stuff sold by HP, Cray, IBM, etc.
As much as Slashdot loves Beowulf, it's just an implementation of a 70's era design, of writing software explicitly for a library that passes messages. OpenSSI, like a NUMA cluster, looks like one big Unix machine to software. Processes that fork() can migrate or be migrated to other nodes and can continue talking over pipes they already have open, continue reading/writing files they already have open, talking over sockets they have open, etc.
I agree with those that propose new cpu to have some mips/flops.
less power, less space, less maintenance.
I see some usefulness in building a farm for kvm/xen virtual machine.
having lot's of motherboard let you better distribute load and support
hw failure. you could use sheepdog for storage and the 1Gb should be
enough. the virtual farm give you a tool to quickly deploy different server
for test/experiment. still you can do some number crunching just to test
you software and when stable run it on better performing (flops/watt)
hw.
Unless you're specifically undertaking this project to learn more about building a cluster, don't build a cluster. Over time it would be cheaper in terms of power, cooling, manpower and space to toss the old equipment and replace it with something more powerful, or better yet just toss everything and spin up cluster resources on a cloud platform as needed. AWS, for example has very good support for cluster computing and can put you in or very near supercomputer territory for $1,000/hr.
If possible, it's always better to extract the hot air instead of warming up the room and then cooling it down. Do you have any air extractor that you can use? Have you considered isolating hot and cold areas? A bunch of extractors and tubes with plastic panels will be sufficient.
Keep in mind that if this is commodity hardware, it may as well be possible that if you get rid of the heat, you can actually run the cluster at ambient temperature. That will save you a good amount of money every year.
About other settings, I'd go for a NFS with DRDB and HA for /home and /scratch (if you use one). Perhaps a controller node with WOL and a few modifications on the job scheduler may allow you to boot/shutdown the nodes on demand. SLURM+MUNGE would be my queue manager/scheduler of choice: it's extremely simple and powerful.
X.
Pulling the system out of the case seems... odd. Are you that short on space that you can't have another rack?
Several reasons:
1. dust
2. static
3. a. cooling: real servers have plastic shrouds to guide the air from the fans through the heat sinks. Without that,
the cooling won't be anywhere near as good, and possibliy not good enough to keep them from shutting
down when they're being run hard.
b. DO NOT ALTERNATE directions. In data centers, in server rooms, etc, you have all in a row facing the same
way, and blow your cool air towards the front, and let it get somewhat warm behind. This is how they're designed
to be used.
UPSes on the bottom: sure. I've put some in the middle of the rack, but those are rack-mount. MAKE SURE that you leave clearance to open 'em up when you need to replace the batteries.
NOTE: when you buy replacement batteries for these UPSes, UNDER NO CIRCUMSTANCES BELIEVE ANY MANUFACTURER OR RESELLER. TELL THEM THAT IF THEY DON'T SEND YOU HR - HIGH RATE - BATTERIES, YOU WILL SEND THEM BACK. APC rackmounts WILL NOT ACCEPT *A*N*Y*T*H*I*N*G* but an HR battery, and continue to tell you that you need to replace if, forever.
I'm assuming you'll be running linux. I'm also assuming that you're using this for heavy duty computing, not load balancing or H/A (high availability).
For clustering, also check out torque, which is a standard clustering package, though it does need the jobs to be parallel processing aware.
For the person who mentioned "time" as a cost: I'd assume that the OP was asked to do this "as time permitted", and is certainly something to do that's useful, as opposed to playing solitaire, waiting for something to need work....
mark
Whatever happened to Plan9?
If you are serious about this project I would use Plan 9 because it is designed to use all of your hardware transparently. They can always use more members in this small community. You might find this underrated platform quite delightful:
http://plan9.bell-labs.com/plan9/
Ignore all the naysayers. Just having experience with Plan 9 makes this experiment worth it.
I have never seen diskless clusters work very well. As soon as they run close to using the system memory (very easy) they go sideways down a GigE link. I have been wrong so sure -- go try it.
You will notice that all motherboards have mounting holes in about the same place. Bolt them together and have at it. Airflow is a royal pain to control without a chassis. Memory and other large packages in the system often have air flow/cooling issues once the board leaves a standard chassis. The closer together the boards the higher the air flow velocity can be so that may prove to be a plus.
Also look at ROCKS clustering.
You can just stack what fits on shelves and benches then get started. That will tell you if the performance is even close to your needs.
As you benchmark your codes you may find that more than N boxes no longer provides any speedup and compacting boards together to gain a couple U has little value.
Measure and Track AC power as well as cooling. These are not free and if you can jump a couple generations the cost of upgraded hardware vanishes when AC and cooling are considered. It may be that AC+cooling is a fixed cost to you and thus a don't care. But measure...
Truth is stranger than fiction, but it is because Fiction is obliged to stick to possibilities; Truth isn't. Mark Twain.
http://helmer3.sfe.se/
Most of the posts say throw them away, they're not worth the cost of operation. They assume there is $$$ for buying something new.
Some are saying it's worthwhile as a learning exercise.
Power: At most places I've worked, my department did not pay for electricity. So from my dept's point of view, it's not a concern. If I was running it at home or in a colo or had to account for power, it would be a concern. So paying for the new with savings isn't a solutions.
Cooling: If the room isn't overloaded, you don't need to buy more AC. It sounds like the OP had enough extra. If you have to worry about the cooling because you get charged or have to add more, it's a concern.
CPU Cycles: Anything modern, desktop will blow this away. So will GPU cards. But they have to be budgeted, approved and purchased. I've been places where that is a minimum of 6 months. Lots of things got expensed on a credit card. As a learning exercise, the speed isn't the issue. You want it to work like a real system.
Rack space: many places I worked had extra real estate for spare machines.
Labor: There are *many* managers that will spend $5000 in labor to save a $1000 PC. For DIY or student work, labor is free.
If learning is the only need you could simulate a cluster with a virtual environment with multiple VMs as nodes on a modern CPU. I wonder if a new desktop system w/ a quad core CPU, lots of RAM (enough for each node, plus the host) and disk to put images on would work for student programmers.
It does sound like there is some $$ here for a UPS, a rack, a network switch in the rack and labor to build it. If $$ is tight, I'd compare that $$ against a VM host w/ multiple virtual nodes. You'd lose the learning experience of physical building, but you'd save future labor on maintaining old hardware.