Ask Slashdot: Clusters On the Cheap?
First time accepted submitter serviscope_minor writes "A friend of mine has recently started a research group. As usual with these things, she is on a shoestring budget and has computational demands. The computational task is very parallel (but implementing it on GPUs is an open research problem and not the topic of research), and very CPU bound. Can slashdotters advise on a practical way of getting really high bang for buck? The budget is about £4000 (excluding VAT/sales tax), though it is likely that the system will be expanded later. The computers will probably end up running a boring Linux distro and Sun GridEngine to manage batch processing (with home directories shared over NFS)."
Why waste money on building a cluster when you can rent the best in the world * by the hour * ?
Actually, that's a good question... Assuming no time constraints, at what point does it make sense to buy hardware rather than use the cloud? Take that budget above (roughly US$6K) and the best hardware you can get for that price: How many months would you need to run it, flat out, to equal the number of floating-point ops EC2 would give you for that cost?
Many universities/consortia have supercomputers available on which researchers can apply for (or buy) time. For example, my university is a member of VPAC, which has a big-arse cluster shared between a number of institutions. She might get much better bang for buck if she uses the money for that, rather than splashing out for dedicated hardware.
Any sufficiently advanced technology is indistinguishable from a rigged demo
--Andy Finkel (J. Klass?)
Why buy your own when you can use existing GRID infrastructure? For 4k you can't do much more that get a few decent desktops for yourself and a few grad students and/or postdocs. Rather than blow it on a massively underpowered cluster use the grid. I know the UK has massive clusters available to researchers so find out how to get an account and resources on them and use those. For test jobs, interactive analysis and other low latency tasks use your desktop.
You can get a SuperMicro reseller to sell you one workstation with 4 sockets of CPUs and a bunch of RAM. UK£ 4000 = 6 299.2 U.S. dollars
That buys you a box with 4 x Opteron 6134 (32 cores) and 128GB RAM (32 x 4GB sticks). And some hard disks.
OP hasn't mentioned a lot except budget. Since you are on such a tight budget, I would highly recommend doing some theoretical analysis first. Do you have a serial code? How much parallelism exists in the code? You say the task is 'very parallel', but Amdahl's law (which is really common sense) will tell you that even for small amounts of serial sections of code, your speedup will be limited. You should also consider the amount of time the code actually runs. Achieving a speedup of 2 for a serial code that runs for one minute is near worthless.
After you estimate speedup, do some rough calculations on the basis of average cost of a processor and the the number of processors required. This should give you an estimate of the hardware cost required. Compare that with the cost of CPU cycles per dollar you get using a cloud service such as Amazon.
$1.60 / hour for the largest non-GPU cluster instance. This also provides you with rather fast interconnects and scalability with multiple instances.
Only £4,000 in hardware would be a waste of money. You wouldn't have all that much computing power, and it would be obsolete immediately.
She could also consider creating a BOINC project. She could then do some publicity locally and on forums, to get people to choose her project. I've never tried creating a BOINC project, so I don't know how hard this is. However, I do run the client as a background task, and I imagine many other people do as well.
Enjoy life! This is not a dress rehearsal.
Hmm I wonder if the College has a datacenter already. You know, so ya dont have to worry about the overhead.
Ohh and man hours to configure the thing? You know I'm sure there are a load of students there working for free.
This is NOT some company trying to spin up a new service. This is a school doing a project. That means shoe string budget, and the people get paid with good grades in their class.
4000 pounds wont get you very far at all on Amazon.
I love it when people take the question given and then insist that the problem can only be solved by changing the specifications to using "cloud" computing or some other nonsense. It makes you guys look like paid shrills.
You are entitled to your own opinions, not your own facts.
The problem is the constraints. The cheap cluster in my old department cost £100k. £4k does not buy you a lot of hardware. You will probably find a lot more lying around in the undergrad labs. For some of my work as a PhD student, that's exactly what I used - each lab had 40 machines on a GigE network and closed overnight, and for work that wasn't that latency sensitive, I could distribute it across the machines there and run it at night without anyone minding.
If you're serious about needing a cluster, then you need to spend a lot more than £4K. If you only need a cluster for a short time, then £4K can buy you a chunk of time on someone else's hardware. Since this is the UK, they should contact the Manchester Supercomputing Centre, which provides this kind of service to UK universities at quite a reasonable price (and will also lend you people who are good at optimising code for their systems). If the university doesn't already have some clusters lying around, then you should get in contact with a few other research groups. £4K won't go very far, but if half a dozen research groups each put in £4K then that gives you enough for a reasonable cluster to share between the various users.
I am TheRaven on Soylent News
Just get with the guy who did the ask Slashdot the other day that didnt know what to do with his supercomputer.