Ask Slashdot: Best Bang-for-the-Buck HPC Solution?
An anonymous reader writes: We are looking into procuring a FEA/CFD machine for our small company. While I know workstations well, the multi-socket rack cluster solutions are foreign to me. On one end of the spectrum, there are companies like HP and Cray that offer impressive setups for millions of dollars (out of our league). On the other end, there are quad-socket mobos from Supermicro and Intel, for 8-18 core CPUs that cost thousands of dollars apiece.
Where do we go from here? Is it even reasonable to order $50k worth of components and put together our own high-performance, reasonably-priced blade cluster? Or is this folly, best left to experts? Who are these experts if we need them?
And what is the better choice here? 16-core Opterons at 2.6 GHz, 8-core Xeons at 3.4 GHz? Are power and thermals limiting factors here? (A full rack cupboard would consume something like 25 kW, it seems?) There seems to be precious little straightforward information about this on the net.
Where do we go from here? Is it even reasonable to order $50k worth of components and put together our own high-performance, reasonably-priced blade cluster? Or is this folly, best left to experts? Who are these experts if we need them?
And what is the better choice here? 16-core Opterons at 2.6 GHz, 8-core Xeons at 3.4 GHz? Are power and thermals limiting factors here? (A full rack cupboard would consume something like 25 kW, it seems?) There seems to be precious little straightforward information about this on the net.
Unless you need to transfer A LOT of data from your cluster, Amazon AWS will probably be cheaper than dedicated hardware. Especially if you can use spot instances (that are 5-10 times cheaper than the regular Amazon EC2 instances).
i did this before, on a very small scale, for GBP 1,000 about 10 years ago. sales teams kept offering me 2ghz dual-core machines at GBP 300 each and i had to tell them this:
"look, i have a budget of 1,000 GBP. you're offering me a 2ghz system for 300. so i can only buy 3 machines, right? so that's a total of 6 ghz of computing power. on the other hand, if i buy this GBP 125 machine which has only a 1ghz processor, i can get 8 of those, which gives a total of 8 ghz of computing power. so _why_ would i want FASTER?"
so i bought qty 8 of motherboard, CPU, 128mb RAM, low-cost case containing a PSU already, and accidentally included a 3com network card because i didn't realise that the built-in ethernet on the motherboard could do PXE boot..... but still, all-in that was 125 GBP and each one took 15 minutes to assemble so it was no big deal. got myself 8ghz of raw computing power, which was the best that i could get for the money that i had.
and that's the question that you have to ask yourself. what's the highest performance / price metric that can be achieved?
the highly specific problem that i was endeavouring to parallelise was a very small memory footprint non-I/O-bound task: running the NIST.gov Statistical Test Suite. i booted all 8 machines off of my laptop, over PXE boot with an NFS read-only root filesystem. had to wait 30 seconds between each because my 800mhz P3 laptop with 256mb of RAM reaaallly couldn't cope with 8 machines hammering it... not over a 100mbit/sec link, anyway.
once started, i wrote a script that ssh'd into each and left them running the STS for a day at a time. very little actual data was generated: a report.
but the issue that you're solving may involve huge amounts of disk I/O, it may involve huge amounts of inter-connectivity (inter-dependence between the parallel tasks). you may even have to use a GPU (OpenCL) if it's that computationally expensive... ... and that's where anyone's advice really ends, because unless you know exactly what it is you need to do - in real, concrete terms of I/O per second, GFLOPs/sec, GMACs/sec, inter-communication/sec, you really can't and shouldn't even remotely consider spending any money.
so please consider writing a spreadsheet, based on the performance/price metric, extending it to the domain(s) that you're interested in optimising. then the answer about what to buy should be fairly self-evident.
oh and don't forget to include the power budget (and cooling) because i think it will shock the hell out of you. remember you need to include the maximum specs, not the "average" or "scenario design power".
this this this!!!!
For example the work I do with a HPC would need a monster DB able to handle millions of inserts a day. Which needs bottom rack intel video chips but monster data interconnects (think 40gb per sec and up). But someone doing oil topographical analysis or making a movie may want top of the line quadra nvidia cards and lots of memory and minimal disk space.
A HPC runs the gamut of what is out there.
For about 50k I am sure you could build something from HP or Cisco that is in the 100-200 cpu range. But what are you going to do with it? What sorts of network interconnects are you looking at what sort of storage do you need? If you need say 500k sustained IOPs per second 50k will not cut it (start thinking in the 400-1million range).
That just gets you the hardware. Do you need a particular bit of software? What is that going to cost and ongoing cost? For example something like splunk can costs several hundred k per month in the right environment.
Without the specs of what you are doing I would be randomly guessing what you need.
My advice? Start with a prototype of bottom run 'crap' 'costco special' hardware. Work your way up and decide what you need. Perhaps hire someone who knows how to plug this all together. Having done this a few times it can be a challenge just to manage 5000 bits of hardware all showing up one day and getting it all put together. Finding a location and power sources can actually be a challenge. Depending on how big it is you may not be able to plug it into your buildings mains. I suggest a high level design then work your way down to lower designs. But most of all HIRE SOMEONE who knows this stuff. There are thousands of people out there that need a job that can do EXACTLY this sort of thing.