Ask Slashdot: Best Bang-for-the-Buck HPC Solution?

← Back to Stories (view on slashdot.org)

Ask Slashdot: Best Bang-for-the-Buck HPC Solution?

Posted by Soulskill on Saturday July 18, 2015 @09:03AM from the 14,000-raspberry-pis dept.

An anonymous reader writes: We are looking into procuring a FEA/CFD machine for our small company. While I know workstations well, the multi-socket rack cluster solutions are foreign to me. On one end of the spectrum, there are companies like HP and Cray that offer impressive setups for millions of dollars (out of our league). On the other end, there are quad-socket mobos from Supermicro and Intel, for 8-18 core CPUs that cost thousands of dollars apiece.

Where do we go from here? Is it even reasonable to order $50k worth of components and put together our own high-performance, reasonably-priced blade cluster? Or is this folly, best left to experts? Who are these experts if we need them?

And what is the better choice here? 16-core Opterons at 2.6 GHz, 8-core Xeons at 3.4 GHz? Are power and thermals limiting factors here? (A full rack cupboard would consume something like 25 kW, it seems?) There seems to be precious little straightforward information about this on the net.

17 of 150 comments (clear)

Min score:

Reason:

Sort:

Look for other users of the S/W for advice by peterjt · 2015-07-18 09:13 · Score: 5, Insightful

Why not start with looking at what S/W you plan to run, and then see what advice is available from them (and from other users) as to what H/W they would recommend.
1. Re:Look for other users of the S/W for advice by JamesTRexx · 2015-07-18 09:22 · Score: 5, Insightful
  
  Precisely this. Do not look at the hardware for hardware's sake, look at the needs to run the software as best as you can. Does it benefit from parallelism? Throw tons of Opteron cores at it. Does it benefit from speed? Get the fastest Intels. Can it do everything in RAM? Stuff the servers with it, etc. etc.. Also, if it is built to scale, start with one or two servers, then see what kind of load it causes and base the next nodes you add on that data. You might even want to consider starting off with a virtual environment for portability to other hardware or cloud providers.
  
  --
  home
2. Re:Look for other users of the S/W for advice by Anonymous Coward · 2015-07-18 10:37 · Score: 4, Interesting
  
  this this this!!!!
  For example the work I do with a HPC would need a monster DB able to handle millions of inserts a day. Which needs bottom rack intel video chips but monster data interconnects (think 40gb per sec and up). But someone doing oil topographical analysis or making a movie may want top of the line quadra nvidia cards and lots of memory and minimal disk space.
  A HPC runs the gamut of what is out there.
  For about 50k I am sure you could build something from HP or Cisco that is in the 100-200 cpu range. But what are you going to do with it? What sorts of network interconnects are you looking at what sort of storage do you need? If you need say 500k sustained IOPs per second 50k will not cut it (start thinking in the 400-1million range).
  That just gets you the hardware. Do you need a particular bit of software? What is that going to cost and ongoing cost? For example something like splunk can costs several hundred k per month in the right environment.
  Without the specs of what you are doing I would be randomly guessing what you need.
  My advice? Start with a prototype of bottom run 'crap' 'costco special' hardware. Work your way up and decide what you need. Perhaps hire someone who knows how to plug this all together. Having done this a few times it can be a challenge just to manage 5000 bits of hardware all showing up one day and getting it all put together. Finding a location and power sources can actually be a challenge. Depending on how big it is you may not be able to plug it into your buildings mains. I suggest a high level design then work your way down to lower designs. But most of all HIRE SOMEONE who knows this stuff. There are thousands of people out there that need a job that can do EXACTLY this sort of thing.
3. Re:Look for other users of the S/W for advice by Anonymous Coward · 2015-07-18 11:41 · Score: 2, Insightful
  
  I will third this. I will also state that I was directly involved in building a home grown cluster that was highly ranked in the Top500 List a little over a decade ago.
  You MUST begin with needs analysis and that goes WAY beyond just looking at research domains, in this case FEA and CFD. You have to know what software you want to run. You must also research and find out if there are alternatives to what software you currently run (or are initially planning to run) that may have modern competitors that run more efficiently.
  I will also note that FEA and CFD have different resource needs, and therefore different hardware configurations that would be optimally suited for those tasks (I think someone else in this thread has already brought this point up below), so if you do want to run both types of software packages on the same machine you will be making some compromises on efficiencies and configuration to do that. Most of the FEA code that our researchers ran was run on single-system image, shared memory machines (SGI), not an MPI-based, distributed memory cluster where the CFD and MD/QD folks get their best bang for the buck. I don't know how much that has changed in the last few years, but I would imagine, not much.
  I will keep an eye on this thread over the next couple of days. If the OP wishes to contact me I'd be happy to help them work on this challenge. We can figure out how to connect if I get a reply to this post.
4. Re:Look for other users of the S/W for advice by sumdumass · 2015-07-18 12:40 · Score: 2
  
  Just wanted to add, don't stop at the recommendations the software suggest.
  I had a client who decided to go with the hardware recommendations provided by the software vendor against my objections. Six months after we were up and running, the software which was the entire point of the ordeal released an update that slowed everything down enormously. Turns out, their "recommended" hardware specs were slightly better than their minimum specs on the new version of the software and the server had also been purposed to do a few other minor things that ran in conjunction with the software. You might as well say it was the minimum.
  So by stopping at the recommendations of the software vendor, they were presented with a setback no one was really thinking about. They could either roll back the software version negating the support and upgrade purchase plan, suffer the slow speeds and hope the vendor doesn't slow it any more, or replace good hardware that they really didn't have a use for outside of the specific software. They eventually let me completely overkill a server to replace it.
  So unless the software vendor says there is a limit, reasonably increase the power and memory of their recommendations for future proofing. Just keep in mind you will want to eventually replace the hardware anyways else risk suffering down time from the inevitable failures.
5. Re:Look for other users of the S/W for advice by KGIII · 2015-07-19 00:34 · Score: 2
  
  I made use of their services quite some years ago. This is a seconding for them. They became our go-to for hardware and hardware recommendations even while we were mostly a Sun shop. The reps were knowledgeable and polite. The service was top-notch. The after-sale support was surprisingly good. I have been out of the loop for about eight years as I am now retired but I keep my ear to the ground a little bit and have not heard anything that would make me inclined to believe they have changed.
  
  --
  "So long and thanks for all the fish."
Supercomputers are very workload specific by mdtiemann · 2015-07-18 09:15 · Score: 2

You mention you are interested in CFD. Intel Phi processors have been known to do well here: http://www.cfd-online.com/Foru... . In that linked story, a single Intel Phi processor beats a 1024 core cluster. Moreover, Thinkmate is literally giving away Intel Phi processors: http://www.thinkmate.com/syste... . But not all workloads fit the Phi, so you really need to do some benchmarking before you buy.
Get some quotes by hawkeyeMI · 2015-07-18 09:18 · Score: 2

Disclosure: I have worked for Penguin Computing in the past, though I currently have only a customer relationship with them (we use their Penguin on Demand service). I strongly recommend you talk to a few of the HPC vendors out there about your needs and get a few quotes. Obviously Penguin is one I recommend, I'm not sure who else is still in the business, I think at least one of the major ones I've gotten a quote from in the past went under. Just do a little googling. They are probably familiar with your applications and can get you a turnkey solution that's well-suited for your application.

--
Error 404 - Sig Not Found
Why not rent the time? by plopez · 2015-07-18 09:24 · Score: 3, Informative

You haven't said anything about your application. Do you run it continuously? Sporadically? Will the machine be sitting idle much of the time? Do you have the staff to support it? What about networking and storage? Do you have the ability to rapidly move and store data as the actual computing is only part of the story.
It may make sense to rent the time due to lower storage and maint. costs than to actually buy and maintain the infrastructure.

--
putting the 'B' in LGBTQ+
Haswell-EP Xeons by coats · 2015-07-18 09:24 · Score: 2

I would go with Haswell-EP Xeons -- probably 2697v3 (14 cores @ 2.6-3.6): a two-socket motherboard gives you 28 physical cores per board, for prices in the $12K range. Just one of these is quite a powerful system. If you can get by with a 2-node system, then 10GE interconnect is good enough (AND MUCH CHEAPER); for more nodes, you will need Infiniband (since 10GE does not scale well). The 4-node/IB cluster will be on the order of $60K, and will offer more performance than a $160K solution of a couple of years ago.
These will offer far better performance than the Opteron solution.
Can you compile your own application? If so, use the Intel compilers, and make sure you compile targeting the Haswell instruction set (-O3 -Xhost -march=corei7-avx2 -mtune=corei7-avx2 if I recall correctly): the full AVX2 Haswell instruction set is rather more powerful for your app than the predecessor "AVX" SandyBridge/IvyBridge instruction set, which is far more powerful than the previous Nehalem/Westmere SSE4.2 instruction-set, which is somewhat more powerful than a simple "-O3". If you can't compile on your own, try to make sure the vendor's executables target AVX2; the right compile-flags will double your performance over "-O3"...

--
"My opinions are my own, and I've got *lots* of them!"
Amazon AWS by Cyberax · 2015-07-18 09:30 · Score: 4, Interesting

Unless you need to transfer A LOT of data from your cluster, Amazon AWS will probably be cheaper than dedicated hardware. Especially if you can use spot instances (that are 5-10 times cheaper than the regular Amazon EC2 instances).
1. Re:Amazon AWS by kimanaw · 2015-07-18 10:05 · Score: 2
  
  This. AWS has a GPU tier (kinda pricey, but probably cheaper than standing up an equivalent on your own). I'm guessing your FEA/CFD will probably need GPUs. $50K will rent a lot of GPU time. Not sure how available the spot instances for them are.
  otoh, if you're looking to use regular CPUs, Azure has an infiniband tier that may be a better interconnect for HPC purposes than AWS's 10 Gbps VPC's.
  
  --
  007: "Who are you?"
  Pussy: "My name is Pussy Galore."
  007: "I must be dreaming..."
See what you can do with leasing or cloud by garyisabusyguy · 2015-07-18 09:32 · Score: 2

There are plenty of costs beyond the actual computer, including power, power conditioning, battery backup, heat removal, etc... that make up most of the cost.
If you still decide to build your own hardware, then pay close attention to
1. Compatibility with your chosen software, i.e. the best system in the world is worthless if it does not run the software that you want. You may be building your own software, then you will still need to consider OS, compiler, libraries, etc
2. Ability of the operating system to provide enough resources to your software, in the 'good old days' Windows only provided a limited amount of RAM to processes, even in today's world Windows system swap aggressively and may not give you the RAM performance that you may see in the Enterprise *nixes
3. Internal bus structure of the system you choose, The biggest growth in PC hardware has been the internal bus width and speed. Look around, but for cost's sake, you will probably be using a variety of PCIe from Intel. You will probably also see better integration with the PCIe bus with Intel chips. If you are using GPU accelerators, that is a whole 'nother kettle of fish that will affect your other decisions above and below
4. Methods provided for disk access, used to be the Fibre-Channel was the King, but times have changed with iSCSI making inroads, and local disk architecture provides the greatest bang for the buck with SATA starting to edge out SCSI. If you go the SAN or iSAN routes, it will have additional costs for rackspace, power and cooling.
5. Disk system that you choose, most people would suggest butt-loads of local SSD, after RAM, solid state drives will probably be your highest costs
Just my two bits, plus I completely ignored tape system vs spinning-disk hard drives for backup, which would add more rack space, power supply and cooling to anything that you try and put together. Try and put together a realist estimate for purchasing and supporting your hardware for a couple of years and compare it to cloud cost for similar resources

--
Wherever You Go, There You Are
small cluster: performance/price metric by lkcl · 2015-07-18 09:35 · Score: 5, Interesting

i did this before, on a very small scale, for GBP 1,000 about 10 years ago. sales teams kept offering me 2ghz dual-core machines at GBP 300 each and i had to tell them this:
"look, i have a budget of 1,000 GBP. you're offering me a 2ghz system for 300. so i can only buy 3 machines, right? so that's a total of 6 ghz of computing power. on the other hand, if i buy this GBP 125 machine which has only a 1ghz processor, i can get 8 of those, which gives a total of 8 ghz of computing power. so _why_ would i want FASTER?"
so i bought qty 8 of motherboard, CPU, 128mb RAM, low-cost case containing a PSU already, and accidentally included a 3com network card because i didn't realise that the built-in ethernet on the motherboard could do PXE boot..... but still, all-in that was 125 GBP and each one took 15 minutes to assemble so it was no big deal. got myself 8ghz of raw computing power, which was the best that i could get for the money that i had.
and that's the question that you have to ask yourself. what's the highest performance / price metric that can be achieved?
the highly specific problem that i was endeavouring to parallelise was a very small memory footprint non-I/O-bound task: running the NIST.gov Statistical Test Suite. i booted all 8 machines off of my laptop, over PXE boot with an NFS read-only root filesystem. had to wait 30 seconds between each because my 800mhz P3 laptop with 256mb of RAM reaaallly couldn't cope with 8 machines hammering it... not over a 100mbit/sec link, anyway.
once started, i wrote a script that ssh'd into each and left them running the STS for a day at a time. very little actual data was generated: a report.
but the issue that you're solving may involve huge amounts of disk I/O, it may involve huge amounts of inter-connectivity (inter-dependence between the parallel tasks). you may even have to use a GPU (OpenCL) if it's that computationally expensive... ... and that's where anyone's advice really ends, because unless you know exactly what it is you need to do - in real, concrete terms of I/O per second, GFLOPs/sec, GMACs/sec, inter-communication/sec, you really can't and shouldn't even remotely consider spending any money.
so please consider writing a spreadsheet, based on the performance/price metric, extending it to the domain(s) that you're interested in optimising. then the answer about what to buy should be fairly self-evident.
oh and don't forget to include the power budget (and cooling) because i think it will shock the hell out of you. remember you need to include the maximum specs, not the "average" or "scenario design power".
*LOTS* of info on the net by gavron · 2015-07-18 10:41 · Score: 2

The problem is that you don't know what you're looking for so you're not asking the right questions.
- Power is a factor. You mention 25KW. Wrong units. You should look for KVA. You'll never know what the wattage is until you know the power factor (PF) and you won't know that until you populate the device with spindles and fans (which have a different PF than CPUs, GPUs, PSUs,) and then run it under load and measure.
- 25KVA is a medium rack. 35-50KVA is a dense rack. How many racks you choose to have is up to you, but the "25" number is not a good random one to shoot for. If you search for "30KVA" and "High density rack" you'll get an idea of what servers do populate such things.
- You won't be running anything of this magnitude at your deskside, unless you are in Alaska or Siberia and have no other source of heat. Also most businesses don't like running 4 30A 3-phase 208VAC to employees' desksides. Just sayin'... And again, if you're not Alaska or Siberia with an open door and window, you won't move enough air through your office to cool that beast. (Air mass is directly related to cooling, and unless you're doing dielectric-immersion cooling, the sheer amount of air requires massive fans and lots of space.)
- Two other responses said "See what your software vendor says." Software is abstracted by compilers. The real question is "how much CPU, GPU, DISK, or other IO does it do" and plan for that. That will also change the PF and the KW and the heat load.
There's a reason nobody builds deskside compute servers with today's technology. Density, power, and cooling.
Keywords to google: KVA PF KW, high density rack server, PUE (PUE is the inverse of PF and is applied to an entire data center which includes cooling.)
Other places to look: look up abstracts for talks at Data Center World.
Re:I advise against it by lenart · 2015-07-18 11:17 · Score: 3, Funny

Save your money and use it to move somewhere without Fag Marriage.
You can marry a cigarette where you live?
Here is what I did... by Taz1672 · 2015-07-19 14:06 · Score: 2

My company decided it wanted a new FEA machine. They decided to stay with the existing software company, so I called up the company and explained the situation and asked for the department that provided pre-sales support, specifically hardware recommendations. Turns out they had a strong bench of people ready to help with that and detailed Known Good configurations for each major hardware company. We simply decided looked at the software licensing costs, the hardware costs and how long our average scenario would take under various software/hardware configs and sized it to handled the existing number of jobs plus expected growth.
We decided what we could live with in terms of how long an average job would take (we decided we could live with 24 hours as an average). We then decided what sort of tradeoffs we could make in terms of hardware (an up front sunk cost amortized over many years) versus the annual software license fees. A little more spent on hardware up front meant we could save on software licensing costs by taking a step down in numbers of processors permitted. We then presented this decision to their presales people to get it vetted and asked for suggestions. In our case we took the suggestion that we archive saved results to our enterprise grade disk array and put the money into Raid 0 SSD's to speed up the overall job time. As always, RAM was the cheapest upgrade so we maxed that out.
Everyone signed off, we took the specs to a local system builder with a good reputation, told them no changes of ANY sort to any component, negotiated a price that included acceptance testing to ensure compliance and made sure they had enough of a profit margin so as to discourage shortcuts. They delivered, we tested, it gave expected performance results, we accepted it and paid them. That system is installed today and delivers the results it was designed for.
I would suggest a similar course for you.
1. Decide on the software first. Make darn sure it will do what you want.
2. Decide on how fast jobs need to be finished and how many per week/month/year to prevent over specing
3. Call presales support to get hardware recommendations
4. Make the decision on hardware cost versus software licensing cost versus number of jobs to be done
5. Do your homework! Understand what you are specing, talk to others, particularly customers.
6. Take your new found knowledge back to presales a few times to make sure you did not miss any improvements and you truly understand what you are doing, you are betting your job on this.
7. Find a builder, local or the hardware manufacturer, negotiate terms. Make sure you leave a decent profit margin to avoid the temptation to skimp.
8. Test, test, test. Confirm all configuration decisions with presales support.
9. Pay 'em and install the machine.
10. Don't forget to follow up to ensure it continues to work as designed and that procedures are being followed. In my case, we checked that all runs were backed up on our enterprise disk array.