Ask Slashdot: Best Use For a New Supercomputing Cluster?

← Back to Stories (view on slashdot.org)

Ask Slashdot: Best Use For a New Supercomputing Cluster?

Posted by Unknown on Tuesday September 13, 2011 @09:32AM from the reclaim-heat-for-silicon-diner dept.

Supp0rtLinux writes "In about 2 weeks time I will be receiving everything necessary to build the largest x86_64-based supercomputer on the east coast of the U.S. (at least until someone takes the title away from us). It's spec'ed to start with 1200 dual-socket six-core servers. We primarily do life-science/health/biology related tasks on our existing (fairly small) HPC. We intend to continue this usage, but to also open it up for new uses (energy comes to mind). Additionally, we'd like to lease access to recoup some of our costs. So, what's the best Linux distro for something of this size and scale? Any that include a chargeback option/module? Additionally, due to cost contracts, we have to choose either InfiniBand or 10Gb Ethernet for the backend: which would Slashdot readers go with if they had to choose? Either way, all nodes will have four 1Gbps Ethernet ports. Finally, all nodes include only a basic onboard GPU. We intend to put powerful GPUs into the PCI-e slot and open up the new HPC for GPU related crunching. Any suggestions on the most powerful Linux friendly PCI-e GPU available?"

19 of 387 comments (clear)

Lost some funding? by turkeyfeathers · 2011-09-13 09:35 · Score: 5, Funny

Start with the cheapest backend that'll get the system up and running, then use your supercomputer to mine Bitcoins for a few days, then use all the money you'll make to buy the InfiniBand backend (you'll probably have enough money left over to buy Monster cables to hook everything up).
1. Re:Lost some funding? by Anonymous Coward · 2011-09-13 10:30 · Score: 3, Informative
  
  Maybe the mods are a little more aware than you of the engineering and scientific FACTS about Monster Cable. Some things that you said:
  
  Monster cables are only worth the investment for speakers and line-level / mic stuff (i.e. analogue signals). [...] But 44.1KHz 16-bit sound, converted to analogue in the transport and sent to the amp via line leads WILL benefit from Monster / premium cables, as will speaker cables of any kind.
  are, I'm afraid, complete nonsense. Counterfactual, in fact. And yes, there's real science to support that. Let me gloss over it...
  A 44.1 kHz sample rate before the DAC means the maximum frequency component the cables need to handle is 22 kHz. (This is due to the Nyquist limit, as in the Nyquist-Shannon Sampling Theorem.) 22 kHz is low. Really low. Practically any old piece of wire can carry audio frequencies with perceptually flat response across the audible range and nearly no loss as long as the cable lengths are as short as they are in a typical home stereo system. The only thing you need is large diameter wire for your speaker cables to ensure they're very low resistance so that the higher currents involved in powering a speaker don't cause resistive loss in the cable.
  As for low-power line level signals (such as CD player to amp), the most likely source of problems is actually ground loops, where the source equipment has a different ground reference than the destination. (A lesser concern is interference.) The pros don't solve this with stupid Monster Cable, they solve it by using pro equipment with balanced (differential) signaling, which both eliminates the need for the source and destination to have a common ground and provides some noise immunity.
  For home stereo systems, however, making sure that everything is grounded to the same point (3 prong plugs all plugged into a single grounded power strip) is generally good enough, and noise is rarely (if ever) a significant problem.
I call Shenanigans!!! by sconeu · 2011-09-13 09:36 · Score: 5, Insightful

No way in hell a project that big gets approved without a rationale.
And no way in hell the administrator of such a project would ask Slashdot what to do with it.

--
General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
Ummm two things by Sycraft-fu · 2011-09-13 09:36 · Score: 3, Insightful

1) Something with 10gb really isn't a "supercomputer" it is a cluster. Fine, but call it what it is. I really wouldn't call a cluster with Infiniband a supercomputer either.
2) You really should maybe get someone who knows more about your project and someone who knows more about clusters/supercomputers. The questions you are asking are not ones I would want to see form the guy making the choices on a multimillion dollar project.
Uh oh.. by joib · 2011-09-13 09:36 · Score: 4, Insightful

Shouldn't you have figured out answers too all these (simple) questions before ordering several million $$$worth of hardware? Sheesh.. As for you specific questions: - IB vs. 10GbE: IB hands down. Much better latency and more mature RDMA software stacks (e.g. for MPI and Lustre). Cheaper and higher BW as well. - GPU: NVidia Fermi 2090 cards. CUDA is far ahead of everything else at the moment.
Re:Riiiiight by PPH · 2011-09-13 09:43 · Score: 3, Funny

Happens to me when I visit Costco all the time.

--
Have gnu, will travel.
EPIC TROLLING by jpedlow · 2011-09-13 09:44 · Score: 4, Insightful

Wow, he just TROLLED THE CRAP out of slashdot. We mad, bros!
What we do ... by Anonymous Coward · 2011-09-13 09:47 · Score: 4, Informative

Similar size setup in bio-informatics in Europe. We run redhat 6.1, was centos 5 and LSF. single 1gbit to each server (blades). No need for 10gb or IB unless huge mpi which no one uses. 32GB to 2TB per node - some people like enormous R datasets. All works well for our ~500 users.
Re:So little detail by webmistressrachel · 2011-09-13 09:47 · Score: 3, Interesting

No it's not, some really ugly, nerdy guy out there has a big cock and nobody is interested in him - he can't just flop it out in public, so that might be a very real problem for him! Or maybe he does, and girls only want him for that?
Back on topic, it's not like that at all because the computer is probably real, and if not, it's just another hypothetical "Ask Slashdot" for us to fantasize over. "What would you do if you had...". What's wrong with that? Just my 2 pence!

--
This tagline was transcoded to result in at least one smirk. If you experience failure to smirk, please consult your Gen
Totally believable. by khasim · 2011-09-13 09:50 · Score: 3, Interesting

I totally believe the submitter's question.
Next up on Ask Slashdot:
I just got permission to buy the biggest fleet of trucks on the east coast ... and I was wondering if anyone on Slashdot had any ideas what I should do with them.
Followed by,
The company I work for just purchased 10,000 acres of land on the east coast and I was wondering if anyone on Slashdot had any idea what we should do with it.
Happens all the time!
1. Re:Totally believable. by blair1q · 2011-09-13 10:47 · Score: 3, Interesting
  
  Actually, it does.
  I remember taking possession of a spanking-new Thinking Machines cluster some <mumble> years ago.
  The principal investigator got it to do one particular calculation, and promised the excess would be put to good use.
  We spent our time trying to figure out what "good use" meant in that context.
  It hasn't got much easier.
  I say if you run out of numbers to crunch of your own, these days, just hook it up to some lucky grid-computing project and let it swamp the stats.
While I find this highly doubtful.... by xzvf · 2011-09-13 10:01 · Score: 3, Interesting

I've seen government institutions have unallocated money at the end of some budget cycle, that was so micro-managed that it could only be spent on a certain type of widget. I can see a university get a late grant, that had to be spent in 30 days, could only be spent on technology, that can only come out of a pre-approved catalog, and some administrative type that just saw a Top 500 super-computer list with competing university names on it, bring up in a meeting that we should build a super computer, and some grad assistant saying how easy it would be. They found a room with a window in it and ordered a bunch of parts, and will walk prospective students and their parents by it saying "This is the largest super-computer on the east coast".
1. Re:While I find this highly doubtful.... by robotkid · 2011-09-13 18:00 · Score: 3, Insightful
  
  Ever wonder why the option at the end of every damn Government spending cycle to NOT spend the money is never an option to choose? Like we have to wonder how the hell we ended up trillions of dollars in debt.
  Sad to say, I've seen Government "last-minute" spending like this too, but not exactly to this level of magnitude. This is a shitload of money "left over". This may have come from somewhere, but "budget" obviously had nothing to do with it.
  Yeah, I used to wonder that too. Then my wife got a job in state government. And the answer became painfully obvious judging by the maximum pace at which stuff gets done even when you have people willing to work hard and important problems sitting right in front of you. If you allowed unspent money to roll over indefinitely, that would create an irresistible incentive to do the cheapest job that won't get you in trouble and then hoard, hoard that money. Heck, you could stretch that 3-year project into a 5-year one by doing it very slowly. You could build up a war chest and use it on pet projects that noone approved. Or you could wait till no-one even remembers the project existed anymore and then embezzle it.
  So as inefficient as it is, the blanket rule that all money must be spent the year in which it is allocated is a simple way to increase transparency and accountability across the board. It may even be one of the driving forces anything gets done remotely on schedule in an environment where purchasing a USB cable requires 2 requisition forms, 3 vendor quotes, the signature of your boss (who is in an all-day meeting), your boss's boss (who is talking with legislators today and can't be disturbed), and pre-approval from someone in accounting (who just went on vacation yesterday).
  Of course, it would be great if getting the job done on time and under-cost were somehow rewarded. But that's incentivizing success, that's the profit maximizing, the corporate bottom line, whereas the the Gub'ment bottom line is minimizing "embarrassment" (be it from the media, the voting public, and especially legislators on the appropriations committee). You use a Gub'ment bureaucracy for things you can't trust the for-profit world to do on their own, so the service provided has to be somewhat divorced from the revenue stream if you want to ensure more reliable results than just contracting out to a private company. (I'm sure Ron Paul would beg to differ, but then again he also probably enjoys being able drink water out of the tap without getting sick). You wouldn't pay a health inspector, for example, just based on the number of sites inspected per day because that encourages as cursory a job as possible on as many sites as possible. Instead, you set a minimum quota they have to fulfill, and then make it known you'll have their head on a platter if a restaurant shows up in the news for salmonella poisoning the week after you've signed off on it. That's the Gub'ment way. .. .
Cluster software & GPU experence by PAPPP · 2011-09-13 11:11 · Score: 5, Informative

I assume this is an epic troll, but am going to give an honest answer anyway, because there are some legitimate questions buried in there.

I work with a aggregate.org a university research group which has a decent claim to having built the very first Linux PC Cluster, set some records with them (KLAT2 and KASY0 were both ours), and still operates a number of Linux clusters, including some containing GPUs, so I feel like I have some idea of the lay of cluster technology. It is *way* overdue for an update (and one is in progress, we swear!), but we also maintain TLDP's widely circulated Parallel Processing HOWTO, which was the goto resource for this kind of question for some time.

In a cluster of any size, you do _not_ want to be handling nodes individually. There are several popular provisioning and administration systems for avoiding doing so, because every organization with a large number of machines needs such a tool. The clusters I deal with are mostly provisioned with Perceus with a few ROCKS holdovers, and I'm aware of a number of other solutions (xCat is the most popular that I've never tinkered with). Perceus can pass out pretty much any correctly-configured Linux image to the machines, although It is specifically tailored to work with Caos NSA (Redhat-like), or GravityOS (a Debian derivative) payloads. Infiscale, the company that supports Perceus, releases the basic tools and some sample modifiable OS images for free, and makes their money off support and custom images, so it is pretty flexible option in terms of required financial and/or personnel commitment. The various provisioning and administration tools are generally designed to interact with various monitoring tools (ex. Warewulf or Ganglia) and job management systems (see next paragraph).
Accounting and billing users is largely about your job management system. Our clusters aren't billed this way, so I can't claim to have be closely familiar with the tools, but most of the established job management systems like Slurm, and GridEngine (to name two of many) have accounting systems built in.
The "standard" images or image-building tools provided with the provisioning systems generally provide for a few nicely integrated combinations of tools, which make it remarkably easy to throw a functioning cluster stack together.

As for GPUs... be aware that the claimed performance for GPUs, especially in clusters, is virtually unattainable. You have to write code in their nasty domain-specific languages (CUDA or OpenCL for Nvidia, just OpenCL for AMD) and there isn't really any concept of IPC baked in to the tools to allow for distributed operations. Furthermore, GPUs are also generally extroridnarly memory and memory bandwidth starved (remember, the speed comes from there being hundreds of processing elements on the card, all sharing the same memory and interface), so simply keeping them fed with data is challenging. GPGPU is also an unstable area in both relevant senses: the GPGPU software itself has a nasty tendency to hang the host when something goes wrong (which is extra fun in clusters without BMCs), and the platforms are changing at an alarming clip. AMD is somewhat worse in the "moving target" regard - they recently deprecated all 4000 series cards from being supported by GPGPU tools, and have abandoned their CTM, CAL, and Brook+ environments before settling on OpenCL, and only OpenCL. Nvidia still supports both their C
Yes, this is legit and no, we're not idiots by Supp0rtLinux · 2011-09-13 11:28 · Score: 5, Informative

For everyone that thinks I trolled slashdot... here's the quick backstory behind my question(s): Our organization received a grant to pay for this from a private philanthropist that has a medical issue that is currently being researched by one of our labs (this happens to us not to infrequently). We have an existing HPC of roughly 300 nodes and 1200 cores that's all 1Gbps connected and running Rocks 5.1. The grant money came in in two different payments. We used the first payment to buy the nodes (which are in route to arrive in 2 weeks or so). The second payment was going to pay for the GPU's and the extra infrastructure (storage is one thing we currently have plenty of... both SAN and NAS). Unfortunately, we hit two issues: 1) one of our more seasoned enterprise admins took a new job at Apple's new NC datacenter and 2) our cluster admin passed away from a heart attack about a week after the purchase was made. This put us into a bit of a holding pattern. We're in the process of replacing both of them, but in the meantime we A) have the equipment arriving soon and B) have the second round of the grant money in hand now. We're smart enough to know that we lost two very valuable resources and we decided to step back, pause, and re-evaluate. The servers are already bought. The infrastructure, interconnects, and GPU's are not. The old admin knew which GPU's he wanted; unfortunately we haven't found his research anywhere to know what and why. He had also planned to go with the latest release of Rocks, but only because he was very familiar with it. We know there are other options out there and we've no idea how well Rocks can scale. Additionally, I don't see an option for chargeback with Rocks (at least not from a Google search), plus we've heard they recently lost a core developer. Thus, we went to the Slashdot community for advice. So I've already seen some good info on the IB versus 10GbE question and its much appreciated. We're still looking for info on which Linux distro and which GPU to go for. We want to make the best decision we can and use the money as wisely as possible. But we also realize that we know what we don't know and thought the Slashdot community could provide some experience to help us make the right decisions.
1. Re:Yes, this is legit and no, we're not idiots by Anonymous Coward · 2011-09-13 12:31 · Score: 5, Funny
  
  "I've got 1200 servers shipping to me and my two best engineers are gone and we're not sure what to do with them when they get here."
  Best. IT horror story. Ever.
2. Re:Yes, this is legit and no, we're not idiots by Sgs-Cruz · 2011-09-13 16:08 · Score: 3, Interesting
  
  Are you at MIT and is your benefactor David Koch? Because in that case, we have some researchers up at the Plasma Science and Fusion Center that do simulation work that could definitely use access to a bigger cluster. As long as you can compile FORTRAN on it, the TRANSP runs and GYRO simulations that we do are already run on a (smaller) cluster. This falls under "energy research" and is way cool to boot.
  I'm not joking, if you are at MIT, please get in touch with Martin Greenwald (contact info on the PSFC staff page).
  
  --
  Karma: pi (Mostly due to circular reasoning in posts).
Re:SETI ! by Jarik+C-Bol · 2011-09-13 13:49 · Score: 3, Insightful

screw SETI, run folding@home and find the cure for cancer. We need that a little more than we need to stare at the sky, wishing someone would call from alpha centauri or some such place.

--
I've decided to Diversify my Holdings. I've divided my cash between my left and right pockets, instead of all in one.
OS, duh! by ThurstonMoore · 2011-09-13 14:17 · Score: 3, Funny

The obvious answer is Windows Server 2008 HPC.