Three-Mile-High Supercomputer Poses Unique Challenges
Nerval's Lobster writes "Building and operating a supercomputer at more than three miles above sea level poses some unique problems, the designers of the recently installed Atacama Large Millimeter/submillimeter Array (ALMA) Correlator discovered. The ALMA computer serves as the brains behind the ALMA astronomical telescope, a partnership between Europe, North American, and South American agencies. It's the largest such project in existence. Based high in the Andes mountains in northern Chile, the telescope includes an array of 66 dish-shaped antennas in two groups. The telescope correlator's 134 million processors continually combine and compare faint celestial signals received by the antennas in the ALMA array, which are separated by up to 16 kilometers, enabling the antennas to work together as a single, enormous telescope, according to Space Daily. The extreme high altitude makes it nearly impossible to maintain on-site support staff for significant lengths of time, with ALMA reporting that human intervention will be kept to an absolute minimum. Data acquired via the array is archived at a lower-altitude support site. The altitude also limited the construction crew's ability to actually build the thing, requiring 20 weeks of human effort just to unpack and install it."
Drove three of my friends over Tioga Pass in the Sierra Nevada's in the north of Yosemite...couple of them had never been out of Louisiana...between 8000 and the summit of the pass at ~10,000 ft meant me driving while everyone else suffered from altitude sickness...the only cure is to remove to a lower elevation. Having grown up in the sierras, i was used to the elevation...but if you're not acclimated, then you're going to walk 20 feet and have to sit down to rest for 10 minutes.
There are three kinds of people in the world. Those that can count, and those that can't.
134 million processors, 140 kilowatts?!?
1 miliwat per processor?
I don't think the article mentioned redundancy either day... but consider what they did: they took pre-manufactured components, hauled them up 15,000 feet and installed them... not set them up. I'm sure somewhere in this process short of hiring ALL first year grads they most likely introduced typical datacenter redundancies... load balancing, failover, arrays, etc...
The article is about the challenges posed with operating the components at such a high altitude and for people who aren't used to high altitudes, they really can't work effectively up there limiting the pool of tech support personnel you can send up there significantly.
More than one fiber would be needed. There are 50 antennas each with multiple fibers connected to the correlator. A lot of thought went into it and despite the complications it was simpler to put the correlator there than 'down the road'.
Clusters don't do load balancing in the sense that a datacenter would and the nodes don't fail over (but the switches probably should in this case). If a node fails, it gets turned off and the cluster i slightly less powerful until it is replaced.
The only IT related things from the actual article are:
Use SSDs.
Use bigger fans.
Seems kind of a waste to not put that in the blathering summary.
There are two types of people in the world: Those who crave closure
Great, now I need a new skill:
IT sherpas needed for new datacenter.
There are two types of people in the world: Those who crave closure
Or if they're having trouble finding a sysadmin, I'm available. My Spanish is decent and I have extensive experience in avoiding physical activity thus reducing the need for oxygen. Email address above!
"When information is power, privacy is freedom" - Jah-Wren Ryel