Stanford Uses Million-Core Supercomputer To Model Supersonic Jet Noise
coondoggie writes "Stanford researchers said this week they had used a supercomputer with 1,572,864 compute cores to predict the noise generated by a supersonic jet engine. 'Computational fluid dynamics simulations test all aspects of a supercomputer. The waves propagating throughout the simulation require a carefully orchestrated balance between computation, memory and communication. Supercomputers like Sequoia divvy up the complex math into smaller parts so they can be computed simultaneously. The more cores you have, the faster and more complex the calculations can be. And yet, despite the additional computing horsepower, the difficulty of the calculations only becomes more challenging with more cores. At the one-million-core level, previously innocuous parts of the computer code can suddenly become bottlenecks.'"
Pfft. I can simulate supersonic jet noise just by overclocking my Radeon 7970.
I don't know. The word just popped into my head.
everything is in the subject
http://Lenny.com
4 great justice!
Fwoooooooooooosh. Fwoooooooooooooooooooooooooosh. KWEEEOW. Fwooooooooooooooosh.
Try not. Do or do not, there is no try.
-- Dr. Spock, stardate 2822-3.
Pfft is my simulation of jet noice
That sounds amazingly similar to the sound I hear when slashdaughters make programming jokes.
Tic-Tac-Toe, Global Thermonuclear War, and relationships all have the same winning move.
Have gnu, will travel.
Slashdotters don't have sex, and so they cannot have slashdaughters. Ergo, slashdaughters do not exist. QED.
In Soviet Russia, Jesus asks: "What Would You Do?"
There's that sound again.
Tic-Tac-Toe, Global Thermonuclear War, and relationships all have the same winning move.
But searching for "5-d torus interconnect" gets you nothing on wikipedia. Here's the 2-dimensional version explanation: http://en.wikipedia.org/wiki/Torus_interconnect
and the K computer by Fujitsu at Riken uses a 6-d (six dimensional) torus network. So how does the 5-d torus interconnect lead to the 2**19 + 2**20 cores or possibly 2**17+2**18 cpus? I'm not seeing it in my head clearly. Off to a paper-napkin to sketch it out!
.
Each core connects 5-dimensionally going forward or back in each dimension gives 10 interconnects from one core to the 10 5-dimensional neighbors one distance away. But the number of cores is divisible only by twos and a three (factor number of cores = 3 * 2^19) so I'm not seeing the construct...
simulate the Matrix?
I can get it by flipping the switch on my amp and running a pick down the low E string on my Ibanez. They seriously needed a million processors? Sounds like government had a hand in that one.
*Repent!Quit Your Job!Slack Off!The World Ends Tomorrow and You May Die!
I believe Wikipedia still lists this as the world's fastest... which is in fact false. That title currently goes to ORNL's Titan.
I do not respond to cowards. Especially anonymous ones.
But what was the question?
You get some pretty interesting problems, when you increase the number of cores in your computer.
A couple of years ago, we replaced a 4-core IBM P5 with a 32-core HP DL 580. We tested it for a couple of months with just a user, or two, at a time. Then, we took a day and tested with the entire company (roughly 250 users). Thank goodness we did before we put it into production because, for some people, it was actually slower than the P5. It looked like it was going to be a disaster.
Fortunately, I had seen this problem before (on a Sequent Symmetry, of all things). I ran "strace" on the offending process, and sure enough, we were having problems with lock contention. We talked to our software vendor and, while it took a while for them to admit it was their problem (and probably cost us multiple thousands of dollars to have them fix it), they rewrote the code to use fewer locks. Problem solved.
Sit, Ubuntu, sit. Good dog.
I was able to calculate the noise from the jet *inside the cabin* without so much as a calculator...
Atheist: Buddhist in a Prius
"The waves propagating throughout the simulation require a carefully orchestrated balance between computation, memory and communication."
This statement seems to imply the outcome of the simulation depends somehow on the tuning of the system hardware. That has dire implications for whatever method they are using.
If a simulation becomes non-deterministic depending on how the hardware communicates, and gives different solutions to the same problem because of that, then I would say it is not a good approach to computational bogodynamics.
Most of these CFD problems are time marching problems, governed by hyperbolic differential equations. Basically the state of fluid at some point X, at time t, is influenced only by the state of the fluid prior to that time. So when they are marching from t to t+delta(t), only the solution at the previous time step matters. Even in space, only a small region at T-Delta(t) affects any give point at T. Such problems are inherently parallel in data dependency. Such problems lend themselves for parallelism. This is not to minimize what they have achieved. If it was that easy, they would have done it long time ago. Physics governed by elliptical (and to some extent parabolic) equations are not that lucky.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
I imagine that this was done using direct numerical simulation, which is considerably more accurate than any other method. Since turbulent fluid flow is inherently transient, and also involves very tiny wiggles, the only way to fully resolve what is happening is by massive computer simulations such as this one. There is no point to a mathematical approximation, since they are trying to gain some insight into the foundations of the physics of sound. This wasn't just some for-the-hell-of-it simulation.
At the one-million-core level, previously innocuous parts of the computer code can suddenly become bottlenecks.
When they say this, they mean it. To put this in perspective: with 1,572,864 cores, an application which is 99.9999% scalable will use LESS THAN HALF of the hardware! Over 60% of the hardware will be tied up waiting for that 0.0001% of serial code to execute.
This problem is explained by Amdahl's law, an important (yet depressing) observation which shows just how difficult writing an effective parallel algorithm actually is -- even when you're only writing for 4 cores.
It's cooler. Look up "compute server".
Is there a system that can handle a 3000 ship EVE online batter with no lag?
Slashdotters don't have sex, and so they cannot have slashdaughters. Ergo, slashdaughters do not exist. QED.
Slash has had sex with many, many women over the years.
I'm sure he has at least a few Slashdaughters.
Re: I'm more interesting in how the headline writer got from "1,572,864 cores" to "million core". Rounding down to the nearest million? ;>) I think the achievement was surpassing the arbitrary limit of "one million cores" in a cluster or parallel environment. The same way that people like to celebrate milestones of 10^3 somethings or multiples of {365,365,365,366) added together in ratios of approximately 4 to 1. And yes, that does (or should) make you "more interesting"! (you said "I'm more interesting..." rather than "I'm more interested in")
in ratios of approximately 4 to 1
Shouldn't that be in ratios of 3 to 1 approximately? Responding to myself to catch the error of leap year frequency!
My first tower desktop left dark dust-marks against the wall where the fans were. Told my parents that I forgot to turn the after-burners off after take-off.
Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
Such simulations using double precision for accuracy. You get precision problems if you just used 32-bit floating-point otherwise the tiny differences between approximated values will amplify over every time-step. The goal of this project was to model turbulence and how it could be reduced by adding grooves to the engine exhausts. Turbulence is almost fractal in nature - the closer you look at any volume in space, the smaller the vortex tubes get, right down to atoms spinning round each other. Because there is more turbulence closer to the surface of the aircraft, they use multi-grid methods where the volume of space right next to the aircraft has the highest grid resolution, down to the nearest centimetre or lower.
So they needed over 1 million processing cores to model a volume of space that would contain the entire airframe down to the engines and wheels (250 metres x 100 metres x 20 metres) at centimeter resolution. You just wouldn't be able to get a desktop PC to store all that data - it would be in the range of several hundred gigabytes.