NASA Runs Competition To Help Make Old Fortran Code Faster (bbc.com)
NASA is seeking help from coders to speed up the software it uses to design experimental aircraft. From a report on BBC: It is running a competition that will share $55,000 between the top two people who can make its FUN3D software run up to 10,000 times faster. The FUN3D code is used to model how air flows around simulated aircraft in a supercomputer. The software was developed in the 1980s and is written in an older computer programming language called Fortran. "This is the ultimate 'geek' dream assignment," said Doug Rohn, head of NASA's transformative aeronautics concepts program that makes heavy use of the FUN3D code. In a statement, Mr Rohn said the software is used on the agency's Pleiades supercomputer to test early designs of futuristic aircraft. The software suite tests them using computational fluid dynamics, which make heavy use of complicated mathematical formulae and data structures to see how well the designs work.
I'll pass.
If this was written in COBOL, the replacement code would be in C#.
But a bit of googling shows that there's still more than enough justification to call it the best programming language for physics simulations.
So... there will be Fortran programmers out there. I'd suspect, though, given that it's maintained a niche in high-end physics simulation, that anyone who would program in Fortran at the level required here currently has a job doing just that, and won't have time for a major side project with an unknown probability of paying off.
VS
Given the popularity of Fortran these days amongst 'geeks' (whatever they mean by that), this challenge is essentially limited to people already working on it.
I understand, why BBC may want to explain, what FORTRAN is, but for Slashdot to spell it out reveals clumsy copy-pasting — and lousy editing.
What's with the "up to"? If I make it only twice faster, will I get anything? What if I make it 20,000 times faster — will my entry be disqualified for exceeding the specified maximum improvement?
In Soviet Washington the swamp drains you.
Hahahahahahaha.
What compiler is used on Pleiades?
"I don't know, therefore Aliens" Wafflebox1
"If you can make my simulation code run 10,000 faster, I'll give you Fifty Five Thousand dollars!"
No, you don't need a geek for this. Geeks are the comparatively artsy-fartsy kids playing with their electronics playthings. You need nerds for this. The obsessive and pedantic old code trawlers who're adept at math and code, and who can do math-y and algorithmic optimisation. If their teachers were any good they learned to work with slide rules and fortran, not java and graphing calculators. This isn't even just about age, but about outlook, mentality, and picking the right tools for the job. FORTRAN isn't just "an older language", it ended up to be the fastest way to do math short of assembly. So it's not merely a historic fart, it's the thing to use if you're doing the heavy math lifting, and working on that needs the right people.
Which NASA's management still doesn't understand, those perennial idiots with their failure to go metric, their failure to go to space, and their failure to do it safely. It's a wonder they had Engineers worth anything running around for so long. Apparently no longer. Serves them right. Stupid management gonna be stupid.
1. Run it on better hardware.
2. Re-write the compiler to optimize this code in the best way possible.
3. Re-write the code so it provides optimal input to the compiler.
4. Come up with a new algorithm.
5 and beyond: Left as an exercise to the reader.
Assuming any improvements from #1 and #2 don't "count" for this contest, that leaves you with 3 and 4.
Unless the code is brain-dead there is no way you'll get anywhere close to 10,000 improvement JUST by #3. You MIGHT get it with a combination of #3 and #1 and/or #2 vs. just #1 and #2 alone. That is to say, changes in hardware and compilers may give an opportunity to re-factor the code to get huge improvements vs. un-modified code on new compilers and new hardware.
The big win will be in #4, but only if there are better algorithms out there or someone can come up with one. As with re-factoring the code, changes in hardware and corresponding changes in compilers may turn an algorithm that was inefficient in the 1980s into something that is best-in-class today.
5 and beyond are open-ended and the sky is the limit.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
"This is the ultimate 'geek' dream assignment,"
Actually it sounds like what I call "work."
"First they came for the slanderers and i said nothing."
I entered the contest and made some modifications but it runs well over 10,000 times faster. Disqualified again! >:(
Anons need not reply. Questions end with a question mark.
Why FORTRAN is so good for this problem domain is that it is so brain dead, far-from-orthogonal, ancient, and has all these odd specifications (one of them at least used to be that a FOR loop could not be evaluated zero times).
Compilers can optimize the "stuff" out of FORTRAN; C, C++, not quite as much. The clever equivalency between pointers and array references confounds optimization "tricks" used in FORTRAN compilers.
Java may even be a better for optimization than C or C++, and there are at least some micro benchmarks where the JIT can do things that C doesn't facilitate.
I'm not dead, yet!
Optimize the algo, then profile, rewrite the part of the code that uses the most cpu in assembly. Good luck getting a 10K improvement, unless the algo or implementation is brain dead.
The 10,000 times faster is this clearly unattainable goal, but just like the NP-Complete problems used in cryptography, no one knows for sure if P == NP or if there is some clever hack.
If you submit a solution anywhere near a 10,000 speedup, these guys with HKs wearing Ninja suits will come to your house, slap a bag over your head, and you will wake up on this island where you will be assigned a number and where this menacing beach-ball device will prevent you from ever returning home.
I would like to see more of this sort of thing in the open source world. Something that operates like kickstarter but in reverse where the community puts money into features and software bounties for developers to work on to get paid to support the community in things people actually want.
Found a long standing bug that's driving everyone nuts but nobody bothers to fix? Put some money into a pot and if enough other people feel the same way a developer can collect that bounty.
Am I crazy to think this is a good idea?
Fortran is fast on problems where sophisticated algorithms don't exist. But most good algorithms can't be implemented in Fortran.
Here is the solution. Take the code. Rip out all the goto statements and replace them with real function calls. That should at least increase the speed by 75%. You're welcome.
Fortran has been a staple of high performance computing applications for decades and will continue as such. As such, there are several off the shelf tools available for profiling, optimization and vectorization, many from the vendor that includes architecture dependency. This task is something that normally would be better accomplished in-house, but also makes a clever and probably lower cost recruiting tool.
I am not a Fortran programmer, but I know how to make some code run faster (on the very same hardware).
One, is a better compiler with machine code optimizations that lead to (average) faster code execution.
Two, automated source code analysis, refactorization and optimization.
Three, hire better programmers, provided that Fortran can allow for higher effectiveness.
Four, move to a language that can take advantage of multithreading. Like Fortran 2008+ or, better, C.
Sent as ripples into the electromagnetic field. No single photon has been harmed in the process.
Unless the existing code is totally serial, and they just want someone to add MPI support or parallelise it over FPGA or GPU cores, a 10000x speedup is just silly. Or I suppose the code could just be utter crap, resulting in crazy slowness. I work for a CFD vendor, and I know they can't expect much for such a small sum on money. I'd say that converting a serial code to parallel might be possible for this sum, but who is going to spend a year doing that unless they are guaranteed payment? This will be far less than a typical CFD developer salary, so they'd have to be looking to low cost countries. If serial, it will require splitting the mesh, then passing around boundary conditions over MPI. So, you need to write tools to split the mesh for computation, and then to re-join the mesh for post-processing - that is in addition to parallelising the solver. Likewise with FPGAs or GPUs, except the skillsets are even less available, and the entry barrier even higher. While C++ can achieve a small edge in performance over 'modern' F2008 these days, re-writing parts of the solver's kernel in C++ is going to achieve nothing like this speedup - maybe a 5-15% in my experience, if you have a good developer - possibly even the opposite, if it is some physicist who doesn't even know how to write C++ properly.
I love the fact they feel they have to explain what Fortran is.
Nowadays? When dealing with a complex piece of software about which nobody else will probably care? When trying to optimise very specific parts which were most likely developed by applying well-known theories?
Come on! This is the worst kind of discrimination! This is discrimination against me! LOL
Custom Solvers 2.0 = Alvaro Carballo Garcia = varocarbas.
$55,000 is much cheaper than hiring developers to do it. It's akin to companies having a contest for a new marketing logo, and the winner get's their work used. The compensation is a line on a resume.
I don't think that the solution to the problem has anything to do with what language is used to solve the problem. From the article:
"The software suite tests them using computational fluid dynamics, which make heavy use of complicated mathematical formulae and data structures to see how well the designs work."
When I was an undergraduate I wrote a FORTRAN program for a genetics professor to calculate the distribution of butterfly markings in a wild butterfly population (I also caught butterflies, tagged them, recording their markings, and released them). The statistical problem was solved by solving partial differential equations by approximation. The equation had two complicated halves. I guessed a number to be the solution and then plugged it into both sides of the equation getting two different answers. I then plugged the difference of the two answers into another equation which gave me a guess as to where the final answer lay. I plugged the difference equation result into the first two equations and came up with a new narrower answer spread. I kept doing this loop until the difference result was less than the number of significant digits I wanted in the answer. This program was a real CPU burner, enough so that Dr. West had to have a serious discussion with the computer center about his computer budget.
Years later I was the manager of a mainframe computer center. I once toured the much larger McDonald Douglas computer center in St Louis to see what I could learn about managing a computer center. One of the large applications at McDonald Douglas was solving partial differential equations by approximation to measure airflow across a wing. Each cross section on a wing had a different answer so the more cross sections they solved the more accurate their answer was. Therefore they divided up the problem across several computers and ran several calculations in parallel.
I think that the problem described by NASA is similar to my population genetics problem or McDonald Douglas' wing air flow problem. I can easily see that by cutting down the number of iterations needed to arrive at a significant answer you can save large amounts of CPU time. From the article:
"Significant improvements could be gained just by simplifying a heavily used sub-routine so it runs a few milliseconds faster, said Nasa on the webpage describing the competition. If the routine is called millions of times during a simulation this could "significantly" trim testing times, it added."
So if I were working on the problem I would look for an answer by speeding up the approximation calculation rather than speeding up the hardware or programming language.
--------------
Steve Stites
1. They're almost assuredly already using ifort.
2. Hopefully they've run VTune against it, but if they're engineers/physicists, who knows.
3. Yes. Programmers that really know how to write HPC applications are expensive and hard to come by, though. The main reason why Fortran is the language of choice for HPC is because Fortran does not allow aliasing, which enables deeper compiler optimization. However, C (but not C++) can replicate some of this behavior with the restrict keyword.
4. Usually multithreading creates performance problems. Massively parallel programs running on general purpose cores almost always run faster with one process per core (I guess I'm assuming that it's already an MPI program), unless you can eliminate mutexes. This is not true when running on MIC, since the per-process overhead of MPI itself becomes a bottleneck.
The answer is probably rewrite the inner loop for CUDA. Maybe also some kind of adaptive mesh refinement if it's not already doing it.
Fortran compilers are already highly optimized. It is very unlikely there is significant performance gain to be found there.
Also, Fortran is so simple (relatively) that gains by refactoring doesn't work. Mind, most heavy-duty Fortran code is already very much optimized.
And, given that Fortran is almost trivially parallellized (most large scale simulations run on vector computers, thereby executing many iterations of FOR loops in concurrently) multithreading isn't the solution either.
Fortran isn't as backwater as many seem to think. It is fast and highly efficient. Never fast enough though, making it faster can make larger simulations possible.
System Architecture
Manufacturer: SGI
161 racks (11,472 nodes)
7.25 Pflop/s peak cluster
5.95 Pflop/s LINPACK rating (#13 on November 2016 TOP500 list)
175 Tflop/s HPCG rating (#9 on November 2016 HPCG list)
Total CPU cores: 246,048
Total memory: 938 TB
2 racks (64 nodes total) enhanced with NVIDIA graphics processing units (GPUs)
184,320 CUDA cores
0.275 Pflop/s total
1 rack (32 nodes total) enhanced with Intel Xeon Phi co-processors (MICs)
3,840 MIC cores
0.064 Pflop/s total
Operating Environment
Operating system: SUSE® Linux®
Job scheduler: Altair PBS Professional®
Compilers: Intel and GNU C, C++ and Fortran
MPI: SGI MPT
Full specs here.
Sounds like stone soup to me. CUDA cores, Phi coprocessors, SGI interconnects, Linux OS because nothing else in the whole wide world could talk to all of that...
Ick. Keep your prize money.
Weaselmancer
rediculous.
In 1986 the state-of-the-art CPU generation was the i386 (other Motorola and other makers had similarly-powered CPUs available), but it was new and the i286 was much more common. The Pentium 200's were about 79x faster than those (based on the popular NSI). After that, the improvements were mostly in clock rate, with the latest I7's clocking the CPU roughly 20x faster than the Pentium 200.
So that means CPU-bound Fortran code should be executing roughly 1600x faster just by recompiling it on an I7. That's before any parallelization (note that modern Fortran compilers have parallel loop constructs, so that wouldn't be tough to add, if the algorithm allows for it).
So its tempting to think you could get most (perhaps all) of the way to this $55,000 prize with a $400 CPU and a copy of the Intel Fortran compiler.
Porting this creaky code to node.js running in Docker should give at least a 10x performance boost.
-O3.
$$pls.
Why not just call it Fortran? Any geek worth his/her salt should know what Fortran is - even those who never used or studied it themselves.
Now it's like 'the AIDS' and 'the diabetes'!
Oh no, he's got 'the C++' on his resume, don't touch it.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
I thought: ah, a fun challenge.
Then I read the last sentence in the BBC article:
>The sensitive nature of the code means the competition is only open to US citizens who are over 18.
They just don't get it.
Wow so many hipster jokes at the expense of foolish aging geezer programmers ... you know the guys who got a spaceship to the moon while you were um making a mobile dating app mashup
When you try to download their software, you are taken to this page which at the bottom contains the follow text:
By accessing and using this computer system, you are consenting to system monitoring, including the monitoring of keystrokes. Unauthorized use of, or access to, this computer system may subject you to disciplinary action and criminal prosecution. [emphasis mine]
A keylogger for using your website? Microsoft hasn't even thought of that yet!
-- Political fascism requires a Fuhrer.
Thats all we ever used in Physics and Blender uses it too, to give you an idea.
Fortran is more restrictive than C, which makes it easier to optimise.
But Fortran's main advantage hasn't changed in fifty years: Fortran is the native language of people whose careers revolve around numerical simulations.
IOW, it's the same as any other language: the language itself is less relevant than the ecosystem which surrounds it: people, tools, libraries, books, etc.
I was a happy man for a very short time :(
based on the pulled out of the arse number 10000 times faster I think they are paying fifty, five thousand dollars or $250000, or is it two hundred, fifty thousand dollars - they really need to correct the title, it sounds like they are paying 10 million bucks for this speed up.
I KNOW How to do this... write it in BRAINFUCK!!
I'm probably being too sensitive here - not usually a trait that describes me - but is anyone else sick to freaking death of being called a 'coder'? I kind of hate to say this, but 'coder' sounds rather like 'data entry technician' - someone doing a mindless repetive job.
So this speedup is so vital to NASA that they decide on a contest with a tiny potential payoff that no professional would bother with. Really making yourselves look good there, NASA!
There are (justifiably expensive) experts out here that can do this kind of work but, shockingly, we need to feed our families and send our children to colleges. Rather than, I don't know, hire them the clever folks at NASA instead decide to waste time on a PR exercise that makes your organization look incompetent. By trying to crowdsource the work for near-free, this sends the message to our brightest students that expertise in scientific software is no longer a viable career in the US. When this contest fails they will have wasted time and money and further eroded the future talent pool they'll need in 10 years. Brilliant!