A C++ Library That Brings Legacy Fortran Codes To Supercomputers

Code... by gigaherz · 2013-09-21 04:39 · Score: 3, Informative

...like rice, is not countable. At least not since I learned the word.

Re:Code... by Nerdfest · 2013-09-21 04:40 · Score: 2

It really does lower one's opinion towards the author. If I read TFAs, I wouldn't read this one.
Re:Code... by john.burton1765 · 2013-09-21 05:05 · Score: 3, Insightful

I couldn't agree more. Although the word "codes" is usually a red flag not to bother reading any more in any article or question
Re:Code... by jeremyp · 2013-09-21 05:28 · Score: 2

Not to mention the fact that the author has erased history (well, the summary implies the author has erased history - I haven't read TFA) because the Cray 1 had a vector processing unit and a specially designed compiler to make use of it, and the compiler was for Fortran. This was in 1978 when C++ didn't even exist.

--
All I want is a secure system where it's easy to do anything I want. Is that too much to ask ~~ Randall Munroe
Re:Code... by Livius · 2013-09-21 06:27 · Score: 2

Code is a mass noun, and it's number is indeterminate, neither singular nor plural.
Re:Code... by boristhespider · 2013-09-21 07:03 · Score: 3, Insightful

Oh now, that's a bit harsh. Programming in Fortran isn't something done because people are afraid of work. I genuinely get tired of the incessant Fortran-bashing by people who -- in my experience, at least -- have almost never, if ever, actually used the language or seen why other people do. In most cases they seem to be repeating jokes their lecturers made about the language, jokes that were first written back when FORTRAN 77, with those stupid capitals and all, was the dominant form.
Now, I'm very much not a fan of F77. In fact, I hate the language. It's clunky and decrepit and not suited for modern programming practices. But it's easy to call from later Fortran standards, and each one has vastly improved the situation. Fortran 2008 is a genuinely nice language. True, it's not OO - though you can force it to act almost as if it is - but not everything has to be forced into OO. What it is is extremely good for numerical work, and dealing with arrays in particular in Fortran is a dream after, say, C, or even C++11. The fact it calls F77 routines without effort or pain also helps, since there genuinely is a vast body of code still in F77. (The oldest I came across was F66, ported directly from Fortran IV. Now that really did need to be rebuilt in something approaching a sane language.)
I'm not saying F2008 is "better" than either C or C++11 -- that's a meaningless statement. But there are things that make it a very nice language to use, and other things -- character strings, I'm looking at you -- that make it distinctly unpleasant. Same as any other language, really.
Re:Code... by Thomasje · 2013-09-21 08:06 · Score: 2

I studied math in college, and many numerical algorithms textbooks refer to software as "codes". It seems to be common practice in the computational mathematics world. I assume it goes back to the days before Fortran, before high-level languages in general, when source code literally consisted of a series of codes.
Re:Code... by cas2000 · 2013-09-21 12:47 · Score: 3, Informative

actually, "codes" is common usage amongst researchers and has been since at least the 1970s.
most of them are not programmers or geeks or computer scientists, they're researchers or academics or post-grad students who happen to do a little programming or simply use someone else's "codes".
it used to make me cringe every time i heard it when working with academics and researchers and on HPC clusters, but then i got used to it and stopped caring.
and, really, they're not interested in a lecture or why it's a dumb usage of the word. they've got stuff they want to get done ("codes to run") and don't give a damn.

Re:It never ceases to amaze me by Mitchell314 · 2013-09-21 04:50 · Score: 3, Insightful

In old codes, you're already familiar with the existing quirks and bugs, and the base is heavily patched up from years of debugging.

--
I read TFA and all I got was this lousy cookie

Very limited scope by RoverDaddy · 2013-09-21 04:51 · Score: 2

I took a look at TFA and followed up by reading the description of LibGeoDecomp:

If your application iteratively updates elements or cells depending only on cells within a fixed neighborhood radius, then LibGeoDecomp may be just the tool you've been looking for to cut down execution times from hours and days to minutes.

Gee, that seems like an extremely limited problem space, and doesn't measure up at all to the title of this Slashdot submission. It might really be a useful tool, but when I clicked to this article I expected to read about something much more general purpose, in terms of 'bringing Legacy Fortran to Supercomputers'.

By the way, regarding the use of the word 'codes': I don't think English is the first language of this developer. Cut some slack.

--
RETURN without GOSUB in line 1050

Re:Very limited scope by Mitchell314 · 2013-09-21 05:01 · Score: 2

AFAIK a lot of simulation problems are centered around 'update node based on neighbors', like particulate dispersal or flux.

--
I read TFA and all I got was this lousy cookie

Re:It never ceases to amaze me by Anonymous Coward · 2013-09-21 04:55 · Score: 3, Insightful

Fortran is by no means outdated. Seriously, check out the new Fortran 2008 standard and its state-of-the-art compilers (e.g. the NAG one).
You'll be blown away by its speed and clean looking code. C++ might have features that fortran lacks (complex template usage seems rather popular), but that doesn't always reduce the development time. At least that my experience.
As long as you're working on scientific projects, fortran is practically unmatched.

Re:It never ceases to amaze me by mjwalshe · 2013-09-21 05:21 · Score: 3, Insightful

have you any idea how much it woudl cost to port it to cludgy C++ (which lacks a lot of things needed for scientific computing) you then have to re-qualify All of your models which is both time and resource intensive.

Author here. by gentryx · 2013-09-21 05:36 · Score: 3, Informative

The IEEE and Los Alamos National Laboratory seem to have a different opinion on this. And even the Oxford dictionary knows the use of codes. But surely those guys can't even spell gigahertz.

--
Computer simulation made easy -- LibGeoDecomp

Re:Author here. by boristhespider · 2013-09-21 06:42 · Score: 2

I'd suggest you don't be so pious. I'm for protecting the language as much as anyone else but ultimately it evolves. I don't think this is really about the people employing numerical techniques in science becoming "more poorly educated"; I think it's about your field branching out and attracting new jargon and new uses for the old jargon. It's just what happens.
As it happens, I've spent close to ten years in academia where we build "codes" (typically in Fortran -- 90 or more recent if you were lucky; 77 if you weren't) to solve problems. I have since moved into professional development, chiefly in C++ and occasionally in C#. The use of the spoken language changes and the two fields have different ways to express the same concepts. Ultimately I don't really see a major problem with this.
All that said, misuse the word "corn" and I begin to get extremely irritated so I probably shouldn't be so pious myself... :)
[In British English "corn" doesn't mean maize, it means wheat, or occasionally barley -- more accurately, it means the chief arable crop of an area. When English speakers settled North America, that chief arable crop was maize, hence the American usage. What annoys me isn't Americans calling maize "corn" -- which is entirely valid in both North America and in their dialect of English -- but rather the *British* thinking that corn is maize. It isn't, it's wheat. I also wish they'd get off my God damned lawn.]

Re:Modern Fortran by rubycodez · 2013-09-21 05:45 · Score: 2

more than adequate, Fortran is still the most optimizable language for high performance numeric computation, moreso than C and derived languages

Very limited indeed by gentryx · 2013-09-21 06:03 · Score: 4, Informative

I took a look at TFA and followed up by reading the description of LibGeoDecomp:

If your application iteratively updates elements or cells depending only on cells within a fixed neighborhood radius, then LibGeoDecomp may be just the tool you've been looking for to cut down execution times from hours and days to minutes.

Gee, that seems like an extremely limited problem space, and doesn't measure up at all to the title of this Slashdot submission. It might really be a useful tool, but when I clicked to this article I expected to read about something much more general purpose, in terms of 'bringing Legacy Fortran to Supercomputers'.

Correct. We didn't try to come up with a solution for every (Fortran) program in the world. Because that would either take forever or the solution would suck in the end. Instead we tried to build something which is applicable to a certain class of applications which is important to us. So, what's in this class of iterative algorithms which can be limited to neighborhood access only?

cellular automata
stencil codes
Lattice Boltzmann methods for computational fluid dynamics (technically a subclass of stencil codes)
Particle in cell codes
Short-ranged n-body simulations

It's interesting that almost(!) all computer simulation codes fall in one of the categories above. And supercomputers are chiefly used for simulations.

By the way, regarding the use of the word 'codes': I don't think English is the first language of this developer. Cut some slack.

Thanks :-) You're correct, I'm from Germany. I learned my English in zeh interwebs.

--
Computer simulation made easy -- LibGeoDecomp

Old and kludgy makes it harder to port. by Ungrounded+Lightning · 2013-09-21 06:31 · Score: 2

Not only does it cost a LOT to port this stuff and risk errors in doing so, but the cruftier it is the harder (and more expensive and error-prone) it is to port it.

If, instead, you can get the new machines to run the old code, why port it? Decades of Moore's Law made the performance improve by orders of magnitude, and the behavior is otherwise unchanged.

If you have an application where most of the work is done in a library that is largely parallelizable, and with a few tiny tweaks you can plug in a modern multiprocessor-capable library and run it on a cluster, you get another factor of almost as-many-processors-as-I-decide-to-throw-at-it, with small effort and negligible chance of breaking the legacy code.

What a deal!

And it's one less reason to touch the tarbaby of the rest of the working legacy code.

Let the COMPUTER do the work. People are for setting it up - with as little effort as practical - and moving on to something else that is important and can't yet be automated.

Eventually somebody will teach the computers to convert the Fortran to a readable and easily understandable modern language - while both keeping the behavior identical and highlighting likely bugs and opportunities for refactoring. Until then, keeping such applications in the legacy language (unless there's a really good reason to pay to port them) is often the better approach - both for economy and reliability.

--
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way

Fortran works fine with MPI by poodlediagram · 2013-09-21 06:48 · Score: 5, Informative

...and has done for years.

We write a scientific code for solving quantum mechanics for solids and use both OpenMP and MPI in hybrid. Typically we run it on a few hundred processors across a cluster. A colleague extended our code to run on 260 000 cores sustaining 1.2 petaflops and won a supercomputer prize for this. All in Fortran -- and this is not unusual.

Fortran gets a lot of bad press, but when you have a set of highly complex equations that you have to codify, it's a good friend. The main reason is that (when well written) it's very easy to read. It also has lot's of libraries, it's damn fast, the numerics are great and the parallelism is all worked out. The bad press is largely due to the earlier versions of Fortran (66 and 77), which were limited and clunky.

In short, the MPI parallelism in Fortran90 is mature and used extensively for scientific codes.

Codes code as peoples people by amaurea · 2013-09-21 06:57 · Score: 2

I, too, work in HPC computing, and while I found "codes" very jarring to begin with, I've learned to live with it. I am not sure the "code" vs. "codes" issue it is more grammatically problematic than "people" vs. "peoples". A people (countable) is made up of people (uncountable). Similarly "a code" (countable, but nonstandard) is made up of code (uncountable). Personally I would use "a program" or "a library" instead of "a code", though.

Another related issue is whether "data" is countable or not. I'm used to it being uncountable, with there being more or less of it, but not "several data". But scientific journals in my field prefer the countable version "a datum", "several data", which is arguably more historically correct. That, too, took some getting used to.

Re:It never ceases to amaze me by boristhespider · 2013-09-21 07:19 · Score: 4, Funny

Not at all. It might be a bit more monocultured than, say, C++ but there are still more than enough ways to skin the same cat that you end up with a ton of cat parts and a mass of confusion.

You only have to rewrite it a *little* bit by msobkow · 2013-09-21 07:26 · Score: 4, Insightful

You don't have to rewrite your code entirely, just a little bit.

You only have to restructure the subroutines and change the syntax.

Well, that sounds like rewriting to me. Just because there is a library that might implement the same semantics as FORTRAN's math does not mean that it isn't a rewrite, coming with all the risks for new errors and gotchas that that implies.

--
I do not fail; I succeed at finding out what does not work.

interesting tool, misleading summary by excelsior_gr · 2013-09-21 08:02 · Score: 4, Interesting

It is true that there are a lot of legacy Fortran codes in scientific computing, but chances are that they are already parallel, so this tool won't be much of a use for those supporting them. OpenMP and MPI have been in use in Fortran codes for decades. The summary seems to think that legacy Fortran codes need saving and porting. They don't. They are just fine, number crunching faster than you can say DO CONCURRENT.

Having said that, LibGeoDecomp seems quite nice if you find a piece of serial code and you want to make a rough parallel version of it without much hassle. But if you are writing new code, you can parallelize it natively. Nevertheless, I believe that we must focus our resources in developing the current compilers. The Compaq compiler died in the hands of HP and people moved mostly to the intel compiler, since the open-source community was focused in C++ at the time and the gcc was stuck with the obsolete g77. Then g95 came along, that brought us all the cool stuff of Fortran 90/95, while gfortran was being developed. Now gfortran seems decent, but it still has to match the speed of ifort in order to sit at the cool kids' table. Also, we need the features of the latest Fortran standards. I would gladly use a compiler that is feature-complete, even if the executables are relatively slow, because I will be able to switch into the mindset of the Fortran2008 standard and stop doing things the Fortran95-way while coding. They will then have all the time they need to make it more efficient.

In my personal experience... by tlambert · 2013-09-21 08:10 · Score: 2

In my personal experience...

Most of the physics code in FORTRAN that I've dealt with are things like relativistically invariant P-P and N-P particle collision simulations in order to test models based on the simultaneous solution to 12 or more Feynman-Dyson diagrams. It's what was used to predict the energy range for the W particle, and again for the Higgs Boson, and do it rather reliably.

The most important part of this code was reproducibility of results, so even though we were running Monte Carlo simulations of collisions, and then post-constraining the resulting pair productions by the angles and momentum division between the resulting particles, the random number stream had to be reproducible. So the major constraint here was that for a reproducible random stream of numbers, you had to start with the same algorithm and seed, and the number generation had to occur linearly - i.e. it was impossible to functionally decompose the random number stream to multiple nodes, unless you generated and stored a random number stream sufficient to generate the necessary number of conforming events to get a statistically valid sample size.

So, it was linear there, and it was linear in several of the sets of matrix math as it was run through the diagrams to filter out pair non-conforming pair production events.

So we had about 7 linearity choke-points, one of which could probably be worked around by pre-generating a massive number of PRNG output far in excess of what would be eventually needed, and 6 of which could not.

The "add a bunch of PCs together and call it a supercomputer" approach to HPC only works on highly parallelizable problems, and given that we've had that particular capability for decades, the most interesting unsolved problems these days are not subject to parallel decomposition (at least not without some corresponding breakthroughs in mathematics).

I converted a crap-load of FORTRAN code to C in order to be able to optimize it for Weitek vector processors plugged into Sun hardware, including the entire Berkeley Physics package, since that got us a better vector processor than was on the Cray and CDC hardware at Los Alamos where the code was running previously, but adding a bunch of machines together would not have improved the calculation times.

Frankly, it seems to me that the available HPC hardware being inherently massively parallel has had a profound effect on constraining the problems we try to solve, and that there are huge, unexplored areas that are unexplored for what amounts to the equivalent of someone looking for their contact lens under the streetlight, rather than in the alley where they lost it, "because the light's better".

Re:It never ceases to amaze me by mbkennel · 2013-09-21 08:12 · Score: 2

" I can't imagine what it would be like trying to bolt together a dozen or more different utility libraries each using their own favorite blend of parallel processing API's."

In Fortran you don't. Fortran has the mathematically expected parallel constructions built into the language, and the compiler directives commonly used before things are entirely in the language were reasonably standard.

I think Fortran is very good for quantitative programming and I regret that in my commercial enterprise it is essentially forbidden as alien.

Re:Its code not codes FFS by oursland · 2013-09-21 08:30 · Score: 2

Then you're likely a waste of time and detriment to your team.

I have had conversations with some of my friends who work on the peta-scale clusters and thought much the same as you. But, it turns out, when you're working with that level of system, you're probably addressing some small part of a much, much larger problem that has been largely solved. The existing code that performs 99.9% of your task is written in Fortran and actively developed by a very successful team of researchers. Attempting to rewrite the working, debugged, code so you can work in your favorite language today is not only impossible, but would likely get you removed from the team.

Re:Its code not codes FFS by jythie · 2013-09-21 09:46 · Score: 2

Not really. When projects are already dominated by a particular language, esp projects that can have decades or more of legacy design to them, programmers who want to come in and rewrite perfectly good subsystems in their preferred language are not all that well looked upon.

Re: Its code not codes FFS by cwebster · 2013-09-21 10:00 · Score: 3, Informative

Please don't learn FORTRAN, learn Fortran instead. (For the pedantic, all caps is F77. Normal caps is F90 and later.)

Re:FUD by stenvar · 2013-09-21 10:47 · Score: 3, Informative

Care to backup those claims with actual code/numbers?

You claim to be writing high performance code and you don't understand the difference between Boost multi-array and Fortran arrays? I'm sorry, but if you do any kind of high performance computing, you should at least have a decent understanding of one of the major tools used for it, namely modern Fortran. Once you do, you can then make an informed choice, instead of behaving like an immature language zealot.

Here are two places you should start looking:

http://en.wikipedia.org/wiki/Fortran_95_language_features#Arrays_2

http://en.wikipedia.org/wiki/High_Performance_Fortran

(The Fortran code on libdecomp.org is cringe-inducing and inefficient.)

And, FWIW, I'm primarily a C++ programmer, because that's what the market demands, not a Fortran programmer, but at least I know my tools and their limitations.

My experience is that if you use C++ correctly, you get code which at least matches Fortran code.

If you use C, assembly, or Java "correctly", you can usually match Fortran code. That is entirely not the point.

Re:Its code not codes FFS by Khashishi · 2013-09-21 14:10 · Score: 2

That's because you aren't doing development on computationally expensive simulation codes that run on supercomputers. Because then you would use FORTRAN. C++ is such a memory hog, and the memory overhead scales with the number of processors. In FORTRAN, you only allocate what you need to use, and that's important when working with large arrays. Java and Ruby are out of the question.

FORTRAN is not obsolete, because there are currently no other languages that can fill the role. When running simulations that take 100000+ cpu-hours, it's worth the extra coding effort to write it in FORTRAN. Assembly language isn't being considered because generally, these codes need to run on different supercomputers which all have unique architecture. Therefore, optimizing compiling scripts exist for each supercomputer for use with FORTRAN.

Slashdot Mirror

A C++ Library That Brings Legacy Fortran Codes To Supercomputers

30 of 157 comments (clear)