Why Scientists Are Still Using FORTRAN in 2014
New submitter InfoJunkie777 (1435969) writes "When you go to any place where 'cutting edge' scientific research is going on, strangely the computer language of choice is FORTRAN, the first computer language commonly used, invented in the 1950s. Meaning FORmula TRANslation, no language since has been able to match its speed. But three new contenders are explored here. Your thoughts?"
A: Legacy code.
Scientists work in formulas. Fortran was designed to do things naturally that don't fit into C/C++, Python, whatever.
Why not?
Actually that is a serious question, for these sorts of applications there seems to be no significant downside.
1) Modern Fortran is not all uppercase ....
2) Modern Fortran does not have to start on column 7
3) Modern Fortran has dynamic memory allocation
4) Modern Fortran can use the same types as C (maximizes interoperability), hence can be called where C might be called
5) Modern Fortran has an objects, polymorphism, etc.
6) Modern Fortran has (a limited form of) pointers
7) Modern Fortran has concise array/vector/matrix operations
8) Modern Fortran has dynamically allocatable, multidimensional arrays that can be indexed starting with any integer
8) Modern Fortran supports the complex type without higgery-jiggery
9) Modern Fortran doesn't *need *pointers *in *all *the *places *that &C does, pass by reference is the norm
10) Modern Fortran is blazingly fast and designed for sciene
Some folks still write in Fortran 77, and the tired tales of woe that are bound to come from a language specification that is many decades old.
But, that code/style still works, and who am I to judge how you want to get your work done?
When you go to any place where 'cutting edge' scientific research is going on, strangely the computer language of choice is FORTRAN, the first computer language commonly used, invented in the 1950s.
Perhaps it's still the best tool for the job. Why is that strange? Old(er) doesn't necessarily mean obsolete -- and new(er) doesn't necessarily mean better.
It must have been something you assimilated. . . .
No, not just "legacy code." Fortran (yes, that's how it's spelt now, not "FORTRAN") was designed to be highly optimizable. Because of the way Fortran handles such things as aliasing, it's compilers can optimize expressions a lot better than other languages.
At least Slashdot seems to encourage re-use of commonly used responses when a question is asked.
Have gnu, will travel.
This. I have many friends in the physics dept and the reason they're doing Fortran at all is that they're basing their own stuff off of existing Fortran stuff.
What amused me about the article was actually the Fortran versions they spoke about. F95? F03? F08? Let's be real: just about every Fortran code I've heard of is still limited to F77 (with some F90 if you're lucky). It just won't work on later versions, and it's deemed not worth porting over, so the entire codebase is stuck on almost 40 years old code.
If the language accomplishes the task efficiently and effectively with no apparent downside then why attempt to switch languages simply for the sake of switching?
Furthermore, an ability to run legacy code should be sustained especially in science where being able to use that code again after many years might save scientists from having to reverse engineer past discoveries.
I've decided to stop wasting my time responding to AC trolls/sockpuppets... so if you want a response from me... login.
A: Legacy code, and because Fortran 2003+ is a very good modern language for scientific computation and maps very naturally to problems. As it turns out, the language semantics (both legacy and modern constructs) make it very good to parallelize. And it runs fast, as in, equalling C++ level of performance is considered a weak showing.
If you haven't seen or used modern Fortran and think it's anything like Fortran 66/77 then you're mistaken. Except for I/O, which still tends to suck.
In addition there are still some seemingly trivial but actually important features which make it better than many alternatives (starting from Fortran 90).
There's some boneheaded clunkers in other languages which Fortran does right: obviously, built-in multi-dimensional arrays, AND, arrays whose indices can start at 0, 1 (or any other value) and of course know their size. Some algorithms are written (on paper) with 0-based indexing and others with 1-based and allowing either one to be expressed naturally lowers chance of bugs.
Another one is that Fortran distinguishes between dynamically allocatable, and pointers/references. The history of C has constrained/brain-damaged people to think that to get the first, you must necessarily take the second. That doesn't happen in Fortran, you have ALLOCATABLE arrays (or other things) for run-time allocation of storage, and if you need a pointer (rarer) you can get that too. And Fortran provides the "TARGET" attribute to indicate that something *may be pointed to/referenced*, and by default this is not allowed. No making pointers/references to things which aren't designed to be referred to multiple times. This also means that the aliasing potential is highly controlled & language semantics constructed to make Fortran able to make very aggressive, and safe, optimization assumptions.
The more parallel you want, the more of these assumptions you need to get fast code, and naturally written Fortran code comes this way out of the box than most other languages.
Legacy code that has been carefully checked to give correct results under a wide range of conditions.
If it ain't broke - don't fix it.
#DeleteChrome
Huge libraries of FORTRAN code have been formally proven. New FORTRAN code can be formally proven. Due the limitations of the language, it is possible to put the code through formal processes to prove the code is correct. In addition, again, as a benefit of those limitations, it is very easy to auto-parallelize FORTRAN code.
"To those who are overly cautious, everything is impossible. "
A: Legacy code.
AKA battle hardened libraries that work as advertised.
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
Large scale models handling huge arrays, though - like climate or weather modeling - I think that's where Fortran has always been king of the roost.
The whole point is speed. No one's working in Python if they're interested in speed.
#DeleteChrome
no, used because Fortran is the high level language that produces the fastest code for numeric computation, it is by far the most optimizable. Yes, it blows away C.
Precision is important in scientific discourse. Latin isn't a language with creeping grammar and jargon. It's sorta what Esperanto only wished it could ever be.
I am sticking with Visual Basic 6
A big problem is that C and C++ don't have real multidimensional arrays. There are arrays of arrays, and fixed-sized multidimensional arrays, but not general multidimensional arrays.
FORTRAN was designed from the beginning to support multidimensional arrays efficiently. They can be declared, passed to subroutines, and iterated over efficiently along any axis. The compilers know a lot about the properties of arrays, allowing efficient vectorization, parallization, and subscript optimization.
C people do not get this. There have been a few attempts to bolt multidimensional arrays as parameters or local variables onto C, (mostly in C99) but they were incompatible with C++, Microsoft refused to implement them, and they're deprecated in the latest revision of C.
Go isn't any better. I spent some time trying to convince the Go crowd to support multdimensional arrays properly. But the idea got talked to death and lost under a pile of little-used nice features.
Firstly... 10^-15 is WAY beyond what most scientific codes care about. Most nonlinear finite-element codes generally shoot for convergence tolerances between 1e-5 and 1e-8. Most of the problems are just too hard (read: incredibly nonlinear) to solve to anything beyond that. Further, 1e-8 is generally WAY beyond the physical engineering parameters for the problem. Beyond that level we either can't measure the inputs, have uncertainty about material properties, can't perfectly represent the geometry, have discretization error etc., etc. Who cares if you can reproduce the exact same numbers down to 1e-15 when your inputs have uncertainty above 1e-3??
Secondly... lots of the best computational scientists in the world would disagree:
http://www.openfoam.org/docs/u...
http://libmesh.sourceforge.net...
http://www.dealii.org/
http://eigen.tuxfamily.org/ind...
http://trilinos.sandia.gov/
I could go on... but you're just VERY wrong... and there's no reason to spend more time on you...
APL-style languages should be even more optimizable, since they use higher-order array operators that make the control flow and data flow highly explicit without the need to recover information from loopy code using auto-vectorizers, and easily yield parallel code. By this logic, in our era of cheap vector/GPU hardware, APL-family languages should be even more popular than Fortran!
Ezekiel 23:20
Also "legacy training". Student learns from prof. Student becomes prof. Cycle repeats.
Also Fortran didn't stagnate in the 60s, it's been evolving over time.
Other languages are highly optimizable too. However most of the new and "cool" languages I've seen in the last ten years are all basic scripting languages, great for the web or It work but awful for doing lots of work in a short period of time. It's no mystery why Fortran, C/C++, and Ada are still surviving in areas where no just-in-time wannabe will flourish.
Well, we live in a somewhat different world today, given that suitable HW for that is virtually everywhere. But just to be clear, I'm not suggesting anyone should adopt APL's "syntax". It's more about the array language design principles. Syntax-wise, I'd personally like something along the lines of Nile, with math operators where suitable, and with some type inference and general "in-language intelligence" thrown into the mix to make it concise. I realize that depriving people of their beloved imperative loops might seem cruel, but designing the language in a way that would make obvious coding styles easily executed on vector machines seems a bit saner to me than allowing people to write random loops and then either hope that the vectorizer will sort it out (they're still very finicky about their input) or provide people with examples what they should and shouldn't be writing if they want it to run fast.
Ezekiel 23:20
People using existing Fortran code are interested in the RESULTS of the computation, not whether the code is modern or has the latest bells and whistles. Programmers forget that the ultimate goal is for someone to USE the program. I wrote a program in CDC Fortran 77 in 1978 that's still being used, Why? Because it does the job.
For years and years and years the Gnu G95 compiler was only a partial implementation of the language. This made it impossible to use without buying a complier from intel or absoft or some other vendor. It chokes the life out of it for casual use.
Personallyt I really like a combination of F77 and python. Whats cool a bout it is that F77 compiles so damn fast that you can have python spit out optimized F77 for your specific case sizes. Then for the human interface and dynamic memory allocation and glue to other libraries you can use python.
Some drink at the fountain of knowledge. Others just gargle.
I would also hazard a guess that Fortran tends to be a tad easier to read than C... Especially for scientists...
This is absolutely right. It's also easier to write, in many cases. Most scientific applications don't need things like lambda expressions or derived classes. Many people who write applications as tools in their research don't want to spend time learning esoteric aspects of languages.
Please... it's spelt "c" now.
Dark Reflection
Blabbing on and on about vector based GPUs is idiocy, because not everything uses trig where vector based processing is beneficial. I have no confidence you have ever seen math intensive code based on what you are talking about. Nile from their own page is # The Nile Programming Language ## Declarative Stream Processing for Media Applications and NOT a language for Math.
I find this view quite amusing, given that the whole scope of the VPRI project (of which Nile has been of the intermediate results) is to reduce everything in common personal computing into mathematics. And Nile in particular was designed precisely and explicitly to allow the VPRI people to express as wide an array of graphical operations using as short a high-level description as possible - a mathematical description, in equational form, to allow them to express the majority of Cairo (or any other Cairo-like 2D library) in a few hundred lines of these equations. Furthermore, nowhere have I made the claim that the semantics of Nile in its current form is a perfect replacement for any language for scientific computation, as opposed to the thought that there could be some lessons to be learned.
And why don't you log in? There seem to be quite a few anonymous psychotic individuals running around here recently. It makes the conversation feel quite disingenuous.
Ezekiel 23:20
Also "legacy training". Student learns from prof. Student becomes prof. Cycle repeats.
Not really - even when I was a student we ditched F77 whenever we possibly could and used C or C++. The issue is more legacy programmers. Often the person in charge of a project is a older person who knows FORTRAN and does not want to spend the time to learn a new language like C (or even C++!). Hence they fall back into something more comfortable.
However by now even this is not the case. The software in particle physics is almost exclusively C++ and/or Python. The only things that I am aware of which are still FORTRAN are some Monte-Carlo event generators which are written by theorists. My guess is that as experimentalists even older colleagues have to learn C++ and Python to use and program modern hardware. Theorists can get by using any language they want and so are slower to change. Certainly it has probably been at least 15 years since I wrote any FORTRAN myself and even then what I wrote was the code needed to test the F77 interface to a rapid C I/O framework for events which was ~1-200 times faster than the F77 code it replaced.
My previous supervisor decided to fork our Fortran code for performing quantum mechanical calculations. We'd worked on it for more than half a decade and it was world-class.
.xml .xml), and it only compiles after a lot of work
... and I feel the same way about CS graduates and Fortran. They have no idea about the physics or maths involved (which is the difficult part), so the do the only thing they know which is to 'modernize' everything, making it into an incomprehensible, ungodly mess.
He handed it over to a computer science graduate (i.e. a non-physicist) who really liked all the modern trends in CS. Now, five years later:
1. the tarball is an order of magnitude larger
2. the input files are now all impenetrable
3. the code requires access to the outside (not possible on many superclusters)
4. he re-indented everything for no apparent reason
5. the variable names were changed, made into combined types and are much longer
6. as a result, the code is basically unreadable and nearly impossible to compare to the original formulae
7. code is duplicated all over the place
8. it now depends on unnecessary libraries (like the ones required to parse
9. it's about four times slower and crashes randomly
10. it generates wrong results in basic cases
To quote Linus Torvalds: "I've come to the conclusion that any programmer that would prefer the project to be in C++ over C is likely a programmer that I really *would* prefer to piss off, so that he doesn't come and screw up any project I'm involved with."
Fortran, apart from being a brilliant language for numerical math, has the added benefit of keeping CS graduates at bay. I'd rather have a physicist who can't program, than a CS type who can.
(Apologies to any mathematically competent computer scientists out there)
The big thing Fortran has over C is proper support for multidimensional arrays, with powerful slicing operations built into the language. It was the inspiration for numpy arrays. My first languages were C++ and C, but when I do scientific programming, my languages of choice are now python and fortran (with f2py making it very easy to glue them together). Fortran is horrible at text processing, and has an almost absent standard library, but for scientific use, good arrays make up for that - especially when you can use python in the non-performance-critical parts.
C++ has some multidimensional array classes, but none of them are as convenient as fortran arrays. Especially when it comes to slicing. At least that's how it was the last time I checked.
Fortran has been an "APL style language" since Fortran 95, with most of the APL operations present. That was done both for optimization and for convenience. And other APL-style languages are very popular as well, foremost MATLAB.
ALL CAPS has been optional since 1990, at least.
Fortran has had modularisation, structured code since 1990, Classes and object-orientated since 2003. Please update your prejudices.
Anyone who believes exponential growth can go on forever in a finite world is either a madman or an economist
Wow, faster AND more accurate. They must use some mystical floating-point instructions that only Fortran compiler writers know about.
On PPC implementations, head-tail floating point is typically used for "long double"; this leads to inaccuracies in calculations. 80 bit Intel floating point is also inaccurate. So are SSE "vector" instructions, since denormals, NaNs, INFs, and -0 are always suspect unless you compiler emits an extra instruction in order to trigger the "next instruction after" signalling of the condition, and for NaNs, you are still somewhat suspect there.
If it isn't IEEE-754 compliant, you pretty much can't trust it. FORTRAN goes way the heck out of its way, including issuing additional instructions and introducing pipeline stalls, in order to force IEE-754 compliance.
Pretty much this accuracy only matters if you are doing Science(tm); if you are doing graphics, you are generally willing to eat the occasional FP induced artifact, because what you typically care about is the frame rate in your game, rather than being 100% accurate.
So, in closing, they're not using "some mystical floating-point instructions", they are just using accurate floating point, rather than approximate floating point.
Isn't the main performance benefit that Fortran has always claimed over C/C++ the fact that an array is guaranteed to only be used from one thread at a time, and thus you don't have to re-read from memory to registers each time you want to do something with the data in the array? A capability that was formally added to C in C99 (and pretty much universally informally added to C++) with the restrict keyword?
Correct me if I'm wrong here, as I'm not a Fortran programmer.
By a scallop's forelocks!
He handed it over to a computer science graduate (i.e. a non-physicist) who really liked all the modern trends in CS.
Why was a graduate fresh out of university put in charge of architecture decisions? You wouldn't put an apprentice in charge of a mechanical workshop and expect them to keep it tidy and efficient, this is no different.
It's my general experience that it takes 5-10 years of commercial experience before someone is capable of making wise architecture choices about small standalone apps, and 15+ before they'll have a hope in hell of doing anything non-destructive with a large legacy application.
Rampant carbon sequestration destroyed the Dinosaurs' tropical paradise. I'm here to help repair the damage.
In the cutting edge world of biological research scientists are still using the ancient language latin to name, classify and describe species of organism. The thing is why would they change?
Korma: Good
Yes, FORTRAN sucks, but it is stable, fast and well understood. It runs on a number of supercomputer architectures. It is way easier to program in FORTRAN than in C for non-CS people. So what is the issue? Oh, maybe that this is not a "modern" language? Here is news for you: Java sucks a lot more than FORTRAN.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.