Why Scientists Are Still Using FORTRAN in 2014
New submitter InfoJunkie777 (1435969) writes "When you go to any place where 'cutting edge' scientific research is going on, strangely the computer language of choice is FORTRAN, the first computer language commonly used, invented in the 1950s. Meaning FORmula TRANslation, no language since has been able to match its speed. But three new contenders are explored here. Your thoughts?"
A: Legacy code.
Scientists work in formulas. Fortran was designed to do things naturally that don't fit into C/C++, Python, whatever.
Why not?
Actually that is a serious question, for these sorts of applications there seems to be no significant downside.
1) Modern Fortran is not all uppercase ....
2) Modern Fortran does not have to start on column 7
3) Modern Fortran has dynamic memory allocation
4) Modern Fortran can use the same types as C (maximizes interoperability), hence can be called where C might be called
5) Modern Fortran has an objects, polymorphism, etc.
6) Modern Fortran has (a limited form of) pointers
7) Modern Fortran has concise array/vector/matrix operations
8) Modern Fortran has dynamically allocatable, multidimensional arrays that can be indexed starting with any integer
8) Modern Fortran supports the complex type without higgery-jiggery
9) Modern Fortran doesn't *need *pointers *in *all *the *places *that &C does, pass by reference is the norm
10) Modern Fortran is blazingly fast and designed for sciene
Some folks still write in Fortran 77, and the tired tales of woe that are bound to come from a language specification that is many decades old.
But, that code/style still works, and who am I to judge how you want to get your work done?
When you go to any place where 'cutting edge' scientific research is going on, strangely the computer language of choice is FORTRAN, the first computer language commonly used, invented in the 1950s.
Perhaps it's still the best tool for the job. Why is that strange? Old(er) doesn't necessarily mean obsolete -- and new(er) doesn't necessarily mean better.
It must have been something you assimilated. . . .
No, not just "legacy code." Fortran (yes, that's how it's spelt now, not "FORTRAN") was designed to be highly optimizable. Because of the way Fortran handles such things as aliasing, it's compilers can optimize expressions a lot better than other languages.
This. I have many friends in the physics dept and the reason they're doing Fortran at all is that they're basing their own stuff off of existing Fortran stuff.
What amused me about the article was actually the Fortran versions they spoke about. F95? F03? F08? Let's be real: just about every Fortran code I've heard of is still limited to F77 (with some F90 if you're lucky). It just won't work on later versions, and it's deemed not worth porting over, so the entire codebase is stuck on almost 40 years old code.
A: Legacy code, and because Fortran 2003+ is a very good modern language for scientific computation and maps very naturally to problems. As it turns out, the language semantics (both legacy and modern constructs) make it very good to parallelize. And it runs fast, as in, equalling C++ level of performance is considered a weak showing.
If you haven't seen or used modern Fortran and think it's anything like Fortran 66/77 then you're mistaken. Except for I/O, which still tends to suck.
In addition there are still some seemingly trivial but actually important features which make it better than many alternatives (starting from Fortran 90).
There's some boneheaded clunkers in other languages which Fortran does right: obviously, built-in multi-dimensional arrays, AND, arrays whose indices can start at 0, 1 (or any other value) and of course know their size. Some algorithms are written (on paper) with 0-based indexing and others with 1-based and allowing either one to be expressed naturally lowers chance of bugs.
Another one is that Fortran distinguishes between dynamically allocatable, and pointers/references. The history of C has constrained/brain-damaged people to think that to get the first, you must necessarily take the second. That doesn't happen in Fortran, you have ALLOCATABLE arrays (or other things) for run-time allocation of storage, and if you need a pointer (rarer) you can get that too. And Fortran provides the "TARGET" attribute to indicate that something *may be pointed to/referenced*, and by default this is not allowed. No making pointers/references to things which aren't designed to be referred to multiple times. This also means that the aliasing potential is highly controlled & language semantics constructed to make Fortran able to make very aggressive, and safe, optimization assumptions.
The more parallel you want, the more of these assumptions you need to get fast code, and naturally written Fortran code comes this way out of the box than most other languages.
Legacy code that has been carefully checked to give correct results under a wide range of conditions.
A: Legacy code.
AKA battle hardened libraries that work as advertised.
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
Precision is important in scientific discourse. Latin isn't a language with creeping grammar and jargon. It's sorta what Esperanto only wished it could ever be.
I am sticking with Visual Basic 6
A big problem is that C and C++ don't have real multidimensional arrays. There are arrays of arrays, and fixed-sized multidimensional arrays, but not general multidimensional arrays.
FORTRAN was designed from the beginning to support multidimensional arrays efficiently. They can be declared, passed to subroutines, and iterated over efficiently along any axis. The compilers know a lot about the properties of arrays, allowing efficient vectorization, parallization, and subscript optimization.
C people do not get this. There have been a few attempts to bolt multidimensional arrays as parameters or local variables onto C, (mostly in C99) but they were incompatible with C++, Microsoft refused to implement them, and they're deprecated in the latest revision of C.
Go isn't any better. I spent some time trying to convince the Go crowd to support multdimensional arrays properly. But the idea got talked to death and lost under a pile of little-used nice features.
Firstly... 10^-15 is WAY beyond what most scientific codes care about. Most nonlinear finite-element codes generally shoot for convergence tolerances between 1e-5 and 1e-8. Most of the problems are just too hard (read: incredibly nonlinear) to solve to anything beyond that. Further, 1e-8 is generally WAY beyond the physical engineering parameters for the problem. Beyond that level we either can't measure the inputs, have uncertainty about material properties, can't perfectly represent the geometry, have discretization error etc., etc. Who cares if you can reproduce the exact same numbers down to 1e-15 when your inputs have uncertainty above 1e-3??
Secondly... lots of the best computational scientists in the world would disagree:
http://www.openfoam.org/docs/u...
http://libmesh.sourceforge.net...
http://www.dealii.org/
http://eigen.tuxfamily.org/ind...
http://trilinos.sandia.gov/
I could go on... but you're just VERY wrong... and there's no reason to spend more time on you...
People using existing Fortran code are interested in the RESULTS of the computation, not whether the code is modern or has the latest bells and whistles. Programmers forget that the ultimate goal is for someone to USE the program. I wrote a program in CDC Fortran 77 in 1978 that's still being used, Why? Because it does the job.
This is absolutely right. It's also easier to write, in many cases. Most scientific applications don't need things like lambda expressions or derived classes. Many people who write applications as tools in their research don't want to spend time learning esoteric aspects of languages.
My previous supervisor decided to fork our Fortran code for performing quantum mechanical calculations. We'd worked on it for more than half a decade and it was world-class.
.xml .xml), and it only compiles after a lot of work
... and I feel the same way about CS graduates and Fortran. They have no idea about the physics or maths involved (which is the difficult part), so the do the only thing they know which is to 'modernize' everything, making it into an incomprehensible, ungodly mess.
He handed it over to a computer science graduate (i.e. a non-physicist) who really liked all the modern trends in CS. Now, five years later:
1. the tarball is an order of magnitude larger
2. the input files are now all impenetrable
3. the code requires access to the outside (not possible on many superclusters)
4. he re-indented everything for no apparent reason
5. the variable names were changed, made into combined types and are much longer
6. as a result, the code is basically unreadable and nearly impossible to compare to the original formulae
7. code is duplicated all over the place
8. it now depends on unnecessary libraries (like the ones required to parse
9. it's about four times slower and crashes randomly
10. it generates wrong results in basic cases
To quote Linus Torvalds: "I've come to the conclusion that any programmer that would prefer the project to be in C++ over C is likely a programmer that I really *would* prefer to piss off, so that he doesn't come and screw up any project I'm involved with."
Fortran, apart from being a brilliant language for numerical math, has the added benefit of keeping CS graduates at bay. I'd rather have a physicist who can't program, than a CS type who can.
(Apologies to any mathematically competent computer scientists out there)
The big thing Fortran has over C is proper support for multidimensional arrays, with powerful slicing operations built into the language. It was the inspiration for numpy arrays. My first languages were C++ and C, but when I do scientific programming, my languages of choice are now python and fortran (with f2py making it very easy to glue them together). Fortran is horrible at text processing, and has an almost absent standard library, but for scientific use, good arrays make up for that - especially when you can use python in the non-performance-critical parts.
C++ has some multidimensional array classes, but none of them are as convenient as fortran arrays. Especially when it comes to slicing. At least that's how it was the last time I checked.