Why Scientists Are Still Using FORTRAN in 2014
New submitter InfoJunkie777 (1435969) writes "When you go to any place where 'cutting edge' scientific research is going on, strangely the computer language of choice is FORTRAN, the first computer language commonly used, invented in the 1950s. Meaning FORmula TRANslation, no language since has been able to match its speed. But three new contenders are explored here. Your thoughts?"
A: Legacy code.
Scientists work in formulas. Fortran was designed to do things naturally that don't fit into C/C++, Python, whatever.
Why not?
Actually that is a serious question, for these sorts of applications there seems to be no significant downside.
At work in the recent past (2000's) we were still supporting FORTRAN on the SGI machines we had running. The SGI compilers would optimize the hell out of the code and get it all parallized up, ready to eat up all the CPUs.
Newer isn't always better.
Trolling is a art,
1) Modern Fortran is not all uppercase ....
2) Modern Fortran does not have to start on column 7
3) Modern Fortran has dynamic memory allocation
4) Modern Fortran can use the same types as C (maximizes interoperability), hence can be called where C might be called
5) Modern Fortran has an objects, polymorphism, etc.
6) Modern Fortran has (a limited form of) pointers
7) Modern Fortran has concise array/vector/matrix operations
8) Modern Fortran has dynamically allocatable, multidimensional arrays that can be indexed starting with any integer
8) Modern Fortran supports the complex type without higgery-jiggery
9) Modern Fortran doesn't *need *pointers *in *all *the *places *that &C does, pass by reference is the norm
10) Modern Fortran is blazingly fast and designed for sciene
Some folks still write in Fortran 77, and the tired tales of woe that are bound to come from a language specification that is many decades old.
But, that code/style still works, and who am I to judge how you want to get your work done?
When you go to any place where 'cutting edge' scientific research is going on, strangely the computer language of choice is FORTRAN, the first computer language commonly used, invented in the 1950s.
Perhaps it's still the best tool for the job. Why is that strange? Old(er) doesn't necessarily mean obsolete -- and new(er) doesn't necessarily mean better.
It must have been something you assimilated. . . .
No, not just "legacy code." Fortran (yes, that's how it's spelt now, not "FORTRAN") was designed to be highly optimizable. Because of the way Fortran handles such things as aliasing, it's compilers can optimize expressions a lot better than other languages.
I have a PhD in engineering, and my dissertation involved writing lots of code. Now I work at a national lab in the US, and I and nearly all of my coworkers work on scientific or engineering codes of some sort. Although there is significant amounts of legacy code that was written in Fortran lying around (a project I work on uses a fortran library written in 1973), very little development is done in that language. It's all C++ or Python.
At least Slashdot seems to encourage re-use of commonly used responses when a question is asked.
Have gnu, will travel.
The biggest reason of interest is that it helps non-computer-science scientists write up computational codes, neither having to devote excessive amount of time in memory management, nor deviate from the classic imperative programming model. And, it is also important for a purely non-technical reason: a generation of domain experts in engineering and scientific domains where trained in FORTRAN codes.
... there is a reason :)
As managers of High Performance Computing platforms, we generally take an a-religious approach and deliver to the users all possible permutations of language types that a given community may need. The following is a very common setup, containing both GNU & Intel compilers: https://hpc.uni.lu/users/softw...
btw. I'm not defending Fortran in any kind of way; ask any Fortran-fun, in which language his compilers are written in
After all, it was "For Tran".
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
This. I have many friends in the physics dept and the reason they're doing Fortran at all is that they're basing their own stuff off of existing Fortran stuff.
What amused me about the article was actually the Fortran versions they spoke about. F95? F03? F08? Let's be real: just about every Fortran code I've heard of is still limited to F77 (with some F90 if you're lucky). It just won't work on later versions, and it's deemed not worth porting over, so the entire codebase is stuck on almost 40 years old code.
If the language accomplishes the task efficiently and effectively with no apparent downside then why attempt to switch languages simply for the sake of switching?
Furthermore, an ability to run legacy code should be sustained especially in science where being able to use that code again after many years might save scientists from having to reverse engineer past discoveries.
I've decided to stop wasting my time responding to AC trolls/sockpuppets... so if you want a response from me... login.
Seconded. And the legacy isn't necessarily just the source code. Many of the engineering industries using such codes have a relatively low turnover rate, meaning an older group of engineers and researchers with the most experience stick around for decades. Most of these folks used Fortran since college. It works for them, and they aren't concerned with any "new-fangled" languages that offer more features. Another reason I hear from these folks is that Fortran has powerful array slicing and indexing syntax not found in C, making big data manipulation simpler. Newer programming languages like Python have packages like NumPy which offer similar capabilities, but it's often a nightmare to translate hundreds of thousands of legacy code lines simply to "escape" Fortran. And there are decent bindings to Fortran that can be leveraged for many parallel computing packages (MPI), which means even less incentive to move up.
Newer folks entering the field often work under the tutelage or mentoring of these folks, and Fortran sticks around. Python is gaining usage in the scientific communities, and it's often coupled with mixed-language wrapping code like f2py or SWIG to access any legacy Fortran code for heavy number-crunching work. I've seen this recipe used successfully in parallel computing to detach some of the "administrative" aspects of scientific code into newer languages.
A: Legacy code, and because Fortran 2003+ is a very good modern language for scientific computation and maps very naturally to problems. As it turns out, the language semantics (both legacy and modern constructs) make it very good to parallelize. And it runs fast, as in, equalling C++ level of performance is considered a weak showing.
If you haven't seen or used modern Fortran and think it's anything like Fortran 66/77 then you're mistaken. Except for I/O, which still tends to suck.
In addition there are still some seemingly trivial but actually important features which make it better than many alternatives (starting from Fortran 90).
There's some boneheaded clunkers in other languages which Fortran does right: obviously, built-in multi-dimensional arrays, AND, arrays whose indices can start at 0, 1 (or any other value) and of course know their size. Some algorithms are written (on paper) with 0-based indexing and others with 1-based and allowing either one to be expressed naturally lowers chance of bugs.
Another one is that Fortran distinguishes between dynamically allocatable, and pointers/references. The history of C has constrained/brain-damaged people to think that to get the first, you must necessarily take the second. That doesn't happen in Fortran, you have ALLOCATABLE arrays (or other things) for run-time allocation of storage, and if you need a pointer (rarer) you can get that too. And Fortran provides the "TARGET" attribute to indicate that something *may be pointed to/referenced*, and by default this is not allowed. No making pointers/references to things which aren't designed to be referred to multiple times. This also means that the aliasing potential is highly controlled & language semantics constructed to make Fortran able to make very aggressive, and safe, optimization assumptions.
The more parallel you want, the more of these assumptions you need to get fast code, and naturally written Fortran code comes this way out of the box than most other languages.
Legacy code that has been carefully checked to give correct results under a wide range of conditions.
If it ain't broke - don't fix it.
#DeleteChrome
Huge libraries of FORTRAN code have been formally proven. New FORTRAN code can be formally proven. Due the limitations of the language, it is possible to put the code through formal processes to prove the code is correct. In addition, again, as a benefit of those limitations, it is very easy to auto-parallelize FORTRAN code.
"To those who are overly cautious, everything is impossible. "
A: Legacy code.
AKA battle hardened libraries that work as advertised.
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
Large scale models handling huge arrays, though - like climate or weather modeling - I think that's where Fortran has always been king of the roost.
The whole point is speed. No one's working in Python if they're interested in speed.
#DeleteChrome
Someone told me (in 1986, I think) "It's amazing. You just write this documentation, and it runs!"
no, used because Fortran is the high level language that produces the fastest code for numeric computation, it is by far the most optimizable. Yes, it blows away C.
Precision is important in scientific discourse. Latin isn't a language with creeping grammar and jargon. It's sorta what Esperanto only wished it could ever be.
If you're using C++ for scientific math, then you deserve to have whatever credentials you may possess to be revoked immediately. No language should be used for scientific math that can produce different results based upon the version of library or platform it is compiled against.
You also cannot prove C++ code is good. You just can't. C++ is not deterministic, again, because the outcome depends on platform/library versions, compiler options, time of day, alignment of the planets, and many other factors. There is no way to say for certain that "Yes, this code will produce the correct results under all conditions."
The big name modern codes that are getting run on the biggest machines are generally done in C and C++ and producing incorrect results.
I have PoC code that I have used to prove that C++ can produce incorrect results based on factors other than the code itself, and at the level of significance as high as 10^-15. That is a completely unacceptable level of inaccuracy for scientific exploration.
I am sticking with Visual Basic 6
A big problem is that C and C++ don't have real multidimensional arrays. There are arrays of arrays, and fixed-sized multidimensional arrays, but not general multidimensional arrays.
FORTRAN was designed from the beginning to support multidimensional arrays efficiently. They can be declared, passed to subroutines, and iterated over efficiently along any axis. The compilers know a lot about the properties of arrays, allowing efficient vectorization, parallization, and subscript optimization.
C people do not get this. There have been a few attempts to bolt multidimensional arrays as parameters or local variables onto C, (mostly in C99) but they were incompatible with C++, Microsoft refused to implement them, and they're deprecated in the latest revision of C.
Go isn't any better. I spent some time trying to convince the Go crowd to support multdimensional arrays properly. But the idea got talked to death and lost under a pile of little-used nice features.
Firstly... 10^-15 is WAY beyond what most scientific codes care about. Most nonlinear finite-element codes generally shoot for convergence tolerances between 1e-5 and 1e-8. Most of the problems are just too hard (read: incredibly nonlinear) to solve to anything beyond that. Further, 1e-8 is generally WAY beyond the physical engineering parameters for the problem. Beyond that level we either can't measure the inputs, have uncertainty about material properties, can't perfectly represent the geometry, have discretization error etc., etc. Who cares if you can reproduce the exact same numbers down to 1e-15 when your inputs have uncertainty above 1e-3??
Secondly... lots of the best computational scientists in the world would disagree:
http://www.openfoam.org/docs/u...
http://libmesh.sourceforge.net...
http://www.dealii.org/
http://eigen.tuxfamily.org/ind...
http://trilinos.sandia.gov/
I could go on... but you're just VERY wrong... and there's no reason to spend more time on you...
APL-style languages should be even more optimizable, since they use higher-order array operators that make the control flow and data flow highly explicit without the need to recover information from loopy code using auto-vectorizers, and easily yield parallel code. By this logic, in our era of cheap vector/GPU hardware, APL-family languages should be even more popular than Fortran!
Ezekiel 23:20
Also "legacy training". Student learns from prof. Student becomes prof. Cycle repeats.
Also Fortran didn't stagnate in the 60s, it's been evolving over time.
Other languages are highly optimizable too. However most of the new and "cool" languages I've seen in the last ten years are all basic scripting languages, great for the web or It work but awful for doing lots of work in a short period of time. It's no mystery why Fortran, C/C++, and Ada are still surviving in areas where no just-in-time wannabe will flourish.
They both generate machine code. But they get there in different ways and produce very different output. It would be more correct to say FORTRAN (compilers) blows away any C compilers. (esp. gcc)
F77+extensions, usually DEC extensions. Very very few people ever used strict F77 with no extensions.
Some of the issues this causes are irritating bordering on unnerving. This we we discovered that g77 didn't care for treating INTEGER as LOGICAL. Used to be that there was no other way to specify bit operations, now it is precluded. Everybody's code has that, and there's really nothing intrinsically wrong or difficult to understand about it, but it was technically non-standard (although everyone's extensions permitted it) and it won't work on g77 - maybe only with the infamous -fugly flag.
If you get hung up on floating point truncation errors, then I have bad news for you: Fortran won't protect you from that. You seem to be under the delusion impression that this invalidates the results for some reason. This is utter bullshit. One example are molecular dynamic simulations. An MD simulation is a chaotic system. The _exact_ trajectory is not the relevant result. The phase space that is sampled is. Trajectories of systems with identical initial conditions are bound to diverge on different machines due to a change in floating point operation order and the resulting truncation errors. But the phase space that is sample is _equivalent_ in each run. If for some reason machine precision is important to you you'd be much better off by using a library such as GMP (https://gmplib.org/).
Yeah, I used to hear that argument a lot in 1978...
OP here. This is what the article said. Compilers are the key. They have been around a long time. Another key is that commercial compilers (like Intel for example) further increase the speed, as the manufacturers know how to optomize the code for the specific CPU at hand.
Don't explain computers to laymen. Simpler to explain sex to a virgin. -- Robert A. Heinlein
Well, we live in a somewhat different world today, given that suitable HW for that is virtually everywhere. But just to be clear, I'm not suggesting anyone should adopt APL's "syntax". It's more about the array language design principles. Syntax-wise, I'd personally like something along the lines of Nile, with math operators where suitable, and with some type inference and general "in-language intelligence" thrown into the mix to make it concise. I realize that depriving people of their beloved imperative loops might seem cruel, but designing the language in a way that would make obvious coding styles easily executed on vector machines seems a bit saner to me than allowing people to write random loops and then either hope that the vectorizer will sort it out (they're still very finicky about their input) or provide people with examples what they should and shouldn't be writing if they want it to run fast.
Ezekiel 23:20
This. I have many friends in the physics dept and the reason they're doing Fortran at all is that they're basing their own stuff off of existing Fortran stuff.
The types of people who haven't head about collaborative development or, dare I say, version control.
I've been there. You end up with a zillion diverging (and never merging) forks, people reinventing various wheels over and over, and of course adding their own bugs.
This is a terribly unproductive and sad way of developing code. Unfortunately most scientists I know (knew) don't give a crap because they are _completely_ oblivious to what they are missing out on.
> it's[sic] compilers can optimize expressions a lot better than other languages.
No. Wrong. C has had similar aliasing rules for a very long time. It's only recently that they got turned on in GCC by default. This is what all the -f(no-)strict-aliasing nonsense is all about. The problem is, people know that once, before C89, Fortran was faster. So of course it's still faster. Why would anyone ever modify C to be faster? That's just dumb.
Also Blitz++ uses template metaprogramming and a DSL (that looks just like normal c-vector math) to optimize BLAS far beyond what fortran can do. A fairly minor example is rewriting the expression tree for (A+B)[2] to (A[2]+B[2]) at compile time.
People using existing Fortran code are interested in the RESULTS of the computation, not whether the code is modern or has the latest bells and whistles. Programmers forget that the ultimate goal is for someone to USE the program. I wrote a program in CDC Fortran 77 in 1978 that's still being used, Why? Because it does the job.
My thought at reading the summary was "Do older languages have some sort of expiration date I don't know about?"
That is because you aren't a hipster or fad brogrammer. These idiots probably expect them to be using Node.js or some such bullshit.
In some cases, the client side is language-limited, and everything has to be translated to one language before it can be deployed. In the case of iPhone OS (now iOS) during the second quarter of 2010, this was Objective-C++. In the case of Windows Phone 7 apps and Xbox Live Indie Games, this is the subset of verifiably type-safe CIL accepted by the .NET Compact Framework (which in practice means C#). In the case of web applications, this is JavaScript. In order to ensure that the client-side prevalidation in your web application has provably the same behavior as the validation in the server side, they need to be written in the same language in the same source code files. If your client is in JavaScript, having a server also in JavaScript will let you write the application logic once and not repeat yourself. Hence Node.js.
A: Legacy code, and because Fortran 2003+ is a very good modern language for scientific computation and maps very naturally to problems
See.... Fortran 2003 is more modern than ISO 1999 C.... Now that that's settled... How come people are still programming in languages like C/C++/Java, when Fortran2003 is available?
For years and years and years the Gnu G95 compiler was only a partial implementation of the language. This made it impossible to use without buying a complier from intel or absoft or some other vendor. It chokes the life out of it for casual use.
Personallyt I really like a combination of F77 and python. Whats cool a bout it is that F77 compiles so damn fast that you can have python spit out optimized F77 for your specific case sizes. Then for the human interface and dynamic memory allocation and glue to other libraries you can use python.
Some drink at the fountain of knowledge. Others just gargle.
The reason this is true is that Fortran compiler output, when appropriate optimizations are turned on, is not actually concerned with producing correct output... and although it is certainly faster, the resulting code will be less robust than code output by a modern most modern C compilers. Modern compilers generally focus on producing correct output at all times, and may make compromises in efficiency for correctness.
The main differences involve pointer aliasing... and a C or C++ compiler with the appropriate optimizations switches to ignore pointer aliasing possibilities, as Fortran does, will produce code that is just as fast as that which a Fortran compiler can produce.
File under 'M' for 'Manic ranting'
I would also hazard a guess that Fortran tends to be a tad easier to read than C... Especially for scientists...
Blitz++ is hardly the pinnacle of what should be possible with proper array languages. Think of what you could do with higher-order operators - for example, interprocedural loop fusion becomes trivial, and one could probably come up with many other operations optimizable accross procedure/function/subroutine (whatever you want to call it) boundaries as well. Blitz++ was neat but it can't beat a dedicated compiler for an array language (by which I most certainly don't mean stateful loopy Fortran). Although I agree that the C++/C interoperability is a huge plus.
Ezekiel 23:20
This is absolutely right. It's also easier to write, in many cases. Most scientific applications don't need things like lambda expressions or derived classes. Many people who write applications as tools in their research don't want to spend time learning esoteric aspects of languages.
Please... it's spelt "c" now.
Dark Reflection
No, you are wrong by being terribly incomplete. People who use Fortran (or any language for that matter) are interested in getting the correct answer, in the fastest reasonable time, with the shortest amount of developer work (and perhaps cost matters). In the scientific domain, Fortran remains competitive. The precise cost function that balances these considerations depends on many variables and is precisely why we have so many languages today.
Missing out on what, the wonders and simplicity of using git? (/sarcasm)
Many Fortran programmers are academics, and academics are often times (not always) concerned with one-off programs that demonstrate a computer-based solution to a novel phenomenon. Often times, the investigator works alone. Once that is done, sometimes the code is never visited again. In these cases, anything more than VMS-style versioning is total overkill.
I agree that version control is important and often ignored. But this is not specific to Fortran -- but rather that the fact that version control adds overhead both in using and learning -- and academics are often putting all of their efforts into the science.
APL-family languages should be even more popular than Fortran!
Probably would be if it wasn't a write only language.
When our name is on the back of your car, we're behind you all the way!
Blabbing on and on about vector based GPUs is idiocy, because not everything uses trig where vector based processing is beneficial. I have no confidence you have ever seen math intensive code based on what you are talking about. Nile from their own page is # The Nile Programming Language ## Declarative Stream Processing for Media Applications and NOT a language for Math.
I find this view quite amusing, given that the whole scope of the VPRI project (of which Nile has been of the intermediate results) is to reduce everything in common personal computing into mathematics. And Nile in particular was designed precisely and explicitly to allow the VPRI people to express as wide an array of graphical operations using as short a high-level description as possible - a mathematical description, in equational form, to allow them to express the majority of Cairo (or any other Cairo-like 2D library) in a few hundred lines of these equations. Furthermore, nowhere have I made the claim that the semantics of Nile in its current form is a perfect replacement for any language for scientific computation, as opposed to the thought that there could be some lessons to be learned.
And why don't you log in? There seem to be quite a few anonymous psychotic individuals running around here recently. It makes the conversation feel quite disingenuous.
Ezekiel 23:20
Latin was the one language that all academics shared.
...you mean, those who didn't speak Hebrew, Greek, or Arabic?
Ezekiel 23:20
Most people learned Fortran in a class intended to teach scientific programming. I have never, ever, ever seen a course catalog that lists CS 201 FORTRAN PROGRAMMING. It has always been about getting the scientific result -- it just so happens that Fortran has been pretty good at it and actually pretty simple -- and so it has often times been used.
That being said, C almost killed Fortran (77) because they waited so damned long to bring out Fortran 90. People were sick and tired of waiting for dynamic memory allocation and free form input, and so many people who were past entry-level programming started jumping to C (...which I would never relish to teach to a beginner who has no solid interest in computer programming).
Several years ago (in 1992) I was involved in the rewrite of a large fortran program, which had been around for decades and had become unmaintainable and slow. The physicists who had been working with the program were absolutely convinced that no software could replace fortran for speed. I did not believe that. Operating systems at the time were written in C, and the only thing that beats C for speed is assembly. I made several tests with bare scientific calculations, using fortran, C, and C++, and fortran came up (surprise surprise!) the slowest. Then, the physicists rewrote the fortran application in C++, with some help from me with the object-oriented design. Not only the replacement program was faster, more maintainable, and easier to read. But we found out that the main problem of the program (written in times where RAM was scarce and never updated) used files instead of in-memory arrays or any more modern data structure, or a database, which would be the first choice nowadays. The application was slow because it was designed with constraints that were not realistic anymore. In the end, the replacement program was slightly faster in the number crunching, but immensely faster in the data handling. While the original program required one day or more of very expensive supercomputer time to complete its work, the new program ran in just a few hours. The sad truth is that, even now, many fortran programs are fossils of an era where hardware constraints made the choice of data structure and algorithm, and they were never reviewed.
"Capable of precision other languages lack"? There's nothing to support that claim in modern linguistics. If Latin has an upper hand, for example in medicine, it's because most, if not all of the terminology was developed in that language, but nothing prevents you from saying "medulla oblongata" in English with zero loss in meaning. If doctors still use Latin today, it's because writing the remaining 20% of non-teminology in a medical text in the same language that the terminology is written in makes it international, not because it magically gets some "extra precision". (Do purely Latin longer texts get still published in medical journals anyway?)
Ezekiel 23:20
Also "legacy training". Student learns from prof. Student becomes prof. Cycle repeats.
Not really - even when I was a student we ditched F77 whenever we possibly could and used C or C++. The issue is more legacy programmers. Often the person in charge of a project is a older person who knows FORTRAN and does not want to spend the time to learn a new language like C (or even C++!). Hence they fall back into something more comfortable.
However by now even this is not the case. The software in particle physics is almost exclusively C++ and/or Python. The only things that I am aware of which are still FORTRAN are some Monte-Carlo event generators which are written by theorists. My guess is that as experimentalists even older colleagues have to learn C++ and Python to use and program modern hardware. Theorists can get by using any language they want and so are slower to change. Certainly it has probably been at least 15 years since I wrote any FORTRAN myself and even then what I wrote was the code needed to test the F77 interface to a rapid C I/O framework for events which was ~1-200 times faster than the F77 code it replaced.
Actually, you can get one for mere $35! ;) But no, the surface syntax is definitely something I didn't have in mind.
Ezekiel 23:20
My previous supervisor decided to fork our Fortran code for performing quantum mechanical calculations. We'd worked on it for more than half a decade and it was world-class.
.xml .xml), and it only compiles after a lot of work
... and I feel the same way about CS graduates and Fortran. They have no idea about the physics or maths involved (which is the difficult part), so the do the only thing they know which is to 'modernize' everything, making it into an incomprehensible, ungodly mess.
He handed it over to a computer science graduate (i.e. a non-physicist) who really liked all the modern trends in CS. Now, five years later:
1. the tarball is an order of magnitude larger
2. the input files are now all impenetrable
3. the code requires access to the outside (not possible on many superclusters)
4. he re-indented everything for no apparent reason
5. the variable names were changed, made into combined types and are much longer
6. as a result, the code is basically unreadable and nearly impossible to compare to the original formulae
7. code is duplicated all over the place
8. it now depends on unnecessary libraries (like the ones required to parse
9. it's about four times slower and crashes randomly
10. it generates wrong results in basic cases
To quote Linus Torvalds: "I've come to the conclusion that any programmer that would prefer the project to be in C++ over C is likely a programmer that I really *would* prefer to piss off, so that he doesn't come and screw up any project I'm involved with."
Fortran, apart from being a brilliant language for numerical math, has the added benefit of keeping CS graduates at bay. I'd rather have a physicist who can't program, than a CS type who can.
(Apologies to any mathematically competent computer scientists out there)
The big thing Fortran has over C is proper support for multidimensional arrays, with powerful slicing operations built into the language. It was the inspiration for numpy arrays. My first languages were C++ and C, but when I do scientific programming, my languages of choice are now python and fortran (with f2py making it very easy to glue them together). Fortran is horrible at text processing, and has an almost absent standard library, but for scientific use, good arrays make up for that - especially when you can use python in the non-performance-critical parts.
C++ has some multidimensional array classes, but none of them are as convenient as fortran arrays. Especially when it comes to slicing. At least that's how it was the last time I checked.
None of that is good programming, is the thing.
I really wish people would stop blaming the tools when the problem is people who are tools. Maybe that's endemic to "CS types"? But those of use who code for a living in the real world recognize what you describe as a noob stunt, not a language problem.
The main reason stuff stays in fortran is the general best practice of not messing with working shipped code. If the code needs regular work, for goodness sake use a maintainable language. But lots of fortran code has been stable for decades, and only a madman would go changing it.
Socialism: a lie told by totalitarians and believed by fools.
Reproducibility is part of science. So is identifying and fixing errors. But perhaps the most important aspect of science is being able to continue it.
I've worked in science labs where non-software engineers write code. They fall victim to the same problem software engineers fall victim to when they work without version control: the lose it, they overwrite it, they make mistakes and want to go backwards, they end up with 50 copies and can't remember which one was used to compile their postdoc work. And when it comes time to publish, they (may) archive it and never look at it again - despite the fact that good science should necessitate they release the code, if for no other reason than to reproduce results and ensure they are error free.
Version control is a tool. When used properly, it solves many of the above problems, all which sap productivity. In an academic setting, particularly where peer reviewed papers are being released about computationally intensive science, version control almost certainly saves more time than it creates. People just aren't willing to put the initial investment it takes to learn to use the tool.
Fortran has been an "APL style language" since Fortran 95, with most of the APL operations present. That was done both for optimization and for convenience. And other APL-style languages are very popular as well, foremost MATLAB.
ALL CAPS has been optional since 1990, at least.
Fortran has had modularisation, structured code since 1990, Classes and object-orientated since 2003. Please update your prejudices.
Anyone who believes exponential growth can go on forever in a finite world is either a madman or an economist
Latin is even more terse than English and the words can be placed where the fuck you want. It can be well ambiguous enough!, but that's because I studied poetic latin a bit in high school. Declensions on every word save it, but mean you can really abuse it. Then being a dead language noone knows how to say "yes", "no", "hello", "thanks", "how are you doing?" and such little things.
In 17th/18th century French replaced it, with less declensions and more grammar. (and you had other such artifical national languages such as German, Italian and English)
English goes a bit too far, it's a pain to write a sentence with the word "could" which can both be read as being in the past tense or in present conditional, and no way to tell between the two (similar thing with "would". And you can't even write "I will can")
Wow, faster AND more accurate. They must use some mystical floating-point instructions that only Fortran compiler writers know about.
On PPC implementations, head-tail floating point is typically used for "long double"; this leads to inaccuracies in calculations. 80 bit Intel floating point is also inaccurate. So are SSE "vector" instructions, since denormals, NaNs, INFs, and -0 are always suspect unless you compiler emits an extra instruction in order to trigger the "next instruction after" signalling of the condition, and for NaNs, you are still somewhat suspect there.
If it isn't IEEE-754 compliant, you pretty much can't trust it. FORTRAN goes way the heck out of its way, including issuing additional instructions and introducing pipeline stalls, in order to force IEE-754 compliance.
Pretty much this accuracy only matters if you are doing Science(tm); if you are doing graphics, you are generally willing to eat the occasional FP induced artifact, because what you typically care about is the frame rate in your game, rather than being 100% accurate.
So, in closing, they're not using "some mystical floating-point instructions", they are just using accurate floating point, rather than approximate floating point.
No. The main reason we program in fortran is because the lirbary are known, have known error bars, known comportment , and are "provable". We *DO* reprogram every time we come up to a new problem which need to be translated. Chance is there is no standard code for what you want to simulate for your own specific problem. There are some rare case, like QM program (Gaussian, Molpro etc...) or some engineering program, but those are the exception not the rule.
C. Sagan : A demon haunted world:
http://www.amazon.com/gp/product/0345409469/
visit randi.org
Pretty much all scientific computing benefits from vectorization. All that's needed to use it is that the code performs the same operation on multiple values of data (SIMD).
Fortunately, (1) more and more scientists (also non-CS) are using github repo's for development, as most of our stuff is public anyway; and (2) It is becoming quite common to release a github page as part of a publication and is also a selling point to the editors/reviewers.
Isn't the main performance benefit that Fortran has always claimed over C/C++ the fact that an array is guaranteed to only be used from one thread at a time, and thus you don't have to re-read from memory to registers each time you want to do something with the data in the array? A capability that was formally added to C in C99 (and pretty much universally informally added to C++) with the restrict keyword?
Correct me if I'm wrong here, as I'm not a Fortran programmer.
By a scallop's forelocks!
He handed it over to a computer science graduate (i.e. a non-physicist) who really liked all the modern trends in CS.
Why was a graduate fresh out of university put in charge of architecture decisions? You wouldn't put an apprentice in charge of a mechanical workshop and expect them to keep it tidy and efficient, this is no different.
It's my general experience that it takes 5-10 years of commercial experience before someone is capable of making wise architecture choices about small standalone apps, and 15+ before they'll have a hope in hell of doing anything non-destructive with a large legacy application.
Rampant carbon sequestration destroyed the Dinosaurs' tropical paradise. I'm here to help repair the damage.
Doctors don't use Latin. They use their native languages and a bunch of proper nouns that happen to be Latin (or Greek) words or phrases.
Latin (or Greek) used to be the language of academia because it's what the Greeks used and the Romans translated the Greeks into. And the church loved Aristotle and ironically, the church pretty much defined western academia.
Most scientists like to use tools that work, and they are proficient in.
FORTRAN falls under both categories.
---- Booth was a patriot ----
FORTRAN was -- for some still is-- the 'Perl' of scientific computing. Get it in and get it done... and it doesn't always compile down very tight, but always fast because for mainframe developers getting this language optimized for a new architecture was first priority.
At 15, the first real structured program I ever de-constructed completely while teaching myself the language, was the FORTRAN IV source for Crowther and Woods Colossal Cave Adventure, widely regarded as 'the' original interactive text adventure, a genre which would later go multi-user to become the MUD. Read about it here, or play it in Javascript.
Crowther's PDP-11 version was running on the 36-bit GE-600 mainframes of GEISCO (General Electric Information Services) Mark III Foreground timesharing system... this is in the golden age of timesharing and no one did it better than GE. It took HOURS at 300bps and two rolls of thermal paper to print out the source and data files, and I laid it out on the floor and traced the program mentally, keeping a notebook of what was stored in what variable... I had far more fun doing this than playing the game itself.
FORTRAN IV and Dartmouth BASIC (I'll toss in RPG II also) were the 'flat' GOTO-based languages, an era of explicit rather than implicit nesting -- a time in which high level functions were available to use or define but humans needed to plan and implement the actual structure in programs mentally by using conditional statements and numeric labels to JUMP over blocks of code. Sort of "assembly language with benefits".
When real conditional nesting and completely symbolic labeling appeared on the scene, with good string handling, it was a walk in the park.
<blink>down the rabbit hole</blink>
In the cutting edge world of biological research scientists are still using the ancient language latin to name, classify and describe species of organism. The thing is why would they change?
Korma: Good
I would also hazard a guess that Fortran tends to be a tad easier to read than C... Especially for scientists...
The way scientists write code, it doesn't matter what language they use it will still come out an undecipherable mess. You'll have a quintuply nested loop populating an array call tlnb1 in a function called abn that takes 8 arguments named t1 to t8. The only documentation will be a single comment just before the loop that says "This should work now".
If you're going to be working with scientific code you'll need at least a Master's degree in software archaeology and software anthropology.
~X~
Your problem began when you turned the code over to an inexperienced CS graduate. You don't need to be a good software engineer or even a good programmer to get a degree in computer science. I wish people would stop conflating the two, especially the people in the HR department. :P
You needed a SOFTWARE ENGINEER. Worse, you needed an experienced software engineer familiar with the domain. Instead you handed it over to a fresh graduate who maybe had one or two courses on engineering. What exactly did you expect to have happen?
And what about you? Did you give them a good set of requirements? Did you have frequent reviews? Or did you just drop it in his/her lap and say "make it better"? There's a lot of missing details on exactly how you participated in the process. Again, if you weren't actively giving feedback into the process then you shouldn't be surprised when it isn't what you wanted.
The failure in this case is yours. Next time you need work done, I suggest actually getting the right person for the job and not some cheap fresh grad who happens to have "computer" in their degree description.
~X~
Yes, FORTRAN sucks, but it is stable, fast and well understood. It runs on a number of supercomputer architectures. It is way easier to program in FORTRAN than in C for non-CS people. So what is the issue? Oh, maybe that this is not a "modern" language? Here is news for you: Java sucks a lot more than FORTRAN.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
No one knows how to spell it. In these threads I've seen Fortran, FORTRAN, ForTran, etc. Who the hell can keep track? That's why C is winning! One letter, upper case, C. No muss no fuss. Not to worry, PL\I had the same issue (Forward or backward slash? Better go look it up!) Same with LISP, LisP, Lisp, (or, these days, scheme, Racket, Clojure).- too many letters, too many ways to misspell them. D avoided falling into this trap, but which would you rather have on a paper? A D or a C? And C++ doesn't even look like a real grade? Python? Are we naming a language or a comedy troupe? Same with Ruby - I don't need a bunch of freaking geologists telling how to use a computer.
Jeez, everyone, C just got it right where it counted - its name. Now can we just all agree to use that and move on?
That is all.
> Fortran code ignores the very possibility that pointer content can overlap. Modern compilers do not.
Fortran's language specification doesn't allow pointers to overlap. Inhibiting programmer freedom in this way ironically gives the compiler greater freedom to perform optimizations.
In contrast C & co, give the programmer this freedom, resulting in the compiler having to be more conservative.
Fortran is still used because it works. Becuase it is fast. Because libraries are optimized and well understood. Fortran is still used because gasp it has evolved since FORTRAN 66 and FORTRAN IV. Maybe you and the other language nannies always forcing latest greatest buzz on the rest of us should take the time to actually read about some of the most recent versions?
Restrict keyword is not related to threading. C/C++ compilers have always assumed that data is not accessed from several threads without synchronization. It just wasn't standardized until the new memory model in C11 and C++11. So if you don't use mutexes, memory barriers etc, the compiler is allowed to assume a single thread of execution.
What restrict does is it guarantees that two pointers do not point to same area in memory (aliasing). Let's say a function takes two pointers (char* a, char* b). If you write to the data pointed by a, then the compiler has to emit code to re-read data pointed by b, because a and b might refer to the same location. With restrict pointers the compiler doesn't have to do this.
C/C++ have also always had the concept of strict aliasing, which basically says that pointers with different types may not be used to access the same memory location (char pointers are exception). It allows the same optimizations as the restrict keyword. However most compilers don't enforce the rule because programmers are stupid and use all kinds of noncompliant hacks with pointers.
I was really pleased that InfoJunkie777 raised this issue and followed with great interest the ensuing discussion. I suspect that not everyone who contributed comments has actually had much first hand experience with FORTRAN or even Fortran and some therefore are thinking out of the box. That's not a bad thing of course, since new ideas frequently spring from thinking outside the box. However, I would like to offer a few comments from someone who has been firmly inside the box. Take the issue of fast computing. When I started programming in FORTRAN IV back in 1967 processing speed was the least relevant issue because we were required to take part in a procedural loop from which, occasionally, there was no escape. This consisted of punching cards, submit the cards to an operator who ran the program in batch mode, an hour, or even a day, later collect a printout, correct punching and programming errors, resubmit, and keep going around the loop until, with luck, finally escape with real results. Computing speed in this regime was effectively instantaneous. About the same time some bright spark (who would hate to be named now) invented a version of FORTRAN called PORTRAN (Poor man's Fortran) which was used by kids in rural schools with no direct access to a computer. This involved programming by using a paper clip to remove chads in blank punched cards and posting them to a computer operator in a city. I know it sounds dreadful now, but this was a wonderful thing in its time, inspiring many young minds to go further in their education. I notice that many of the contributors to the discussion remarked that Fortran is still considered useful for scientists and mathematical manipulations, but nobody seemed to understand quite why. I think the reason is simply that it remains fit for purpose. All the criticisms of Fortran are perfectly true (horrible at text processing, dreadful with graphics, inconvenient process of compiling, linking etc.), but for number crunching Fortran remains just fine. Right from the earliest days, FORTRAN IV made it really simple to do complex arithmetic (with i imaginary), matrix algebra, and advanced statistical analysis. In this respect access to the freely available IBM SSP made numerical analysis a breeze. Less than 10 years later than PORTRAN saw the advent of widely available make-your-own microcomputers with the S100 bus allowing one to simply plug in whatever card was relevant to your needs. This was instant opportunity for someone bright to upstage FORTRAN forever. But it did not happen. Instead, FORTRAN-80 was born which ran on the Z80 chip. The entire software library of mainframes became available in your own private room. I don't find it in the least surprising that Fortran is still relevant today. It began life with a strong supporting library and has evolved and adapted to remain fit for purpose. Code portability is still an attractive feature. I am still programming 47 years later, and there is little need for me to write any new Fortran code because it is already all written and running perfectly. The only need is minor manipulation of new data for the same old programs. Finally, it be plainly absurd for me to think of using Fortran for other aspects of my academic work and research, such as interactive graphics, programming hardware, and writing simulation software. Fortran is not fit for purpose here, so I use Pascal and, reluctantly C.
So, Fortran issues extra instructions and pipeline stalls for accuracy, yet manages to be faster.
That is amazing!
It's faster in areas not involving floating point, and in floating point on hardware that has a good floating point implementation.
It's easier to branch-predict fortran code, and the lack of pointer support makes the boolean algebra a lot simpler for the complier. Given the calling conventions and limits, it's actually a lot easier to optimize fortran, and given that oist matrix math involves linear loops, it's easier to autovectorize things like the Berkeley Physics package. E.g. if you are attempting to do montecarlo simulations of P-P and N-P collisions in a relativistically invariant reference frame while simulating pair production events constrained by the matrix solution to the intersection of multiple Feynman-Dyson diagrams, it will almost universally run faster if the code is in FORTRAN.