Is FORTRAN Still Kicking?
Algorithm wrangler queries: "I'm beginning to wonder if I should invest the time in learning FORTRAN. Although it is, arcane it seems to be the best tool when it comes to demanding optimization tasks and heavy computations. C/C++ does not cut it for me - it is simply too easy to make mistakes and I find myself using half of my time hunting bugs unrelated to the problem at hand. Additionally, although tools like Matlab exist they don't provide the power that justify the huge price tag they carry. I find any script based language (Matlab, Numeric Python, Scilab) to be inadequate as soon as it is necessary to use loops to describe a problem and using such tools for recursive systems can be a real pain. As another data-point, the Netlib repository seems to be very FORTRAN oriented, and it is a true gold mine when it comes to free routines for solving almost any computing task. What bothers me though is that FORTRAN code is really ugly and the language lacks almost any modern day language feature (I know about Fortran 90 but it is not much nicer than F77, and no one seems to use it). Can it really be true that the best tool we have for heavy duty computing is a 25 year old language, or have you found anything better - free or non-free?"
FORTRAN is used in high performance scientific computing. The language allows for high parrelelization.
Fortran 90 has plenty of structured programming features to make maintainable code. Equally, if not more important, is that Fortran code can be much better optimized than C/C++ code for numerics. IBM did a good job on Fortran, and it's still a major player today.
Care about electronic freedom? Consider donating to the EFF!
You can become a passable FORTRAN programmer in a couple of hours if you already know another language, such as C or Pascal. There are a couple of gotchas (predeclared variables & COMMON statements IMHO).
If you are going to touch any heavy simulation code (such as statistics, physics & biology) learn FORTRAN. It works very well for those problems. Yes, it is old, but that doesn't mean it's bad. It's not modern, but it works surprisingly well.
I find myself teaching FORTRAN to budding scientists, and they are able to write complex stuff very quickly because they don't trip all over the language (e.g. '==' vs '=' in C).
to your C code. You just have to know how FORTRAN arrays are held in memory and how long the FORTRAN types are. Then you need to know what standard FORTRAN libs you need to link, so that your numerical libs will work. I did that for lots of my numerical work and it worked fine. You have to test it of course and it takes a while to work out the kinks.
***Quis custodiet ipsos custodes***
Common Lisp is a very high level language with a tremendous amount of expressiveness, and it is suited towards academia in that in general, functionality is not sacrificed for performance.
Check out http://www.lisp.org, http://cmucl.cons.org/cmucl for a really good implementation (and there are even Debian packages of it).
CL is not known for its parallelization abilities, but if you need a language that lets you describe mathematics, CL is useful.
Lisp is actually based around something called the Lambda Calculus, which is a way of expressing concepts by transforming data into other data using data which is expressed as a "function". Because of this, Lisp has a lot of abilities that other languages lack, such as extremely simple and powerful function composition, even at run-time. CL also has a massive core library with OO facilities, basic mathematic primitives, good FFT suppot in most implementations, windowing system support, and good commercial vendors like Franz. Check it out; it's almost as old as Fortran, but has evolved in a much more elegant manner.
Speaking of meteorological programming, ALL the major atmospheric models are written in FORTRAN. The ETA, AVN, NGM, MM5, WRF, and scores of lesser-known models...all of them written in FORTRAN (most of them FORTRAN-90 now, but some of the older ones are FORTRAN-77). The MM5 & WRF may be found here and here. The source code to several others is readily available as well if you're so inclined, for instance the ETA and the ARPS. Anyone wanting to run them may do so fairly easily on a PC running Linux (any new PC will be able to run a fairly hi-res model real-time); I do so myself.
Different languages have different strengths and weaknesses. I use Fortran, C, Ada95, and Ocaml interchangeably for different tasks. Often times linking the object files into a single executable.
Fortran, designed for mathematics and engineering, obviously excels at that job. You might want to consider writing the "intensive" parts of your application in Fortran and then linking it with modules written in another language such as C or Ada.
I've found that C is perfect for handling the I/O routines for such apps, but my Ada libs are ideal for doing memory managment and when the code outgrows the practical limitations imposed by Fortran.(Note: Interfaces.C and Interfaces.Fortran).
Likewise Ocaml tends to fit around anything with a minimum of hassle.
Of course, this is just a subjective evaluation derived from my own experiences. However I would encourage you to experiment to find the combination that works best for you. As we all should know "Theres more than one way to do it."
I'm sorry if this post seems somewhat vague, but it would be rather hypocritical of me to outright prescribe a certain language or tool when I personally have a tendency to float around and use whatever tool is most convenient.
NiCad
C's libraries may not be (and may never be due to compiler pointer aliasing issues), but C++'s are. One in particular is Blitz++.
Not to take away from Fortran. Language in general means far less to performance than an experienced programmer and good algorithms.
- I don't need to go outside, my CRT tan'll do me just fine.
Honestly, your objection to C++ is unclear to me...you say you spend more time fixing bugs than approaching the task at hand? Is this because you don't know the language that well? Perhaps because you're not taking advantage of the many excellent libraries available to you? Keep in mind that C++ library design requires a great deal of skill, but using a well-designed library is actually easier than coding in other languages.
C++ is my own personal choice for anything by the most demanding of high-performance computing applications. Is there an overhead to the language? Debatably, yes. Does it matter, in 99.9% of applications? No. And with only a little bit of forethought, even the "inherent" performance hits can be avoided in the places where it matters. It's just that you have to rely on a profiler to tell you where those places are...
There is a significant community of researchers and developers working on scientific and high-performance computing in C++. Check out some of these:
These are just a few good starting points. Do a google search for 'high performance c++' to find many more. Just, please, for the love of Deity, don't code in FORTRAN. ick....
Let's try not to let fact interfere with our speculation here, OK?
It's obvious that the story's poster didn't really look into FORTRAN much past the aging F77.
I currently use F77 to do research in magneto-hydrodynamics simulations of neutron stars on Cornell's Velocity Cluster (which has been featured on slashdot before). Fortran, due to its lack of things like pointers, etc, is rediculously efficient, and almost completely cross platform (because surprise surprise- it's very difficult to attempt to do anything remotely platform specific). The language is much simpler than something like C with pointers, etc, that must be messed with. Sure it's ugly as hell, but once again the newer versions of Fortran take care of most of these issues.
I would suggest that anyone interested in high performance computing should check out High Performance Fortran. It's a set of extensions to the F90 language to allow the seemless integration of large-scale parallelization in your code. It also has several other performance advancements.
I highly disagree with the poster of the story, Fortran 90 is much more modern than F77, including things like objects, safe pointers, better recursion, better array sharing, generic routines (a type of function overloading). The language syntax is also much more lenient than F77 (which was designed to work with punchcards). It also has some really great array operations (things like slices, etc) that are rediculously fast. While I absolutely hate F77, if I was going to write a computationally intensive simulation, I'd probably do so in F90 or HPF.
A lot of people still use Fortran, especially computational physicists and meteorologists... Many of these people don't have time to learn new programming languages, and Fortran works very well for what they need, better in most situations than almost any other language. It's something to consider.
Cheers
Justin
I work as a controls engineer at the NASA Ames Research Center. Most of the nonlinear aircraft simulations are still written in FORTRAN. FORTRAN provides very robust mathematical libraries while making it very easy to parse text files. In other words, FORTRAN is ver good at taking a text document of flight data, and crunching it into a useful simulation. The main thing is that so many compilers and languages talk to FORTRAN. I do a lot of work in Matlab and C, and both can link to my FORTRAN code. I can pull up an old simulator from the early 90s, slap on an s-function or C-wrapper, and use the code in my new code. Of course, the question is: is new FORTRAN code being generated for reasons OTHER than to be compatible with the old code, or because it is the only language the crusty engineer knows? Well, it's a toss-up. Matlab seems to be making a lot of headway, especially since it's code is very C-like and can link to old code. But, the gnu g77 compiler means I can distribute my FORTRAN work to anybody with a Unix box. Not everyone has put out the cash for Matlab. My recommendation is to learn enough FORTRAN to understand the math and logic loop functions. This will be enough to be able to read old code, and to be able to write math subroutines to be linked to more modern code. I still have to write in FORTRAN, but it's uncommon that I ever write a stand-alone FORTRAN program with an interface or anything. It is mostly text-file and math subroutines for Matlab or C.
Fortran is truly ugly, if you need to write fortran, at least do it an ratfor (Rational Fortran) described by Kernigan and Plauger in Structured Programming.
Ratfor adds "normal" structured programming constructs to fortran to make it readable by somebody less than 40 years old.
You write code that looks like:
for(i=1;i=100;i=i+1) {
fortran code here
}
Ratfor generates:
23002 if(.not.(i.le.100))goto 23004
fortran code here
goto 23002
23004 continue
I dont know about Linux, but ratfor is included in the FreeBSD ports.
S.E.S.S.D.E.N.E.E.NW from west end of hall of mists
I'll get flamed for this but a nice choice, particularly if you want recursive computing, is Lisp. Lisp was not really designed for heavy computational use but it did find its niche in AI which is heavily mathematical in its algorithms. There is a lot of numerical library code floating around out there though you will find none at netlib. clmath is a nice math library that can be found here and there out there. Lisp is slow you say? BS. It is fast for development, fast for testing/debugging, and yes...it is fast to execute. Lisp can be both interpreted and compiled. Interpreted means fast testing. Compiled means fast execution. One implementation of lisp (cmucl) is actually faster than C or fortran in many instances. Personally, I use clisp but that is because it compiles on OS X out of the box for the most part. Well, that's my advice. It's nearly as old as fortran but still a great great language.
I hate to sound trollish, but Python seems to handle loops quite well, and I find them intuitively easier to implement than in C/C++. The next time the author uses Numeric Python maybe he should give for or while a try. Plus there are other modules such as sci.py and scientific python that offer other tools, and Python integrates well with R, gnuplot, GRASS, and other computational tools. And I find C++ to be pretty zippy speed-wise.
As a matter of fact, Craig Burley (original author of "g77") had quite a fight with RMS over optimizations that "gcc" did not provide but that are necessary for performance in a language where multi-dimensional arrays are first-class citizens.
There are still a couple of "compilers" that translate into C (Gnu "fort" which is basically both obsolete and dead from a development point of view, and NAG "f90" that is free and is OK if you are only doing development work that doesn't involve real number crunching.
Neither is used for serious computational work; having to live within the C aliasing rules doesn't permit the optimizations necessary for high performance computing problems.
"My opinions are my own, and I've got *lots* of them!"
There are some fundamental weaknesses in the C language that make it less-than-optimal for writing numerical codes.
1) C arrays are nothing more than pointers in drag. "Aliasing" of multiple pointers pointing into the same region of memory can cause optimizations to introduce bugs. Because all array accesses are done as if by pointer arithmetic, it is hard to deal with multi-dimensional arrays where more than one dimension can vary (think rectangular MxN matrices). There is a bias in the language toward manual pointer movement (*p++, etc.) to efficiently stride through arrays.
2) C always "wants" to compute with doubles. (E.g. the usual trig libraries all return doubles, and the default function call rules cast float arguments to doubles.) Serious number crunching code may want to use single precision floats to conserve memory and, more importantly these days, cache and memory bandwidth.
3) No built-in exponentiation operator. (Important so that the compiler can optimize small integer powers as combinations of multiplications.) No built-in, transparent complex number support. E.g. trig functions with complex arguments.
Well, I can't speak for other agencies, but here at NASA Glenn Fortran is very much alive. A huge amount of the thermodynamic cycle/turbomachinery analysis that gets done around here is done using legacy Fortran code. Though they are no longer developing new codes in Fortran (at least in my office), it still lives. Rather than rewriting Fortran code, the effort (mine anyways) currently goes into writing generic GUIs for Windows to interface with those programs.
Yeah I don't think it's a troll as it seems to be coming from reliable sources. Apparently his family sent an email to the faculty at cs.utexas.edu which has been forwarded around. I would imagine it would show up in the news within a day. Here's a link to the email, on the perl5-porters list.
The poster says:
This isn't true, and I don't think has ever been true. Below is a quote from a g77 page:
No commercial compiler I'm aware of does anything similar, either. Obviously, one should be wary of taking language advice from someone who is this ill-informed about compilers.
As for unrelated bugs, this can be an issue. If all one wants to do is a fourier transform, or a singular-value decomposition, or something similar, on some data, it's clearly ridiculous to have to learn the C++ STL, or similar libraries in other languages, to just mess with some matricies. FORTRAN, for all its problems, Will Just Work as long as you're doing something simple.
On the other hand, if you're just doing some small stuff and you don't want to deal with more complicated languages, the best bet is probably to use Matlab/IDL/Maple/Mathematica and not worry about computer programming at all. Even if you're planning on doing big calculations at this point, prototyping your algorithm and methods in these interpreted special-purpose tools can be a very good way to get your code up and running.
Just curious, does all your knowledge about programming languages date from 1975, or just your prejudices about Lisp? Lisp has had arrays since about then, those arrays have the same O(1) access time as anyone else's arrays, and the performance of code using them is tuneable to FORTRAN speed or better. Whoever taught your "survey of programming languages" course did you a real disservice - maybe you should get them brought up on educational malpractice charges.
That said, FORTRAN can probabaly outrun Lisp on supercomputers, because of the effort put into parallel and vector optimizations on those platforms. I love Lisp, it's my preferred hacking environment, but I wouldn't propose it as the language of choice for big numerical applications unless there was a chance that hairy data structures might improve performance.
To a Lisp hacker, XML is S-expressions in drag.
Try O'Caml (caml.inria.fr); it's a modern language that's compiled very efficiently (independent benchmarks) and is suitable for heavy crunching. O'Caml has lots of features that you won't find in many languages, like algebraic data types, higher order functions, etc., but is intended for real general purpose programming. Most importantly, it's type-safe (statically) so you probably won't spend as much time tracking down bugs unrelated to the problem at hand. (That has certainly been my experience with SML, a language from the same family.)
I'm not going to wade in on a lame language war, but Fortran IS very portable. I have worked on code that was written in 1967 for a CDC mainframe. It was then ported to a:
PDP-11, then a
Vax, then a
486-class PC. The code ran much faster on the PC then the Vax.
Then I discovered that I needed a routine from the original CDC implementation, which had not been touched since. So I typed in the routine FROM CDC PUNCH CARDS. Compiled perfectly.
Your issue with extra ints on unformatted writes of Fortran file io... I've worked on Fortran development on 2 platforms:
1) DEC/Compaq Alphas running OpenVMS with DEC compilers
2) Windows NT4/2K with MS Powerstation v4 and Compaq Visual Fortran v6 compilers.
The DEC compilers on OpenVMS did *not* do those extra ints on unformatted file io. My C code to read the output file worked with no extra steps, and could read data structures with few problems. The MS/Compaq compilers *did* write extra ints on the Windows platform. Drove me buggy when I was trying to port some software from VMS to Windows. (Don't ask why, I was ordered to do it.)
Incidentally, the MS Powerstation v4 compiler wrote a 16-bit int before and after, and the Compaq Visual Fortran v6 compiler wrote a 32-bit int before and after. That change also drove me nutty. This had some extra issues... an array declared as
integer(4) MYVAR(1000)
was *larger* than the 16-bit int could specify... so the compiler broke it up into 128-byte chunks. Yes, a 4000-byte array was written as a series of 31 128-byte chunks (each with its own leading and trailing 16-bit ints), followed by a 32-byte chunk with its own leading and trailing 16-bit ints. Making C code to read this mess hurt my brain. At least switching to the Compaq v6 compiler took that issue away.
I didn't look up the Fortran language spec to see which one was actually complying with the spec. Having seen all three methods, I decided none were correct.
Incidentally, when doing unformatted writes of structures where one language is writing, and another language is reading the file... Make sure both compilers are using the same memory/data alignment rules. My Fortran compiler was doing align=byte, and my C compiler was doing align=word, and my structures with some logical*1 and integer*2 variables were messing up my read routines.
Ahh... the dangers and joys of multi-language development projects.
This is my sig. There are many like it but this one is... Oops. Frank, I've got your sig again! Where's mine?
Fortran does not force you to write spaghetti code, any more than c forces you to generate buffer overflows or perl forces you to write unreadable code.
Design and structure your application.
If you are used to objects and methods, just use subroutine modules and entry points to the same effect.
Fortran was where I learned to use multiple entry points into one sequential file for recursive processing.
I agree. Especially for research Matlab is a very powerful tool (in particular with the special purpose toolboxes, Simulink is great!). You will have to think in matrix and vetors to get the speed. I am regularly called by colleagues to fix speed problems with Mathlab applications. Usually it is a for loop that could be vectorised (mostly by the use of boolean array's).
Why is Matlab so fast? Well, it's nothing more as an easy frontend for the Fortran libraries. That's where the speed is: FORTRAN.
Cu, Hans
UT-Austin has an obituary.
I do use Fortran (F95). It runs beautifully both on my Linux desktop and on a remote Alpha cluster on True64. It also runs fine in the Soliaris and SGI workstations. The beauty of F95 is not that it is quite portable, but it is _easy_ You can use loads of time saving constructs which are a pain to deal with in C (I also do C for some other things mind you). If you use Windows, you get some pretty good debuggers (the ones for Linux are quite ugly, and since gcc does not yet support F95, gdb does not particularly care for it).
The thing is that if you tye F9x, you'll find writing the code easy. Learning the language is a no-brainer. You can then interface the numerical stuff with things such as SciPy to display it and so on.
Eventually it will. Look at www.scipy.org. They have a module called weave. With that module you can mix C-code into Python-code. The C-code is compiled when you run the program the first time and then you have a fast module.
There are also some modules which allow the linking of FORTRAN subroutines into Python code.
Sure, the authors NR took on a big task (I wouldn't call it mammoth, because they only skim over things like PDEs), but they weren't up to it.
The authors are scientists, not specialists in numerical computing. The appearance of "complete" does not equal accurate or correct. Writing robust and accurate numerical codes is difficult work, and there are journals dedicated to the topic.
Even their code for special functions is pretty lousy, often just taken from Abramowitz & Stegun, which is a source from the 1950s!!!
I'll freely admit that Netlib is not uniformly good; often, you have to find the most up-to-date solution to your particular problem from among the 3 or 4 solutions you find there. Also "old" does not mean "incorrect," or "untested," although it often does mean "probably inferior to some later work."
Real production-quality matrix codes, for instance, are not easy-to-read like NR. They are total mazes of special cases and tests and branches, but all of those things were put in for very good reasons, and the stuff that survives in high-quality libraries has been throughly tested and peer-reviewed. Don't expect to read a few pages of chatty prose and a couple pages of Fortran and feel totally informed. Expect it to be a black box that you can use with confidence, but inside is basically incomprehensible without careful study.
NR is a danger because it is not as good as readers think, and because it causes readers to not look any further for better solutions to their problems.
I found ifc produced faster code, typically 2x my g77 code, all on fortran 77 code. Not all optimizations work reliably for me, and some core dump. Try turning some off. Most of the speed boost for me is prefetch optimization, in my case. I also have Lahey's fortran express, which at US$250 is the cheapest commercial fortran available for linux. I bought the one before they added prefetch. g77 is the slowest out of all the above. It is also slower than f2c+gcc in my case. But coding for f2c is a pita because it's stricter than g77.
There is basically no free lunch with regard to f95, well except ifc. There is a project called g95 but it's nowhere near done.
I never ever claimed using floating point was the best idea, merely that it wasn't an inherently bad idea.
But it is an inherently bad idea. You're better off using strings, for crying out loud. You don't need the sin, sqrt, 1E-100 stuff that floating point offers you. You do need exactness to the cent, no matter how large the numbers are - something floating point doesn't offer you.
Depending on the processor and library, there can be distinct advantages to using floating point, like overflow and underflow exceptions, support for infinities and not-a-number values, and so on.
Which are worthless. You don't want infinity or not-a-number - you want it to raise an exception where you screwed up, which integer types in better languages do. You don't want an underflow exception - you want it silently round to zero. And you can get overflow exceptions on integers just as easy as on floats.
Lots of things to like about Python, but NumPy is one of the better ones, IMHO. There is a largish and growing community of numerical jocks coalescing around around Python in scientific computation. NumPy makes Python into an "array language" (like Matlab, S-Plus/R, APL, etc. etc.) where the crunching is heavily optimized C code. Links to LAPACK et al., and Fortran wrappers exist in f2py and PyFort, for that old still-running-after-all-these-years code.
/dev/null ;-)
(Just another enthusiast, dabbling in the religious-war du jour. Flamage to
Fortran is still alive and well in the high energy physics (HEP) community... though it is fading away slowly (not as slowly as some people would like though). Up until very recently, FORTRAN was *THE* language for data analysis but is slowly being replaced by C++ in newer experiments such as BaBar at SLAC and is replacing FORTRAN for data analysis at a few older experiments such as H1 at DESY. The reason why FORTRAN is fading away so slowly is mainly because of CERNLIB which is a FORTRAN library that contains many useful functions (random numbers, matrix manipulation, data fitting etc...) As most particle physicists "grew up" using CERNLIB, it will be a while yet before FORTRAN well and truly disappears (in the HEP community anyway). Also of note, CERNLIB has now been released under the GPL, so anyone can use it. Nice.
But I'm astonished at the range of mis-information on earlier replies in this thread. Most seem to think that Fortran means Fortran77, though just a few mention Fortran90. Most users of the language that I know have switched to Fortran95. It has just about everything C++ has, and more in a few ways: for example in Fortran you can define your own operators, and overload them. In C++ you can only overload an existing operator symbol, which leaves you with a rather small choice. Suppose you want to implement a "like" operator for string matching along the lines of that in SQL: you can define .like. to do it in Fortran, what obscure symbol are you going to choose in C++?
Fortran95 isn't fully compatible with object-oriented programming, but for scientific applications that's often irrelevant.