Programming As a Part of a Science Education?
An anonymous reader writes "I'm a fairly new physics professor at a well-ranked undergraduate university. When I arrived, I was surprised to discover there were no computer programming requirements for our majors. This has led to a series of fairly animated faculty curriculum conversations, driven by the question: to what extent should computer programming be a part of an undergraduate science education (in particular, physics)? This is a surprising line of questioning to me because in my career (dominated by research), I've never seriously even questioned the need. If you are a physics major, you learn to program. The exact language isn't so important as is flow control, file handling, basic methods/technique, basic resource management, and troubleshooting. The methods learned in any language can then be ported over to just about any numerical or scientific computational problem.
Read on for the rest of the reader's questions and his experiences dealing with faculty who have their own ideas.
The reader continues, "I'm discovering the faculty are somewhat divided on the topic. There is even a bizarre camp that actually acknowledges the need for computer programming, but turns my 'any language' argument on its head to advocate the students do 'scientific programming' using Excel because it is 'easy,' ubiquitous, and students are familiar with it. They argue Excel is 'surprisingly powerful' with flow control and allows you to focus on the science rather than syntax. I must admit that when I hear such arguments I cannot have a rational discussion and my blood nearly boils. In principle, as a spreadsheet with simple flow control in combination with visual basic capabilities, Excel can do many things at the cartoon level we care about scientifically. But I'm not interested in giving students toys rather than tools. As a scientist raised on a heavy diet of open source software and computational physics, I'll hang my head in shame if our majors start proudly putting Excel down on their resumes. However, in the scientific spirit, perhaps I'm missing something. So I ask Slashdot, to what extent do you feel computer programming should be a part of an undergraduate science education? As a follow-up, if computing is important, what languages and software would best serve the student? If there are physics majors out there, what computing/programming requirements does your department have? My university is in the US, but how is this handled in other parts of the world?"
While a lot of computational physics requires speed, I find myself on a daily basis needing to write simple programs to collect, filter, transform and plot data. I have found no better language for writing these quickie tools in than Python.
Once they know Python, then they can pick up C++ or Java as context requires it. And if they never have to deal with really huge amounts of computation, then Python + Scipy might get them by for most everything. (And if not, Python has bindings for everything, practically.)
I was in the class taught by the authors of the textbook mentioned above. We had to write the code to model simulations of such things as magnetism and inertia/momentum. I enjoyed the labs that involved the programming, but I'm not sure what effect it had on other people who aren't as into computers.
I'm not suggesting that all physics students must learn C++ and Matlab, but they should be taught a grown-up computer language so that they at least understand the concept of C++ objects, or how to begin solving the problem of communicating with a machine via a Matlab environment.
My examples are very specific, but you get the idea. Physicists need to be aware of certain computer programming concepts (which cannot be gleaned from experience with spreadsheets) otherwise they will fall flat on their face when faced with a real research environment.
The specific language is not very important, but physics tends to be dominated by C/C++ and Fortran, so these would definitely be a good place to start.
Even BASIC is better than Excel.....
All of the libraries and programs of interest are in C and FORTRAN. C++ is interesting and used but the other two still dominate. If you had to chose between the two for teaching people to program, take C. For utility, the two are about equal.
http://aps.arxiv.org/abs/0803.1838
DNA in your Linux: DNALinux
Yes, Excel is surprisingly powerful. It's also surprisingly cumbersome; you have to fight with it too much. Every time I give in to the "just do it in Excel" impulse, I waste many hours before I get disgusted, throw away my efforts, and start over in a real programming language. On a recent project, I needed some data reduction, smoothing, and publication-quality graphs. Perfect for Excel, right? Wrong! Couldn't get decent-looking graphs out of it. In the end, it was easier to download a graphing package for Python (MatPlotLib), figure out how to use it, recode everything -- and get exactly the output I wanted. I've been programming since 1966. I've used Fortran, PL/I, many flavors of assembly code, Pascal, Lisp, APL, C, C++, VB, Java, Python, Maple, etc. I can probably make Excel do whatever I want. Hell, I can probably make a Turing Machine do what I want. But it's not worth the hassle.
There are no strictly mandatory courses that teach programming. Even though, hearing a lecture on a programming language is recommended as is hearing a lecture on numerical computation which touches on numerical accuracy issues as well as basic numeric algorithms (interpolation, equation solving, splines...). Most students choose to follow that recommendation and end up hearing a lecture in C/C++ for the most part. Also, there is a lecture on computational physics, which introduces a wide set of useful tools for those that feel inclined to take that road. This lecture is purely optional and usually only attended by a couple of students.
I am not aware of any diploma thesis that was completed lately that did not include some basic scripting or programming by the author. Most topics require the students to do either considerable numerical computations (theoretical topics, mostly) or some heavy data processing, often in the form of image processing (experimental topics). Most students seem to be prepared well enough for these tasks by the time they start with the thesis work. I have never heard of any problems that arose because a topic required more elaborate programming skills than the student had or was able to develop during the thesis work (which lasts for a whole year).
Bottom line: the teachers here are quite aware that basic programming skills are necessary and although they do not force the students to acquire them, they still end up learning them.
http://www.moonlight3d.eu/
You're right, it's mostly the program flow/control issues that are needed in programming. Most interesting problems in physics require computers to solve, and so a good understanding of LOGIC is needed; once you know how to iterate with loops and use if-thens, its somewhat trivial to do so in any language you see fit. After that, you can focus on numerical routines for integration, differentiation, etc. Once they understand the basics, helping them to learn to use libraries (GSL? keep with the open source!) to tackle more difficult problems may be the best for undergrads.
My university required an introductory software design course, which was one of those more theoretical things that covered the logic and design methods, and not so much a specific language. While not required, it was strongly suggested you then took introductory computational physics, which focused more on solving differential equations and setting up general problems, as well as little things to watch out for (preventing divide by zero, etc). Again, no specific language, and the professor even allowed you to use whatever language you wanted. FORTRAN (90/95) was the default, but if you had more C experience, that was ok too, for instance.
It was also required we get some UNIX/Linux experience (apparently most of the supercomputers they used ran Linux and so they felt it was nice for us to have some familiarity with alternate OSes), so we had a computer lab with some Red Hat machines we were required to compile code on and make scripts for. In fact, it was funny, we had to scp our programming code (= homework) to the prof's computer for review, so that was another skill we learned.
Those experiences and skills really come in handy a lot for me, and I'm glad we had that requirement. The numerical methods have come in handy for certain projects in my graduate courses. Somewhat off-topic, but that class was also my first experience with Linux, and after using it and seeing it wasn't hard to use, I got the itch to try it at home when Windows started flaking out. Also, colleagues of mine, when they got jobs at various research centers, were given laptops and said "Here, use ssh to get data", and so had we not had some experience, it may have been harder to start off and get used to.
We recently had a talk here with faculty at my university about better integrating this type of stuff into the curriculum, even starting with the very first mechanics course. There are interesting equations of motion for somewhat simple problems that are tough to solve by hand, so instead of simply finding Lagrangians and calling it quits, why not ask them to use numerical routines (integrate, invert a few matrices, etc.) to solve those equations and do some sort of animation of the motion? It was suggested we use Mathematica for this reason, apparently the new version 6.0 has nifty easy-to-use animation functions --- I haven't used it yet, but if that's the case, nothing is better for developing some intuition and understanding than seeing things happen real-time.
Besides Mathematica, other faculty use MATLAB and FORTRAN and C/C++ for their actual research, so if you had to choose some programming environment, those seem the most common to me, and thus probably the best choices. Even if Excel isn't a bad choice in principle, there are so many other environments that are more powerful that I can't really see a good reason to use it.
I love all of these "reasons" for not using Excel that seem to boil down to "When I learned to program, we didn't have ones and zeros, just zeros, and were were glad to have 'em." I use Excel occasionally for fast, free-form calculations and even exploration, but never for real research. The reason is simple: it gives incorrect answers. Check out the the links here for details: http://scholar.google.com/scholar?hl=en&lr=&q=On+the+accuracy+of+statistical+procedures+in+Microsoft+Excel+97&btnG=Search This one is easy to read and gives a short, detailed list of some of the problems: http://www.amstat.org/sections%5CSRMS%5CProceedings%5Cy2001%5CProceed%5C00470.pdf
that nobody has spoken up yet to recommend Ruby as a scientific programming language. While it does not have built-in higher math packages (neither, for that matter, do C, or C++, or perl, or Python, or Java), it has a very friendly syntax, and it is a proven, solid, and stable language that has been undergoing continuous user-supported improvement. It is fully object oriented down to its roots, and it is easier to use than most all of those languages already mentioned. It is a dynamically-typed language but it is nevertheless strongly typed with robust type checking. Ruby is completely open source.
As for math packages, Ruby has a very simple interface for including existing C libraries, which removes the objection that "there is not enough written for it yet". There are enough math libraries written in C that any physicist should have no trouble finding what is needed, whatever the specialty.
I am a software engineer and I have been using Ruby professionally as my primary language for over 2 years now, and it would be my language of choice for almost any high-level programming. While specialty tools can certainly beat it at certain tasks (it is no MatLab or Mathematica), for general-purpose or custom programming it is a wonderful tool. And as mentioned, the ability to EASILY include just about any C-language library makes it more versatile than any of the others of which I am aware.
The "downside" of Ruby is that it is relatively new (only about 12 years old or so now), and did not become popular until very recently. So it is distrusted in some circles because (to them) it is a brand-new thing, and also as a "newcomer" some of its features are vilified by those who have not bothered to actually learn how they work... just like every other language when it was a new experience.
The chair of the physics department here at Augusta State University is definitely all for we physics majors graduating here with plently of MATLAB experience. Several months ago, I overhead him commenting how he was disappointed with the lack of programming skills with the majority of the majors. So now, MATLAB is a staple of the computational physics and electronics II courses. Prior to this change, I, myself, had very little experience with programming. But I got on to a research project this past March and had to learn MATLAB very quickly. Now, two months later, I`m working everyday with MATLAB on an Ising Model program. Needless to say, from an undergraduate stand-point, it seems that the need to code and implement algorithms will be of vital importance to future physicists.
The best single document would be the classic Goldberg paper on "What Every Computer Scientist should Know about Floating Point Arithmetic" http://docs.sun.com/source/806-3568/ncg_goldberg.html (originally published as an ACM paper; kindly corrected and republished by the Sun Floating Point Group (Goldberg worked at Xerox). It should be required reading.
Beyond understanding the differences between conventional mathematical arithmetic and what computers actually do, the student really should have a formal introduction to data structures.
Lastly, and why I hung the comment on this one, Numerical Recipes is well known to not be a good numerical choice. Making it the foundation of a class would be a real crime against computing.
Abramovitz and Stegun are the Table People! The numerical recipes people are Press et al.
Physicists here at Cambridge, UK use Excel in first year for a mathematical methods exercise, and then in second year we use excel again for another mathematical methods exercise and C++ (formerly Fortran) for a programming in physics course. I personally think Excel IS useful for seeing the flow of iterated algorithms, numerical integration and so on - but learning a proper language is important to if you want people to mature into programming at all. Plus programming seems so important in research environments that it'd be negligent not to. In addition, as part of the first year natural science course you can do half the first year computer scientist's courses, including ML and Java - so if you're interested you do learn "real" languages.
There's an OSS Matlab clone called GNU Octave, see http://www.octave.org/. It's mostly compatible with Matlab. I've been using it for handling larger datasets or maths-heavy stuff. Works fine. --Bud
If you haven't already, check out Sage: http://www.sagemath.org/ Here's a review of sorts from a heavy Matlab user: http://vnoel.wordpress.com/2008/05/03/bye-matlab-hello-python-thanks-sage/
To the troubled new faculty member at a well-guarded institution: you are not alone. Try reading the article below. In fact the entire issue is devoted to the theme of computation in undergraduate physics. R. G. Fuller, 'Numerical computations in U.S. undergraduate physics courses,' Computation in Science and Engineering. 8 (5), 16-21 (2006)
But Octave supports little more than the basic syntax of Matlab and omits most of its basic functions and all of its GUI (e.g. no graphing).
As a better alternative to Matlab, I'd suggest SciLAB (http://www.scilab.org/). Not only does it support a much larger fraction of Matlab than Octave (including a lot of functions from the toolboxes), it's actively under development and actually has a GUI.
In fact, SciLAB might be exactly what the Original Poster wants -- a flexible matrix-based programming language with some visualization primitives, that's 98% compatible with a standard programming language to boot.
Randy
We've been going through a process of evaluating this very question as we start to set up a general access course for "Programming for Science" as part of our graduate student access courses. This would be in addition to our Java and C++ specific courses, along with XML/XSL and SPSS/Matlab (we've also just started up a LaTeX course which is heavily over subscribed - validating a 3 year long campaign!).
So far we've come up with the following generally applicable core:
Software Engineering (software lifecycle, reqs/specs, quite high level and generic)
Web programming (PHP / MySQL)
Procedural Programming (Perl - possibly some python - VPython is very useful)
Maths / Stats (Matlab / SPSS)
Declarative Programming (XSL)
This covers the most commonly requested ground and the specs have had good feedback from Heads of Schools and Student reps. It'll probably be a 20 class course - still finessing details
I have also had programming (not math methods) classes in Java and IDL from a physics department. While I enjoyed them, most of the students who did not have prior programming experience found those classes difficult and uninteresting. Most physics students aren't interested in learning the details of why or how a programming language works, simply what they can do with it. The students who had taken a math methods/programming course found "regular" programming courses much more useful.
I've also had (and taught) classes with LabView. While theorists find LabView totally useless, it is by far the most common programming tool used in experimental labs. You can learn structure and flow with labview, but it's not a very useful learning "language". However, it can (and should!) be taught in advanced lab classes to make things like temperature controllers, timed electronic measurements, instrument control etc. The people who say experimentalists don't need to program are dead wrong. If you can't rig up a simple temperature controller, basic e-beam writing system or digital oscilloscope in software, you're going to be wasting money on hardware you didn't need to buy.
I've had to teach lab classes where the students were forced to present the data in Excel, and that was bad enough. Good luck finding a graduate student in a physics department who's willing to teach VBA and Excel well enough to do anything useful. Use of Excel as a programming platform is not as common as your peers think. I would try very hard to get them to move to something like Origin or Igor, which are much more powerful, produce better graphs and actually ubiquitous.
There is a fairly well organized group of physicists in the states that is promoting computational methods as an integral part of a physics undergraduate curriculum. There will be a working session at the Winter AAPT meeting in Chicago next February. Also take a look at the special issue "Computation in Physics Courses" of Computing in Science and Engineering (Sept/Oct 2006) to see what some other programs are doing (disclaimer: I wrote one of the articles). Also, the American Journal of Physics recently (April/May 2008) released a double issue in "Computation and Computer Based Instruction" with lots of information in it (especially the resource letter by Landau). As it happens, Landau, Jan Tobochnik (editor of AJP), and Norm Chonacky (editor of CiSE) are all part of the AAPT working group.
My school requires a course in computer methods as a part of the physics major, and I have been afforded a very recent opportunity to observe traditional-aged fellow students while taking this course.
Such a course is essential. While students at our school are introduced to Mathematica as part of the calculus sequence, this exposure is cursory and is only interactive; it does not require actual programming.
My computer methods course was a first presentation for its lecturer, and arguably was a bit too ambitious in breadth. We had programming assignments in Mathematica (both procedural and rule-based), and interactive and programmed use of GNU Octave (Matlab clone). We also were required to write a program in C/C++.
Throughout the class there was an emphasis upon learning and writing documents in LaTeX, including a considerable amount of equation-writing. We also used the old Bell Labs Graphviz "dot" program, both as an arcane and slightly bizarre introduction to programming per se, and to generate graph figures for inclusion into the reports written in LaTeX.
Adding to the confusion from breadth is the fact that one professor, for his optics class, prefers and supplies his examples in Mathcad.
I believe that it's fair to say that most students found this class to be a nearly excruciating experience, which is an indication that it had worthwhile and challenging content. The breadth and pace of presentation needed to be reduced a bit.
Out of the 10 or so students in the class, only I had any previous programming experience. I found the work load challenging, more in the underlying problems than in the technical aspects of writing code. By contrast, at most one of the other students had previously used any text editor apart from Word, or were aware of TeX/LaTeX.
The course had some challenging conceptual content (e.g. generating attractor maps by iterative solution of cubic equations with starting points gridded across a region of the complex plane) as well as fairly extensive technical content.
I will admit a pathological allergy to Excel, with the exception that it is somewhat handy for graphing and doing simple regression fits. I am glad that this course did not spend time on Excel.
Again, this course needed to be recalibrated downwards a bit in breadth of topics. Numerical stability and precision effects were only briefly mentioned, but did benefit by a good example or two.
There is no way that a semester course could approach both the extensive technical topics, and actual numerical analysis. That needs to be a separate course.
Also missing from this course was any coverage of experiment automation (which usually means LabView). This is another important topic, but there simply is not time in a semester to cover it alongside everything else.
At least two of the students gained some enthusiasm for LaTeX as a result of this class, and have used it to write assignments for other classes.