Open Source Software For Experimental Physics?
jmizrahi writes "I've recently started working in experimental physics. Quite a few programs are used in the lab for assorted purposes — Labview, Igor, Inventor, Eagle, to name just a few. They are all proprietary. This seems to be standard practice, which surprised me. Does anybody know of any open source software intended for scientific research? Does anybody work in a lab that makes an effort to use open source software?"
Does anybody know of any open source software intended for scientific research? Does anybody work in a lab that makes an effort to use open source software?
We discussed R which describes itself as:
R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.
...) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.
R provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering,
One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.
While it's not geared specifically towards experimental physics, that's probably going to be your most fruitful endeavor.
Then there's the Matlab knock-off of Octave which describes itself as:
GNU Octave is a high-level language, primarily intended for numerical computations. It provides a convenient command line interface for solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly compatible with Matlab. It may also be used as a batch-oriented language.
Octave has extensive tools for solving common numerical linear algebra problems, finding the roots of nonlinear equations, integrating ordinary functions, manipulating polynomials, and integrating ordinary differential and differential-algebraic equations. It is easily extensible and customizable via user-defined functions written in Octave's own language, or using dynamically loaded modules written in C++, C, Fortran, or other languages.
I'm surprised you're surprised that you only find proprietary software in the highly specialized realm of "experimental physics." I mean, you have to be like a PhD in physics with a good deal of programming knowledge to make something accurate & useful (and there's probably gotta be like 50 failed projects before you get a good successful one).
... I have no idea what Labview, Igor, Inventor, or Eagle do. Ask yourself why these programs are standards and then maybe add to Octave's wish list or contribute to it even! Unfortunately, this isn't easy--I myself started to implement proper handling of sparse matrices in Octave but gave up as I was trying to form low level requirements ... You probably already know though that this is going to have to be done in C or another very low level, very quick language.
You're probably wondering why there's not a project of Firefox or OO.o quality for experimental physics but I'll tell you why: it's too specialized and your user base is ridiculously small. You're not going to find a company that is going to benefit greatly (or at all maybe) by releasing their product into the wild for a community to grow. There's probably not a community for it to grow in.
You should tell us what specifically you are looking for something to do
If you're looking for something specific or outline some high level
My work here is dung.
.. but the specialized libraries aren't as mature. http://www.scilab.org/
Labview has the advantage of being easy to learn for non-computer-savvy people that want to control and read special hardware (electrical monitoring equipment, servo motors, etc.). Unfortunately, it rather terrible for complex programs that do not naturally fit in LabVIEW's paradigm of deterministic data flow paths from user input to screen or file output. For example, storing things in persistent variables that are not visible to the end user are a horrible kludge. Reusing VIs (program modules) in other projects require you to endlessly draw wires on your screen rather than simply copy/paste something in an editor. Data processing that involves anything else than applying a standard library function (such as searching arrays for special conditions) that would have been 20 lines of straightforward C code will take you half an hour in LabVIEW, even if you use the "C formula node" that has no debugging facilities whatsoever. You will find yourself spending most of your time moving lines on your screen because there is no free space left on the flow diagram for that extra feature that you need to implement. Stay away from Labview if you can. Even Labview representatives will tell you that more experienced programmers tend to not like Labview so much.
I haven't used Igor myself, but I have watched over other people's shoulders. You can do quite a bit with it, but the syntax of the language is quite inconsistent; the central paradigm is that variables are either scalar or "wave", which is an array that implicitly represents a row of (x,y) values with fixed steps in x. Wave variables always have global scope, which makes functions that deal with wave variables unpleasant, especially if there is recursion.
Myself, I have been a happy Gnuplot (FOSS) user since 1996. Gnuplot is quite limited in terms of data analysis and dealing with higher-dimensional datasets, and also has a very inconsistent syntax due to historical reasons, but the latter doesn't bother since I have used it as it evolved. For serious (CPU-intensive) data analysis, I have always written C/C++ programs (with gcc, of course), using Gnuplot only for plotting.
I have played a bit with R (FOSS) for data analysis and visualisation, which I liked, but which I have never used apart from making a few drawings. Octave can be useful if you need to collaborate with Matlab people.
For hardware control, I also prefer C/C++. Unfortunately you will likely have to do that under Windows unless you want to write your own lowlevel drivers for Linux (I tried once and gave up when I had to read a 200 page document describing the PCI bridge controller to figure out how to do a DMA transfer).
(I'm out of the academic world now, working at a high-tech company where I have to work with Microsoft Office most of the time and god I do hate that...)
Avantslash: low-bandwidth mobile slashdot.
For simple tasks, even with big data sets, I've had good results with qtiplot http://soft.proindependent.com/qtiplot.html. For really complex stuff, there's root from CERN http://root.cern.ch/
Actually Fermi's Linux is their build of Scientific Linux, which is a distro they have in collaboration with CERN and others.
It's originally based off Redhat/Fedora IIRC. Or department uses it.
Many larger places in the world use EPICS (Experimental Physics and Industrial Control System). An experiment I am a (small) part of use EPICS for control.
It is an open source control systems frequently used for particle accelerator control and observatory telescope control. We use it slightly differently, but for what we need to do, it works very well. It is maintained primary by Advanced Photon Source at Argonne National Laboratory. You can read more at the following URL:
http://www.aps.anl.gov/epics/
In case you are wondering, no, they don't use EPICS for LHC. They use a commercial SCADA program called PVSS (for the most part anyway).
The programs he names:
labview, igor, matlab, are not simply data analysis tools but also have hardware control modules. You can run your data acquisition from Igor, and matlab and of course Lab view.
things like R or octave or scipy dont' have control and acquisition modules that I know of.
That said, it seems to me that since scipy is in python that any python DAQ system would fit the bill of a scientific software system.
So the real question is, what DAQ's are available for python.
and the answer is going to be, that daq softwares tend to get written to drive specific hardware systems. e.g. labview makes a whole suite of DAQ boards. there's oscilloscopes and so forth.
As a scientist you don't want to get bogged down building a custom daq. SO the real bottom line is what commercial DAQs are open in design.
The only system I know that might possibly be in the open is the OASIS daq's developed for flow cytometry. these are mass produced and were developed at the National Labs. But I don't know how it is lic.
of course, one can always use an oscilloscope or DAQ made to be controlled by GPIB or simmilar common language. In that case it's pretty easy to write your own as long as you can find a suitable open source GPIB output driver.
Some drink at the fountain of knowledge. Others just gargle.
Since you said experimental physics... :)
The Experimental Physics and Industrial Control System (EPICS) is a set of Open Source software tools, libraries and applications developed collaboratively and used worldwide to create distributed soft real-time control systems for scientific instruments such as a particle accelerators, telescopes and other large scientific experiments.
EPICS is often used with the free real-time operating system RTEMS to build custom control systems.
Users of EPICS+RTEMS include Stanford Linear Accelerator Center (SLAC), Argonne National Labs, Brookhaven National Labs, and Canadian Light Source.
Professional experimental physicist here.
There are two main types of software physicists have to deal with: hardware interface, and data analysis.
Hardware interface is often the the tougher one: slow controls, data aquistion, environmental monitoring - these all need to interface to hardware through various drivers. LabView is an obvious candidate for table-top experiments, since it is possible to set up working control and readout systems more or less out of the box. There is really no good open-source solution for this for the same reason that open-source drivers took a long time coming to Linux: the user base is just too small to write the code.
My own experience is that it's far better to write your own code, using whatever drivers you can scrounge - it's far more efficient at getting what YOU want done as quickly as possible once it's running. However, the time to write and debug this code is extensive. It's particularly bad since often students will write this code and then disappear, leaving you with badly-documented half-working code.
However, this is basically true of many LabView installations as well.
On the data analysis side, there are many good packages that serve as starting points. ROOT (http://root.cern.ch) is an excellent package for doing event-based data analysis in nuclear and high-energy physics, including efficient ntuple storage and histogramming. It's really a toolkit, not a program, so it allows you to do your own analysis by writing your own code.
I'm not familiar with other big packages, but I know that I frequently use raw C, C++, gnuplot, perl, and python to do little jobs.
There are other tasks as well. Blogging software can be good for logbooks. Wikis are good for in-house documentation.
It really depends on specifics. But basically it depends on where your project falls on the quick-and-dirty vs long-life vs high-performance judgements.
http://www.sagemath.org/. Sage rocks. It's python based, brings together many of the useful libraries already mentioned (numpy, scipy, matplotlib, etc.), and has a very active mailing list. Can't recommend it enough.
Does having a witty signature really indicate normality?
Python has mad fortran all the mor valuable. the combination of fortran and python is a match made in heaven.
fortran is a wonderfully good algorithmic language but it is terrible at everything else. python is great at everything else, but slow as a dog.
one of fortrans strengths when combined with python is oddly enough it's simplicity and lack of sophistication. unlike c++ It compiles so fast that you can feasibly have python write fortran on the fly optimized for the algorithm and compile it, all during run time.
Fortran is good for non-computer scientist in part because it is quite hard for a typo to be a logic error. (e.g. = instead of ==) as a result it's comparatively easy to debug compared to C.
but it can't do all the fun memory management tricks, introspection, or objects that advanced languages do.
But with f2py you can push all of that into python and then just have it call short simple fortran routines for speed.
Some drink at the fountain of knowledge. Others just gargle.