Open Source Software For Experimental Physics?
jmizrahi writes "I've recently started working in experimental physics. Quite a few programs are used in the lab for assorted purposes — Labview, Igor, Inventor, Eagle, to name just a few. They are all proprietary. This seems to be standard practice, which surprised me. Does anybody know of any open source software intended for scientific research? Does anybody work in a lab that makes an effort to use open source software?"
Does anybody know of any open source software intended for scientific research? Does anybody work in a lab that makes an effort to use open source software?
We discussed R which describes itself as:
R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.
...) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.
R provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering,
One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.
While it's not geared specifically towards experimental physics, that's probably going to be your most fruitful endeavor.
Then there's the Matlab knock-off of Octave which describes itself as:
GNU Octave is a high-level language, primarily intended for numerical computations. It provides a convenient command line interface for solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly compatible with Matlab. It may also be used as a batch-oriented language.
Octave has extensive tools for solving common numerical linear algebra problems, finding the roots of nonlinear equations, integrating ordinary functions, manipulating polynomials, and integrating ordinary differential and differential-algebraic equations. It is easily extensible and customizable via user-defined functions written in Octave's own language, or using dynamically loaded modules written in C++, C, Fortran, or other languages.
I'm surprised you're surprised that you only find proprietary software in the highly specialized realm of "experimental physics." I mean, you have to be like a PhD in physics with a good deal of programming knowledge to make something accurate & useful (and there's probably gotta be like 50 failed projects before you get a good successful one).
... I have no idea what Labview, Igor, Inventor, or Eagle do. Ask yourself why these programs are standards and then maybe add to Octave's wish list or contribute to it even! Unfortunately, this isn't easy--I myself started to implement proper handling of sparse matrices in Octave but gave up as I was trying to form low level requirements ... You probably already know though that this is going to have to be done in C or another very low level, very quick language.
You're probably wondering why there's not a project of Firefox or OO.o quality for experimental physics but I'll tell you why: it's too specialized and your user base is ridiculously small. You're not going to find a company that is going to benefit greatly (or at all maybe) by releasing their product into the wild for a community to grow. There's probably not a community for it to grow in.
You should tell us what specifically you are looking for something to do
If you're looking for something specific or outline some high level
My work here is dung.
.. but the specialized libraries aren't as mature. http://www.scilab.org/
There are tons of opensource software tools used in scientific research. One widely used language for building and extending these tools is Python.
Try a google search for +Python +physics to get started, but also try +Python Labview, +Python Igor, and so on. You will find that lots of people have built Python tools to extend the proprietary ones, or do preprocessing or postprovcessing of data.
Labview has the advantage of being easy to learn for non-computer-savvy people that want to control and read special hardware (electrical monitoring equipment, servo motors, etc.). Unfortunately, it rather terrible for complex programs that do not naturally fit in LabVIEW's paradigm of deterministic data flow paths from user input to screen or file output. For example, storing things in persistent variables that are not visible to the end user are a horrible kludge. Reusing VIs (program modules) in other projects require you to endlessly draw wires on your screen rather than simply copy/paste something in an editor. Data processing that involves anything else than applying a standard library function (such as searching arrays for special conditions) that would have been 20 lines of straightforward C code will take you half an hour in LabVIEW, even if you use the "C formula node" that has no debugging facilities whatsoever. You will find yourself spending most of your time moving lines on your screen because there is no free space left on the flow diagram for that extra feature that you need to implement. Stay away from Labview if you can. Even Labview representatives will tell you that more experienced programmers tend to not like Labview so much.
I haven't used Igor myself, but I have watched over other people's shoulders. You can do quite a bit with it, but the syntax of the language is quite inconsistent; the central paradigm is that variables are either scalar or "wave", which is an array that implicitly represents a row of (x,y) values with fixed steps in x. Wave variables always have global scope, which makes functions that deal with wave variables unpleasant, especially if there is recursion.
Myself, I have been a happy Gnuplot (FOSS) user since 1996. Gnuplot is quite limited in terms of data analysis and dealing with higher-dimensional datasets, and also has a very inconsistent syntax due to historical reasons, but the latter doesn't bother since I have used it as it evolved. For serious (CPU-intensive) data analysis, I have always written C/C++ programs (with gcc, of course), using Gnuplot only for plotting.
I have played a bit with R (FOSS) for data analysis and visualisation, which I liked, but which I have never used apart from making a few drawings. Octave can be useful if you need to collaborate with Matlab people.
For hardware control, I also prefer C/C++. Unfortunately you will likely have to do that under Windows unless you want to write your own lowlevel drivers for Linux (I tried once and gave up when I had to read a 200 page document describing the PCI bridge controller to figure out how to do a DMA transfer).
(I'm out of the academic world now, working at a high-tech company where I have to work with Microsoft Office most of the time and god I do hate that...)
Avantslash: low-bandwidth mobile slashdot.
You might be interested in checking out Fernando Perez' presentation at the San Francisco Bay Area Python User's Group on usage of Python in scientific computing:
https://cirl.berkeley.edu/fperez/talks/0811_baypig_scipy.pdf
For simple tasks, even with big data sets, I've had good results with qtiplot http://soft.proindependent.com/qtiplot.html. For really complex stuff, there's root from CERN http://root.cern.ch/
Actually Fermi's Linux is their build of Scientific Linux, which is a distro they have in collaboration with CERN and others.
It's originally based off Redhat/Fedora IIRC. Or department uses it.
Many larger places in the world use EPICS (Experimental Physics and Industrial Control System). An experiment I am a (small) part of use EPICS for control.
It is an open source control systems frequently used for particle accelerator control and observatory telescope control. We use it slightly differently, but for what we need to do, it works very well. It is maintained primary by Advanced Photon Source at Argonne National Laboratory. You can read more at the following URL:
http://www.aps.anl.gov/epics/
In case you are wondering, no, they don't use EPICS for LHC. They use a commercial SCADA program called PVSS (for the most part anyway).
Since you said experimental physics... :)
The Experimental Physics and Industrial Control System (EPICS) is a set of Open Source software tools, libraries and applications developed collaboratively and used worldwide to create distributed soft real-time control systems for scientific instruments such as a particle accelerators, telescopes and other large scientific experiments.
EPICS is often used with the free real-time operating system RTEMS to build custom control systems.
Users of EPICS+RTEMS include Stanford Linear Accelerator Center (SLAC), Argonne National Labs, Brookhaven National Labs, and Canadian Light Source.
Here at the Relativistic Heavy Ion Collider at Brookhaven Lab, for large-scale data analysis at plotting we use the open-source C++ framework ROOT (root.cern.ch). ROOT is also used at the LHC. For hardware related tasks the needs of experimental physicists are very specific to the task at hand, and we must hand-write most of the code we need. Only for very small-scale projects can software like LabView (yuck) be useful, and there just isn't any open-source equivalent of Labview that I'm aware of.
A lot of astronomers use IDL but NASA projects in general must put their software in the public domain, so it is not surprising to find packages like the Astronomical Image Processing System (AIPS), the C Flexible Image Transport System (FITS) I/O (CFITSIO) libraries and many other packages and interfaces in SciPy.
I've been a LabVIEW developer for 20 years now (since v2 came out in '89) and a C-coder for about almost as long and I can say this with about 99% certainty: LabVIEW is not for everything, but what it is good at, there is no good replacement (open source or otherwise). LabVIEW is second to none for data acquisition, control, (some) analysis, (some) simulation and (some) SCADA. On the flip side, unless you've a lot of experience with LabVIEW and/or a lot of time to kill, don't try anything with recursion, distributed computing or high-end visualization. I guess I'm not really sure what the problem is here: For less than $2k, you can pick up a copy of LabVIEW and save your boss hundreds of hours of your time. For about $5k, you can get the whole dev package and compile things to .exe's for deployment all over the company.
Just my $0.02...
#include "humorous_pop_culture_reference.h"
For those of us (like me) who were confused when the DAQ acronym started being peppered in the parent post, it means "Data AcQuisition".
My blog. Good stuff (when I remember to update it). Read it.
It's a tad less polished and maybe a bit more buggy than Eagle, but FreePCB ( here ) is FOSS Windows software. It works sufficiently well for me to be very productive. I've used it to make all of the circuitry for the Stanford Solar Car Project ( here ).
It's much easier to use than Eagle and does make you go through as much bigma to get a board made.
I wrote the DAQ I use for my research using the GnuRadio framework (which is written in python and c++). It is relatively easy to write custom C++ modules for high speed computation. There is also a graphical programming utility called GRC for simple LabView style "programming". Unfortunately, the code still isn't very mature, but it is under very active development. It is difficult to argue with the simplicity of labview for a quick interface to NI hardware, but once you leave the academic world where licenses and hardware cost real money, the appeal is quickly lost.
When I get time, I'll play with comedi and the comedi module for gnuradio for working with NI DAQ cards.
I use all of the tools listed by other commentators. But one of the most useful tools not yet listed is GRACE (or xmgrace). It produces publication quality figures, includes many useful features like linear and non-linear regression, statistical analysis, convolutions, interpolations, Fourier transforms and it supports complicated multi-graph overlays well.
In the interest of full disclosure, I've been an IGOR user for about a decade, and I have worked for the company as a consultant.
IGOR is a fantastic tool. Will it do everything? Just about. Its programmability makes it nearly infinitely flexible.
But, for me, the two features that make me shudder at the thought of dispensing with IGOR are the support and OS interoperability.
IGOR Pro is perhaps the best supported and best documented piece of software on the planet. (Yes, I have contributed a minor amount to said documentation.) The company stands behind the program completely, and the user community usually offers solutions to anyone's problem within minutes. And, the documentation is simply phenomenal: the comprehensive and highly linked pdf version of the manual ships with the software is also installed as a searchable, highly organized database that can be navigated many ways.
Then, of course, is the bonus that everyone gets with their purchase of IGOR: you get Mac and Windows versions as part of your purchase. (I've been told the Windows version runs well on Linux under Crossover.)
And, oh yeah, the cost is a fraction of Matlab's or LabView's.
"...who search the reason of things
Are those who bring the most sorrow on themselves." --Euripides, The Medea
Scientific Linux OS is written by the scientists and engineers at CERN!
Its base is from Red Hat.
It is a jewel of an OS and is exactly what you need.
Distrowatch.com probably has it listed and a path to downloading it. It is free.
Good luck and do something good that helps people!
Professional experimental physicist here.
There are two main types of software physicists have to deal with: hardware interface, and data analysis.
Hardware interface is often the the tougher one: slow controls, data aquistion, environmental monitoring - these all need to interface to hardware through various drivers. LabView is an obvious candidate for table-top experiments, since it is possible to set up working control and readout systems more or less out of the box. There is really no good open-source solution for this for the same reason that open-source drivers took a long time coming to Linux: the user base is just too small to write the code.
My own experience is that it's far better to write your own code, using whatever drivers you can scrounge - it's far more efficient at getting what YOU want done as quickly as possible once it's running. However, the time to write and debug this code is extensive. It's particularly bad since often students will write this code and then disappear, leaving you with badly-documented half-working code.
However, this is basically true of many LabView installations as well.
On the data analysis side, there are many good packages that serve as starting points. ROOT (http://root.cern.ch) is an excellent package for doing event-based data analysis in nuclear and high-energy physics, including efficient ntuple storage and histogramming. It's really a toolkit, not a program, so it allows you to do your own analysis by writing your own code.
I'm not familiar with other big packages, but I know that I frequently use raw C, C++, gnuplot, perl, and python to do little jobs.
There are other tasks as well. Blogging software can be good for logbooks. Wikis are good for in-house documentation.
It really depends on specifics. But basically it depends on where your project falls on the quick-and-dirty vs long-life vs high-performance judgements.
http://www.sagemath.org/. Sage rocks. It's python based, brings together many of the useful libraries already mentioned (numpy, scipy, matplotlib, etc.), and has a very active mailing list. Can't recommend it enough.
Does having a witty signature really indicate normality?
There's nothing stopping you from using a DLL from an open-source program for the image processing.
Shane.
when you're trying to do something non-traditional (which should be all the time in experimental physics!).
There fixed it for you.
My experience with LabView (used it once for a little undergrad acoustics experiment) was that it didn't do things quite the way I expected from its presentation. The whole block diagram data flow model thing made me think that everything was going to run in parallel. In reality, it did not, so real-time processing just wasn't quite the same thing as building a circuit that did what it looked like LabView was going to do. Of course, this isn't a very strong complaint, and somebody who used it often would quickly learn exactly what it did and did not do and how.
My own offering: In high energy experiment, we use primarily ROOT (root.cern.ch), which is a horrible, bloated, feeping creature, but gets the job done. Replacing or rewriting it would be too much effort, so everyone just limps along and pretends there isn't an elephant in the room. I can't say I want to recommend using it, but I do recommend using it anyway. At least it is FOSS, so you can in principle fix it if you want.
SIGSEGV caught, terminating
wait... not that kind of sig.
Perl Data Language is quite nice -- it has all the blended hackish glue-code goodness of Perl (which could be a plus or a minus depending on your personal style). CPAN (the big Perl repository) has a lot of free stuff for access to serial, parallel, and USB ports so you can control your equipment and acquire data. PerlXS (built into Perl) gives you nice access to C libraries, and PP (a meta-language that comes with PDL) helps you sidestep a lot of cruft if you have to use a C module to import large volumes of data.
PDL has built-in commands for plotting 2-D stuff via PGPLOT or PLPLOT, and 3-D stuff via GL. I use it for all my publications.
Some prefer NumPy or SciPy, the Python equivalents of PDL, but (IMAO) Python isn't as expressive as Perl, and the external libraries, while extensive, cannot compete with CPAN for completeness or hugeness.
If you want an environment like MatLab or LabView, there is Scilab:
There are many such toolboxes: DATA ACQUISITION AND REAL-TIME
Agree. However, Fortran still evolves and Fortran 2003 (and 2008) is suppose to add some OO capabilities beyond encapsulation in modules.
I will point out that I've seen more Perl than Python in scientific computing. But most of my scientist coworkers tend to be older..
These comments are my own and do not necessarily reflect the views or opinions of my employer or colleagues...
FermiLinux was what they used before switching to Scientific Linux. The original FermiLinux was about 1-2 years out of date because they took about a year to certify the corresponding version of RedHat (pre-Fedora). We never used it on the Desktop Linux cluster I designed and used to manage because of that although now they use Scientific Linux things are a lot better.
My different generations of meas and evaluation software stacks
Measurement software:
0) C
1) C/perl/java evaluation
2) C/java/jython evaluation:matlab
3) matlab/labview (connected by http+xml+xslt)
4) matlab with DAQ+instrument control toolbox
5) C/tcl evaluation:matlab; real time plotting: gnuplot
C was only used to bind non-preexisting daq functions. I dont write GUIs in C. For GUIs i prefer java, and the best free combinations i used were 2) and 5) would i be free to choose i would use 2), however in the environment i am currently working some co-workers write in tcl, so i found in more productive to learn tcl. (Not because of my productivity, but learn fast. However, one more conflict in the group could have killed the group. This selection was purely driven by social purpose - i gave to say: sucessfully.).
Otherwise the general idea would be that to use java and implement the GUI in java and the scripting languange in one of the many languages existing on the top of java. If you use a standalon scriptiong language for performance reasosn (i tested that once, but i could affot to go without it): python is a great choice for semi-numerical (and numerical) programs.
For evaluation i used and
1) perl
2) python
3) octave
4) matlab
1-3) each plotting with gnuplot. I know there are other solutions around, but when i tested them, gnuplot was always the rock-stable, feature-rich workhorse (think: LaTeX friendly output: epslatex - great in combination wit a makefile for you figures in a thesis!), with the only disadvantage of some things beeing historically grown.
4) is great, but do yourself the favor and alway try to keep the core functionality working in octave. I mean matlab is a great product, but you may prefer not to tie yourself to a package which costs than much money (think about what will happen after you leave your current univerity? do you still want to access you old data?). My current strategy is to keep the functions loading experimental data/evaluating it working in octave, and gui functions in matlab (only exception: i use the statistical toolbox moderatly. I could rewrite the function i use, but it would take a week. I am working in a reasearch company, and here it is more reasonable to buy the toolbox.
Oh. and by the way from time to time i am misfortunate enough to work in labview. it is a PITA. My personal experience say: rewriting the program in another language usually saved time.
I'm sure this will be lost among the sea of comments, but I'll throw it out there anyway. As a former LabVIEW/Matlab user, I have had tremendous success with the Python(x,y) package.
All the development is done in Eclipse. You have a WYSIWYG editor built into Eclipse, with which to design Qt GUIs. The signal/slot methodology in Qt is similar to LabVIEW, but easier to work with once you've learned it, since you can make connections in text or the drag and drop interface, whichever is easier.
You can use the generic Qt widgets to display Matplotlib graphs, which use similar syntax to Matlab.
You have numpy/scipy, which are fantastic for processing data.
You have pyvisa/serial/parallel, which will control any instrument you have, or even ones you make/program yourself, such as FPGAs or MCUs.
You can use Python's pickling to store your data in rapidly-accessible modes (I've opened hundred million datapoint sets in seconds), or CSV/Excelerator to store smaller chunks of data in more human-readable modes. Python also provides a fairly simple database system if you need one, and you can always use Zope or Django if you want a web interface (though that will be harder to learn).
On top of all that, there are many more field-specific modules available in Python than in LabVIEW.
I've been using it for months now (I should emphasize that I had never even heard of Qt or numpy or any of those things before that), and I cannot wait to drop LabVIEW. If you think Perl is bad, try to debug a 5 year old LabVIEW program, developed by ten different people, each of whom was using LabVIEW because they don't know how to program properly. One of our old VIs, which only sends a string of commands over a serial port and then reads a DMM, weighed in at over 350 MB. No one can even figure out how it works any more. The same system, done with Python(x,y), came out to under 200 lines of code.