Doing Science With Virtual Biologists

Posted by Soulskill on Saturday October 15, 2011 @05:51AM from the please-state-the-nature-of-the-biological-emergency dept.

An anonymous reader sends word of new research into automating computational experiments. A team of scientists developed a piece of software, dubbed Eureqa, to help solve complex, computationally-intense biological problems. A new paper in the journal Physical Biology details their success (abstract). "The researchers chose this specific system, called glycolytic oscillations, to perform a virtual test of the software because it is one of the most extensively studied biological control systems. Jenkins and Vallabhajosyula used one of the process' detailed mathematical models to generate a data set corresponding to the measurements a scientist would make under various conditions. To increase the realism of the test, the researchers salted the data with a 10 percent random error. When they fed the data into Eureqa, it derived a series of equations that were nearly identical to the known equations. 'What’s really amazing is that it produced these equations a priori,' said Vallabhajosyula. 'The only thing the software knew in advance was addition, subtraction, multiplication and division.'"

1 of 29 comments (clear)

Min score:

Reason:

Sort:

Re:Pretty impressive by Daniel+Dvorkin · 2011-10-15 06:21 · Score: 4, Informative

It's neat stuff, but I'm skeptical that it will replace human biologists any time soon. As is often the case, the pop-sci writeup is a lot more dramatic than the article itself. Reading the latter, I'd say that what they've done is a clever bit of data mining combined with mathematical modeling -- they use an evolutionary algorithm to find the best set of differential equations, out of an enormous number of possible models, which describe the behavior of the data.
This is easier, and probably produces better results, than the traditional method of coming up with sets of diff. eqs. to describe the behavior of complex systems, but it's not a replacement for human judgement in coming up with the model space in the first place. (I'll also note that they performed almost the entire "experiment" on simulated data, which is always a valuable first step in the development of any modeling method, but it's not enough to show that the method "works" -- real data is always messier than the best simulations, and biological data is particularly so.) That being said, it's a very nice technique, and I'll be interested to see if the same approach can be applied to building the kinds of statistical models I work with, Bayesian networks and such.

--
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.