Grid Computing Saves Cancer Researchers Decades

← Back to Stories (view on slashdot.org)

Grid Computing Saves Cancer Researchers Decades

Posted by samzenpus on Wednesday November 7, 2007 @01:41PM from the begin-the-beowulf-cluster-comments dept.

Stony Stevenson writes "Canadian researchers have promised to squeeze "decades" of cancer research into just two years by harnessing the power of a global PC grid. The scientists are the first from Canada to use IBM's World Community Grid network of PCs and laptops with the power equivalent to one of the globe's top five fastest supercomputers. The team will use the grid to analyze the results of experiments on proteins using data collected by scientists at the Hauptman-Woodward Medical Research Institute in Buffalo, New York. The researchers estimate that this analysis would take conventional computer systems 162 years to complete."

4 of 149 comments (clear)

Min score:

Reason:

Sort:

Oh great ... by trolltalk.com · 2007-11-07 13:47 · Score: 5, Funny

Wanna bet they discover that maple syrup or Canadian back bacon cures cancer?

--
Kevin Smith on Prince
I used to run Folding@... by kcbanner · 2007-11-07 13:48 · Score: 5, Insightful

...as a competition with friends. But then I realized that I didn't really need to use my computers as heaters...and did a number for the planet and closed the client.

--
Obligatory blog plug: http://www.caseybanner.ca/
Storm Botnet by creativeHavoc · 2007-11-07 14:02 · Score: 5, Funny

If they wanted to knock that 10 years down to 5 they could just buy a chunck of the storm worm bot net!

--
insight through the mind
I OBJECT!! by Anonymous Coward · 2007-11-07 18:21 · Score: 5, Interesting

I know this research, and the people involved in it very, very well, and I think this project is a very sad, very large waste of computing time.

Let me back up and explain what the project is doing. To simplify a little bit, the vast majority of "work" in the cell is done by proteins. While DNA can be thought of as something like a simple "string", proteins have complex three-dimensional shapes. Knowing those 3D shapes is of great interest to biologists. There are several reasons for that. One is that it can allow easier design of drugs targeted at a specific part of the protein. Another is that by seeing the shape, we can understand how all the mutations that occur in disease might be affecting its function.

The primary way to determine the shape of the protein is to take the protein and to grow it into an ordered crystal. You can then shine an x-ray beam through the crystal, and the diffraction pattern that emerges can be, through some very complex math, reverse-engineered into a 3D structure. Typically the most difficult part of this process is finding the specific chemical conditions that will allow a crystal to grow. These conditions differ from protein to protein.

This project is not "solving cancer", by any means. Rather, the people in Buffalo have generated a high-throughput way of screening different chemical conditions to determine which ones might allow a protein to grow. They use robotics to screen about 1000 conditions, and take pictures of each condition. The question then becomes: can you automatically process the pictures to find crystals. That's the goal of this project, to help automatically identify crystals in this screen.

So why do I object so strongly to this work? There are three reasons.

First, the project has nothing to do with cancer. In fact, the proteins being analyzed are not in any way "cancer-specific proteins" -- many of them are not even human!! This "cancer" pitch is a sales job, and nothing but a sales job. As a cancer researcher, it offends me that people try to use the disease to justify research that is this unrelated.

Second, the project is ill-conceived, technically. In no way did the group in question (Igor Jurisica's lab, in Toronto) carefully select a machine-learning approach to identify good ways of analyzing images. Instead, they have just selected something like 1000 different techniques, and are running *all* of them on every image they have. It's a fishing expedition, with the hope that one of those thousand metrics they return will be a useful predictor.

Third, the techniques selected are basically arbitrary. Most egregiously, there appear to be NO Fourier transforms included in the analysis!! Further, the images generated by the software appear to be transforms of something called "gray level cooccurrence matrices", and the computation of those can be estimated in no more than five minutes. So why are they taking 5 hours per unit? It appears that they have chosen to implement an exhaustive GLCM search that is an order of magnitude slower, rather than using existing estimation procedures that are ~98.5% accurate. Is that an excuse to use more computer time? Is there any scientific merit to that? Why aren't Fouriers included, since they are a standard technique for image analysis?

I have a number of computers that I run various BOINC projects on, but this will NEVER be one. It's a fishing expedition, being sold as cancer research, and that is a sad way to deceive the public.