Grid Computing Saves Cancer Researchers Decades

Oh great ... by trolltalk.com · 2007-11-07 13:47 · Score: 5, Funny

Wanna bet they discover that maple syrup or Canadian back bacon cures cancer?

I used to run Folding@... by kcbanner · 2007-11-07 13:48 · Score: 5, Insightful

...as a competition with friends. But then I realized that I didn't really need to use my computers as heaters...and did a number for the planet and closed the client.

--
Obligatory blog plug: http://www.caseybanner.ca/

Re:I used to run Folding@... by Turn-X+Alphonse · 2007-11-07 13:51 · Score: 4, Informative

If you run it on a low level you can only increase your usage by about 1-2 and still help the project, there is no logical reason to run the client at 100% if it's going to cost you a bomb, where as at 1-2% you won't win any contests, but you will be helping the project and paying at most a buck or two extra on electric a month.

--
I like muppets.
Re:I used to run Folding@... by ChatHuant · 2007-11-07 14:13 · Score: 2, Insightful

... Al Gore, uses the word "if" too much. It's an old debating trick, to say "if X, then Y", and focus on the terrible consequence Y, and completely avoid the debate - which is over the validity/scope/level/definition of X

I don't see it as a trick, but rather as being honest. Many of the "X" items aren't certain; it would be a lie to present them as such. But we can estimate the probability of X (based on the current state of knowledge), and explore the consequences if X *does* occur. Gore's argument is that the consequences are serious enough to require action now, even it X may not happen after all. Most climate change skeptics I've seen ignore that and focus on the fact that the Xs aren't 100% surely proven.
Re:I used to run Folding@... by DigiShaman · 2007-11-07 14:21 · Score: 2, Insightful

Then just run it in the winter.

Exactly!

Rather then turning on my heater these past few days (getting chilly at night in Houston, TX), I run the GPU Folding@home client on my PC. Seriously, it's not wasted energy if you want your home to be heated. You also participate in worthy cause to boot!

--
Life is not for the lazy.
Re:I used to run Folding@... by Belial6 · 2007-11-07 14:38 · Score: 2, Insightful

Gore is an idiot. The real "Inconvenient Truth" is that following Gore's advice will kill you within minutes. I'm not convinced that mass suicide is really the right answer.

In fact, you can even reduce your carbon emissions to zero.

Al Gore
Re:I used to run Folding@... by ZorinLynx · 2007-11-07 15:21 · Score: 3, Insightful

This only applies if you use electric heating.

In most places, electrical energy costs a HELL of a lot more per watt-hour than other sources like natural gas, oil, propane, and so on.

So unless you heat your home with electricity, which practically no one north of Florida does unless they have VERY cheap electrical power, you'll still be paying more by running computers.
Re:I used to run Folding@... by scottv67 · 2007-11-07 16:43 · Score: 2, Insightful

...outweigh the years of life extended by treating cancers.

It's easy to feel that way until someone in your family is diagnosed with cancer. Also, treating cancer does not just "extend life". There are a lot of younger people (20 to 40 years old) who get different forms of cancer. For them, it's not "will I live to 76 or will i live to 80?" but "will I live to see 30?". Don't even get me started on the kids who are afflicted with these diseases.
Re:I used to run Folding@... by porpnorber · 2007-11-07 19:39 · Score: 2, Interesting

Meanwhile, since I live in Canada and by this time of year I do need heating, I have my boinc client running at 100%, I'm doing some good, and (since the peak capacity of the machine is justified in other ways) it's not costing a penny. The heating here is electric anyway; it may as well do some computation on its way into my home!
Doing whatever@home in the winter is just good sense.
Now what's needed is a distributed computing client that is controlled by a room thermostat. No, really, I'm totally serious.
Re:I used to run Folding@... by TeknoHog · 2007-11-07 23:44 · Score: 3, Informative

Can I run it so that speedstep/cool'n'quiet works? What I mean I do not want to run anything which increases the CPU frequency. Instead it should keep the CPU at lowest freq. Can this be accomplished?

Linux's CPU frequency scaler has this option. For example the 'conservative' governor has the file /sys/devices/system/cpu/cpu0/cpufreq/conservative/ignore_nice_load. So a program running with lower than default priority will not increase CPU frequency.

I use a script to handle CPU frequency changes. When I'm at home with my laptop, I use the "ignore nice" option which in practice will turn the fan off. YMMV. When I go somewhere, I can set the CPU to full steam.

--
Escher was the first MC and Giger invented the HR department.

Storm Botnet by creativeHavoc · 2007-11-07 14:02 · Score: 5, Funny

If they wanted to knock that 10 years down to 5 they could just buy a chunck of the storm worm bot net!

--
insight through the mind

How good are the programs by gringer · 2007-11-07 14:06 · Score: 4, Insightful

I hope they're using programs that've had a few computer scientists' eyes over them. One of the issues I see with supercomputing is that people tend to see it as a way to get around dumb code(1) — if the computer's fast enough, you can implement *five* infinite loops, have an exponential time algorithm, and still get the calculations done before dinner!

(1) although from their point of view, it's just slow code.

--
Ask me about repetitive DNA

162 years? by sayfawa · 2007-11-07 14:12 · Score: 4, Insightful

Okay, not that I'm knocking how cool this grid computing is, but that estimate of 162 without grid computing couldn't possibly be taking into account the acceleration of computing power. Maybe with today's computers it would take 162 years, but after the first couple of years just get a new computer and cut the time in half.

Which reminds me of how towards the end of my grad school career I did hours long simulations that would have taken weeks at the beginning of grad school. I was in grad school a long time :(

--
Free the Quark 3 from asymptotic confinement! Bring your charm! Don't get down! All colours and flavours welcome!

Re:162 years? by JK_the_Slacker · 2007-11-07 14:37 · Score: 3, Insightful

We're computer scientists. We can calculate these kinds of things. Protein folding calculations take a ridiculous amount of time and processing power. That's a reflection of how complex your dna is, not a reflection of how much processing power we have at our disposal. If we could borrow from the computing power of the future, then you might be right. But the fact remains, we only have what's at our disposal now. At the current state of computing technology, the calculations would take 162 years.

That's the thing, though... as computing power scales, so does the distributed computing. With one centralized server, if you start running a simulation on it, you have to continue to run that simulation on that server. On the other hand, in a distributed environment, when newer, more powerful machines come out, you can just set up a simulation client on it, and increase your calculation speed by that much. I used to run Folding @ Home on a 700 MHz computer with 256 MB of RAM. I later upgraded to a 1600 MHz computer with 512 MB of RAM. Now, I fold on a 2.2 GHz dual-core machine with 2.5 Gig of RAM. Does the newer machine do the work much faster than the two older machines? Yes, it does. Does that mean that the work I did on those older machines was needless? No. I still fold occasionally on the 1.6 GHz machine, and it takes about a week to turn over a WU, as opposed to less than 24 hours on my main machine. Should I stop folding on the old one because the new one works so much faster? No, because that's about 52 WUs I don't have to fold on my main machine per year. It's an increase in computing power, and that's always desirable in a situation like this.

It's all fine and dandy to talk about how much computing speed will increase in the future... but, in the end, reality overcomes theory. There are people dying of cancer right now, people that can be helped by letting computers do the work. True, in two years, the work will likely get done faster... but, that doesn't change the fact that we can't just sit around and wait. When those better computers come in to play, then let's add them to the pool. Until then, let's get something done.

--
I'm waiting for a "-1 somepeoplejustshouldn'tgetmodprivileges" meta-moderation.

This is great and all but... by Icarus1919 · 2007-11-07 14:31 · Score: 4, Interesting

But do we see a chunk of the profit that they'll be making off the cancer drugs they make from this data that OUR computers analyzed and then is eventually sold to us for too-high-to-afford prices?

Re:This is great and all but... by FooAtWFU · 2007-11-07 14:52 · Score: 2, Insightful

So, you seem to be complaining that the (evil) biopharmaceutical companies are greedy and want money and this is wrong... unless you can have a slice of it too? I think you need some sort of levee around your moral high ground, buddy.

--
The World Wide Web is dying. Soon, we shall have only the Internet.
Re:This is great and all but... by S.O.B. · 2007-11-07 15:26 · Score: 3, Informative

But do we see a chunk of the profit that they'll be making off the cancer drugs they make from this data that OUR computers analyzed and then is eventually sold to us for too-high-to-afford prices?

The research is being done by scientists at Princess Margaret Hospital in Toronto, a government run hospital. If you knew anything about health care in Ontario you'd know that profit is the last thing on their mind.

--
Some of what I say is fact, some is conjecture, the rest I'm just blowing out my ass...you guess.

Desktops are not supercomputers by deadline · 2007-11-07 14:44 · Score: 3, Informative

Every time these "connect desktops to become the fastest computer in the world" articles come up, I have to dust off my Cluster Urban Legends article to clear up the mis-conceptions that abound. I also did a piece on the Linux Magazine site as well that debunks much of the spam-bot supercomputer legend (need to register for that one)

--
HPC for Primates. Read Cluster Monkey

Re:Desktops are not supercomputers by deadline · 2007-11-07 15:20 · Score: 3, Insightful

I'm not talking about spare cycles. I'm talking about the naive notion that gets repeated in the press "the combined power of all these computers equals one of the fastest supercomputers in the world" For trivial parallel applications this might be true, but just once I would like to see these "supercomputers" run a simple parallel benchmark like High Performance Linpack (used for the Top500 list). My guess is the number of real FLOPS would be much less than expected -- if it even finished. Don't get me wrong, using computers like this is great idea, it is not one of the most power computers in the world, however.

--
HPC for Primates. Read Cluster Monkey

PS3 Supercomputer by jhines · 2007-11-07 14:45 · Score: 3, Informative

Folding@home has reached a petaflop out of PS3 games. A record supposedly, from the BBC news. http://news.bbc.co.uk/2/hi/technology/7074547.stm

I run their PC sw on my systems I keep on. They are getting results, and publishing papers based on the research.

Patents? by DoofusOfDeath · 2007-11-07 14:47 · Score: 2, Interesting

I'm very glad to help cancer research, but will this also result in the development of drug patents that (a) bankrupt some patients, and (b) prevent other researchers from improving on those drugs?

Because that would make me feel a little less charitable with my computing power. (Only a little, though.)

I don't get it... by Pedrito · 2007-11-07 15:43 · Score: 3, Informative

"The researchers estimate that this analysis would take conventional computer systems 162 years to complete."
They're always saying, "We've knocked decades off of our work by using the right tool for the job." That's like me saying I knocked decades off of the calculations to run an energy minimization on a hexane molecule by running it on my Core 2 Duo instead of my Atari 800.

I mean, let's face it. They weren't going to let the friggin' program run for 162 years. The problem became solveable when the hardware became available. Hell, within 5 years, that "conventional computer system" will be able to solve it in a fraction of that 162 years and 5 years later, a fraction of that. So what do you do? You wait until the hardware meets up with ability to solve the problem. They haven't saved decades. They probably haven't even saved a decade. Within a decade they'd probably be able to run it in a few days on a conventional computer.

Open Source Software Cures Cancer by atwtftg · 2007-11-07 16:57 · Score: 3, Insightful

According to the World Community Grid website:

World Community Grid is making [this] technology available only to public and not-for-profit organizations to use in humanitarian research that might otherwise not be completed due to the high cost of the computer infrastructure required in the absence of a public grid. As part of our commitment to advancing human welfare, all results will be in the public domain and made public to the global research community.

WCG uses the Berkeley Open Infrastructure for Network Computing (BOINC) client, an open source software project that runs on Linux, Mac and Windows. Headline should read Open Source Software Cures Cancer ;-)

BoincStats shows you who is contributing to World Community Grid projects. Check it out...and ask yourself why you aren't contributing.

How could this be? by WK2 · 2007-11-07 17:02 · Score: 2, Funny

How could they knock decades of research off when we are less than 10 years (TM) away from a cure?

--
Write your own Choose Your Own Adventure. http://www.freegameengines.org/gamebook-engine/

I can see it now by EEPROMS · 2007-11-07 17:22 · Score: 3, Interesting

We "the people" run the software and pay the millions of dollars of hardware and electricity costs. When the problem is solved the University patents everything (thank you suckers) and licenses the technology for for a small fortune to some back stabbing Megacorp (TM) drug company. So when "we the people" get sick we have the wonderful knowledge that we have paid twice for the ripp-off drugs. So all things being fair, if you want my cpu spare time I want a part of the license fees to pay for the drugs that cost a house when I get sick.

I OBJECT!! by Anonymous Coward · 2007-11-07 18:21 · Score: 5, Interesting

I know this research, and the people involved in it very, very well, and I think this project is a very sad, very large waste of computing time.

Let me back up and explain what the project is doing. To simplify a little bit, the vast majority of "work" in the cell is done by proteins. While DNA can be thought of as something like a simple "string", proteins have complex three-dimensional shapes. Knowing those 3D shapes is of great interest to biologists. There are several reasons for that. One is that it can allow easier design of drugs targeted at a specific part of the protein. Another is that by seeing the shape, we can understand how all the mutations that occur in disease might be affecting its function.

The primary way to determine the shape of the protein is to take the protein and to grow it into an ordered crystal. You can then shine an x-ray beam through the crystal, and the diffraction pattern that emerges can be, through some very complex math, reverse-engineered into a 3D structure. Typically the most difficult part of this process is finding the specific chemical conditions that will allow a crystal to grow. These conditions differ from protein to protein.

This project is not "solving cancer", by any means. Rather, the people in Buffalo have generated a high-throughput way of screening different chemical conditions to determine which ones might allow a protein to grow. They use robotics to screen about 1000 conditions, and take pictures of each condition. The question then becomes: can you automatically process the pictures to find crystals. That's the goal of this project, to help automatically identify crystals in this screen.

So why do I object so strongly to this work? There are three reasons.

First, the project has nothing to do with cancer. In fact, the proteins being analyzed are not in any way "cancer-specific proteins" -- many of them are not even human!! This "cancer" pitch is a sales job, and nothing but a sales job. As a cancer researcher, it offends me that people try to use the disease to justify research that is this unrelated.

Second, the project is ill-conceived, technically. In no way did the group in question (Igor Jurisica's lab, in Toronto) carefully select a machine-learning approach to identify good ways of analyzing images. Instead, they have just selected something like 1000 different techniques, and are running *all* of them on every image they have. It's a fishing expedition, with the hope that one of those thousand metrics they return will be a useful predictor.

Third, the techniques selected are basically arbitrary. Most egregiously, there appear to be NO Fourier transforms included in the analysis!! Further, the images generated by the software appear to be transforms of something called "gray level cooccurrence matrices", and the computation of those can be estimated in no more than five minutes. So why are they taking 5 hours per unit? It appears that they have chosen to implement an exhaustive GLCM search that is an order of magnitude slower, rather than using existing estimation procedures that are ~98.5% accurate. Is that an excuse to use more computer time? Is there any scientific merit to that? Why aren't Fouriers included, since they are a standard technique for image analysis?

I have a number of computers that I run various BOINC projects on, but this will NEVER be one. It's a fishing expedition, being sold as cancer research, and that is a sad way to deceive the public.

Re:I OBJECT!! by Tom+Womack · 2007-11-07 23:31 · Score: 2, Interesting

Given that most proteins contain tryptophan, and tryptophan fluoresces under UV, and UV lasers are not that hard to come by, wouldn't it be easier to shine a UV laser at the crystallisation plate and detect by subtraction where the glowy bit is?

Or, as a lot of molbio automation companies are offering, actually shine an X-ray beam through the putative crystal onto a detector and see if it diffracts.

Fully automated high-throughput crystal growing strikes me as a bit of a boondoggle; the sophisticated robots required for the last steps of automation are an order of magnitude more expensive than having three shifts of trained Indian or Chinese workers moving plates around and looking through microscopes.
Re:I OBJECT!! by defile39 · 2007-11-08 01:35 · Score: 2, Insightful

I'll start off by saying that I know little more about x-ray crystallography than what you explained in your post. My concern with your objection is, however, more related to your criticism. I understand your distaste for the project's underhanded tactics in trying to generate publicity. Beyond that, however, your criticisms fail to address the merits of what the group IS doing (other than what I perceive as your criticism of high-throughput screening in general). If you feel that your technical criticism has merit, have you explained your concern to the team conducting the analysis?

Besides, even though many of the proteins are not proteins associated directly with cancer, the knowledge that will come from having thousands of additional proteins 3-D structures will surely aid future cancer research.
Re:I OBJECT!! by C+Cumbaa · 2007-11-08 09:59 · Score: 2, Informative

In no way did the group in question (Igor Jurisica's lab, in Toronto) carefully select a machine-learning approach to identify good ways of analyzing images. Instead, they have just selected something like 1000 different techniques, and are running *all* of them on every image they have. It's a fishing expedition, with the hope that one of those thousand metrics they return will be a useful predictor.
Not quite. The machine learning bit comes second. You have to spend the CPU cycles to extract features from the images first. Only then can your favourite ML technique tell you if the features are predictive. The first ~1000 features (already computed, locally) show some promise, and that's why this project will explore the image feature space a bit more (~12000 features). Once we get Grid results back from our human-scored image set, any features that are a clear waste of time will be dropped.

Third, the techniques selected are basically arbitrary. Most egregiously, there appear to be NO Fourier transforms included in the analysis!!
Again, not really. The techniques selected are based heavily on our own research and on successful methods drawn from the literature. I can confirm that no Fourier analysis is done. Fourier analysis can tell you that there are high-frequency components in the image. So can simple edge detection. And a Radon transform will find the straight edges of a protein crystal. Publish your Fourier-based method of distinguishing amorphous precipitate from protein crystal, and I will include it in Phase II of the project. Before you do that, maybe also read up on wavelets.

So why are they taking 5 hours per unit? It appears that they have chosen to implement an exhaustive GLCM search that is an order of magnitude slower, rather than using existing estimation procedures that are ~98.5% accurate.
More time = more exploration of feature space. Show me proof of a "98.5% accurate" approximation method, and I will make sure that gets in to Phase II as well.

By the way, I noticed that the "Slashdot Users" team on the World Community Grid is ranked #4. You guys are huge contributors. Whether you contribute to this project, or Dengue Fever, or whichever, thanks.

Christian Cumbaa
Research Associate
Ontario Cancer Institute
Re:I OBJECT!! by C+Cumbaa · 2007-11-08 15:03 · Score: 2, Informative

Here is a more complete story: between changing compilers, moving from the development platform to the target platforms, and identifying some redundant computation in one corner of the algorithm, we were able to reduce the run-time from about six hours to five minutes. This allowed us to undo some rather brutal compromises (accuracy for speed) we had made in a previous stage of development, when we thought the analysis was running unacceptably long for Grid purposes.

The extra hours are not busy work.

Christian Cumbaa
Research Associate
Ontario Cancer Institute

Slashdot Mirror

Grid Computing Saves Cancer Researchers Decades

30 of 149 comments (clear)