IBM Sponsors Humanitarian Grid Computing Project
BrianWCarver writes "Reuters reports that IBM and top scientific research organizations are joining forces in a humanitarian effort to tap the unused power of millions of computers and help solve complex social problems. Following the example of SETI@home, the project, dubbed The World Community Grid, will seek to tap the vast underutilized power of computers belonging to individuals and businesses worldwide and channel it into selected medical and environmental research programs. The first project to benefit will be Human Proteome Folding, an effort to identify the genetic structure of proteins that can cause diseases. The client is currently available for Windows XP, 2000, ME, and 98."
Proteins do.
Well, not only do they not support any clients besides Windoze, but if you're operating on any reasonably secured LAN where the firewall doesn't allow you to willy-nilly connect over SSL ports (443) using proprietary protocols (gasp, imagine that), it isn't going to work.
Not really a great way to get off on the right foot with this effort. Make it impossible to use by the majority of those interested by precluding other OSes and folks on corporate networks without proxies.
Back to Folding@Home for me!
Additionally I think it's good that IBM too have an interest in this area, since 1) competition is always good and 2) it makes for more accurate results. With some luck we can have peta-byte based grid by 2007.
For those of you who don't know Stanford's project, called Folding@Home, uses computer cycles to observe and find out more about how proteins fold.
Now how is this really different from IBM's project?
From IBM's World Community Grid website:
"However, scientists still do not know the functions of a large fraction of human proteins. With an understanding of how each protein affects human health, scientists can develop new cures for human disease.
Huge amounts of data exist that can identify the role of individual proteins, but it must be analyzed to be useful. This analysis could take years to complete on super computers. World Community Grid hopes to shrink this time to months. Human Proteome Proteins are long and disordered chains folded into globs. The number of shapes that proteins can fold into is enormous. Searching through all of the possible shapes to identify the correct function of an individual protein is a tremendous challenge.
The Human Proteome Folding project will provide scientists with data that predicts the shape of a very large number of human proteins. These predictions will give scientists the clues they need to identify the biological functions of individual proteins within the human body. With an understanding of how each protein affects human health, scientists can develop new cures for human diseases such as cancer, HIV/AIDS, SARS, and malaria."
From Stanford's Folding@Home website:
"What are proteins and why do they "fold"? Proteins are biology's workhorses -- its "nanomachines." Before proteins can carry out their biochemical function, they remarkably assemble themselves, or "fold." The process of protein folding, while critical and fundamental to virtually all of biology, remains a mystery. Moreover, perhaps not surprisingly, when proteins do not fold correctly (i.e. "misfold"), there can be serious effects, including many well known diseases, such as Alzheimer's, Mad Cow (BSE), CJD, ALS, Huntington's, and Parkinson's disease."
"What does Folding@Home do? Folding@Home is a distributed computing project which studies protein folding, misfolding, aggregation, and related diseases. We use novel computational methods and large scale distributed computing, to simulate timescales thousands to millions of times longer than previously achieved. This has allowed us to simulate folding for the first time, and to now direct our approach to examine folding related disease."
They both sound like they're out to accomplish the same exact thing. I could not spot any real differences, anyone care to enlighten us?
I'd encourage all of you guys to support BOINC, an open source and multi-platform architecture instead.
Treehugger? Treehugger... Treehugger!
The IBM World Community Grid project uses Agent software by United Devices, which was developed in part by some of the people from distributed.net
Here's the URL for BOINC: http://setiweb.ssl.berkeley.edu/
Each project in this protein folding will give a better understanding of how and why certain thing occur in living thing. The Folding project at Stanford is a general protein folding to find out what angles and other attributes are normal what are abnormal. There is no particular protein structure they are looking at. These proteins could be anything between prions to humans.
This Human Proteome Protein project is looking at primary human proteins and how they could affect human function.
My opinion is both are important since each can affect each other for example the SARS which usually start in fowl and then transmit to human to cause SARS.
grid.org and World Community Grid are the same project. See this discussion thread from grid.org.
Although United Devices is involved in running both the IBM World Community grid Proteome project, and also the older cure Cancer project at http://www.grid.org/, they are unrelated. In fact UD's grid.org is running both at the same time.
If you are a grid.org member, then your existing client will be able to participate in the same Proteome project. (You have the option of opting out of the Proteome project if you want to continue to exclusively run the Cancer project only.)
If you download the World Community Grid client, you will only work on the Proteome project.
The BOINC open-source distributed computing main page: http://boinc.berkeley.edu
From there you can see the five projects currently using the BOINC platform (developed by the SETI@Home team)
As someone who works in the field of computional biophysics, these are completely different projects. Folding@Home is designed to study the mechanism of protein folding, and uses molecular dynamics as the tool to do this. The goal of the studies is to understand at a basic scientific level just how it is that proteins fold.
This project is designed to predict the structure of large numbers of proteins for which we know the sequence, but not the structure. The algorithms for predicting protein structure are distinct from molecular dynamics, since the end goal is very different. I believe that the particular method they are using is Rosetta, developed by at the University of Washington, with the the Institute for Systems Biology is affiliated.
Basically it boils down to the difference between protein folding (which implies studying the mechanism) and protein structure prediction. The second is solvable to reasonable accuracy with modern methods (although not perfect), but not cheap, so a grid computing approach is a nice way to tackle the problem.
The folding@home problem is MUCH more difficult, needing the distributed computing framework to study the folding of ONE small protein.
The information is available on their site: http://www.worldcommunitygrid.org/files/rfp.pdf
To quote:
World Community Grid is designed as a resource for research done with a philanthropic or humanitarian purpose and will only be available to projects conducted for public and not-for-profit purposes. It will serve as a useful tool for the completion of a certain stage of research, hastening the progress of projects into further phases of development. Results must be made available to the global research community by the sponsoring research organization and remain in the public domain. The results will also be available on World Community Grid's website for volunteers and other visitors.
Dumb question from a bio neophyte, but wouldnt you already know the structure if you knew the sequence, since you would have an example of the protein, and the sequence supposedly more or less determines the structure?
;)
Short answer: no.
Longer answer: first, protein structures are incredibly complex, and in fact it's often much easier to sequence a big protein than to determine its structure. The first can be done (these days) by any half-competent lab tech working with relatively cheap equipment; the second is one of the most demanding applications of the black art of crystallography -- if the protein is amenable to crystallization at all, which many aren't, and if crystallization doesn't change the protein's structure, which it often does. Other methods for determining protein structure exist, but most of them are really Not There Yet.
Second, the degree to which sequence determines structure is an open question. I mean, okay, in broad terms it does; there are only so many possible configurations for any given sequence. The problem is that the number of possible configurations for any protein of more than trivial size is really really big. There are many, many steps between "translation from RNA into polypeptide" and "finished protein" -- the simple fact is that in the cases of most proteins, not only do we not know their complete structures, we don't know how they get to the types of sub-structures we do know they have. It sounds to me like this IBM project is trying to puzzle out the first question, while AFAIK Folding@Home is more interested in the second.
Disclaimer: none of this is really my area of expertise; I'm a genomics guy. So it's quite possible that my answers are out-of-date or just plain wrong. If so, someone please tell me, because I'd like to know.
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
In the time it took me to create a Slashdot login to be able to post a message here, 4 other people have already joined the Grid 'team' for Slashdotters. Apparently they're tracking progress and awarding 'points' for tasks completed and our team is ranked 35th overall at last check.
For those interested, the team name is 'Slashdot Users' and more information can be found here
my geeklog
There's also a (very) large fixed energy cost to produce the things in the first place. If by using one of these fairly inefficient computers you save building another new one, then it's probably an energy win.
That said, these @Home projects aren't used to do real research. It's a feel-good project. The real projects to determine the function of proteins involve large-scale human medical records, which can't just be distributed to random computers world wide. These projects are also *not* particularly CPU intensive. You need wetware to figure out a lot of it.
To use computing power to directly figure out protein chemistry (folding, interactions, etc) is well beyond our current supercomputing capabilities. Small molecule chemistry is supercomputing (max of hundreds of atoms). To go into the tens of thousands of atoms in a pair of interacting proteins with water shell involves making approximations so gross the results are meaningless. Sorry to rain on everyone's parade, but these are useless PR stunts for the gullible (err, us I suppose).
What we need to do real ab initio protein chemistry is a whole new method of doing chemistry. Think Schorr's Algorithm for quantum chem. No such algorithm exists, nor does the quantum computer that could run it.
For a bunch of reasons:
1) Establishing a permanent infractructure with big back-end servers and infrastrcuture to do this in a big way - permanently
2) This is the first project of MANY-MANY, if you read site there is a process to bring in projects you would like to run on this Grid
3) Looks like IBM is donating money, people and time to kick start and turn this into a permanent infrastructure, maybe even roll into a non-for-profit over time
4) Look @ Board of Advisors/directors (impressive list of folks) - IBM only has one on board
5) Using IBM marketing and WW presences to build a network of users in the 10's of millions...not the millions....that kind of power would help out a TON of different research projects. The press on this in one day blows away what seti and others have been able to do via grass roots efforts. Not to take away from those projects - they are awesome....but this exposure will help this Grid, Boinc, Seti, Protein folding as well
6) Check the stats, IBM is one of the worlds biggest corporate sponsors and donaters of technology and technology solutions back to the public and has a huge history of giving to good causes, this is just another example of this...
Boinc
A couple weeks ago, HP had a press release announcing the "Global Grid Exchange":
http://www.globalgridexchange.com/
It's interesting to me that IBM would feel pressured to "play catch up" against HP (Should we expect one from Sun next month?) Obviously both companies have been percolating SOME sort of "Killer App" Grid Initiative for some time now. Perhaps the Grid Wars are finally starting to heat up!
(The name "World Community Grid" DOES sound like a blatant copy of "Global Grid Exchange", IMHO. C'mon guys! Be original!)
It's my understanding that because the Global Grid Exchange is bytecode-based (Java) they will support Linux as well as Windows (and eventually OS X.) Also, researchers will be able to write their OWN applications to run on the Grid, rather than limiting themselves to Proteome Folding.
Imagine that -- a researcher on a Windows box will be able to write a program which could be run on a Linux box (or, I'll go ahead and say it, a Beowolf Cluster) all without the programmer having to know -- OR CARE!
For that reason alone, IBM's offering seems like "Too Little, Too Late".