Folding@Home Reports Success
msheppard writes "This Article describes how the folding@home distributed computing project is reporting that they used the data processed on client machines to "predict the folding rate and trajectory of the average molecule." Too bad Seti@Home hasn't had a hit yet."
Here.
MSNBC Article.
Folding@Home Home
For the real info though check out the Forums
Token link to how my team is doing.
PRIME1
http://www.kubuntu.org/
If you like F@H, check out Distributed Folding.
Look, ma! I'm a karma whore
if you are intressted in distributed computing, good page to start with is http://www.aspenleaf.com/distributed/. there is info on every existing distributed computing project (both upcomming and existing), lots of articles on distributed computing and even links to books on distributed computing.
-- http://electronicintifada.net --
This(postscript) is the the original paper on the hardness of String Folding problems.
Later, when folding@home has folded, the distributed power of the toolbar may be used to make a 'Super-Google' of sorts. (is that a pun?)
Unfortuantely, that's a common trait in software licenses.....
I submitted this as a story, but it was rejected. Google has incorporated distributed computing into its toolbar as an option. The first supported project is Folding@Home, but they will add more projects in the future. Its optional, and currently has only been released to a few toolbar users. It will gradually be released to all users. Check it out at toolbar.google.com/dc/. Google is currently seventh in the team statistics...
What I can't figure out is whether or not these are the same projects now!
Of course since glibc 2.3.1 killed the folding@home client....
X(7): A program for managing terminal windows. See also screen(1).
Yes it does. Be sure to set chmod +x $FILENAME.exe also.
Where does the school board find them and why do they keep sending them to ME?
Well, I kinda agree and I kinda disagree.
First, you can't expect to go from no success to complete success overnight. People have been trying to fold proteins for some time now and have basically failed because it is freakin' hard. The theory is in principle in place, a least to a first approximation, but the calculations are so intensive that they have basically beaten every comer. As an undergraduate I remember how everyone in the field thought getting bigger and better grants and buying bigger and bigger computers was the answer. Oh to be SGI in those days. They sum up the problem pretty well in the Nature paper, essentially a modern (desktop) computer would require a few decades to crunch through a single useful length simulation. Then you need to do it many times to get a useful answer (say 100-1000). Even supercomputers are going to balk at that kind of calculation. Moore's law what it is, we should then be able to get through an in silico simulation in a week on a single computer (when its this fast crystallography really will be dead) by, oh say 2040 at best. (somebody want to calculate that exactly, 10000yrs -> 0.02yrs is how many doubles). So yes, this hasn't gotten rid of x-ray crystallography just yet.
But this is still really cool. Complaints about interface and maintenance aside, this was a great system. It relied on four pretty bright insights.
First, that distributed computing is essentially the poor man's (cough, the academic's) super computer. Also, it automatically adapts itself to technological improvements. People will buy new computers from time to time and, hopefully, reload your software.
Second, that there was no reason other than no one had sufficiently brute forced the process that the existing methods shouldn't work. They use a bunch of 'cheating' techniques to make this managable during the screen saver timescale, such as a united atom model (I think that means they ignore aliphatic hydrogens) and implicit solvent (don't treat it as individual solvent molecules, just a uniform field). It was an open question as to whether this approach would work at all or if you had to go over to much more explicit methods to get it to work at all. It appears that this has kinda worked with the cheater methods in place.
Third, choice of a test case. Yes they chose something that was small. This isn't surprising. They wanted to be done sometime this decade, remember there is a graduate student as the primary author here. Small was necessary. However they also chose a FAST-FOLDING protein. That was clever. Basically, even with distributed computing, it is still hard to simulate a full microsecond of time. Thus they chose something that had some chance of completing its folding one the time scale that they could look at.
Fourth, they remembered their P-Chem. It is really hard to run these things to completion... so they didn't. You don't have to run the simulation until 99% of the molecules have completely folded, just until an appreciable number have folded and you can extrapolate the behavior from that. They ran a 20ns simulation (at the longest). The thing takes 7us for ~60% to fold. As a result only once in a great ong while did one of the simulations actually produce a folded protein. But by doing it ~10000 times they could figure out how that translated into the rate constant. That's clever.
That said, yes there is a long way to go on this, but its still a really clever paper. No we haven't cured cure cancer yet, but its still progress. And forget an in silico structure of the ATPase, that's largely understood already (check the RSCB/PDB there's a bunch). The real challenge will be getting a structure that size that hasn't been solved by other methods and convincing anyone else that you're right! Disclosure- I don't have PhD in this area yet, but I'm close.
When they can predict the structure of the F1F0 ATPase, then we can throw out crystallography- but it's not going to happen.
(Ignoring for the moment that crystallography has it's own issues. . . at least it can show active sites and quaternary structure)
Well, for the first, we can't throw out crystallography even then. When you're doing a computer calculation, you are in the realm of theory. (even if you have arbitrary accuracy).
You will still need to do experimental verifications now and then.
At the moment, about 2/3 of known protein structures have been mapped through X-ray crystallography. At best the resolutions are about 1.8 Å, which is pretty good. So you can see quite a bit more than quaternary structure!
The other third is done with NMR spectroscopy,
usually with some powerful computing help to figure it out.
And then there are a pitiful few,
done with computers and experimental data.
These structures also have the poorest accuracy.
Note that computers will never, ever be able to figure out a protein structre ab initio. (i.e. without any info except the sequence)
Do the math, say you have 100 amino acids, and you
test say, 4 conformations for each, that's 4^100
combinations to test.. and you test 10 million a second, it'll take you 5E45 years.
Much older than the current universe.
(Disclaimer: I do not -yet- have my PhD in computational biochemistry.. but I'm working on it..)
Actually, we're in the experimental error. Keep in mind that folding time distributions are exponentially distributed (not Gaussian). This means that the std devs will be big just by their nature. 7.5 vs 6 are indistinguishable statistically.
I've been folding for 2 years and I'll tell you they have deadlines on their work units; nec. to proceed to the next time step of the simulation.
Only people who can meet the deadlines should fold- every else should do something else.
To meet the deadlines you should:
1. have a fast machine ( > 300 Mhz)
2. leave your machine on 24/7
3. have a persistant connection or
dial in every 6 hours or
set up your computer to autodial.
Given that you've almost stated the Levinthal paradox I'll assume you're familiar with it, but missed the point. Basically, it states that even in the simplest description of protein conformation (say 3 possible states each for 100 amino acids) can't be searched in a reasonable period of time, the shortest feasible time that a protein could sample a state in about 10^-13s. This works out to be ~10^27 years to check all the 3^100th states (borrowing Styer's description of this). This is clearly wrong, proteins fold in milliseconds (okay ns-100s of but you get my point). The clear conclusion is that proteins don't sample every conformation availible, or even any singificant fraction of them. There must be some fashion by which frequent short range and random long range contacts guide the protein into a 'pathway' of folding.
The nifty thing with the folding@home study is that there were able to basically show that invoking a simple physical force field system was enough to get pathways, though they don't make too big a deal about this, maybe someone else has already done this, but I'd be surprised if they managed to do as many trajectories as were done here. I imagine it'll be a while before they process the trajectories to try and find actual pathways (very compute intensive), but the fact that they found a comparable rate (we're not talking global conformation here, these are kinetics) suggests that they may be sitting on top of an actual description of the folding pathway for this teeny protein. Spiffy!
A comprehensive list of distributed projects can be found here http://www.aspenleaf.com/distributed/
scott
Your concern that we won't recognize complex wave methodologies (spread spectrum is one I can wrap my brain around, so I'll stop there) may be right.
However, in addition to large elaborate schemes, we have several broadcast schemes that aren't likely to change and that are simple enough to detect: Telemetry (intentionally made repetetive since spectrum's cheap when you're talking to something a zillion miles away ( V'Ger) and its antenna is an ever-shrinking dot), and WWV (The US's atomic clock broadcast). Heck, our way of talking to already-gone objects like voyager *cannot* change.
Similarly, a never-obsolete set of radar pings, carrier waves for TV and radio and *whatever* use, etc. are all just as vaguely possible. For example, WE MAY *NEVER* MANAGE TO KILL OFF FM OR SHORTWAVE BROADCASTS. And if, in this far-off civilization, two planets are settled, the first communication methodology geared to span interplanetary distances is going to be as simple as possible.
Occams razor: Noise cancellation's first and easiest technique is redundant signal (carrier waves with frequency-modulation being close enough to fit this category). No matter what, there'll always be easy opportunities for this easy way out. Anything more complex without a good reason would be illogical (I wanted to say anything less would be uncivilized, but nobody'd remember the old ad campaign).
I like your concern, though. It brings to mind a good question I'll be sending to SETI in a moment: Has SETI projected what we'll sound like in 50 or 100 years and seen how they'd score at considering us civilized if we're entirely spread-spectrum or worse by then?!
I agree with you 100% about the bugginess of the Folding@Home client. However, I also agree with Vijay that it is a minor issue. (I still wish they would make it more like UD.) UD may have a nice stable client, but they're a bunch of pricks, and brute force drug searches don't advance pure science as much as the fundamental discoveries made by projects like Folding@Home. Vijay means well, so please forgive him for his lack of social refinement regarding his project. 263 units and counting; I feel good about the use of my spare cycles.
Wired had a good explanation on the problems inherent in predicting folding. IBM is building a big grid supercomputer to do this.