Creating Artificial Proteins
Spy der Mann writes "By examining how proteins have evolved, UT Southwestern Medical Center researchers have been able to design genes to create artificial proteins.
The researchers have discovered a set of simple "rules" that nature appears to use to design proteins. By feeding these rules into a computer program, they were able to obtain a sequence of artificial genes. These genes were then inserted into laboratory bacteria, producing the artificial proteins as expected."
I've been creating proteins by hand since I was 12!
Norman Cook's Ode to Sl
Isn't one of the reasons that creationist use when they attack evolution (actually abiogenesis) is that it would take such a long time to generate functioning proteins through random chance that it would be statistically impossible? If there are simple "rules" to create proteins, maybe that's how nature was able to come up with life so quickly.
Real_men_don't_need_spacebars.
Good lord, what if people had other unique markings that could be tracked... finger/palm prints, DNA, retinas... now THAT would be scary.
Oh wait.
"Hel-l-l-p me-e-e-e-e..."
Seeing bad movies only encourages them. Watch responsibly
And the answer to this could be that a lot of rules have been randomly tried out. It turns out that the rule(s) we are seeing/discovering are the ones that lasted - and if they are simple they are probably efficient in some way.
The creationist/ID policy is to avoid facing unknowns by passing the buck onto a designer. In the current example, just because something appears elegant and simple to some person, it does'nt mean that it could not have naturally occured.
Our jobs, as scientists, or in the more general case, as people with a scientific temperement, is to uncover how or why this simple and elegant thing is the way it is - not to say, 'It's too tough, lets pass the buck onto the designer'!
The researchers believe they may have found a set of statistical rules for determining the tertiary ('overall') structure of proteins from the sequence.
(Although the summary reads otherwise, creating a 'new' protein with an arbitrary amino acid sequence isn't new at all though. )
If this pans out, it is of course significant towards the goal of engineering 'new' proteins one day. But there is still a lot to be covered. Even if the relationship between sequence and structure were simple and known (and it isn't, yet), you still have the issue of relating structure to function.
Which isn't known. And of course, even knowing the structure and function of a single protein doesn't mean you know what it's going to do in a complicated environment such as a cell, where there are thousands of things to interact with.
It's a step forward, nonetheless. But if someone thinks this means we're going to be tricking-out living organisms with new custom-engineered proteins anytime soon, you'll be disappointed.
Now where's the bacteria that will make substances like xanax or other drugs, so it can make the entire market cheaper and more affordable to those who need it but don't have insurance, and "naturally" at that? (Naturally as in not needing a buttload of power from a processing plant for the drug and wasting energy uselessly)
Um.. news flash: Drugs have been made that way for years.
But first: This works for proteins such as insulin. Most drugs are not proteins, however.
And for those who are, there is nothing about it which necessarily makes it cheaper or less power-consuming. Bacteria need food. Bacteria need to be kept warm. And most importantly, you've got to seperate and purify your drug from the bacteria and growth substrate and whatnot.
Of course, for proteins you've got no choice. It's practically impossible to synthesize proteins using conventional chemistry. And it's very very difficult (and likely uneconomical) to use bacteria to produce other organic compounds. So these things are complimentary to eachother, really.
Full text of article, institutional/personal subscription required.
Abstract: Classical studies show that for many proteins, the information required for specifying the tertiary structure is contained in the amino acid sequence. Here, we attempt to define the sequence rules for specifying a protein fold by computationally creating artificial protein sequences using only statistical information encoded in a multiple sequence alignment and no tertiary structure information. Experimental testing of libraries of artificial WW domain sequences shows that a simple statistical energy function capturing coevolution between amino acid residues is necessary and sufficient to specify sequences that fold into native structures. The artificial proteins show thermodynamic stabilities similar to natural WW domains, and structure determination of one artificial protein shows excellent agreement with the WW fold at atomic resolution. The relative simplicity of the information used for creating sequences suggests a marked reduction to the potential complexity of the protein-folding problem.
From this page : a WW domain is the smallest, monomeric, triple-stranded, anti-parallel beta-sheet protein domain that is stable in the absence of disulfide bonds, cofactors or ligands.
Most proteins eventually degrade, if they are not immediately destroyed by the immune system (ie, antigenic). Furthermore, for proteins that don't degrade quickly, how would you detect these proteins? Other than putting radioactive isotopes (try getting on an airplane with that in today's environment!), I don't see how you would detect them other than strapping someone down and getting some blood. I suppose you could always try a gene therapy technique to continually express protein, but gene therapy is still highly experimental and presents its own problems. This sounds way more complicated than just implanting inorganic RFID chips/beacons/whatevers under the skin or in a (cough!) body cavity.
In two papers appearing in the Sept. 22 issue of the journal Nature, Dr. Rama Ranganathan, associate professor of pharmacology, and his colleagues detail a new method for creating artificial proteins...
That's the sum total of useful information in the article. Go read the full paper in Nature if you want to know more. Scientific reporting at its finest. Now and then I read an article where a "journalist" actually understands what has been written and has something profound to say about it that the scientists themselves didn't even think of (and actually agree with). Unfortunately it's increasingly rare these days. Even rags like Scientific American seem to do more puff pieces and press releases than well researched articles these days.
How we know is more important than what we know.
PDFs of our papers, and Java code implementing 4 different correlated mutation algorithms including SCA, are at my web site:
http://www.afodor.net
The references are:
Anthony A. Fodor, Richard W. Aldrich. On Evolutionary Conservation of Thermodynamic Coupling in Proteins. JBC 279(18):19046-19050, 2004
John P. Dekker, Anthony Fodor, Richard Aldrich and Gary Yellen. A pertubation-based method for calculating explicit likelihood of evolutionary co-variance in multiple sequence alignments. Bioinformatics 20:1565-1572, 2004
Anthony A. Fodor and Richard W. Aldrich. Influence of Conservation on Calculations of Amino Acid Covariance in Multiple Sequence Alignments. Proteins 56(2): 211-221, 2004
The last paper contains a comparison between SCA and three other correlated mutation algorithms.
As I said, I haven't had a chance to look carefully or critically at the new papers. (It takes me a LONG time to read a paper critically :-> This Slashdot thread will be likely long archived before I finish thinking about these papers!). But this particular algorithm aside, people who are interested in bioinformatics and contact prediction may find the math behind the correlated mutation algorithms interesting.
Anthony
Email: anthony.fodor(remove this and put in an at symbol)gmail.com
http://www.afodor.net/
Why are there symbiant relationships? It allows for division of labor, essentially. The genetic load of one organism after symbiosis does not have to take care of these certain task that the other is taking care of. Most of the cells contained in your body are not actually yours. The majority of cells in the body are bacteria living in your intestine which each produce proteins which help with digestion. If our DNA had to encode for every one of those digestive and metabolic proteins that are actually used in digestion, we would be selected against compared to an organism that could make more efficient use of its DNA.
Diversity also leads to a sort of long term stability. If there are different ways to obtain resources, the ecosystem as a whole can adapt to environmental changes far more gracefully.
I'll never make that mistake again, reading the experts' opinions. - Feynman
Since protein engineering is my field of study, for the benefit of the /. crowd (and my karma) I'll fill in the gaping holes left in the New Scientist article, and give you a little more background on the Nature paper. Because the writeup on /. is a perfect example of "scientific telephone": a semi-interesting result gets written up into a paper, which once it's been through several layers of editors suddenly seems like a major breakthrough.
The Nature paper isn't a breakthrough. It's not even really a major advance. Scientists in my field have been creating artificial proteins for five to ten years now. And yes, even some of them designed completely from scratch (though they're really simple; nothing as complex as, say, ATP synthase) instead of just taking a known fold pattern, known as a "motif." The "WW domain" (domain, in protein parlance, is a small, independent structure within a much larger protein---think of it like a module within the kernel or Apache) is a common fold in hundreds of different proteins. Basically, they analyzed the sequences of all of these WW domains, and figured out which positions were meaningful. It's kinda like reading through some code in a programming language you don't know, and figuring out which lines are comments and which lines are actual compilable code. This group found that the number of interesting positions is small, that they could identify them just from the amino acide sequence instead of having to mess with the whole complicated 3D structure of the domain, and that if they put together a protein with the meaningful amino acids intact and the non-meaningful positions randomized, then in many cases they could still get a pretty decent protein (in terms of structural similarity to the "natural" protein) out of it. Most of the paper is devoted to showing via various methods that they did get a pretty decent protein.
So what does this mean for me, assuming that this paper is absolutely correct (which I admit is a little hard for me to determine with one quick reading, given that I'm just a first-year grad student)? It means that the number of meaningful amino acids in a protein (at least in terms of overall structure) is pretty low, and that they can be identified without knowing what the full 3D structure is. This is good, because for a lot of proteins, the 3D structure is difficult to get. However, they picked an easy target: a small domain where there are over 100 unique sequences known. We'll see how well this method holds up with longer domains and fewer unique sequences. The S/N ratio won't be nearly as good.