Creating Artificial Proteins
Spy der Mann writes "By examining how proteins have evolved, UT Southwestern Medical Center researchers have been able to design genes to create artificial proteins.
The researchers have discovered a set of simple "rules" that nature appears to use to design proteins. By feeding these rules into a computer program, they were able to obtain a sequence of artificial genes. These genes were then inserted into laboratory bacteria, producing the artificial proteins as expected."
Well, we know we've been able to modify DNA to produce insulin from bacteria.
We've got bacteria that crap out metal wires (Can't remember if we discovered them or made them)
Now where's the bacteria that will make substances like xanax or other drugs, so it can make the entire market cheaper and more affordable to those who need it but don't have insurance, and "naturally" at that? (Naturally as in not needing a buttload of power from a processing plant for the drug and wasting energy uselessly)
Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
I think you might be using backwards logic here. TFA states that by examinig 100 proteins they were able to notice some standard common things about the proteins they were looking at. When they made rules around those common things they could make new proteins.
It's like having 100 pieces of example code to look at before trying to create your own, not generating the code from nothing.
s
Their "rules" were derived from observing nature, not computed or by any ab-initio means.
The original summary of the article is quite off base, as many of these biochemistry-related revelations are.
I would better summarize the Nature paper as saying that the researchers have found a somewhat reliable method of duplicating a three-dimensional structure by using existing sequences as a simple template. The concept of truly "designing" a protein from scratch remains the Grail of this field.
Full text of article, institutional/personal subscription required.
Abstract: Classical studies show that for many proteins, the information required for specifying the tertiary structure is contained in the amino acid sequence. Here, we attempt to define the sequence rules for specifying a protein fold by computationally creating artificial protein sequences using only statistical information encoded in a multiple sequence alignment and no tertiary structure information. Experimental testing of libraries of artificial WW domain sequences shows that a simple statistical energy function capturing coevolution between amino acid residues is necessary and sufficient to specify sequences that fold into native structures. The artificial proteins show thermodynamic stabilities similar to natural WW domains, and structure determination of one artificial protein shows excellent agreement with the WW fold at atomic resolution. The relative simplicity of the information used for creating sequences suggests a marked reduction to the potential complexity of the protein-folding problem.
From this page : a WW domain is the smallest, monomeric, triple-stranded, anti-parallel beta-sheet protein domain that is stable in the absence of disulfide bonds, cofactors or ligands.
Nope. From the original movie "The Fly" with David Hedison.
The higher the technology, the sharper that two-edged sword.
The rules for how DNA encodes protiens have been known since before I was born. The evolutionary mapping of how the genes coding different protiens duplicated and evolved into new structures is fairly easy to map out (give or take a brute force algorithm that runs in double-factorial-time to search through all evolutionary trees looking for the one that minimizes the number of mutations required along the way).
So this group has calculated the most likely common ancestor of the gene that now codes for a whole family of protiens, encoded the solution in real DNA, stuck it into bacteria and shown that it actually does produce a protien that they have been able to isolate the actual protien so that they can explore what it does/did.
(the term "articial protien" seems very odd to read - before I think it through, it sounds as though its hinting there is something mystical to "natural" protiens untouched by humans)
It would also, of course, be interesting if you could use this to work backwards through the genome to a set point
There actually is research that looks at predicting the last common ancestor between two species. For example, given man and ape, you can make a prediction on what the man/ape gnome was before they diverged into two species (not to go into details, but a lot of species divergence is the result of some kind of large scale chromosome rearrangement that makes it impossible to sexually reproduce). Remember, we didn't evolve from an ape, we diverged from an ape. The man and ape have had the same amount of time to evolve their genomes to become the species that we are today. Most people assume that at one time in our past we looked exactly like present day apes, but then evolved into humans. Where in fact we(again, both man and ape) probably looked something like a cross between an ape and a human-- whatever that might be.
Before there is a large debate in the ethics community, they (you) ought to get the facts straight.
E. coli is not a virus. Depending on the strain, the genome size is anywhere from 4.6 to 5.2 millions of base pairs. Putting one of your very own E. coli genomes together would be difficult and expensive. Better yet, why not just grow some? They'll spit up their genome after a few biochemical steps at the lab bench. If you were talking about a phage, their genomes are variable, averaging 35-50 kbp, or thousands of base pairs. That's still difficult and expensive. Even when you got your phage genome put together, now what? You need infectous phage particles to go about the infection/replication cycle. Phage particles aren't infectous to mammalian cells, so that's not what the terrorists are after.
Working with and engineering virus particles isn't reliant upon ordering oligos and assembling the entire genome. The hallmark of being a genejockey is allowing the DNA-containing object to do the replication work for you. You let the bacterial culture divide. You let the mammalian cells divide. You let the phages infect the bacteria and replicate. You let the viruses infect the mammalian cells and replicate. You then harvest the DNA from the millions of [whatever], make some CTL+C and CTL+V with some enzymes, and package the DNA back up in the infectous particles. It's a non-trivial process, and the ability to order oligos on the web doesn't magically give someone the ability to hack out a super virus.
"Is life so dear, or peace so sweet, as to be purchased at the price of chains and slavery?" - Patrick Henry
Not true species, is the only testable (ie scientific, ie non-arbitrary) classification. All the rest are arbitrary. A species is defined as a population of organisms which have reproductive isolating mechanisms which prevent the production of a viable (including sexual) offspring. ie two organisms are members of a different species when they are unable to reproduce and/or produce a viable offspring.
Since protein engineering is my field of study, for the benefit of the /. crowd (and my karma) I'll fill in the gaping holes left in the New Scientist article, and give you a little more background on the Nature paper. Because the writeup on /. is a perfect example of "scientific telephone": a semi-interesting result gets written up into a paper, which once it's been through several layers of editors suddenly seems like a major breakthrough.
The Nature paper isn't a breakthrough. It's not even really a major advance. Scientists in my field have been creating artificial proteins for five to ten years now. And yes, even some of them designed completely from scratch (though they're really simple; nothing as complex as, say, ATP synthase) instead of just taking a known fold pattern, known as a "motif." The "WW domain" (domain, in protein parlance, is a small, independent structure within a much larger protein---think of it like a module within the kernel or Apache) is a common fold in hundreds of different proteins. Basically, they analyzed the sequences of all of these WW domains, and figured out which positions were meaningful. It's kinda like reading through some code in a programming language you don't know, and figuring out which lines are comments and which lines are actual compilable code. This group found that the number of interesting positions is small, that they could identify them just from the amino acide sequence instead of having to mess with the whole complicated 3D structure of the domain, and that if they put together a protein with the meaningful amino acids intact and the non-meaningful positions randomized, then in many cases they could still get a pretty decent protein (in terms of structural similarity to the "natural" protein) out of it. Most of the paper is devoted to showing via various methods that they did get a pretty decent protein.
So what does this mean for me, assuming that this paper is absolutely correct (which I admit is a little hard for me to determine with one quick reading, given that I'm just a first-year grad student)? It means that the number of meaningful amino acids in a protein (at least in terms of overall structure) is pretty low, and that they can be identified without knowing what the full 3D structure is. This is good, because for a lot of proteins, the 3D structure is difficult to get. However, they picked an easy target: a small domain where there are over 100 unique sequences known. We'll see how well this method holds up with longer domains and fewer unique sequences. The S/N ratio won't be nearly as good.