The Emerging Science of DNA Cryptography
KentuckyFC writes "Since the mid 90s, researchers have been using DNA to carry out massively parallel calculations which threaten encryption schemes such as DES. Now one researcher says that if DNA can be used to attack encryption schemes, it can also protect data too. His idea is to exploit the way information is processed inside a cell to encrypt it. The information that DNA holds is processed in two stages in a cell. In the first stage, called transcription, a DNA segment that constitutes a gene is converted into messenger RNA (mRNA) which floats out of the nucleus and into the body of the cell. Crucially, this happens only after the noncoding parts of the gene have been removed and the remaining sequences spliced back together." (More below.)
KentuckyFC continues: "In the second stage, called translation, molecular computers called ribosomes read the information that mRNA carries and use it to assemble amino acids into proteins. The key point is that this is a one way process. Information can be transferred from the DNA to the protein but not back again because during the process various details are lost, such as the places where the noncoding sequences have been removed. The new idea behind DNA cryptography is to exploit this to encrypt a message. The message is encoded in the sequence of bases in the DNA (A for 00, C for 01, G for 10, T for 11, for example) and then processed. The resulting protein is then made public. The key, which is kept private, is the information necessary to reassemble the DNA from the protein, such as the position of the noncoding regions (abstract)."
You would not need to solve the protein folding problem in order to crack this form of cryptography. It is not as though data is encoded in protein conformation using this technique. In fact, this technique would be unlikely to generate well-formed proteins at all. According to the paper, the method does not actually use real nucleic acids or proteins, or even very accurately simulate their properties in biological transcription or translation. The paper is even titled "A Pseudo DNA Cryptography Method." The author is using transcription and translation as a model for the general data flow present in this scheme, but the author points out that strictly hewing to the biological splicing scheme would introduce extra vulnerabilities, since it would be possible to identify from the final protein sequence places where splicing occured.
On the subject of vulnerabilities, this method, as admitted in the paper, is a symmetric substitution cipher. You still need a secure channel to perform key exchange (the key here contains the locations and lengths of spliced out introns). If an eavesdropper gets ahold of the protein (ciphertext), a simple lookup of codons gets the eavesdropper back to the post-spliced RNA. The unique challenge of this cipher is to determine where splicing occured in order to get back to the pre-spliced RNA (which is a simple complement of the DNA sequence, which in turn is an easy substitution cipher away from the plaintext). While a clever way to implement it, the intron splicing in this method is really no different than the mechanisms used to confuse plaintexts in block ciphers like DES, and it is subject to the same vulnerabilities like differential attack.
"FDA staff reviewers expressed concern about the number of patients who were left out of the study because they died."