Slashdot Mirror


Four New DNA Letters Double Life's Alphabet (nature.com)

Joe_NoOne (Slashdot reader #48,818) shares this update from Nature: The DNA of life on Earth naturally stores its information in just four key chemicals -- guanine, cytosine, adenine and thymine, commonly referred to as G, C, A and T, respectively. Now scientists have doubled this number of life's building blocks, creating for the first time a synthetic, eight-letter genetic language that seems to store and transcribe information just like natural DNA.

In a study published on 22 February in Science, a consortium of researchers led by Steven Benner, founder of the Foundation for Applied Molecular Evolution in Alachua, Florida, suggests that an expanded genetic alphabet could, in theory, also support life. "It's a real landmark," says Floyd Romesberg, a chemical biologist at the Scripps Research Institute in La Jolla, California. The study implies that there is nothing particularly "magic" or special about those four chemicals that evolved on Earth, says Romesberg. "That's a conceptual breakthrough," he adds... Benner says that the work shows that life could potentially be supported by DNA bases with different structures from the four that we know, which could be relevant in the search for signatures of life elsewhere in the Universe...

The researchers call the resulting eight-letter language 'hachimoji' after the Japanese words for 'eight' and 'letter'. The additional bases are each similar in shape to one of the natural four, but have variations in their bonding patterns. The researchers then conducted a series of experiments that showed that their synthetic sequences shares properties with natural DNA that are essential for supporting life... Benner's group previously showed that strands of DNA that included Z and P were better at binding to cancer cells than sequences with just the standard four bases. And Benner has set up a company which commercialises synthetic DNA for use in medical diagnostics.

7 of 67 comments (clear)

  1. 6 to 8 by Artem+S.+Tashkinov · · Score: 3, Informative

    Not meaning to downplay the significance of this breakthrough but early last year synthetic biologist Floyd E. Romesberg already announced the creation of two additional synthetic letters.

  2. Interesting! by oldgraybeard · · Score: 2

    Life is 8 bit
    How soon before we move to 16 -> 32 -> 64?

    Just my 2 cents ;)

    1. Re:Interesting! by Red_Forman · · Score: 5, Funny

      How soon before we move to 16 -> 32 -> 64?

      It depends if you rely on the marketing departments of NEC, SEGA or Intel.

  3. 8 POSITION, not 8 bit. by Anonymous Coward · · Score: 2, Informative

    You can only have one of any of the 8 occupying a given strand pair position, meaning that it is 2^3 or 3 bit resolution, up from 2 bit with the previous base pairs. The rest of what you say is correct however.

  4. Re:the evolution revolution by Red_Forman · · Score: 2

    Isn't it easier to just drink Red Bull?

  5. Other letters were probably discarded by Solandri · · Score: 5, Insightful

    Humans have created a variety of languages, from Chinese with thousands of different characters (one for each word), to English with 26 characters which are combined to make different words, to binary with just 2 characters which are combined to make words. When researchers started playing around with compression algorithms, they got to wondering - what's the optimal number of characters in an alphabet for maximizing compression? That is, minimizing the size of the words, while also minimizing the space taken up by each character. With binary, you minimize the space needed to encode each character, but it comes at the cost of lengthening the size of each word. With Chinese you minimize the size of each word, but it comes at the cost of increasing the space needed to encode each character. How many letters in an alphabet results in the most compact language?

    The answer turns out to be e. 2.718. An alphabet with e characters allows you to represent data the most efficiently and compactly. Obviously you can't have a non-integer number of characters, so the optimal number of characters for a compact language is 3.

    Which is probably why DNA only codes 4 different molecules. Since a double helix with conjugate pairs can't be coded with 3 letters, 4 end up being the next step. Likely, DNA/RNA with more base pairs have developed naturally before (probably several times), but were eventually selected out after having to compete with 4-base pair DNA. So as interesting as this is, it probably isn't the first time it's happened like TFA states.

  6. Re:hachimoji??? by Livius · · Score: 2

    With four nucleotides, each base pair encodes 2 bits. With eight pairs, it is 3 bits each. So this is only a 50% improvement in information density.

    3 bits is twice as much information as 2 bits.