Slashdot Mirror


Machine Learning Reveals Genetic Controls

An anonymous reader writes with this quote from Quanta Magazine: Most genetic research to date has focused on just 1 percent of the genome — the areas that code for proteins. But new research, published today in Science, provides an initial map for the sections of the genome that orchestrate this protein-building process. "It's one thing to have the book — the big question is how you read the book," said Brendan Frey, a computational biologist at the University of Toronto who led the new research (abstract).

For example, researchers can use the model to predict what will happen to a protein when there’s a mistake in part of the regulatory code. Mutations in splicing instructions have already been linked to diseases such as spinal muscular atrophy, a leading cause of infant death, and some forms of colorectal cancer. In the new study, researchers used the trained model to analyze genetic data from people afflicted with some of those diseases. The scientists identified some known mutations linked to these maladies, verifying that the model works. They picked out some new candidate mutations as well, most notably for autism.

One of the benefits of the model, Frey said, is that it wasn’t trained using disease data, so it should work on any disease or trait of interest. The researchers plan to make the system publicly available, which means that scientists will be able to apply it to many more diseases.

14 comments

  1. cis and mi regulation is not "bad" code by WillAffleckUW · · Score: 3, Interesting

    See, the problem is many of you don't get that what you think of as "noise" in the DNA is actually code. Shifted code. The internal mechanisms use cis regulation and miRNA, mRNA, cRNA to adapt to things going on in the environment.

    It's not noise code, or broken code.

    It's designed to do that.

    If anyone had taken assembler and machine coding back in the old days of computing, they'd get it. You only have so much to code with, so you make it do multiple things.

    --
    -- Tigger warning: This post may contain tiggers! --
    1. Re:cis and mi regulation is not "bad" code by rockmuelle · · Score: 2

      For small genomes, yes, but for large genomes, there is a lot of "unused" material.

      Only about 6-10% of the human genome is transcribed into RNA, either protein the coding kind or non-coding types used in regulation. (small genomes are almost always entirely coding and even include overlapping coding regions, large genomes are the ones that have "junk" DNA in them)

      Transcription is most closely related to a processor reading machine code and doing something with it. In a computer program, we know that we can safely remove dead code paths and the code will still function. This is not true for DNA. Remove a portion of someone's genome and they usually die.

      It's much more likely that the "junk"/"noise" regions of the genome are structural and help the DNA coform so the chromosomes can specialize for different functions. DNA folds differently depending on the cell type in multicellular organisms. Because the nucleus of a cell is a fairly crowded place, the way the DNA folds determines which sites on it are even accessible for transcription. Muscle cells expose one set of gene coding regions, fat cells expose another.

      Taken from this perspective, large genomes are more akin to an origami fortune teller than machine code. Depending on the series of folding/unfolding events, a specific fortune is revealed. The fortunes are encoded directly onto the paper, but the paper also forms the structure used to access the fortunes. Another actor reads the instructions and acts on them (a person in the origami case or polymerase for DNA).

    2. Re:cis and mi regulation is not "bad" code by WillAffleckUW · · Score: 1

      There's a good paper on how a lot more DNA is used than we think, in the next issue of Cell.

      --
      -- Tigger warning: This post may contain tiggers! --
    3. Re:cis and mi regulation is not "bad" code by HiThere · · Score: 1

      I thought that was what histones were for. DNA that's wrapped can't be read, so you control what is wrapped to decide what is available for expression. And epigenetic tags freeze or thaw the wrapping. This requires sections of DNA that function as labels, but it doesn't directly control the folding (more accurately rolling into a cylinder) that's handled by the histones, and when they decide to roll it up is decided by what tags are attached to the labels.

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
    4. Re:cis and mi regulation is not "bad" code by Anonymous Coward · · Score: 1

      If anyone had taken assembler and machine coding back in the old days of computing, they'd get it. You only have so much to code with, so you make it do multiple things.

      A better analogy would be a huge bloated computer program that evolved over many decades - where changing (or removing) one little thing in one place can break things in dozens of other unexpected places - but where if you were to rewrite the entire thing from scratch you could reduce the size of the code base by a factor of a hundred while still preserving all the functionality (and also eliminating lots of bugs).

      Very few biologists would imagine that you could go through the human genome and excise all the "junk" regions and still end up with a healthy human. But many would agree that some hyper-intelligent entity could almost certainly design a new species that looked and acted human but with a genome that was a hundred times smaller.

    5. Re:cis and mi regulation is not "bad" code by Cyberax · · Score: 1

      Uhm, we know pretty well that most of the junk is just junk. The recent _high_ estimates of human genome that has some function is about 10% (or about 15% with structural elements). That's a _high_ estimate based on analysis of evolution of genomic sequences.

      And it's nothing unusual in the animal world. The difference is even more glaring in plants - a good old Arabidopsis is just 135Mbp and Paris Japonica is 150GBp. That's a difference of three orders of magnitude between plants that have no really special external characteristics! And even Arabidopsis has plenty of junk in its genome.

    6. Re:cis and mi regulation is not "bad" code by mod+prime · · Score: 1

      Bioinformaticians are very much math orientated, and almost all of them code. Their focus has been driven by commercial interest however genetic scientists have been saying that non-coding DNA is functional and regulatory just as epigenetic effects exist. For decades.

    7. Re:cis and mi regulation is not "bad" code by unitron · · Score: 1

      If anyone had taken assembler and machine coding back in the old days of computing, they'd get it. You only have so much to code with, so you make it do multiple things.

      A better analogy would be a huge bloated computer program that evolved over many decades - where changing (or removing) one little thing in one place can break things in dozens of other unexpected places - but where if you were to rewrite the entire thing from scratch you could reduce the size of the code base by a factor of a hundred while still preserving all the functionality (and also eliminating lots of bugs).

      Very few biologists would imagine that you could go through the human genome and excise all the "junk" regions and still end up with a healthy human. But many would agree that some hyper-intelligent entity could almost certainly design a new species that looked and acted human but with a genome that was a hundred times smaller.

      No doubt we were intelligently designed to appear to have been the result of thousands and thousands of years of trial and error for some mysterious reason that is beyond the comprehension abilities of us mere mortals.

      Or maybe we were intelligently designed with all that extra "code" so as to be able to evolve should it become necessary.

      I have an unshakeable, almost religious faith in the ID proponents ability to come up with some sort of explanation of how evolution never happened because pocketwatches.

      --

      I see even classic Slashdot is now pretty much unusable on dial up anymore.

  2. WhatCouldPossiblyGoWrong by PPH · · Score: 1

    We let the machines reverse engineer the homo sapien genome. Next step, a vaccine to eradicate the infection from the planet.

    --
    Have gnu, will travel.
  3. Junk DNA by SupraTT+GOP · · Score: 3, Insightful

    Junk seems to be amazingly capable. I seem to be learning of its doing more and more with each passing day. Impressive stuff.

  4. Um, I'm autistic, and... by JasonGoatcher · · Score: 1

    I don't really feel like I have any sort of disease.

    I love how everybody but autistic people want to cure autism.

    I mean we very seldom lie or use subterfuge, and yet WE'RE the ones that need to be cured? And what about the awesome abilities some of us have with math and various things? For all anyone knows, we're the next step on the evolutionary ladder and the only cost is a little special attention when we're children.

    1. Re:Um, I'm autistic, and... by morgauxo · · Score: 1

      "WE'RE the ones that need to be cured?"

      No, probably not you. You do realize that Autism is a spectrum disorder right? I have a couple very close friends who are diagnosed with a mild form and a few who are not diagnosed but probably could be. They surely do NOT need "cured". I also have friends who have worked as caretakers for people who had it so bad they were unable to communicate, dress themselves, etc.. and will never know a day of independance.

      "For all anyone knows, we're the next step on the evolutionary ladder"

      I hope so but not without another mutation or two to help take the edge off of the side-effects or at least something that causes future generations to only include the higher-functioning end of the spectrum.

      But what are you doing talking about evolution? I thought you were a young Earth creationist!

      "the only cost is a little special attention when we're children"

      I can assure you that for the families of the people on the lower functioning end of the spectrum the cost in terms of emotional stress, money and work is WAY more than just a "little special attention".