Slashdot Mirror


First Sequencing Of Plant Genome

cthugha writes: "The genome of Arabidopsis thaliana has just been completely sequenced, making it the first plant species to have its genome fully sequenced. The fact that we have animal and plant genomes now should give us greater insight into the common aspects of eukaryotic life. Nature has good coverage here. The ABC has a shorter and easier-to-digest report, but the emphasis is on the fact that Australian scientists could not participate due to lack of funding rather than on the technical details."

23 of 62 comments (clear)

  1. Re:source code available online by jbuhler · · Score: 2

    WARNING: according to some mail I subsequently received from the investigators at the Max Planck Genome Intitute, the above sequence is incomplete and was intended only as a private communication within their research group. Please don't download it.

  2. source code available online by jbuhler · · Score: 3

    If you want to, um, compile your own version of A. thaliana, see

    ftp://warthog.mips.biochem.mpg.de/pub/cress/MAR/

  3. Genome Sizes. by Tim · · Score: 2

    Haploid Genome Sizes (collected from various sources):

    Homo sapiens (human): 3.3 x 10^9 bp, # of genes unknown

    Drosophila melanogaster (fruit fly): 1.8 x 10^8 bp, 13,601 genes (if you believe Celera has sequenced it all)

    Caenorhabditis elegans (worm): 95.5 x 10^6 bp, 19,820 genes.

    Saccharomyces cerevisiae (yeast): 12 x 10^6 bp, 5,885 genes.

    E. coli (bacterium): 4,639,221 bp, 4,377 genes.

    Hemophilus influenzae (simpler bacterium): 1,830,138 bp, 1,738 genes.

    Arabidopsis thaliana: 1.17 x 10^8 bp, ~25,000 genes.

    Wheat: 16 x 10^9 bp, ~30,000 genes.

    --
    Let's try not to let fact interfere with our speculation here, OK?
    1. Re:Genome Sizes. by kim_rutherford · · Score: 2
      Haploid Genome Sizes (collected from various sources):
      A more comprehensive list of genome sizes is here:
      http://www.cbs.dtu.dk/databases/DOGS/abbr_table.by size.txt.
      These pages show how much of each organism is finished and publically available:
      http://www.ebi.ac.uk/~sterk/genome-MOT/MOTgraph.ht ml
      http://www3.ebi.ac.uk/Services/DBStats/
      Arabidopsis thaliana: 1.17 x 10^8 bp, ~25,000 genes.
      25000 genes is near the low end of the range for the estimates of the number of genes in the human genome:
      http://www.ensembl.org/Genesweep/
  4. Re:Genetic Sequencing by Jonathan · · Score: 2

    By mapping the genome, are we actually figuring out the underlying structure of what every gene serves to do in a given plant? (more like a decision tree) or are we just figuring out in a vague way what groups of genes do what (more like a bayesian belief net)?

    Neither, unfortunately. Basically a genome is analogous to the binary code of an executable -- you can't just look at it and follow the logic of the program (or organism, as the case may be). However, there is a field of study called bioinformatics which attempts to extract useful information from the raw genomic data, and in order to do this, many techniques from AI and machine learning are used, such as Hidden Markov Models.

  5. Re:Genetic Sequencing by Jonathan · · Score: 2

    Simply put, with finding out what whole genomes do, you get a pretty precise roadmap of what's going on. Not only that, if you can't zoom in too much on one part of the map, you can go find another map that has a similar part and zoom in on that. Got it?


    Well, that's a bit of a stretch. For example, to really understamd what's going on you have to have gene expression information, and you can't get that from the genome -- you have to use microarray data ("gene chips"). And even then you can argue that what you really want to look at is the complete set of proteins and their abdudances (the "proteome") and not the genome at all.

  6. Re:What happened to the human genome? by Jonathan · · Score: 2

    The human genome has yet to be fully sequenced. What you are thinking of is the announcements of several draft sequences, with many missing and erroneous areas. The complete sequence won't be available until next year at the earliest.

    Secondly, although I'm all for enthusiasm for genomics, the human genome actually will be (at least for the forseeable future) one of the *least* useful genomes. Why? Because we can't do experiments on humans. When we have sequenced many plants and animals and gotten a good idea of how they work from experiments, then (and only then) will the human genome be of any practical use.

  7. Re:Maybe I missed it... by Jonathan · · Score: 4

    It was picked because
    1) it has a small genome -- many plants actually have genomes longer than the human genome.

    2) Arabidopsis is is a small, fast growing plant, well suited for experimentation.

    It is important that people realize that sequencing a genome is a beginning and not an end. Having a genome means that more sophisticated studies can be done -- it doesn't mean that we now know everything about the plant.

  8. Re:wheat == five humans by PD · · Score: 2

    Wheat is a hexaploid plant, meaning 6 duplicated sets of each chromosome. It is thought that all wheat is descended from three individuals about 10 years ago. By multiplying chromosome sets new species can arise extremely quickly in plants.

    So, that's why wheat has a lot of genes.

  9. Re:wheat == five humans by PD · · Score: 2

    Well, I meant 10 THOUSAND years ago, not 10 years. I remember eating bread sometime before 1990 on more than one occasion.

  10. wheat == five humans by peter303 · · Score: 3

    Quantity is confusing in the genetic world.
    Wheat has 16 billion base pairs or five times human.
    Plant genes tend to duplicate alot according to the first plant genome.
    With regards to animals, the fly genome has only 2/3rds the gene of the worm genome.
    The low end of human of human estimates- 35,000
    genes- is not much more than these plants or animals.

  11. Re:From the ABC article: by drox · · Score: 2

    Thank you for providing a concrete example of the high costs of research. However I don't see that it supports your argument. Oil exploration is expensive too, but you don't see oil companies patenting the use of oil as fuel. (Maybe they just wish they'd thought of it sooner)

    Research is damned expensive and without assurance of return, it's not entirely feasible to invest the resources.

    Assurance? What assurance? There aren't any assurances of a return on investments (for projects like the sequencing of Arabidopsis). There's the off-chance of a return, and big payoffs get more probable when patents are awarded, but it's still not much of an assurance.

    Patents are a means of holding information hostage. I understand the need for them but I don't have to like them. I particularly don't like them when what is being patented is a process that a living organism has been doing for free since time immemorial.

    If I invent a new widget, I have the right to hold the schematics hostage, releasing them only to those who pay. But who invented the gene that codes for usefulase? I don't claim to know, but I'd bet money it's not the one applying for the patent!

    IANAL.

  12. Re:Genetic Sequencing by SEWilco · · Score: 3
    This level of analysis has been compared to a map. You can see where the streets are and perhaps the buildings, but you can't see the colors of the houses and the windows and doors.

    Further analysis is needed to figure out what molecules are created by each gene and under what circumstances. For example, neurons have on part of their surface a receptor for serotonin. This "receptor" is a molecule of a certain shape which the serotonin molecule fits into, and when this happens the receptor causes a change in behavior in the cell. There's a gene sequence someplace which builds the receptor molecule and adds it to the surface of the cell -- but this level of genetic maps don't tell us exactly where this gene sequence is and what the shape of the receptor is. Further research is needed to find the location of this genetic sequence, to analyze the exact genetic code, and what molecules that code can build.

    Even that won't tell us everything about a cell -- some drugs work by fitting into a receptor near a receptor whose action they are targeted to block, and the drug works because the rest of its physical shape crowds the target receptor so what usually activates that target receptor cannot reach the receptor. It takes a lot of study to figure out the 3-D shape of the surface of a cell to understand what can be going on in the molecular soup of life.

  13. Re:From the ABC article: by TheHornedOne · · Score: 2

    Research is damned expensive and without assurance of return, it's not entirely feasible to invest the resources. Nobody has ever work solely for the public good, you know. Either they are seeking fame or they are being funded by someone higher up who has a vested interest in their results.
    I am an active researcher in the field of plant biology, so
    I can speak with some authority on the costs involved in doing basic research. I have an assay that I do routinely to directly measure the rate of transcription of a single gene, called a nuclear run-on-assay. Each one of these assays costs around 300 dollars to run and takes bout two weeks from start to finish. To get statistically valid numbers, I need to repeat each experiment twice, effectively tripling the cost (900.00) and the time to over a month (And this does not count the cost of paying me). If I want to ask any meaningful set of questions, I am going to need to run a lot more of these under different conditions. Can you see how the cost adds up? It would cost even more if I didn't make a lot of my own materials from scratch.
    Other assays and techniques are equally expensive. A friend of mine is getting ready to clone a "promoter", which is the part of a gene that actually controls how it's expressed. The minimum cost for cloning and sequencing this promoter will be around 2000 dollars. Actually doing experiments on it later will cost even more.

  14. It is accurate by Wire+Tap · · Score: 2

    Like other have said, other organisms can (and many do) have more base pairs than we do - just like they have more chromosomes. For instant, a fern plant has something in the ballpark of 1200 chromosomes! Compared to us, you would think the fern is a super-being. However, there is much less information per chromosome in the fern, whereas in a human chromosome, the information is much more dense. It is nature's way of making things more efficient perhaps. Just because there is more "Stuff" there, that doesn't mean there is more information in the stuff. Remember, quality, not quanity. :)

    --

    Man is born free; and everywhere he is in chains.

  15. related story.. by gargle · · Score: 3

    Cornell researchers have used the genome sequence of the Arabidopsis to obtain information on its origins as a species. See here.

  16. An honest reply, or trying to be anyways.... by raaum · · Score: 2

    Alrighty then....

    First off (and I'm not being deliberately snotty here), we're not talking about physics here. Current biology has nowhere near the decimal point accuracy, etc. that modern physics does.

    Let's talk about bacteria (since it's a simpler problem - but most applies with minimal changes to studies of other organisms). Let's, furthermore, say I am interested in something like nutrient uptake. There are proteins on the cell surface which are involved in either passing (or not passing) external molecules to the cell interior. It is possible (let's not get into details) to get a good idea as to which surface proteins are involved with passing different classes of external molecules into the cell.

    ok then. I have a protein of interest, I have a 'behavior' of interest, what next? believe it or not, the next step is usually trial and error. I induce mutations in the bacterium (by x-raying it or adding some chemical to a culture, etc.) and look for colonies that do weird things vis-a-vis my system of interest (in this case uptake of some particular nutrient - since this is /., let's say caffeine).

    As bacteria reproduce like crazy and I have induced mutations in a population of, literally, millions of individual bacteria - there are bound to be some which do funky things as regarding caffeine uptake. I cannot attribute this necessarily to some change in my protein, but I can check the interesting mutants to see if my protein is different from the 'normal' sequence. If it is not - well then, no change in this protein is directly involved in the funky behavior. If it is changed, there is still much more to be done.... because, of course, it may be that some other mutation elsewhere produced the new, funky behaviour, and not my new, improved protein of interest.

    One then zeros in on the effect of the changed protein by inserting or otherwise point mutating 'wild-type' bacteria to attempt to determine what effect the changes in sequence have... etc..

    of course, having written all this, I realize that I haven't answered the question you asked. And the answer is that biologists, especially molecular types like me, don't predict!

    We create mutants and see what happens. You would be astounded at the number of different mouse lineages out there with specific mutations and disease susceptibilities. If something is eventually to be used in humans, you start by seeing what it does to mice, move on to monkeys, then move on to human cells in vitro (cells in a tube, basically), and finally if animals/cells are not dying, etc. move on to trials in humans.

    Prediction would be nifty, but even with whole genomes, its just not in the cards for the near future.

  17. Re:What happened to the human genome? by the+gnat · · Score: 2

    You misunderstand some of the key concepts of genomics, and the history of genomes and patents. First off, though, I'd like to point out that "enormous effort" should not justify patents- they are only supposed to be granted for "unique and non-obvious inventions" or something like that. Genes do not meet any of these qualifications, as I'll explain later.

    A common and utterly incorrect assumption is that Celera beat the crap out of the HGP. This misses the mark completely. Celera's sequencing technology is fundamentally more risky and would never have been considered when the HGP started. It mostly relies on massive computing power to assemble overlapping sequence fragments, and on the high-throughput sequencers Dr. Venter helped create. This approach is increasingly considered to be scientifically sound and much more efficient- but only thanks to a decade of advances in computing power. Imagine trying to assemble DNA on a SPARCStation 1 instead of on a brand-new P4 or AlphaServer.

    Certainly Celera's progress spurred the HGP on, and I think the best result of this may be the refining of the "shotgun" technique. However, it is absurd to say that Celera should get patents for its "enormous effort" when the HGP's approach was in fact much more difficult.

    The genome itself cannot be patented; Celera is charging a hefty subscription fee for access to the mouse genome, but a public project will release their own results in the spring (though unfortunately mouse genes will have by then been snapped up by biotechs). The standard for patents on genomic data is ridiculously unclear; single-nucleotide polymorphisms appear to be patentable now. This has led a large public-private consortium mapping these polymorphisms to withold scientific data from everyone out of fear that rival biotechs will steal the results and patent them.

    It isn't that hard to find a gene. Do you think companies will perform gene knockout experiments on humans? No, they'll use a gene-finding program, many of which exist. Hell, my lab is doing this now. On the simplest level, all one needs to do is identify suitably long open reading frames (ORFs) and check for homology to known proteins. The only real limitation is how many and how fast your computers are. It's not rocket science; a basic understanding of genetics and some good Perl code will do it. This doesn't prove anything is a gene, but the USPTO probably won't give a shit if it sounds interesting.

    A researcher from a local biotech recently boasted of their "patent wizard"- fill in the blanks and you've got a 10-page patent application. This is why patents scare the shit out of so many people. I'm afraid that by the time I'm out of grad school there'll be so many patents that any research I do will have to dodge licensing provisions just to be completed, or that any results of mine will have to be suppressed for fear of lawsuits from biotechs.

    This sort of bullshit could destroy public scientific research and destroy America's leadership in this fields, and I'm upset to see people promoting the free-market/privatization view with little or no understanding of the field.

  18. Genetic Sequencing by winter+fantom · · Score: 2
    I am curious, for those Biology buffs out there (I suck at it), how revealing is genetic code to understanding how things are made? What do I mean? Well, for example, if you looked at artificial intelligence, some methods (decision trees, bayesian belief networks) show you as a person information about how decisions are being come to, but others (neural networks, genetic algorithms) tend to be more of a black box.

    By mapping the genome, are we actually figuring out the underlying structure of what every gene serves to do in a given plant? (more like a decision tree) or are we just figuring out in a vague way what groups of genes do what (more like a bayesian belief net)?

    (Obviously, a having the understanding at the "neural net" level implies no mapping at all, so it can't be like that.)

    --
    -winter fantom
  19. Re:Contradictions... by Cmdr.+Marille · · Score: 2

    The plant can't have more genetic information than us
    well it can, a lot off species have more base pairs then a human(If I remember my biology class correctly) it's all about redundancy and also a lot of info isn't used at all(well you could say there's a whole lot of cruft in in us)

    --

    "Mommy, mommy! The garbage man is here!" "Well, tell him we don't want any!" -- Groucho Marx
  20. From the ABC article: by Moderator · · Score: 3

    "These genome projects are the way to gather intellectual property positions, for example if we identify the function of a useful gene we could patent it. Without participation in this type of pure research, we will be left behind."

    This is a shame. All that scientists are worried about these days is patenting the genome of something so they can get rich. Whatever happened to research for the benefit of mankind? Whatever happened to putting politics aside when it came to science? A damn shame.

    --

    --
    The World is Yours.
  21. Contradictions... by ckedge · · Score: 5

    There are some strange contradictions in the ABC article.

    It first claims that "The sequencing of 118.7 billion base pairs of the nuclear genetic complement of a model plant is enormously significant". Then it says something near the bottom regarding "the 3.2 billion base pairs of the human genome". So what's going on here? The plant can't have more genetic information than us.

    The Nature article talks about giving away 5000 CDs containing the data, and mentiones somewhere that the dataset is 120 Megabytes. So I presume that is compressed, down from the 3.2(*2) billion bits that ABC quotes. Are these numbers accurate? (And just how much information is there per base pair? Is my translation of four nucleotides to 4 possible states (2 bits) correct?)

  22. Interesting... but not so by Ipsilon · · Score: 2

    Is this interesting? This implies we already can understant human genome? No.

    Doing a comparison with computers: If you had the binary executable of a program of an architecture you don't know... how would you suppose what means every bit of this file? And, the most important, how would you discover the instructions this processor can understand?

    The "solution" is to search for species with small sequences of DNA and compare to others. Finally you could try to modify some of this to see what changes in the final individual. But we won't get anything in a near future, perhaps we won't see any real use for this in our lifes.

    --
    To visit or not to visit: findusclub.com

    --

    The opinions in this comment are subject to GPL, you can copy, modify and redistribute freely (as in speech).