Slashdot Mirror


Genetic Stone Soup

It's the scientific achievment of our generation; what can you say about the mapping of the human genome? But here's a story behind the story. parvati turned us on to this NYT article about James Kent, who wrote the gene assembly program GigAssembler last June. It turns out that, thanks to his code, the public Human Genome Project had actually finished its work three days before the private effort by Celera Genomics -- a feather in their cap and a boon to public science. The head of Celera was "astonished" to learn of this grad student's genius -- ten thousand lines of C in a month, and why? -- "because of his concern that the genome would be locked up by commercial patents if an assembled sequence was not made publicly available for all scientists to work on." (The debate over public vs. private science continues to rage; see this Seattle P-I article, which discusses among other things the ethics of NDA'ing scientific data produced for profit.)

Update: 02/13 02:26 PM by J : Thanks to tlunde for finding the link to GigAssembler and thus clarifying which language it was written in.

22 of 175 comments (clear)

  1. Re:What about the quality of assembly? by Lars+Arvestad · · Score: 3
    There is a biased comparison available over at the Sanger Center. Summary: The public assembly is much better even though less data is used.

    They measure things such as the number of fragments (fewer=better) and their lengths (longer=better) and estimated coverage of the genome.

    There is also a less biased comparison over at the Nature website. I don't know if you can get to read it without a paid subscription though. Their findings are less controversial, saying that the statistics are similar for the two assemblies, but that the annotations (i.e. descriptions of what is actually there, comparison: A group photo with note on peoples names and their relationships) are better in the public version.


    Lars
    __

    --
    Reality or nothing.
  2. Re:Possible Slashdot interview?? by Thagg · · Score: 3
    I first met Jim Kent on the stage at the Siggraph Technical Papers in 1992, he presented his 3D morph paper right after my 2D morph paper. Later, we hired him to come work with us at Pacific Data Images, where he worked with us in the R&D group. He's definitely a good guy to work with.

    I left PDI six years ago to start my own company, and I've lost touch with Jim; I had heard vaguely that he'd 'gone back to school', but I had no idea that he was up to something this big. It's great to see an old friend make such a great contribution in a new field. Way to go, Jim!

    Another old friend of mine, Carter Burwell, went the other way, from doing genetics with Crick at his Cold Spring Harbor laboratory, to working on early computer graphics at NYIT in the early 80s, to becoming one of the pioneers in digital music, and is finally now a leading composer for films such as the recent O Brother, Where Art Thou (somehow passed over by the Academy).

    And I'm still just making pictures :) Oh well.

    thad

    --
    I love Mondays. On a Monday, anything is possible.
  3. Re:Grad Student? by Fluffy+the+Cat · · Score: 3

    On another, slightly more disturbing note, I am somewhat concerned about the use of academic funding to compete with commercial enterprises. Just because RMS does it doesn't make it right.

    Celera have released their sequence under a license that restricts commercial usage (something vaguely like the Sun open source license thing, whatever it's called), whereas the public effort has released their work into the public domain (pretty BSDish, really). If Celera were the only group releasing this data, academic research into the human genome would not be able to attract the same sort of investment and would proceed significantly more slowly than it otherwise would. Using academic funding in this case secures a future for academic research in a very important field.

  4. Re:Grad Student? by K8Fan · · Score: 3
    I hate to be a nitpicker, but this chap's hardly a typical twentysomething graduate student (which would have been a genuinely amazing feat) - he's a seasoned professional who's experienced in processing large datasets professionally.

    Agreed, he likely brought a huge amount of pre-existing skill in matrix math. But 10k lines of assembly language hacking to beat richly funded capitalists with super-computers in four weeks is a truely amazing hack, no matter what their skill level.

    BTW, his home page doesn't say: anyone know what graphics software worked on before? The name seems familiar - I think he used to hack math for a package called Digital Arts, but I could be wrong.

    On another, slightly more disturbing note, I am somewhat concerned about the use of academic funding to compete with commercial enterprises. Just because RMS does it doesn't make it right.

    What's disturbing is that academic institutions are being forced to compete with commercial enterprises that, frankly, should not exist. The idea of a commercial enterprise doing something as important to the entire human race as the sequencing of the genome with the intention to control distribution of the resulting science is deeply offensive. Just because you can make money doing something doesn't make it right.

    --
    "How perfectly Goddamn delightful it all is, to be sure" Charles Crumb
  5. Human Genome Data Assembled on Linux Cluster by tlunde · · Score: 3

    From http://genome.ucsc.edu/goldenPath/algo.html:

    Assembly Process Overview
    The assembly proceeds according to the following major steps:

    Decontaminating and repeat masking the sequence.
    Alignment of mRNA, EST, BAC end, and paired plasmid reads against genomic fragments. On a cluster of one hundred 800 MhZ Pentium III CPUs running Linux this takes about three days.
    Creating an input directory structure with using Washington University map and other data. This step takes about an hour on a single computer.
    For each fingerprint clone contig, aligning the fragments within that contig against each other. This takes about three hours on the cluster.
    Using the GigAssembler program within each fingerprint clone contig to merge overlapping fragments and to order and orient the resulting sequence contigs into scaffolds. This takes about two hours on the cluster.
    Combining the contig assemblies into full chromosome assemblies. This takes about twenty minutes on one computer.
    The steps will be described in more detail below.

    [snip]

    The program was NOT written in "assembler". From Appendix B:

    mRNA Scoring Function
    int scoreMrnaPsl(struct psl *ali, boolean isEst)
    /* Return score for one mRNA oriented psl. */
    {
    int milliBad;
    int score;

    milliBad = calcMilliBad(ali, TRUE);
    score = 25*log(1+ali->match) + log(1+ali->repMatch) - 10*milliBad + 10;
    if (ali->match <= 10)
    score -= (10-ali->match)*25;
    if (isEst)
    score -= 25;
    else
    score += 25;
    return score;
    }

  6. Re:Things are not as easy by wowbagger · · Score: 3

    The example I like to use is that the Genome project is like the Periodic Table: it just gives you a framework to hang knowledge on.

    Just as the Periodic table helped scientists to deduce the structure of electron orbitals (by observing the sequence of how chemical similarities went with atomic number) and find new elements ("There's a hole here, and what should fill that hole will have these properties. Now we know what we are looking for and where to look..."), the Genome project will allow us to better determine how genes are controlled, and look for new proteins.

  7. Re:could a distributed parallel system be useful? by OctaneZ · · Score: 3

    While some people are discussing Folding@Home as a response to your question of a "seti-like" processing system; there is actually a much more relevant project, also hosted at Stanford. The Genome@Home Project is attmepting "to design new genes that can form working proteins in the cell" from the DNA sequence of non-human organisms. It is a new project, but gaining speed quickly. It is worth taking a look at if you have spare cycles you can give to a good cause.
    -OctaneZ

  8. Other side of the coin by swordgeek · · Score: 3

    First of all, for those who aren't in the biotech industry, it should be mentioned that the NIH has an agenda to push just as much as private for-profit industry. Never believe that 'the good of the public' is the only thing driving non-profits, especially when a government (ANY government!) is involved.

    Still, this issue isn't quite as cut and dried as many would like to believe. If it was, then everyone would gang up on one side, and the other side would wither and die. Consider some of the following points:

    1) Celera's efforts most likely DID force the HGP to speed up.
    2) Celera's "whole genome" approach appears to be a bust. Before they did it, we could only guess at how well (or poorly) it might work. In other words, we learned something valuable for future research from Celera!
    3) There is a lot of grumbling about Science imposing a restrictive agreement on access to the Celera data. I agree--this isn't how science works! However, it's like book publishing. They "borrowed" publicly available information (preliminary work from the HGP), added their own stuff, and can impose whatever restrictions they want. Don't like it? Go to the HGP. They (Celera) are entirely within their rights, but I don't think that Science should have agreed to publish with those restrictions.
    4) Here's a biggie. Science costs a LOT of money--the only two groups that can afford it are governments, and expensive biotech companies. The former can't afford to fund all science research, and the latter can't afford to not make a profit. Incidentally, biotech is an area where on the whole, the patent system works quite well.

    At any rate, it's an impasse. Either you cut research by about 60%, or you deal with companies that need to make a profit on their research. Flip a coin and make your choice.

    --

    "People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
  9. Re:"Stone Soup"? by swordgeek · · Score: 3

    Here's the story, along with (sigh!) a (pretty cool) BEOWOLF CLUSTER!

    I hate to do it, but it's actually on topic. :-)

    --

    "People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
  10. An assembly program, not assembler code... by thue · · Score: 3

    ten thousand lines of assembly code in a month, and why?

    Just for clarity; it doesn't say the language is assember, just that what the program does is assemble genome fracments...

    from the unsung-hero dept.
    Not really...

  11. Re:Things are not as easy by YU+Nicks+NE+Way · · Score: 3

    In fact, the Celera assembly is proof positive of the value of the HGP. When HGP started a decade ago, a dedicated scientist with years and years of training might be able to sequence a few tens of base pairs in a day, if he or she did nothing else. Five years later, after the public funded a huge improvement in the basic technology of sequencing, a barely competent technician can be expected to sequence thousands of bases a day without breaking a sweat.

    Two years after that, private industry realized that it could make money exploiting that technology. All hail to Celera for doing a good job -- but if they had seen further, it would only be because they stood on the shoulders of public money.

  12. About patents, useful link by Leon+Trotski · · Score: 3

    "because of his concern that the genome would be locked up by commercial patents if an assembled sequence was not made publicly available for all scientists to work on."

    So should genes be patented?

    I believe this question has been at least partially answered by the Patent Office. You can patent a gene based medicine or treatment if it is applicable to a particular illness, or disease, or gene based disability. You cannot just patent genes willy nilly because you know they exist. The Patent Office and people in gene research from the NIH and Celera, the two main players in gene research, pretty much agree that it is beneficial to the public if gene based
    medicines can be patented for specific treatments. A more detailed discussion on patenting is at:

    http://www.ornl.gov/hgmis/elsi/patents.html

    --

    Cui peccare licet peccat minus. -- Ovid, Amores.

  13. How is that fair? by sharkticon · · Score: 3

    They wanted to make seeds that couldn't reproduce, ostensibly to control genetically modified plants and keep them from taking over.

    Well you can't have it both ways can you? Either you want seeds that reproduce, in which case you'd be whining about cross-contamination with other crops, or you have seeds that don't produce, in which case you whine about "holding nations' food supplies hostage". Come on, which way do you want it?

    Quite frankly there hasn't been a single conclusive study showing that there is any risk from GM crops. It's all just scare stories and psuedo-science.

    --

  14. What about the anti-genetic backlash? by sharkticon · · Score: 3

    Now that the entire genome is sequenced and work is underway on finding the individual genes and their functions, what advances are we going to see? Well plenty really, from screening and treatments for genetic illnesses, to modified organisms that are better and can survive in more extreme conditions. There's the potential to change almost everything as we begin to work out the sequences of more and more living beings.

    But what concerns me is that the whole backlash against anything with the world "genetic" in it will slow or even stop the flow of scientific advancement. We've already seen how companies like Montesanto can have their research attacked, spoiled and subjected to the worst kind of slanderous publicity, and as we get the capacity to do more, these attacks will likely get worse, fueled by an ever more virulent group of protesters and environmentalists.

    These people are true zealots which make RMS look like an apologist. They think nothing of resorting to intimidation, violance and criminal damage, whilst at the same time engaging in a war of words which admits no logic and no compromise. In some cases, the very lives of researchers who labour to increase our knowledge is at risk, and we cannot afford to let this happen, not with the problems of population growth looming large over humanity.

    These people are dangerous, and their actions need to be curbed. No longer should they be able to get away with their lies and violent behaviour, no more than any common thug. They can claim moral superiority, but in truth it seems as though these people are as bigoted as any racist, and just as determined to further their cause.

    We can't allow research to the thwarted because of the voices of a small bunch of extremists. That's not democracy at all.

    --

    1. Re:What about the anti-genetic backlash? by rlk · · Score: 4
      What happens is they set up one or two of the richest farm owners with thier patented grow twice as fast corn or wheat. These farmers then have a large advantage over all of the family farms that did not get the mansanto handout. The monsanto farmer then buys out the smaller farmers. When the mansanto farmers own most of the farmland mansanto then raises the price of thier grain (which by the way cannot reproduce) and the large farmers are forced to sell out to a large agricorp.

      Why? They could always buy their grain from a different supplier. And if they've managed to get themselves locked in to a contract which allows Monsanto to raise their prices at their whim then that's a foolish move and these farmers are reaping what they have sown.

      This reminds me of Pastor Niemoller's words. If Monsanto drives out the local suppliers, then the farmers don't have a choice who to do business with. As for their "foolish moves", I don't think it's either a stretch or unfair to say that farmers in developing countries lack the business acumen and strategic foresight that Monsanto has, and it's not realistic to expect that they can do business on the same playing field as Monsanto. Remember that a "free market" assumes good knowledge on both sides of the transaction; if one side has all the knowledge and the other doesn't understand what's going on, it's not a free market any more, but rather a scam.

      The same applies to Chinese farmers buying GM food. You talk about doing research on issues such as disease resistance. Precisely how, pray tell, are they going to do such research? Assuming they even know about the genetic basis for disease resistance, and that crop diseases in the US differ from those in China, you're assuming that they have the resources to do this kind of investigation.

      It's not only governments that can oppress. Anywhere that there's an excessive concentration of power there can be oppression. It's not a matter of duly constituted laws and sovereign states (most dictatorships have neither; Saddam Hussein rules by personal whim and doesn't have too much respect for borders of neighboring states). If Monsanto's doing this kind of divide and conquer in Latin America, it's no better than if it's a government doing the same thing.

      In Germany they first came for the Communists and I didn't speak up because I wasn't a Communist.

      Then they came for the Jews, and I didn't speak up because I wasn't a Jew.

      Then they came for the trade unionists and I didn't speak up because I wasn't a trade unionist.

      Then they came for the Catholics and I didn't speak up because I was a Protestant.

      Then they came for me and by that time no one was left to speak up.

    2. Re:What about the anti-genetic backlash? by Zara2 · · Score: 4
      Man you chose a really bad company to base your argument on. Mansanto is very well known for a lot of underhanded tricks re: frankenfood. On the surface I am not agianst GM food. Actually I see a lot of good things coming from it. However Mansanto will go down to a small south american country and completly bankrupt it. What happens is they set up one or two of the richest farm owners with thier patented grow twice as fast corn or wheat. These farmers then have a large advantage over all of the family farms that did not get the mansanto handout. The monsanto farmer then buys out the smaller farmers. When the mansanto farmers own most of the farmland mansanto then raises the price of thier grain (which by the way cannot reproduce) and the large farmers are forced to sell out to a large agricorp. Much like the ones mansanto owns a lot of stock in. Now all the local farmers are reduced to basically being wheat pickers.

      Also on top of this GE food has never had to pass any serious tests of it. Whole crops were wiped out in china because the GE food had disease immunities from the western world. China has a slightly different set of crop diseases and *poof* there goes the rice. Please do your research into how things are done and not just hope that people are doing the things that they should be. While this is a "anti frankenfood" site it does have some good info in it. http://www.purefood.org/monlink.htm

      --

      Pithy, yet ultimately meaningless, phrase expressed with gusto!

    3. Re:What about the anti-genetic backlash? by SubtleNuance · · Score: 4
      Im a Tree-Hugger of the highest order. Although I do know that Monsanto's GMOs are getting a 'bad rap'.. let me assure you: Most people's concerns w/ Monsanto are not only the idea that you have 'Frankenfood'. There is a whole group of ideas about why 'Monsanto' is evil:

      1) Produce GMOs that rely on Monsanto produced chemicals. Further enslaving the Farmer to ecologically unsound farming methods and Monsanto's pocket book. Read: Roundup Resistant Crops.

      2) Being a generally massive producer/marketer of Farming Chemicals. Closely related to Item #1, but worth mentioning in a broad sense.

      3) Actively participating in the 'mono-culture' momentum that will produce the next Irish Potato Famine & contribute to the general degradation the biodiversity of the planets food supply.

      And more generally - Tree Huggers like myself are generally in favour of a broad, local, varied sources of food as apposed to the Monsanto Backed "New Age of Industrial Farming".

      When MadCowDisease reaches The Americas which method will be more likely to preserve our food supply? A) 'Industrial' Farming B)Broad, Scaled, Varied, Local Farming... For many reasons this is a Good Thing(TM). Monsanto is the epitome of the industrialization/commodidizaion of the Planets Food Supply. They operate for the benefit to the Bourgeoisie who own stock - not the good of the planets people (which should be pretty relevant when talking about *FOOD*. Their growing strength represents a general 'problem' with our future (not completely described above).

      I am not a Luddite by any sense, but I do not see the value in a 'for profit' company producing GMOs for the benefit of their pocketbooks... this 'unholy' motivation will *NOT* lead to the benefits most of us understand will come from GMOs... it is a matter of motivation. They are more likely to risk our food supply based on profit returns - its not simply a concept of people scared of GMOs.

      I have no problems with responsible, open, IP free GMO research/production by people with without compromised motivations. That is *NOT* Monsanto.

  15. Possible Slashdot interview?? by moonboy · · Score: 4



    How about trying to get an interview with this guy? Could be very interesting.


    --

    Co-founder and designer at Music Nearby: http://musicnearby.com
  16. His previous animation work by jfoust2 · · Score: 4

    Jim Kent was once known in the mid-80s for writing Zoetrope, a 2D path-based animation system for the Atari ST, not unlike today's Flash technology. Zoetrope also became Aegis Animator on the Amiga, and Autodesk's Animator Pro for the PC, which begat the .FLI/.FLC animation format. I believe Kent also worked on the first DOS generations of Autodesk's 3D Studio, too.

    --
    Curator of the Jefferson Computer Museum http://www.threedee.com/jcm
  17. Re:Grad Student? by KaRll · · Score: 4

    I am more concerned with this kind of projects being run by commercial enterprises. Just because you can make money out of it doesn't make it right.

  18. Gene Patents value overhyped? by Alien54 · · Score: 4
    On theNewsHour, PBS had a story on this the past day or so. They have a large webpage with many links dedicated to the whole issue. One thing that is interesting is that there are fewer genes than had been first imagined. The end result is that the genes are more often like a multi-purpose module, and that much of the functionality of the system is in the proteins system such as enzymes, etc. As it was noted:

    What's going to happen is we have to go into the protein world to really understand where the genome is taking the next level of biology. That's ten times as complex at least.

    What is also noted is that the combination of these protein interactions is staggeringly more complex. I can imagine that the system interactions may be a million times or more complex.

    So in my mind, patenting a gene might wind up being similar to patenting the management system of a nuclear power plant, and thinking that therefore you understand nuclear physics.

    --
    "It is a greater offense to steal men's labor, than their clothes"
  19. Things are not as easy by jw3 · · Score: 5
    As many of you probably know, the actual work hasn't started yet. The schedule of a genome project looks like that:

    a) sequencing, that is -- getting the actual sequence. This is almost purely technical work, and definitely not very interesting for a scientist, although you can get a lot of credits for it.

    b) annotating the sequence: finding out where are the genes, what are the similarities between them and between the genes known from another organisms, and what can be suggested about their function based on those similarities. This is pure bioinformatics stuff: first finding the "open reading frames" (ORFs), that is -- anything that can be a gene at all: it has to start with an "ATG" (codon for metionine) and stop with a so-called stop codon. This is only the most basic criterium.

    Whatever comes later is called "postgenomics", and it is probably the most exciting stuff in this whole area of reasearch.

    1) in most of the genome projects which were done until now, as much as half of the proposed genes had not even a rough function assigned to them. (the group I'm working in sequenced a bacterial genome back in 1996, and during that time the situation hasn't changed much). Experimental work and more biocomputing is needed to find out what those genes do. The problem with biocomputing isn't the lack of CPU, but the lack of good strategies / models / theory (or, not lack of "good", but lack of "better" strategies etc.).

    2) knowing what a gene does is, contrary to the common belief, only very little information. You need to know how it is regulated, and this means a lot of tedious and complicated experimental work: two hole areas of postgenomic science deal with that -- transcriptomics (regulation on RNA level) and proteomics (on protein level). You have to understand that each gene is regulated on many levels -- transcription of the gene from DNA to RNA, turnover (that is, the speed of degradation) of the mRNA, speed of translation, amino acid composition of the protein, protein turnover. Moreover, the genes are interconnected into networks rather then pathways. Creating a functioning model of an eukaryotic cell will be probable impossible during the next twenty or so years. That is -- among other things -- my group works with a little bacterium, which has only +- 700 genes. And even though it is a couple of orders of magnitude more simple then the simplest eukaryotic cell, it is very, very, very complicated.

    Take-home lesson: don't be too enthusiastic. This is not the flight to the moon. This is only the first Sputnik.

    Best regards,

    January Weiner