Slashdot Mirror


Celera Opens Up DNA Database

greenplato writes "Thirty billion base pairs from the sequences of humans, mice, and rats that were available only by subscription to Celera's DNA database are being put into the public domain. Celera will donate this information to a 'federally run database,' presumably GenBank. Francis Collins, head of the National Human Genome Research Institute, notes that 'data just wants to be public.' Stories in BusinessWeek and The New York Times."

11 of 181 comments (clear)

  1. Shouldn't that be by Spetiam · · Score: 4, Insightful

    Shouldn't that be "data want to be free?" :)

  2. from the summary by Anonymous Coward · · Score: 5, Funny

    Francis Collins, head of the National Human Genome Research Institute, notes that 'data just wants to be public.'

    Data hates when you anthropomorphize it.

  3. Oh No! by Anonymous Coward · · Score: 5, Funny

    They've open sourced me! Does this mean I have to call myself GNU/Steve?

  4. Again? by zappepcs · · Score: 4, Insightful

    FTA "DNA database are being put into the public domain" Again, we find information and data that SHOULD be in the public domain, yet the patent office, government, and kickbacks protect those that stand to make money? Its time that we, as a populace, stand and shout for the rights of the public to information. Sure, there are those that say that without protection, such innovation would be stiffled, and I counter with this... "should such efforts be in the public sector?" Through emminent domain, they can take your property, but if you are a business, there seems to be no such thing. I hear of companies giving to this charity or that... but none are giving to the charity of mankind? Information is power, and in this information age, it is time for those with the information to take power from those that would use it to extort finance and power from those that do not know better. All such information should be in the public domain. Knowledge of the human genome, of anything that affects ALL of us, should be public information. For instance, any method of retrieving emergency information during an emergency should be in the public domain, not a subject of patent worthiness. The entire point of 911 service is to aid the community, not bilk them of dollars. The entire point of scientific discovery is to learn and advance humankind... when it becomes simply a method of making money, the advancement of humankind goes in the trash like yesterdays junk mail. At that point, what is the point of funding science? Think bigger than your new BMW. This might seem altruistic, but what is the point of discovery if your only reason to share is profit? When do you lose respect, when do you stop having authority? The ONLY method of advancing the human race is through sharing, through communal discovery. Perhaps this will advance that purpose, perhaps it won't.

  5. Free data - or unable to sell it? by Anonymous Coward · · Score: 5, Interesting

    I work for a biotech company with a database which we've been trying to sell subscriptions to for a few years. The prevailing experience with trying to sell the database is that people are very reluctant to shell out the cash to access the data.

    I think this is a symptom of trying to sell data to academic institutions. The problems with selling to academic institutions are two-fold; Firstly the universities don't have the cold hard cash to spend on the databases, so any cost over free is too expensive. Secondly, there is the free/open culture within universities that almost punishes commercial ventures for trying to build a business around adding some kind of value to the data (such as convenience or quality of data).

    Because of the lack of sales for this database, we're considering handing the data over to a large government body so that they can maintain it, because the company can't simply afford to maintain the database - it costs a lot of money to hire talented people to do database curation.

    So when Celera say that "data wants to be free", I think they mean "We'd sell you this data to try and recoup our investment, but we're resigned to the fact that you're not going to buy it".

    1. Re:Free data - or unable to sell it? by the+gnat · · Score: 4, Informative

      Secondly, there is the free/open culture within universities that almost punishes commercial ventures

      I would not have stated it that way. The real reason is that academics hate to leave anything unpublished. If they're constrained by copyright law or some NDA, they can't tell everyone about the fabulous new work they've been doing - or at the very least, it becomes much more difficult.

      I worked in bioinformatics at a university for several years, and much of what we did was take existing databases and analyze them, then publish the results online as our own database of annotations. As part of this, we reproduced much of the original database in modified form - and all we had to do was cite the original authors and describe our methods/sources. If the databases we used had not been public, none of these projects would have happened. In some cases, we had to ignore private databases that we had limited access to because we were not allowed to reproduce any of their data.

      This is only cultural to the extent that academia thrives on publications. We're not out to punish anyone from trying to make an honest buck (lots of people here collaborate with or consult for companies), but we literally can't afford, professionally, to limit ourselves in accordance with restrictions on databases. So why pay money for something we can't legally use in the manner to which we're accustomed?

  6. In case it gets slashdotted.... by MagicDude · · Score: 4, Funny

    Here's a copy of the data

    acgcggcgatgcgtacatagctagcgctgcatagatcgactatgacgatt atgactgatcggtagcatatattatgctatagctagcgtgtagctagtat cacatcagctactatgtagctacgatcgagcacactgactacgtagctag tagcggatcgatagctgatctgactgactatatatagcgcgcgatatata gcgcgtagatcgtagccgcgcgatgatatataaggagactgactagc...

    1. Re:In case it gets slashdotted.... by jcomand · · Score: 4, Informative

      Good guess, but only part of that sequence is actually in the human genome, in chromosome 20 (with one error):
      Query: 103 catcagctactatgtagctacgatc 127
      Sbjct: 84163 catcagctactttgtagctacgatc 84187
      The quality of match is rated at E=0.65, which means that you would expect to find a match this good by chance 65% of the time. (E value will change slightly if you search different databases.)
      Try searching for the sequence yourself here under Nucleotide-nucleotide BLAST (blastn)

      If you want to see the real thing, you can browse one version of the "real" human genome here. If you click on the blue chromosome 1, and then "Download/View Sequence/Evidence", then "display", you can see the repeating "telomere" sequence at the beginning of chromosome 1.

  7. Well this is a bit embarrassing by glwtta · · Score: 4, Insightful
    I supposedly do this crap for a living, and I find out about this from slashdot.

    Anyway, Celera seems to epitomize the way large projects like this become free: they sink billions upon billions of dollars into a project which is soon supplanted by a better free (though, of course, government funded) alternative, and after years of unsuccessfully trying to sell it, release it for free for a bit of good PR.

    But then again, they've made a huge contribution to the field overall; Craig Venter may be an arrogant prick, but he gets shit done, while Francis Collins mostly waxes poetic about the bright future of genomics.

    Well, that seems like enough venting about the sad state of research.

    --
    sic transit gloria mundi
  8. Re:What about patents? by the+gnat · · Score: 5, Insightful

    You can't generally patent "found" sequences.

    I wish that were not the case. However, there are many gene patents in existence. The trick is that now you have to show a function for that gene - although bioinformatics is sophisticated (or rather, automated) enough that you can come up with a plausible-sounding function without ever doing benchwork.

    What's really being patented is the medical application of these sequences. For instance, Company X discovers that gene Y is overexpressed in cancer Z. They take out a patent on gene Y based on this discovery. That means that no one else can pursue gene Y as a therapeutic target. Moreover, in one case testing for a specific mutation to detect cancer was covered by a patent. This is a very simple piece of labwork being covered, which any competent cancer researcher could have figured out.

    The end result is that patents are being awarded for hard work, not for novelty and invention. Throw enough money at a subject, and you'll get data but not necessarily results. Since companies (or academics) can now patent just the data, if someone else gets "lucky" and comes up with an actual result the patent holders can sue the tar out of them if they try to make money off it. (Or even if they don't, as in the case of the breast cancer gene; the company wanted people to pay three times as much for its own testing kit.)

    You may soon be able to patent single-nucleotide polymorphisms (SNPs), which may be involved in differential drug responses. Back when I was in college we had a guest lecturer who was a biotech patent attorney, and he said he though SNPs should definitely be patentable. In any case, there is a world of difference between patenting a cancer drug, and patenting a gene (or a FUCKING POINT MUTATION) that may, in the future, be a drug target.

    Since most of the human genome is noncoding, I suspect it will be harder to patent pieces of it. I also suspect that some asshole will try anyway.

  9. It's already free by jezmund · · Score: 4, Informative

    Genomes are available at http://www.ensembl.org/ . I know I've said this before, but I feel it can't be overemphasized. Ensembl is so incredibly cool. I imagine Celera is releasing their data because no one wants to pay for it when Ensembl has it for free. Additionally, Ensembl has tools that provide so much more than just genome sequence-scanning. And they use open source projects like BioPerl and use Wiki for documentation! I think this is just a PR stunt for Celera.

    --

    "fist in the air in the land of hypocrisy"