Celera Opens Up DNA Database
greenplato writes "Thirty billion base pairs from the sequences of humans, mice, and rats that were available only by subscription to Celera's DNA database are being put into the public domain. Celera will donate this information to a 'federally run database,' presumably GenBank. Francis Collins, head of the National Human Genome Research Institute, notes that 'data just wants to be public.' Stories in BusinessWeek and The New York Times."
Shouldn't that be "data want to be free?" :)
Francis Collins, head of the National Human Genome Research Institute, notes that 'data just wants to be public.'
Data hates when you anthropomorphize it.
That is so wrong on numerous levels. Hi Evil Corporation, here's ten thousand dollars so I can get a peek at genetic code that I inherently share with every human being in the first place.
Let's see, the one company that pioneered genome research with reliable and extremely efficient shotgun sequencing, is now an evil corporation because it wanted to use its investments in research for developing novel therapeutics. Which in the end benefits human-kind. Please...
I am defenseless. Use your button. Mod me down with all of your hatred.
Considering the millions of dollars that Celera invested in gene sequencing, it should at least have the opportunity to make back that money. Heaven forbid, they might even deserve to make a PROFIT. Profit is a leading motivation of many corporations, you know...
A guy walks into a bar... well, I forgot the joke, but the punchline is that he's an alcoholic.
They've open sourced me! Does this mean I have to call myself GNU/Steve?
Hasn't much of the human genome been patented by greedy companies?
Powered by caffeine and sugar; BSD
FTA "DNA database are being put into the public domain" Again, we find information and data that SHOULD be in the public domain, yet the patent office, government, and kickbacks protect those that stand to make money? Its time that we, as a populace, stand and shout for the rights of the public to information. Sure, there are those that say that without protection, such innovation would be stiffled, and I counter with this... "should such efforts be in the public sector?" Through emminent domain, they can take your property, but if you are a business, there seems to be no such thing. I hear of companies giving to this charity or that... but none are giving to the charity of mankind? Information is power, and in this information age, it is time for those with the information to take power from those that would use it to extort finance and power from those that do not know better. All such information should be in the public domain. Knowledge of the human genome, of anything that affects ALL of us, should be public information. For instance, any method of retrieving emergency information during an emergency should be in the public domain, not a subject of patent worthiness. The entire point of 911 service is to aid the community, not bilk them of dollars. The entire point of scientific discovery is to learn and advance humankind... when it becomes simply a method of making money, the advancement of humankind goes in the trash like yesterdays junk mail. At that point, what is the point of funding science? Think bigger than your new BMW. This might seem altruistic, but what is the point of discovery if your only reason to share is profit? When do you lose respect, when do you stop having authority? The ONLY method of advancing the human race is through sharing, through communal discovery. Perhaps this will advance that purpose, perhaps it won't.
Support NYCountryLawyer RIAA vs People
I wonder why something like this isnt inherently unprotectable, like the contents of the phone book. A DNA sequence is, after all, simply a record of an existing state of things, NOT an original work (barring genetic engineering, which this isnt). If I take your phonenumber/basepair book and reproduce it... have I broken any laws (apparently the answers are no and yes, in that order)? The precedent for this has existed for decades.
I work for a biotech company with a database which we've been trying to sell subscriptions to for a few years. The prevailing experience with trying to sell the database is that people are very reluctant to shell out the cash to access the data.
I think this is a symptom of trying to sell data to academic institutions. The problems with selling to academic institutions are two-fold; Firstly the universities don't have the cold hard cash to spend on the databases, so any cost over free is too expensive. Secondly, there is the free/open culture within universities that almost punishes commercial ventures for trying to build a business around adding some kind of value to the data (such as convenience or quality of data).
Because of the lack of sales for this database, we're considering handing the data over to a large government body so that they can maintain it, because the company can't simply afford to maintain the database - it costs a lot of money to hire talented people to do database curation.
So when Celera say that "data wants to be free", I think they mean "We'd sell you this data to try and recoup our investment, but we're resigned to the fact that you're not going to buy it".
Celera is pretty evil as a employer. At one time the company had an insane stock evaluation. They realised that the genome database profits will end soon and the "synergies" with its own drug research will not happen. So they fired the genome people and used the stock proceeds to buy up biologic instrument companies and some small biotech companies. Making instruments and biology tools is what produces any income for them.
I worked for a small biotech company that became a part of Celera. They are doing a good researchbut the high management is rotten. I was not there before Celera took over but my understanding is that the new management made all the changes for worse. Now the bulshit there is deeper than ice in Antarctica.
I doubt that we will ever figure out - and I suspect that even if we did figure out we couldn't do much about it
Now what do I do with it?
I hear what you're saying about academic institutions. They're incredibly whiny and expect everything to be free. We make very little money off of them, and they consume a large share of tech support, but we go out of our way to be nice to them because many of the same people later pop up in pharmaceutical companies in control of large quantities of cash.
Celera saw the writing on the wall. Everyone is using the public reference assembly because it's free, and in terms of contents the two are merging toward a complete consensus as they approach total coverage. You can only make money selling this kind of information while vast portions of the genome remain unknown or unavailable, and that's not true anymore.
Plus using a different assembly than other researchers cuts you off. When we import data from dbSNP, for example, we regularly drop references to positions specified in reference to Celera contigs. (Not much of a problem, since they're in the vast minority.) The Celera assembly has not been freely downloadable and redistributable, and we haven't been including a copy of it in our software (we always include a current public assembly build). Now that this has happened, I think the next build of the public assembly is going to be really good.
Excellent PBS video on race between government and Celera to crack the human genome:
http://www.pbs.org/wgbh/nova/genome/program.html
Mirrors please..
Celera's "exremely efficient" method only worked because the NIH's freely available genome data was available. Without it Celera's "shotgun" fragments would have been just that - fragments. It took a base of comparison to complete the map.
Celera relied on the "free research" of the NIH. They extended that research with their own technique, and then patented the result of the joint data.
That what was all this school was for... to teach us how to solve our own problems. -- janeowit
Here's a copy of the data
t atgactgatcggtagcatatattatgctatagctagcgtgtagctagtat cacatcagctactatgtagctacgatcgagcacactgactacgtagctag tagcggatcgatagctgatctgactgactatatatagcgcgcgatatata gcgcgtagatcgtagccgcgcgatgatatataaggagactgactagc...
acgcggcgatgcgtacatagctagcgctgcatagatcgactatgacgat
Does anyone remember the story of the hacker that actually wrote the code that cracked the genome sequencing problem? He is the unsung hero of this whole private vs. public debacle. He wrote a 10,000 line C program to do the sequencing in "rafts" and "contigs" in the space of a few days -- and had to ice his wrists from all the work... it was because of his brilliant work that the race went from being a 20-year thing to a 3-year thing, and of course nobody knows his name. (And I've forgotten it.)
Anyway, Celera seems to epitomize the way large projects like this become free: they sink billions upon billions of dollars into a project which is soon supplanted by a better free (though, of course, government funded) alternative, and after years of unsuccessfully trying to sell it, release it for free for a bit of good PR.
But then again, they've made a huge contribution to the field overall; Craig Venter may be an arrogant prick, but he gets shit done, while Francis Collins mostly waxes poetic about the bright future of genomics.
Well, that seems like enough venting about the sad state of research.
sic transit gloria mundi
So, lemme get this straight: they fired the people in an unprofitable part of their business and expanded into profitable endeavours. God, that sounds absolutely evil. Err... maybe that's just basic sound business practice?
Upper management may or may not be rotten, but you don't really explain what was "evil" about their actions.
Someone has probably already pointed out that human DNA contains 3 billion base pairs and not 30 billion. It is a sad shame that a company as renown as Celera is overshadowed by blatant misinformation; even from former CEO Craig Venter who is known for calling archea a type of bacteria in the December 2004 issue of SCIENCE magazine. Mishaps like this further alienate the real intellectuals who would normally be capable of over-running the Internet towards an information rapture in the scientific community.
-Bio major/Nerd
Genomes are available at http://www.ensembl.org/ . I know I've said this before, but I feel it can't be overemphasized. Ensembl is so incredibly cool. I imagine Celera is releasing their data because no one wants to pay for it when Ensembl has it for free. Additionally, Ensembl has tools that provide so much more than just genome sequence-scanning. And they use open source projects like BioPerl and use Wiki for documentation! I think this is just a PR stunt for Celera.
"fist in the air in the land of hypocrisy"
Both sides had a difficult time assembling the sequence. Celera's data was of higher quality because their method provided for better coverage AND they could use the public data to clear up any ambiguities.