Celera Opens Up DNA Database
greenplato writes "Thirty billion base pairs from the sequences of humans, mice, and rats that were available only by subscription to Celera's DNA database are being put into the public domain. Celera will donate this information to a 'federally run database,' presumably GenBank. Francis Collins, head of the National Human Genome Research Institute, notes that 'data just wants to be public.' Stories in BusinessWeek and The New York Times."
slanted articles with plenty of bias. slashdot, as fair and balanced as fox!
Shouldn't that be "data want to be free?" :)
score!
"data wants to be public!"
the new aniti-riaa/mpaa slogan
Francis Collins, head of the National Human Genome Research Institute, notes that 'data just wants to be public.'
Data hates when you anthropomorphize it.
Will this mean more clones, or more genetic modification treatments will become available, now that highschool students can get ahold of this, and work with it on their next science fair project?
Saskboy's blog is good. 9 out of 10 dentists agree.
Considering the millions of dollars that Celera invested in gene sequencing, it should at least have the opportunity to make back that money. Heaven forbid, they might even deserve to make a PROFIT. Profit is a leading motivation of many corporations, you know...
A guy walks into a bar... well, I forgot the joke, but the punchline is that he's an alcoholic.
They've open sourced me! Does this mean I have to call myself GNU/Steve?
Hasn't much of the human genome been patented by greedy companies?
Powered by caffeine and sugar; BSD
FTA "DNA database are being put into the public domain" Again, we find information and data that SHOULD be in the public domain, yet the patent office, government, and kickbacks protect those that stand to make money? Its time that we, as a populace, stand and shout for the rights of the public to information. Sure, there are those that say that without protection, such innovation would be stiffled, and I counter with this... "should such efforts be in the public sector?" Through emminent domain, they can take your property, but if you are a business, there seems to be no such thing. I hear of companies giving to this charity or that... but none are giving to the charity of mankind? Information is power, and in this information age, it is time for those with the information to take power from those that would use it to extort finance and power from those that do not know better. All such information should be in the public domain. Knowledge of the human genome, of anything that affects ALL of us, should be public information. For instance, any method of retrieving emergency information during an emergency should be in the public domain, not a subject of patent worthiness. The entire point of 911 service is to aid the community, not bilk them of dollars. The entire point of scientific discovery is to learn and advance humankind... when it becomes simply a method of making money, the advancement of humankind goes in the trash like yesterdays junk mail. At that point, what is the point of funding science? Think bigger than your new BMW. This might seem altruistic, but what is the point of discovery if your only reason to share is profit? When do you lose respect, when do you stop having authority? The ONLY method of advancing the human race is through sharing, through communal discovery. Perhaps this will advance that purpose, perhaps it won't.
Support NYCountryLawyer RIAA vs People
Who holds the patent for "viewing alpha sequences comprised of the letters G, A, T, and C, superimposed on a dual helix-shaped structure...on the internet"?
I wonder why something like this isnt inherently unprotectable, like the contents of the phone book. A DNA sequence is, after all, simply a record of an existing state of things, NOT an original work (barring genetic engineering, which this isnt). If I take your phonenumber/basepair book and reproduce it... have I broken any laws (apparently the answers are no and yes, in that order)? The precedent for this has existed for decades.
I work for a biotech company with a database which we've been trying to sell subscriptions to for a few years. The prevailing experience with trying to sell the database is that people are very reluctant to shell out the cash to access the data.
I think this is a symptom of trying to sell data to academic institutions. The problems with selling to academic institutions are two-fold; Firstly the universities don't have the cold hard cash to spend on the databases, so any cost over free is too expensive. Secondly, there is the free/open culture within universities that almost punishes commercial ventures for trying to build a business around adding some kind of value to the data (such as convenience or quality of data).
Because of the lack of sales for this database, we're considering handing the data over to a large government body so that they can maintain it, because the company can't simply afford to maintain the database - it costs a lot of money to hire talented people to do database curation.
So when Celera say that "data wants to be free", I think they mean "We'd sell you this data to try and recoup our investment, but we're resigned to the fact that you're not going to buy it".
Sure the public can view the DNA but did Celera surrender the patents too??
...they had opened it up in time to save HHGG!
Instructions to moderators: This is funny. Maybe not very funny, but if you don't get the joke, just take a pass on moderating this one. Thanks.
Does this make Genbank "Internet Explorer" and Celera "Netscape"?
Now what do I do with it?
If wonder if SCO have threatened to sue them?
Personally, I think the real reason is the companies can't make a profit by simply having the "standard definition" and its effectively useless to them.
To 99.99999% of the population, these base pair sequences could be random bits, and we wouldn't know a chromosome if it came up and bit us on the ass.
They are holding a single sample of data, when in reality whats needed is the variation patterns based upon this starting point. We could start to see just how different we are from apes, and why behavioral patterns emerge.
liqbase
I hear what you're saying about academic institutions. They're incredibly whiny and expect everything to be free. We make very little money off of them, and they consume a large share of tech support, but we go out of our way to be nice to them because many of the same people later pop up in pharmaceutical companies in control of large quantities of cash.
Celera saw the writing on the wall. Everyone is using the public reference assembly because it's free, and in terms of contents the two are merging toward a complete consensus as they approach total coverage. You can only make money selling this kind of information while vast portions of the genome remain unknown or unavailable, and that's not true anymore.
Plus using a different assembly than other researchers cuts you off. When we import data from dbSNP, for example, we regularly drop references to positions specified in reference to Celera contigs. (Not much of a problem, since they're in the vast minority.) The Celera assembly has not been freely downloadable and redistributable, and we haven't been including a copy of it in our software (we always include a current public assembly build). Now that this has happened, I think the next build of the public assembly is going to be really good.
No more security through obscurity... and if they do have security patches forme, I would rather not have to recompile.
I have freaks! I did something right...
Whos DNA is it?
I'll tell you exactly what it wants. Human genome data wants to be anthropormorphised.
Are you sure you don't want to add "make love not war" to your rant?
The data generated would not EXIST had not investors (read people) put millions of dollars into the company to hire the researchers, buy the equipment, and develop and analyize the data. Odd that, at some point, they'd hoped to get their money back.
Some people, unlike most here it seems, understand that INFORMATION is not free, that it costs time and money and often sweat and tears to create. As such, in many cases it simply can not be given away.
However, if you believe otherwise, there's nothing stopping you from creating your own information and placing that value into the public domain.
Assuming you're capable of doing so, of course.
Any sect, cult, or religion will legislate its creed into law if it acquires the political power to do so.
Excellent PBS video on race between government and Celera to crack the human genome:
http://www.pbs.org/wgbh/nova/genome/program.html
Mirrors please..
..with the typical /. groupthink. Everyone around her would like to think that the genome sequence should be free to the public. And liken this to open source software. I don't disagree with this. However, we must remember that one can sell a service. An annotated database of the Genome sequence is a service. Although it doesn't contain unique "created" data, annotation and organization is a huge undertaking in itself. Yes, it's horrible that a company invested money and resources towards capitalizing on something that everyone should own. But it's a fact of life and we have the publicly funded Human Genome project that is open to researchers already and obtained in a different manner than the private one.
It's already good. Release are coming farther apart and there are less and less changes. The next build should be a true gold standard; you're right about that.
Beware about dbSNP mappings. Many placements are ambiguous (98% alignment to one spot, 96% to another, which one is right?). Some of the data is probably bogus, too. Still, it's pretty good stuff.
I spill^H^H^H^H^H^H open up my DNA database everyday!
All your Sybase are belong to us.
or he'll write a bill preventing the data from being released.
Oh wait, there's no corporation for him to whore himself out to. Maybe this will actually see daylight.
You are in a maze of twisty little passages, all alike.
to start complaining about how another Hitchhiker's Guide story got posted.
Damned acronyms.
Here's a copy of the data
t atgactgatcggtagcatatattatgctatagctagcgtgtagctagtat cacatcagctactatgtagctacgatcgagcacactgactacgtagctag tagcggatcgatagctgatctgactgactatatatagcgcgcgatatata gcgcgtagatcgtagccgcgcgatgatatataaggagactgactagc...
acgcggcgatgcgtacatagctagcgctgcatagatcgactatgacgat
Does anyone remember the story of the hacker that actually wrote the code that cracked the genome sequencing problem? He is the unsung hero of this whole private vs. public debacle. He wrote a 10,000 line C program to do the sequencing in "rafts" and "contigs" in the space of a few days -- and had to ice his wrists from all the work... it was because of his brilliant work that the race went from being a 20-year thing to a 3-year thing, and of course nobody knows his name. (And I've forgotten it.)
Anyway, Celera seems to epitomize the way large projects like this become free: they sink billions upon billions of dollars into a project which is soon supplanted by a better free (though, of course, government funded) alternative, and after years of unsuccessfully trying to sell it, release it for free for a bit of good PR.
But then again, they've made a huge contribution to the field overall; Craig Venter may be an arrogant prick, but he gets shit done, while Francis Collins mostly waxes poetic about the bright future of genomics.
Well, that seems like enough venting about the sad state of research.
sic transit gloria mundi
Craig Venter better hope his health/life insurance company doesn't take a closer look at the sequence and drop him for "pre-existing" conditions.
In all seriousness however, Celera's sequences essentially suck anyway. The public projects have handily beat them and their sequencing methods have been deemed inferior (see last October's issue of Nature). They are not adding any scientific value by releasing their versions of these three genomes.
I thought it said Caldera there at first. I thought that if I looked too much like my dad, I'd get sued.
Someone has probably already pointed out that human DNA contains 3 billion base pairs and not 30 billion. It is a sad shame that a company as renown as Celera is overshadowed by blatant misinformation; even from former CEO Craig Venter who is known for calling archea a type of bacteria in the December 2004 issue of SCIENCE magazine. Mishaps like this further alienate the real intellectuals who would normally be capable of over-running the Internet towards an information rapture in the scientific community.
-Bio major/Nerd
The book is very readable, and from my own experiences rings of the truth.
The information was most likely taken from a press release by Celera. Press releases tend lean to hyperbole so long as it remains technically truthful. Either there were a heck of a lot of mice and rat genomes, which along side the human totaled to 30Gbp, or much of the data is redundant.
Genomes are available at http://www.ensembl.org/ . I know I've said this before, but I feel it can't be overemphasized. Ensembl is so incredibly cool. I imagine Celera is releasing their data because no one wants to pay for it when Ensembl has it for free. Additionally, Ensembl has tools that provide so much more than just genome sequence-scanning. And they use open source projects like BioPerl and use Wiki for documentation! I think this is just a PR stunt for Celera.
"fist in the air in the land of hypocrisy"
Okay, since this data, too, "wants" to be free, how about posting links to the CVS / rsync / snapshot.bz2 / BitTorrent / ftp site for downloading the database? "I'm okay to go..."
They did swear under
oath
that they would release they data without restrictions.
They also told congress (under oath) that their strategy
would end speculative patenting of the human
genome, whereas infact they've applied
for thousands and thousands of speculative
patents.
Shame on them.
Celera have long been seen as the Microsoft of the Science world, snaffing up patents 'like a powered up pacman'. So I'd say you got the two mixed up there. But Craig Venter (celera) 'opensourcing' is like Bill Gates stealing your cereal, and never replacing your milk - then one day giving you a cow. They are both pricks, and this gesture doesn't change anything. They are both ,aligned in different fields.
(I am pretty certain this data has been freely available but making drugs based on research using it etc might have been the restricting factor.
Maybe it just wasnt freely available to academia).
'Caldera Opens NDA Database!'
OK, heart rate is lowering now...
... computer hackers have known this for quite a while now.
- "They misunderestimated me."
This brings a whole new meaning to the phrase Identity Theft.
Anyone got a torrent?
I am one of many. My idea is not unique, nor do I expect my voice alone to sway you. I speak in a chorus of opinion.
Can anyone tell me if this is that big of a deal? Im no biologist just a college kid in a bioinformatics class but from what i've experienced the major free databases out there like GenBank, EMBL, and DDBJ seem to be pretty comprehensive.
made me cum all over the ceiling
Open Source has nothing to do with GNU.
Profit motivates conservative power-grabbers. It hardly ever motivates creativity nor interesting research. That is why the lean, mean, modern corporations so desperately suck at basic research, and almost all cutting-edge work is still coming from universities and national labs.
These genomes are in the latter category: sitting on this information and trying to wring profit out of it will never earn back the investment Celera expended. Publishing it on the other hand will allow it to be used for intangible and tangible benefits to society, some of which will come back to the company.
Someone has probably already pointed out that human DNA contains 3 billion base pairs and not 30 billion. It is a sad shame that a company as renown as Celera is overshadowed by blatant misinformation...
...
According to my 2004 Bioniformatics in the Post-Genomic Era textbook, there are approximately 100 billion bases - so that's more than 30 billion base pairs (60 billion)
But, hey, so you're working on outdated texts, why should the facts bother you?
-- Tigger warning: This post may contain tiggers! --
Whos DNA is it?
It's mine. Prior art.
All your patents are belong to humanity.
-- Tigger warning: This post may contain tiggers! --