Geneticists Push For Databases Over Journals As Main Source of Information (theatlantic.com)
neoritter writes: The issues of reproducibility in journals continues to present problems. This time in the world of clinical geneticists where a misleading or incorrect journal on the effect of a gene variant can affect the decisions made by doctors and patients alike; from heart monitoring implants to abortions. Poor sampling and low thresholds for evidence have led some clinical geneticists to work towards an open database of genetic information. Scientists and doctors would go to a "one-stop shop for disease genes" to check and share information with each other under the strictest of standards.
It's not the doctors that I worry about having access to this database.
You are welcome on my lawn.
This is a wonderful idea. I mean, why not push for more studies to actually provide their raw data along with their conclusions? Extend the peer review process of the scientific method to include all of the data they generate, as advances in technology allow for the storage and communication of that information now. What is wrong with that, as a general idea? There is always the worry of security or safety of the data, but that was the same with publishing some things in journals already.
The main person quoted in the article, Heidi Rehm, is 100% right about the need for a central open database of known genetic disease variants. And, just to get the Slashdot crowd interested, she also has a bit of the sexy librarian look going on.
But, as far as I can tell, she really hasn't been able to get much funding to be allocated for such databases (e.g. ClinVar and ClinGen). A couple years back, she got a grant for a few million dollars. But in a world where the USA thinks a long running war in Iraq is so wonderful that it's worth spending trillions on it, a few million is absolute peanuts. And Obama has made some worthless speeches about a "Prescision Medicine" initiative but hasn't actually ponied up any real cash.
Personal/clinical genomics today is like personal computers in the1980s. Personal computers didn't give us self-aware AI and personal/clinical genome sequencing isn't going to make us live forever (i.e. cure aging). But personal/clinical genome sequencing is one of the biggest revolutions in the history of medicine - right up there with aseptic surgery and antibiotics. Back in the 1980s there were networked computers and limited forms of email that were available in very limited and specialized contexts. But now everyone has a (networked) computer and all kinds of electronic communication that goes far beyond email. In the last decade, a relatively small number of people have had their genomes sequenced - and obtained useful clinical information. But that's going to explode. In a decade or two pretty much everyone in the developed world will have their genome sequenced.
I know that there's a lot of anger and cynicism about medical care in countries like the USA. There are some obvious market failures in the form of monopolies that limit the availability and dramatically increase the cost of access to medical doctors and medicines. And the USA has responded by layering on additional bureaucracy in the form of mandatory health insurance.
But there's also hope. A lot of lives are going to be saved and a lot of disability and suffering is going to be prevented by wide-spread personal/clinical genome sequencing. Let me give just one example. There are certain drugs that are known to either be ineffective or toxic to people with certain rare genetic variants. As it is, everyone is given the drugs and the doctors hope that they can detect the problem before the patient ends up dead (sometimes they do detect it in time and sometimes they don't and the patient ends up dead). With personal genome sequencing, people will know ahead of time which drugs to avoid - and won't end up dead from being given the wrong drug (i.e. wrong for their particular genetics).
Any research or study of merit should be put into a database. This helps not only verification and result replication, but also makes searching and cross referencing far more effective. The verbosity required for journal publication is unnecessary, and the formats unusable without re-entering the data for proper formatting and processing.
Other areas that desperately need database coverage are things like copyright / patent / trademark registrations. In fact, copyright should go back to registered concept (instead of the default copyright system that we have now) and the work must be added to the fully searchable database with all appropriate key fields and variants (eg. lyrics + score + references + recording for music, etc). Trademarks and patents are currently searchable only because of entities like google, and not because they are made properly accessible (by the government offices in question) including all pertinent raw data, references, and patent examiner notes that go into the applications.
But if everyone populated and used curated public databases, then there would be no need for the army of PhD and Masters students employed by IPA/IVA to read papers and feed their proprietary knowledge base. What would people do with all the money that's currently spent on propping up the Qiagen army?
Ask me about repetitive DNA
...in the form of Helix recently spun out of Illumina and the database being built by 23andMe...
I don't want to disparage either the Helix database or the 23andMe database. They're very cool - even the top academic medical geneticists are intrigued by their potential.
Of course, the top academic medical geneticists have also been working in the field long enough to know just how hard this problem really is. Case in point, the article mentioned in the summary talks about a couple who terminated a pregnancy based on an interpretation of a genetic variant that later turned out to be wrong.
But large scale databases of raw genetic data like Helix and 23andMe are fundamentally different from open curated databases like ClinGen that are the topic of this Slashdot article. The point of an open curated database like ClinGen is to have an "official" medically actionable interpretation for as many variants as possible. One of their main efforts has been to develop a sophisticated score scheme to combine all the possible relevant information into a simple consensus interpretation of each variant in their database.
Since we're talking about changing the reporting language of an entire discipline from human natural languages to a special-purpose database encoding, we have to assume that accountability (Whose variant is this?) and security will be part of the encoding, just as there are peer review controls to reduce the incidence of fake papers in natural language journals.
The other question, whether a database is open-source or not, is the same question we have for today's journals. Universities have to get used to according open-source journals that are well edited the same prestige they give the Shkreli-priced journals, and treat databases in the same way.
the usual checks on falsehood and self delusion simply break down
Everyone in the industry of genetic analysis and gene prediction has a hand out for a dollar, the temptations are huge and the self checking is a joke
Time for a common database? Sure
Also some for civil penalties to those who sell bogus data