Geneticists Push For Databases Over Journals As Main Source of Information (theatlantic.com)

← Back to Stories (view on slashdot.org)

Geneticists Push For Databases Over Journals As Main Source of Information (theatlantic.com)

Posted by samzenpus on Wednesday December 16, 2015 @01:22PM from the changing-it-up dept.

neoritter writes: The issues of reproducibility in journals continues to present problems. This time in the world of clinical geneticists where a misleading or incorrect journal on the effect of a gene variant can affect the decisions made by doctors and patients alike; from heart monitoring implants to abortions. Poor sampling and low thresholds for evidence have led some clinical geneticists to work towards an open database of genetic information. Scientists and doctors would go to a "one-stop shop for disease genes" to check and share information with each other under the strictest of standards.

12 of 31 comments (clear)

Min score:

Reason:

Sort:

Hmm... by PopeRatzo · 2015-12-16 13:41 · Score: 1

“All we need to do to get there is convince researchers around the world to share their data, build the world's largest repository of genetic and clinical information, and develop functional tests for every gene in the human genome,” he adds. “Easy.”
It's not the doctors that I worry about having access to this database.

--
You are welcome on my lawn.
Great by ldgeorge85 · 2015-12-16 13:52 · Score: 3, Interesting

This is a wonderful idea. I mean, why not push for more studies to actually provide their raw data along with their conclusions? Extend the peer review process of the scientific method to include all of the data they generate, as advances in technology allow for the storage and communication of that information now. What is wrong with that, as a general idea? There is always the worry of security or safety of the data, but that was the same with publishing some things in journals already.
1. Re:Great by Anonymous Coward · 2015-12-17 03:02 · Score: 1
  
  This is already pretty standard. Most top-tier journals (the ones people would be basing their medical care off of) require you to put your raw data online in a database, e.g. NCBI's GEO for all microarray data. Any federally-funded project (i.e. nearly all projects in the US) are also required to make all of their raw data available.
  This is not -policed- so carefully, so people can certainly publish data and get federal funding without making the raw data available, but it's becoming more and more rare to find a publication without the raw data made available--at least for large-scale omics studies. Smaller-scale molecular studies still rarely make their raw data available, mostly because it's hard to find databases to upload like two plates' worth of qPCR data into. Many journals allow this to be put in supplemental, but it's not always easy.
Show me the money by Anonymous Coward · 2015-12-16 15:12 · Score: 4, Informative

The main person quoted in the article, Heidi Rehm, is 100% right about the need for a central open database of known genetic disease variants. And, just to get the Slashdot crowd interested, she also has a bit of the sexy librarian look going on.
But, as far as I can tell, she really hasn't been able to get much funding to be allocated for such databases (e.g. ClinVar and ClinGen). A couple years back, she got a grant for a few million dollars. But in a world where the USA thinks a long running war in Iraq is so wonderful that it's worth spending trillions on it, a few million is absolute peanuts. And Obama has made some worthless speeches about a "Prescision Medicine" initiative but hasn't actually ponied up any real cash.
Personal/clinical genomics today is like personal computers in the1980s. Personal computers didn't give us self-aware AI and personal/clinical genome sequencing isn't going to make us live forever (i.e. cure aging). But personal/clinical genome sequencing is one of the biggest revolutions in the history of medicine - right up there with aseptic surgery and antibiotics. Back in the 1980s there were networked computers and limited forms of email that were available in very limited and specialized contexts. But now everyone has a (networked) computer and all kinds of electronic communication that goes far beyond email. In the last decade, a relatively small number of people have had their genomes sequenced - and obtained useful clinical information. But that's going to explode. In a decade or two pretty much everyone in the developed world will have their genome sequenced.
I know that there's a lot of anger and cynicism about medical care in countries like the USA. There are some obvious market failures in the form of monopolies that limit the availability and dramatically increase the cost of access to medical doctors and medicines. And the USA has responded by layering on additional bureaucracy in the form of mandatory health insurance.
But there's also hope. A lot of lives are going to be saved and a lot of disability and suffering is going to be prevented by wide-spread personal/clinical genome sequencing. Let me give just one example. There are certain drugs that are known to either be ineffective or toxic to people with certain rare genetic variants. As it is, everyone is given the drugs and the doctors hope that they can detect the problem before the patient ends up dead (sometimes they do detect it in time and sometimes they don't and the patient ends up dead). With personal genome sequencing, people will know ahead of time which drugs to avoid - and won't end up dead from being given the wrong drug (i.e. wrong for their particular genetics).
1. Re:Show me the money by beastofburdon · 2015-12-17 06:13 · Score: 1
  
  she also has a bit of the sexy librarian look going on
  Clicks link to her bio
  It checks out. She could dazzle me with her looks or with science! Preferably both.
Extend this concept to other areas too by pipedwho · 2015-12-16 15:26 · Score: 2

Any research or study of merit should be put into a database. This helps not only verification and result replication, but also makes searching and cross referencing far more effective. The verbosity required for journal publication is unnecessary, and the formats unusable without re-entering the data for proper formatting and processing.
Other areas that desperately need database coverage are things like copyright / patent / trademark registrations. In fact, copyright should go back to registered concept (instead of the default copyright system that we have now) and the work must be added to the fully searchable database with all appropriate key fields and variants (eg. lyrics + score + references + recording for music, etc). Trademarks and patents are currently searchable only because of entities like google, and not because they are made properly accessible (by the government offices in question) including all pertinent raw data, references, and patent examiner notes that go into the applications.
1. Re:Extend this concept to other areas too by hankwang · 2015-12-16 17:31 · Score: 1
  
  With medical data it's a pain to provide raw data and still guarantee anonymity of the test subjects.
  In genetics, the data is fairly standardized: it's a list of base pairs and the method to get the list from a DNA sample is standardized. However, in fields where data representation and handling are not standardized, it will be a major effort to document the data and metadata representation. For experimental physics, the raw data will only make sense in combination with the hand-written lab notes, hardware details which could be scattered over years of lab notes or that are, only in the head of the Ph.D. student. I have exchanged raw data on some occasions, both on the giving and receiving side. It usually requires extensive oral explanation, and sometimes, the details can't be remembered.
  And what purpose does it serve? In 99% of the papers, not even the referee will be willing to sift through the raw data, which may have taken months of analysis, whereas the referee can only spend an hour or two.
  
  --
  Avantslash: low-bandwidth mobile slashdot.
2. Re:Extend this concept to other areas too by Applehu+Akbar · 2015-12-17 01:46 · Score: 1
  
  Because in your discipline, the database language is mathematics.
What about IPA/IVA? by gringer · 2015-12-16 16:45 · Score: 1

But if everyone populated and used curated public databases, then there would be no need for the army of PhD and Masters students employed by IPA/IVA to read papers and feed their proprietary knowledge base. What would people do with all the money that's currently spent on propping up the Qiagen army?

--
Ask me about repetitive DNA
Re:Helix by Anonymous Coward · 2015-12-16 23:03 · Score: 1

...in the form of Helix recently spun out of Illumina and the database being built by 23andMe...
I don't want to disparage either the Helix database or the 23andMe database. They're very cool - even the top academic medical geneticists are intrigued by their potential.
Of course, the top academic medical geneticists have also been working in the field long enough to know just how hard this problem really is. Case in point, the article mentioned in the summary talks about a couple who terminated a pregnancy based on an interpretation of a genetic variant that later turned out to be wrong.
But large scale databases of raw genetic data like Helix and 23andMe are fundamentally different from open curated databases like ClinGen that are the topic of this Slashdot article. The point of an open curated database like ClinGen is to have an "official" medically actionable interpretation for as many variants as possible. One of their main efforts has been to develop a sophisticated score scheme to combine all the possible relevant information into a simple consensus interpretation of each variant in their database.
Re:No databases... by Applehu+Akbar · 2015-12-17 01:43 · Score: 1

Since we're talking about changing the reporting language of an entire discipline from human natural languages to a special-purpose database encoding, we have to assume that accountability (Whose variant is this?) and security will be part of the encoding, just as there are peer review controls to reduce the incidence of fake papers in natural language journals.
The other question, whether a database is open-source or not, is the same question we have for today's journals. Universities have to get used to according open-source journals that are well edited the same prestige they give the Shkreli-priced journals, and treat databases in the same way.
Whenever profits are based on false information by AutodidactLabrat · 2015-12-17 08:06 · Score: 1

the usual checks on falsehood and self delusion simply break down
Everyone in the industry of genetic analysis and gene prediction has a hand out for a dollar, the temptations are huge and the self checking is a joke
Time for a common database? Sure
Also some for civil penalties to those who sell bogus data