Slashdot Mirror


Geneticists Push For Databases Over Journals As Main Source of Information (theatlantic.com)

neoritter writes: The issues of reproducibility in journals continues to present problems. This time in the world of clinical geneticists where a misleading or incorrect journal on the effect of a gene variant can affect the decisions made by doctors and patients alike; from heart monitoring implants to abortions. Poor sampling and low thresholds for evidence have led some clinical geneticists to work towards an open database of genetic information. Scientists and doctors would go to a "one-stop shop for disease genes" to check and share information with each other under the strictest of standards.

31 comments

  1. open season on open databases by turkeydance · · Score: 0

    it's duck season!

    1. Re:open season on open databases by Anonymous Coward · · Score: 0

      it's duck season!

      Wabbit season!!!

    2. Re:open season on open databases by Anonymous Coward · · Score: 0

      it's duck season!

      Wabbit season!!!

      Duck season!

  2. Hmm... by PopeRatzo · · Score: 1

    “All we need to do to get there is convince researchers around the world to share their data, build the world's largest repository of genetic and clinical information, and develop functional tests for every gene in the human genome,” he adds. “Easy.”

    It's not the doctors that I worry about having access to this database.

    --
    You are welcome on my lawn.
    1. Re:Hmm... by Anonymous Coward · · Score: 0

      On one hand, the databases that are being advocated in the article would be for mutations/variants rather than for genome sequences of individuals, per se. For example, that a mutation of "A" to "T" at position 20,435,103 causes Syndrome X (as a totally made up example). And, generally speaking, the supporting data in the database wouldn't include personal details such as pedigrees of families carrying the variants.

      On the other hand, some variants are sufficiently rare that they occur in only a few families in the world. Make of that what you will.

      In the bigger picture, though, there are other things that are more worrying than this database. For example, some of the rich little oil countries in the Middle East are talking about sequencing the genomes of their entire population (the citizens, anyway - probably not the foreign maids and construction workers). But genome sequencing is the ultimate paternity test and some of these little oil countries have laws that punish adultery by stoning to death. Could be interesting.

    2. Re:Hmm... by PopeRatzo · · Score: 0

      On the other hand, some variants are sufficiently rare that they occur in only a few families in the world. Make of that what you will.

      Leave the Royal Family out of this. They're descended from Jesus.

      --
      You are welcome on my lawn.
    3. Re:Hmm... by Anonymous Coward · · Score: 0

      On the other hand, some variants are sufficiently rare that they occur in only a few families in the world. Make of that what you will.

      Leave the Royal Family out of this.

      Yes, the very rare variants do tend to get noticed in extended families with a tradition of marrying their cousins. But, in some parts of the world, it's not just the rich and powerful extended families that marry their cousins - even the poor extended families do it.

      They're descended from Jesus.

      There's a good chance that all life on the planet is descended from a single little cell that was floating around in the ocean roughly four billion years ago. In a certain sense, we're all part of the same big organism spanning the entire planet. But then some parts of the organism eat other parts of the organism - even though the other parts would prefer not to be. :)

  3. Great by ldgeorge85 · · Score: 3, Interesting

    This is a wonderful idea. I mean, why not push for more studies to actually provide their raw data along with their conclusions? Extend the peer review process of the scientific method to include all of the data they generate, as advances in technology allow for the storage and communication of that information now. What is wrong with that, as a general idea? There is always the worry of security or safety of the data, but that was the same with publishing some things in journals already.

    1. Re:Great by Anonymous Coward · · Score: 0

      This is a wonderful idea. I mean, why not push for more studies to actually provide their raw data along with their conclusions?

      I'm not opposed to making raw data available. And I've actually had good luck asking authors for the raw data underlying their papers for cases that I was particularly interested in. But, in a certain sense, the database being advocated in the article, would be exactly the opposite.

      Yes, the curators would read the original articles carefully and perhaps even contact the authors for supporting data. But the final results would be a curated consensus of all the original articles for each variants. In particular, one of the key things that the ClinGen initiative is already doing is developing a detailed scoring system to weight all the different bits of information in arriving at the final consensus conclusion for each genetic variant.

    2. Re:Great by Anonymous Coward · · Score: 1

      This is already pretty standard. Most top-tier journals (the ones people would be basing their medical care off of) require you to put your raw data online in a database, e.g. NCBI's GEO for all microarray data. Any federally-funded project (i.e. nearly all projects in the US) are also required to make all of their raw data available.

      This is not -policed- so carefully, so people can certainly publish data and get federal funding without making the raw data available, but it's becoming more and more rare to find a publication without the raw data made available--at least for large-scale omics studies. Smaller-scale molecular studies still rarely make their raw data available, mostly because it's hard to find databases to upload like two plates' worth of qPCR data into. Many journals allow this to be put in supplemental, but it's not always easy.

  4. Show me the money by Anonymous Coward · · Score: 4, Informative

    The main person quoted in the article, Heidi Rehm, is 100% right about the need for a central open database of known genetic disease variants. And, just to get the Slashdot crowd interested, she also has a bit of the sexy librarian look going on.

    But, as far as I can tell, she really hasn't been able to get much funding to be allocated for such databases (e.g. ClinVar and ClinGen). A couple years back, she got a grant for a few million dollars. But in a world where the USA thinks a long running war in Iraq is so wonderful that it's worth spending trillions on it, a few million is absolute peanuts. And Obama has made some worthless speeches about a "Prescision Medicine" initiative but hasn't actually ponied up any real cash.

    Personal/clinical genomics today is like personal computers in the1980s. Personal computers didn't give us self-aware AI and personal/clinical genome sequencing isn't going to make us live forever (i.e. cure aging). But personal/clinical genome sequencing is one of the biggest revolutions in the history of medicine - right up there with aseptic surgery and antibiotics. Back in the 1980s there were networked computers and limited forms of email that were available in very limited and specialized contexts. But now everyone has a (networked) computer and all kinds of electronic communication that goes far beyond email. In the last decade, a relatively small number of people have had their genomes sequenced - and obtained useful clinical information. But that's going to explode. In a decade or two pretty much everyone in the developed world will have their genome sequenced.

    I know that there's a lot of anger and cynicism about medical care in countries like the USA. There are some obvious market failures in the form of monopolies that limit the availability and dramatically increase the cost of access to medical doctors and medicines. And the USA has responded by layering on additional bureaucracy in the form of mandatory health insurance.

    But there's also hope. A lot of lives are going to be saved and a lot of disability and suffering is going to be prevented by wide-spread personal/clinical genome sequencing. Let me give just one example. There are certain drugs that are known to either be ineffective or toxic to people with certain rare genetic variants. As it is, everyone is given the drugs and the doctors hope that they can detect the problem before the patient ends up dead (sometimes they do detect it in time and sometimes they don't and the patient ends up dead). With personal genome sequencing, people will know ahead of time which drugs to avoid - and won't end up dead from being given the wrong drug (i.e. wrong for their particular genetics).

    1. Re:Show me the money by Anonymous Coward · · Score: 0

      TROLL ALERT: She looks like a squinty-eyed pig.

    2. Re:Show me the money by Anonymous Coward · · Score: 0

      Each to their own. You prefer the rail-thin super-model heroin chic yourself, I take it?

      It's weird that we're at the beginning of one of the biggest revolutions in the history of medicine - with the ability to sequence most of a person's genome for somewhere around $1,000. And no one really seems to care.

      For me, I get about a minute and a half into this NHGRI video, and I get goose bumps. It's history in the making. It's like watching Abraham Lincoln give the Gettysburg address (one of the greatest speeches in the human history) - if Abraham Lincoln had been smoking hot.

      Not that appearance really matters. But, wow, if that video doesn't inspire you based on content alone then you must be truly dead inside.

    3. Re:Show me the money by Anonymous Coward · · Score: 0

      TROLL ALERT: So people are more interested in what people know and do than how they look.

    4. Re:Show me the money by Anonymous Coward · · Score: 0

      Sexy? You need to get out more often.

    5. Re:Show me the money by beastofburdon · · Score: 1

      she also has a bit of the sexy librarian look going on

      Clicks link to her bio
      It checks out. She could dazzle me with her looks or with science! Preferably both.

  5. Extend this concept to other areas too by pipedwho · · Score: 2

    Any research or study of merit should be put into a database. This helps not only verification and result replication, but also makes searching and cross referencing far more effective. The verbosity required for journal publication is unnecessary, and the formats unusable without re-entering the data for proper formatting and processing.

    Other areas that desperately need database coverage are things like copyright / patent / trademark registrations. In fact, copyright should go back to registered concept (instead of the default copyright system that we have now) and the work must be added to the fully searchable database with all appropriate key fields and variants (eg. lyrics + score + references + recording for music, etc). Trademarks and patents are currently searchable only because of entities like google, and not because they are made properly accessible (by the government offices in question) including all pertinent raw data, references, and patent examiner notes that go into the applications.

    1. Re:Extend this concept to other areas too by hankwang · · Score: 1

      With medical data it's a pain to provide raw data and still guarantee anonymity of the test subjects.

      In genetics, the data is fairly standardized: it's a list of base pairs and the method to get the list from a DNA sample is standardized. However, in fields where data representation and handling are not standardized, it will be a major effort to document the data and metadata representation. For experimental physics, the raw data will only make sense in combination with the hand-written lab notes, hardware details which could be scattered over years of lab notes or that are, only in the head of the Ph.D. student. I have exchanged raw data on some occasions, both on the giving and receiving side. It usually requires extensive oral explanation, and sometimes, the details can't be remembered.

      And what purpose does it serve? In 99% of the papers, not even the referee will be willing to sift through the raw data, which may have taken months of analysis, whereas the referee can only spend an hour or two.

    2. Re:Extend this concept to other areas too by Anonymous Coward · · Score: 0

      The verbosity required for journal publication is unnecessary

      This really depends on the area of research. I work in condensed matter physics, and I often feel that it is rather the opposite: many papers are only 4-10 pages long, which results in an insufficient level of details about the methods being published, and I might have to spend up to a week reproducing the missing parts if I wish to understand what's really going on. At least in my field, we don't need less verbose papers, we need more verbose papers.

    3. Re:Extend this concept to other areas too by Applehu+Akbar · · Score: 1

      Because in your discipline, the database language is mathematics.

  6. No databases... by Anonymous Coward · · Score: 0

    No databases. Journals are still a must. Why?

    First, with databases, it is easy for a company to control access to who has the data, charging astronomical rates. With a journal, I save a copy or have a hard copy, and now, or maybe ten years from now, I can access it for a reference. With a database, the data could still be there, it could be expired, and who knows how much I would have to pay for access.

    Second, the databases can be hacked or modified. Nobody would ever know.

    Third, the data can disappear at the DB owner's whim.

    All and all... just no.

    1. Re:No databases... by Anonymous Coward · · Score: 0

      In this case, though, the databases in question (ClinGen and ClinVar) are open and freely available to download in their entirety. They're maintained by the NCBI so, as works created by the government, they're fully in the public domain.

    2. Re:No databases... by Applehu+Akbar · · Score: 1

      Since we're talking about changing the reporting language of an entire discipline from human natural languages to a special-purpose database encoding, we have to assume that accountability (Whose variant is this?) and security will be part of the encoding, just as there are peer review controls to reduce the incidence of fake papers in natural language journals.

      The other question, whether a database is open-source or not, is the same question we have for today's journals. Universities have to get used to according open-source journals that are well edited the same prestige they give the Shkreli-priced journals, and treat databases in the same way.

  7. What about IPA/IVA? by gringer · · Score: 1

    But if everyone populated and used curated public databases, then there would be no need for the army of PhD and Masters students employed by IPA/IVA to read papers and feed their proprietary knowledge base. What would people do with all the money that's currently spent on propping up the Qiagen army?

    --
    Ask me about repetitive DNA
  8. Helix by Anonymous Coward · · Score: 0

    This will fail for academic geneticists. The reason is there are already vast commercial databases that are far bigger than anything academia could put together or fund in the form of Helix recently spun out of Illumina and the database being built by 23andMe (btw if you get sequenced by 23andMe, they keep your sequence data for others to reference in the future; that's a call out to you privacy guys). An open database sounds good but it's only as good as the data that you put in and Helix is already WAY ahead. So it's usability will be doomed to fail when you can just access a Helix reference for a few hundred bucks.

    1. Re:Helix by Anonymous Coward · · Score: 1

      ...in the form of Helix recently spun out of Illumina and the database being built by 23andMe...

      I don't want to disparage either the Helix database or the 23andMe database. They're very cool - even the top academic medical geneticists are intrigued by their potential.

      Of course, the top academic medical geneticists have also been working in the field long enough to know just how hard this problem really is. Case in point, the article mentioned in the summary talks about a couple who terminated a pregnancy based on an interpretation of a genetic variant that later turned out to be wrong.

      But large scale databases of raw genetic data like Helix and 23andMe are fundamentally different from open curated databases like ClinGen that are the topic of this Slashdot article. The point of an open curated database like ClinGen is to have an "official" medically actionable interpretation for as many variants as possible. One of their main efforts has been to develop a sophisticated score scheme to combine all the possible relevant information into a simple consensus interpretation of each variant in their database.

    2. Re:Helix by Anonymous Coward · · Score: 0

      I understand the theory, but theory doesn't account for reality. Open access database that is medically actionable? It doesn't matter if it's a non-profit, the FDA has jurisdiction and they will absolutely refuse an open access database that is the basis for any medical decision. Helix and 23andMe will still always be better from a medical perspective strictly because of the regulatory; they are closed access and controlled by the company, which means they control the standards of data and more importantly there's one belly button to push when something goes wrong. Who's responsible if someone makes an improper medical decision on an open access journal if the data was based on the input from multiple sources? Open access like this reduces accountability, and the FDA will never allow it.

  9. We need a comprehensive open genomic database by Anonymous Coward · · Score: 0

    Once we have full genome sequences for enough species (even different members of the same species with some known differences tagged per specimen) we can create an algorithm to figure out how to write DNA. Synthetic biology will go from copying-and-pasting pseudorandom fragments of code that likely do what we want to things like a seed that will grow into a house or an amoeba that will turn into a leviathan-style spacecraft if you give it an asteroid to eat. Our ability to exploit the universe will multiply exponentially once we have a way to code novel organisms from scratch like we might a computer.

  10. Excellent idea. by Anonymous Coward · · Score: 0

    It was only a matter of time before the SJW-ification in higher education and research departments spread from the liberal arts 'research' over toward scientific research.

    When we start getting scientific journals about how certain chemical compounds are misogynistic published and then pass peer review, we will definitely need access to the data in order to reach our own (more sane) conclusions.

  11. Whenever profits are based on false information by AutodidactLabrat · · Score: 1

    the usual checks on falsehood and self delusion simply break down
    Everyone in the industry of genetic analysis and gene prediction has a hand out for a dollar, the temptations are huge and the self checking is a joke
    Time for a common database? Sure
    Also some for civil penalties to those who sell bogus data

    1. Re:Whenever profits are based on false information by Anonymous Coward · · Score: 0

      Most studies start with an agenda, in spite of any philosophical tenets claimed, such as say - the scientific method.

      So, the dearth of databases? Just explained.