Slashdot Mirror


Human Genome Contaminated With Mycoplasma DNA

KentuckyFC writes "The published human genome is contaminated with DNA sequences from mycoplasma bacteria, according to bioinformatics researchers who blame an epidemic of mycoplasma contamination in molecular biology labs around the world. The researchers say they've also found mycoplasma DNA in two commercially available human DNA chips made by biotech companies for measuring levels of human gene expression. So anybody using these chips to measure human gene expression is also unknowingly measuring mycoplasma gene expression too. The mycoplasma genes are clearly successful in reproducing themselves in silico raising the possibility that we're seeing the beginnings of an entirely new kind of landscape of infection. One option to combat this kind of virtual infection is to protect databases with the genomic version of antivirus software, a kind of virtual immune system. But this in itself could make things worse by triggering an evolutionary arms race that selects genes most capable of beating the safeguards."

78 of 123 comments (clear)

  1. Intelligent design... by Anonymous Coward · · Score: 2, Funny

    ...clearly uses DRM.

    1. Re:Intelligent design... by NoNonAlphaCharsHere · · Score: 1

      There's NOTHING intelligent about DRM.

    2. Re:Intelligent design... by Anonymous Coward · · Score: 1

      S'all right, there's nothing intelligent about intelligent design, either.

  2. in silico by taktoa · · Score: 1

    Haven't heard that one before...

    1. Re:in silico by stillnotelf · · Score: 4, Informative

      Experiments done in a cell are in vivo (life); experiments done in a test tube are in vitro (glass); experiments done in a computer are in silico (silicon computer chips). In silico is used to describe computational modeling experiments (think FoldIt or Rosetta@Home), or manipulation and searching of large DNA/RNA/protein sequence databases. You'd expect it to apply to stuff like weather modeling or nuclear physics, but there the analogy to vivo/vitro is lost so I believe those fields don't use the term.

    2. Re:in silico by lennier · · Score: 1

      Haven't heard that one before...

      Not an Australian then, eh mate?

      --
      You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
    3. Re:in silico by Mindcontrolled · · Score: 1

      The biophysics prof overseeing my PhD work was adamant though, that one should call it in silicio. "In silico" is bad latin, and if something pissed him off, it was bad latin...

      --
      Ubi solitudinem faciunt, pacem appellant.
    4. Re:in silico by Mattcelt · · Score: 1

      All right, so what is the correct Latin term?

    5. Re:in silico by stillnotelf · · Score: 1

      In silico is the correct term used for communicating science in English. Latin being dead, there's nobody to rule on "correctness", but Wikipedia explains the fight over terms: http://en.wikipedia.org/wiki/In_silico#In_silico_versus_in_silicio

  3. And if they could clone humans using this DNA.. by cyberchondriac · · Score: 1

    would they wind up with Swamp Thing?

    --

    Look back up at my post, now look back down, you're on the Internet. Now look back up. I'm a signature.
    1. Re:And if they could clone humans using this DNA.. by lennier · · Score: 2

      Swamp Thing
      You make my heart sing
      You make everything
      Squelchy

      --
      You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
  4. That's not good by YodasEvilTwin · · Score: 2

    I wonder exactly how far medicine has been set back by this. Researchers trying to investigate genetic or genetically-influenced diseases cannot be happy. They've unwittingly been missing information, and treating false information as valid, for a long time.

    1. Re:That's not good by jd · · Score: 2

      That's a good point, as most companies rely on multiple studies to verify if a mutation applies or not. There is no test to see if the studies are using the same hardware, as far as I know, which means that identical results can be a result of identical database errors.

      It also creates problems for things like the 1000 Genomes Project. How many of the thousands (they're already well over the 1000 mark) will have to be retested in order to be able to reliably subtract out the contamination?

      It's not limited to genetic disease, however. Archaeological DNA results will be of questionable value. How do you know what is actually Neandertal or Denisovian DNA? And with the volume of material being extremely limited (we have one fingerbone for Denisovian DNA and a handful of teeth as the source for Neandertal data), retesting isn't really a serious option. From a scientific perspective, this is the more serious problem as finding people with Parkinson's or with Alzheimer's is easier than finding new Neandertal remains. It'll also be important to subtract out the duplicated errors from the DNA that appears to be in common between species, so all claims of a genetic link between humans and Neandertals (for example) have to be put on hold until the scientists can confirm how much of the contamination is being falsely read as duplication.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    2. Re:That's not good by RDW · · Score: 1

      'I wonder exactly how far medicine has been set back by this.'

      No measurable distance. The summary is rather misleading and the arXiv article isn't exactly clear, either. The previous letter they reference (and co-author) is clearer:

      http://www.biotechniques.com/BiotechniquesJournal/2009/December/Letter-to-the-editor-Unexpected-presence-of-mycoplasma-probes-on-human-microarrays/biotechniques-181035.html?service=print

      Basically a couple of tiny fragments have been found in the public DNA sequence databases that were misclassified as human, presumably because they were derived from cDNA 'libraries' constructed from cells contaminated with mycoplasma. These sequences are NOT part of the reference human genome sequence. They exist only as small independent files supposedly representing fragments of genes expressed in human cells:

      http://www.ncbi.nlm.nih.gov/nuccore/af241217
      http://www.ncbi.nlm.nih.gov/nucest/DA466599

      but really mapping to mycoplasma sequences. Unfortunately a commercial provider used one of these files in the (semi-automated) design of a microarray. A probe for a sequence like this (one of many thousands on the array) would generally be harmless, since it does not detect a human sequence. Rather embarrasingly, it did appear to give a positive result in some publicly available data sets from researchers who used the array. This suggests that the cells used by these researchers were also contaminated with mycoplasma. So the problem isn't so much an insidious 'in silico' contamination of databases by bug sequences, but rather the actual contamination of cell cultures by mycoplasma, which suggests sloppy lab technique and a lack of routine testing.

    3. Re:That's not good by YodasEvilTwin · · Score: 1

      Thanks for the clarification!

  5. Data vs executable by Dan+East · · Score: 5, Insightful

    But this in itself could make things worse by triggering an evolutionary arms race that selects genes most capable of beating the safeguards.

    Why is the word "evolutionary" used here? We're talking about static data that is not "executed" - it does not reproduce, it is only copied verbatim. Invalid data that bypasses filters ("antivirus software") is simply that - corrupt, invalid data that does not belong, but at least there will be less of it after filtering. That doesn't make the data somehow more powerful or adaptive - the filter merely missed it. The key fact is the data does not get to modify itself in an iterative fashion in order to survive or improve.

    --
    Better known as 318230.
    1. Re:Data vs executable by x6060 · · Score: 1

      Its not that easy though. Bacteria, viruses, and your DNA will accumulate mutations. It can happen on a cell by cell level. Thats some of the cause for some cancers.

    2. Re:Data vs executable by Richard+Dick+Head · · Score: 3, Informative

      The topic is not about "vestigial" DNA.

      TFA talks about bacteria being mixed in with human samples accidentally, then sequenced. The bacterial DNA shows up with the human DNA, and the bacterial DNA is being documented as human.

    3. Re:Data vs executable by Ruke · · Score: 1

      Exactly; the "fitter," undetected DNA has no opportunity to reproduce and pass on it's trats; we're simply culling members from a static population as they present themselves. You could argue that the population isn't exactly static: new genes are being sequenced and inserted into the database; therefore "fitter" DNA will squirm its way into the databases more frequently. However, we definitely won't be seeing any "evolutionary" arms race - the database entries have no affect on the biological populations of bacteria living in labs, so there is no pressure to be any more or less undetectable.

    4. Re:Data vs executable by Krackbaby · · Score: 1

      Is it really static? You assume researchers aren't going to try to use this raw data to generate any actual end product. I wouldn't make that assumption.

      See Craig Venter's latest attempts at synthetic life, "Mycoplasma laboratorium".

  6. Antivirus *and* Antibacterial? by teebob21 · · Score: 4, Funny

    At first I was relieved that this was a bacteria infecting silicon. Now I'm concerned: When will Avast release an Antibacterial beta? I'm still running Windows, folks! I know I'm vulnerable to this!!!

    --
    khasim (12/9/06): In a blind taste test, more people preferred Coke over the Pepsi that I had previously pissed in.
    1. Re:Antivirus *and* Antibacterial? by lennier · · Score: 2

      When will Avast release an Antibacterial beta?

      Well, since a computer virus just injects code into an already-existing hardware processor, I guess a computer bacteria would have to carry around its own little itsy-bitsy mini-PC on little ambulatory robot legs, eat power from sockets where they can find it, and reproduce by splitting down the middle into two extra widdle bran-new baby mini-PCs.

      Truly an insidious force. They'd infect the entire world through their sheer power of cuteness.

      --
      You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
  7. No by kilraid · · Score: 1

    No resources are freed for any future generations of database contaminants to breed on by filtering. And, the notion of evolution would also require changes to the contaminants, which don't really happen. So by all means, filter. It will leave harder-to-detect contaminants there, but they won't become more numerous.

  8. Proof title please by Anonymous Coward · · Score: 1

    Is the human genome contaminated or is it the published sequence that is contaminated?

    If only the latter, fix the title.

    1. Re:Proof title please by stillnotelf · · Score: 4, Interesting

      The human genome is surely highly contaminated, just not with mycoplasma. Endogenous retroviruses, retrotransposons, repetitive elements galore, on the other hand...

  9. Mod Parent Up by Anonymous Coward · · Score: 1

    This was my first thought when reading the summary - there is no source of feedback to make the data filtering mechanisms loop into the evolutionary design of the bacteria involved unless - and this incredibly out there - everyone starts applying methods to kill the bacteria only based on the amount of corrupt data they scan in - then *maybe* it would be possible for those bacteria with lesser differences to make their way into the system while the rest die out, but that is incredibly unlikely given their otherwise blind nature to the whole process and the fact it would impose evolutionary constraints in direct contradiction to what makes them able to survive - not absolutely impossible, but detracting from the actual story here with such nonsensical hype would be like suggesting the same evolutionary arms race in the same situation because darth vador were to rise out of the earth's molten core to proclaim dominance over the moon, leaving some new bacteria behind on his way - it doesn't make any fucking sense.

    1. Re:Mod Parent Up by vegiVamp · · Score: 1

      In other words, another typical Slashdot summary.

      --
      What a depressingly stupid machine.
  10. Re:Wow that's a relief! by ColdWetDog · · Score: 1

    This is really unfortunate if you intend to reproduce, as all Eukaryots pass through flagellum stage at a point of the life cycle.

    Just exactly where did you go to school? Have you considered trying it again?

    --
    Faster! Faster! Faster would be better!
  11. Is the submitter brain fryed ? by JonySuede · · Score: 4, Insightful

    that part is nonsensical:

    The mycoplasma genes are clearly successful in reproducing themselves in silico raising the possibility that we're seeing the beginnings of an entirely new kind of landscape of infection. One option to combat this kind of virtual infection is to protect databases with the genomic version of antivirus software, a kind of virtual immune system. But this in itself could make things worse by triggering an evolutionary arms race that selects genes most capable of beating the safeguards.

    static data don't evolve

    --
    Jehovah be praised, Oracle was not selected
    1. Re:Is the submitter brain fryed ? by pz · · Score: 1

      that part is nonsensical:

      The mycoplasma genes are clearly successful in reproducing themselves in silico raising the possibility that we're seeing the beginnings of an entirely new kind of landscape of infection. One option to combat this kind of virtual infection is to protect databases with the genomic version of antivirus software, a kind of virtual immune system. But this in itself could make things worse by triggering an evolutionary arms race that selects genes most capable of beating the safeguards.

      static data don't evolve

      The original poster was engaging in self-indulgent free association.

      --

      Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
    2. Re:Is the submitter brain fryed ? by Ruke · · Score: 1

      So what you're saying is that, once we start cleaning garbage data out of the database, we will skip over data that is harder to detect more often than we will skip over data that is easier to detect? I'm not quite sure that this counts as "evolution"; evolution implies adaptation to the environment, and there is no indication that any change or adaptation is going to occur here at all.

      It's kind of like saying that buttons are evolving to populate photographs of my lawn; I'm more likely to bend over and pick up a red button than a green one because it's easier to see, and there is a kind of duplication in the distribution of photographs, but the buttons themselves are completely static.

    3. Re:Is the submitter brain fryed ? by lennier · · Score: 1

      static data don't evolve

      Nothing in the real world is truly static over time. You think your /etc config files are static data? Ever done a series of in-place system upgrades?

      --
      You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
    4. Re:Is the submitter brain fryed ? by snowgirl · · Score: 1

      static data don't evolve

      Nothing in the real world is truly static over time. You think your /etc config files are static data? Ever done a series of in-place system upgrades?

      I installed my router, and then applied the system immutable flag to all of my /etc directory. So, my /etc data has been static for 10 years! ...

      and has been hacked 42 times...

      --
      WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
    5. Re:Is the submitter brain fryed ? by JonySuede · · Score: 1

      well the pictures of Lenna taken years ago sure did not change. Some things are to be considered static at the human scale.

      --
      Jehovah be praised, Oracle was not selected
    6. Re:Is the submitter brain fryed ? by vegiVamp · · Score: 1

      Is that the newest euphemism for being stupid? This *is* a Slashdot editor we're talking about.

      --
      What a depressingly stupid machine.
  12. Why not repeat the genome sequencing? by quax · · Score: 2

    My understanding is that nowadays new high speed sequencing machine can get an entire human genome processed in a couple of month.

    So I would think that after a couple of independent runs one should be able to flush out the non-human DNA assuming the same bacteria contamination is not ever present?

    Obviously this is not a cheap endeavor but given that there is quite a bit of commercial interest in using correct human genome data this seems to me to be a worthwhile investment.

    I find it puzzling that the abstract of the article does not allude to this.

    1. Re:Why not repeat the genome sequencing? by juggledean · · Score: 2

      From the abstract

      "We ... suggest there is a need to clean up genomic databases but fear current tools will be inadequate to catch genes which have jumped the silicon barrier. "

      http://arxiv.org/abs/1106.4192

    2. Re:Why not repeat the genome sequencing? by quax · · Score: 1

      I read this as cleaning up an already corrupted database. Hence my question why you don't go back to the source? Preferably repeatedly and independently to have a better statistic in separating "noise" i.e. Mycoplasma from "signal" i.e. human genome.

    3. Re:Why not repeat the genome sequencing? by quax · · Score: 1

      if the techniques used don't account for mycoplasma contamination then all the samples will be affected.

      If indeed the contamination is that common it'll amount to what would be considered a systematic error in my field (Disclaimer: my academic background is physics).

      Usually this means going back to the drawing board and figuring out how to collect clean data unless you're lucky and can exactly isolate how the data was skewed and it doesn't add too much noise. In that case you can still filter out the information you're after.

      Since I am not a biologist I can only guess that this implies that you need to know the entire possible parameter space of the mycoplasma genome (I assume there is more than one variant). Even then as you point out I don't see how it's possible to control for not erroneously removing human genes if we share some proteins with this critter.

    4. Re:Why not repeat the genome sequencing? by juggledean · · Score: 1

      Well, my first response is, feel free to try it.

      But remember the source material is one individual's genetic material. I believe in the original study they repeated the chemistry many times to be sure the findings were consistent. Assuming you can get this individual to give you some DNA why do you think it won't be contaminated as well. Remember that there are a large number of genes that have not been associated with some function. Personally I think it is more important to figure out what the proteins are doing and how they work together than worry about a 1% error in the bookkeeping.

    5. Re:Why not repeat the genome sequencing? by quax · · Score: 1

      I am in no position to judge the biological relevance of this 1% error.

      But I am also puzzled why the focus is on one individual's DNA. Wouldn't it make more sense to work with samples of several individuals in order to throw out the - presumably minute - individual variances? I would expect the latter not to be very helpful for medical research.

    6. Re:Why not repeat the genome sequencing? by juggledean · · Score: 1

      The variances are what makes us different, one from another. Medical research is very interested in why some people get diseases and others do not. The 1000 Genomes Project was announced in 2008 and finished its pilot study last year http://en.wikipedia.org/wiki/1000_Genomes_Project

    7. Re:Why not repeat the genome sequencing? by quax · · Score: 1

      Fair enough, but wouldn't this be an even better argument for sampling from various individuals? Other wise how are you going to determine where the deltas are?

  13. No feedback mechanism by Vornzog · · Score: 3, Insightful

    How in the world will setting filters on a database put a bacteria in a lab half way around the world at an evolutionary disadvantage? The bacteria will still grow, contaminate the sample, and get sequenced, but the sequence will be rejected. There is no feedback mechanism here, no selective pressure.

    Genome sequence assembly is pretty far removed from the milieu in which a bacteria must make it's way. And inadvertently including bacterial sequences on a gene expression chip is sloppy science, but hardly news.

    Traditional computer viruses are the only things that truly 'reproduce' in silico. Memes are your next best option, but the 'net is just a carrier - they have to infect a human host to reproduce. Stay away from 4chan if you want to avoid infection...

    But bacteria? In silico? Where are we going with this strained analogy, anyway?

    --

    -V-

    Who can decide a priori? Nobody.
    -Sartre

    1. Re:No feedback mechanism by wikdwarlock · · Score: 1

      The "will be rejected" part, I think, is where the issue comes in. Rejected based on what? Comparison to a known good database is demonstrably suspect and is in fact the main point of TFA. If I can find 90% of the non-human DNA corruption in the database and delete it, that now cleaned database becomes the standard. The other 10% of non-human DNA that wasn't caught in the database is now even more vetted, more certified, and less easily detected and deleted by the same database scanning algorithm. Thus, a new and better (i.e. evolved) algorithm is created and some new percentage of the non-human data is cleaned out, but the stuff that's left this time is even more well hidden (i.e. evolved, more similar to human DNA) and suspected more to be authentically human.

      If there's ever use of the data, transcribing it back into actual viable DNA molecules, these newly manufactured DNA could presumably be checked in molecular biology labs for purity/accuracy and picked up by their already happily-infecting-the-lab cousins and the loop is closed. Now granted, the chances are that such a mutation for digital resilience is unlikely to be beneficial in the wider universe where the bacteria lives, but it's a numbers game and it could be helpful or neutral. Or, if the digital scanning algorithms are based on techniques inspired by nature (perhaps the bacteria is attacked by a virus and using that virus' DNA as a search pattern could improve algorithm performance), the bacterial DNA can beat the digital implementation and have a successful mutation that need only get transcribed into actual viable DNA and infect a lab somewhere and the loop is closed again.

      --

      "I must not fear. Fear is the mind killer." -Bene Gesserit Litany Against Fear
    2. Re:No feedback mechanism by drooling-dog · · Score: 1

      The "will be rejected" part, I think, is where the issue comes in. Rejected based on what?

      Ummm, maybe on known viral/bacterial/mycoplasmal sequences? It's pretty much routine when you're assembling a genome, and it's not hard to screen a database retrospectively as new contaminating genomes are discovered.

      As for sequence data mutating and evolving in silico in genome databases (if that's what people are saying here; I can't be sure), well... That might be a good plot for a SciFi novel, but not one that would seem credible to any biologist.

  14. Re:In silico? by NoNonAlphaCharsHere · · Score: 4, Informative

    No, they're not saying that the mold itself is appearing in the chips, just that the mold's DNA is. Therefore, the presence (or absence) of the mold in a sample would skew the results when using these chips. And yes, saying that the DNA appears "in silico" is perfectly valid here - whether you care for the term or not.

  15. Best to wait by Anonymous Coward · · Score: 1

    For genome 3.1.

  16. A sperm has a flagellum by tepples · · Score: 1

    All humans save three have passed through flagellum stage. Their names were Adam, Eve, and Jesus.

  17. Re:Wow that's a relief! by bcmm · · Score: 1

    This is Slashdot. You almost certainly have flagella.

    --
    # cat /dev/mem | strings | grep -i llama
    Damn, my RAM is full of llamas.
  18. Misidentification by Caerdwyn · · Score: 1

    So to what extent does this "epidemic of mycoplasma contamination" increase the potential for false-positives on DNA matching tests, such as used in criminal investigation or paternity cases? Does a given lab or lab-equipment manufacturer have a common strain of contamination which increases the number of "always match" markers above the threshhold defined for claiming a match?

    --
    Everybody gets what the majority deserves.
    1. Re:Misidentification by stillnotelf · · Score: 1

      This has little to no relevance for DNA matching tests. Those tests do not match specific sequences, they usually match lengths of repeats in repetitive elements - elements that are unlikely to have been drawn from mycoplasma (because they don't have them!) http://en.wikipedia.org/wiki/DNA_profiling

    2. Re:Misidentification by Caerdwyn · · Score: 1

      Ok, thanks. Here in the San Francisco Bay Area lately there has been scandal after scandal concerning sloppy forensics lab operation, theft of evidence, and police departments conspiring to hide histories of police officer misconduct form defense teams. This would have been just one more nail in the coffin.

      --
      Everybody gets what the majority deserves.
  19. Nature is teh ninja haxoarz by theCat · · Score: 1

    Who knew?

    --
    =^..^= all your rodent are belong to us
  20. The Immortal HeLa cell by Latinhypercube · · Score: 2, Interesting

    This is EXACTLY what happened in 70s-80s with Henrietta Lacks IMMORTAL 'HeLa' cell. http://en.wikipedia.org/wiki/HeLa
    Her cells were the first Human cells to grow outside the human body.
    In fact they were so successful, that unbeknown to scientists ALL OVER THE WORLD, her cells had TAKEN OVER all of the cells in their labratories GLOBALLY.
    There is an amazing BBC documentary on this by Adam Curtis called "Modern Times: The Way of All Flesh"
    wiki quote " Contamination: Because of their adaptation to growth in tissue culture plates, HeLa cells are sometimes difficult to control. They have proven to be a persistent laboratory "weed" that contaminates other cell cultures in the same laboratory, interfering with biological research and forcing researchers to declare many results invalid. The degree of HeLa cell contamination among other cell types is unknown because few researchers test the identity or purity of already-established cell lines. It has been demonstrated that a substantial fraction of in vitro cell lines — approximately 10%, maybe 20% — are contaminated with HeLa cells"
    Almost created a COLD WAR incident:
    wiki quote:-"The USSR and the USA had begun to cooperate in the war on cancer launched by President Richard Nixon only to find that the exchanged cells were contaminated by HeLa"

  21. ...and the results are in... by erroneus · · Score: 1

    In the case of little Jeffery, Mycoplasma, you ARE the father!

    1. Re:...and the results are in... by lennier · · Score: 2

      In the case of little Jeffery, Mycoplasma, you ARE the father!

      Join me, and together we can rule the upper right nasal cavity!

      Noooo! I'll never exchange plasmids with you! E Coli, why didn't you tell me?

      --
      You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
  22. Evolution by Hylandr · · Score: 1

    So it would seem Evolution is favoring hackers, that breed well...

    - Dan.

    --
    ~ People that think they are better than anyone else for any reason are the cause of all the strife in the world.
  23. Summary is contaminated with random science jargon by Anonymous Coward · · Score: 5, Insightful

    As a career microbiologist and bioinformatics geek, the complete and utter scientific inaccuracy of this summary made me want to cry.

    The mycoplasma genes are clearly successful in reproducing themselves in silico raising the possibility that we're seeing the beginnings of an entirely new kind of landscape of infection. One option to combat this kind of virtual infection is to protect databases with the genomic version of antivirus software, a kind of virtual immune system. But this in itself could make things worse by triggering an evolutionary arms race that selects genes most capable of beating the safeguards

    Mycoplasma is a common contaminant of many human cell culture lines. It is often present in low counts, and is a relatively slow growing organism. This is a problem, because many of the immortal cell lines are passed serially, meaning that the mycoplasma propagates right along with it. Most labs that perform cell culture now do routine PCR testing for mycoplasma markers as a quality control measure.

    When it comes to sequencing, and in particular, high-throughput next generation sequencing (Illumina/454/SOLiD/PacBio/whatever), you are shotgun sequencing all of the DNA in a given sample extract. This means that if you had a bunch of human cells, that happenned to be contaminated with low counts of mycoplasma, those mycoplasma sequences would be present to some extent in your final sequencing project. Whether this would factor into the final assembly, or just get thrown out depends on the quality control, experience of the bioinformatics team and assembly software pipeline. I am willing to be that most issues with mycoplasma contamination were during the "formative" years of high-throughput sequencing, but may have lingered in databases. These databases would in turn might used by commercial companies that build microarrays or other high-density tools, so it's feasible that some mycoplasma sequence carried over.

    Is this relevant? Probably not. On a microarray, it would most likely be wasted space (eg: always negative during gene expression studies... unless the patient had a mycoplasma infection or something). Furthermore, a simple analysis of the sequence would help to rule out sequences that were clearly prokaryotic.

    "In silico" does not mean what you think it means. In fact, this whole bit about in-silico replication and arms races is complete and utter nonsense. In-silico biology usually refers to biocomputing. Eg: analyzing, manipulating and simulating gene/protein sequences, expression, signalling cascades, and the like on a computer system. It does not apply to mycoplasma sequences running around all nambly pambly causing infections that would require some sort of anti-virus software. What they might be alluding to is the fact that a lot of shotgun sequencing libraries are run, as needed, through a vector screen, which is designed to pull out irrelevant sequences that may have been necessarily introduced during cloning or sequencing. Plasmids, cosmids, whatever. These algorithms may need better tuning to do a better job of ruling out mycoplasma in human sequences, but there's no danger of these mycoplasma sequencing replicating and taking over the world.

    Unless you happen to be William Gibson.

  24. horrible language by Taibhsear · · Score: 5, Informative

    This article was horribly written. They go between using terms with their literal meaning and using terms in metaphorical creative language but do not differentiate between the two using context at all. It's an incredibly confusing read. Actual ancestral human DNA is not contaminated with actual mycoplasma DNA sequences.

    Here's what I gather is going on:
    Researchers took a sample of human DNA and sequenced it, while doing so the sample was contaminated with DNA from mycoplasma (possibly from bacteria in the lab or on the researchers themselves). While sequencing it, the data is assumed to be a representation of pure human DNA (which would be incorrect). Other researchers then use this data set as a reference to compare other human DNA samples they sequenced themselves. They use this to test gene expression and so forth. So if their DNA samples show gene expression for mycoplasma they would incorrectly think it was normal human gene expression. What they did is use software to strip the mycoplasma DNA data from the original data set (that had both human and mycoplasma DNA sequences) to only use the actual human DNA data as a reference. The biological contamination was first in the original sample that was tested, and then the contamination referred to elsewhere is computational data "contamination." This is the software they are referring to as antivirus software and virtual immune system (which isn't antivirus software or similar to a biological immune system, it's DNA data filtering software).

    These people really need to think about what they're trying to say before puking up jargon salad on the readers' brains.

    1. Re:horrible language by drooling-dog · · Score: 1

      It seems as if the metaphors (e.g., "virus") that computational science has borrowed from biology have come around full-circle, with the result that concepts from different fields are getting conflated with one another in bizarre ways. The reasoning seems to be: If data (in the form of a computer program) can replicate and spread to other machines, then perhaps DNA sequence data in genomic databases can perform similar biological feats like mutation, evolution, and transmission. This seems inane enough that it's likely me who is missing something, but it's a "something" that I should have been able to get from the article.

  25. Mindless drivel by Iron+(III)+Chloride · · Score: 4, Interesting

    I don't want to be excessively harsh but the summary was seriously a bunch of drivel. In silico either means it's data on the computer, or that you are simulating a biological process computationally. But as other posters have mentioned, unless you are purposely simulating evolution, mycoplasma sequences in your human databases isn't going to cause any "arms race." Yes, it seriously screws with validity, but that's a completely different issue.

    This is a generalization, and no offense to fellow Slashdotters, but in my experience most of the computer scientists that I've met have a really crappy understanding of even basic biology. CS concepts don't directly translate to biology ones.

    --
    Cogito, ergo sum, fosho!
    1. Re:Mindless drivel by poopdeville · · Score: 1

      I don't want to be excessively harsh but the summary was seriously a bunch of drivel. In silico either means it's data on the computer, or that you are simulating a biological process computationally. But as other posters have mentioned, unless you are purposely simulating evolution, mycoplasma sequences in your human databases isn't going to cause any "arms race." Yes, it seriously screws with validity, but that's a completely different issue.

      You're still missing the point.

      Methods to screen out junk contamination will all miss something. The data representation of a genome is reproduced, as a cost (and time) saving measure. In other words, the contamination that survives the screening process will "survive" as a silicon representation.

      This is a problem in the long term, since we will presumably be using the genomic data to eradicate diseases. So our use of contaminated data will select for diseases which cannot be screened.

      --
      After all, I am strangely colored.
    2. Re:Mindless drivel by Iron+(III)+Chloride · · Score: 1

      First off, the fact that we are continuing to resequence individual human genomes through projects like the 1000 Genomes Project (and attempting to do de novo assemblies, so we're not just relying on the HGP reference genome) as well as articles telling out about such incidences makes it in my view unlikely that significant contamination will continue as research continues.

      Putting that aside, I fail to see how how the usage of invalid DNA sequences in biomedical research, leading to problems with disease treatment as you've mentioned, will lead to selection of those genes in vivo in Mycoplasma. Sure, those efforts to research those diseases will be set back, but how will that confer selective advantage to Mycoplasma carrying genes containing those sequences unless Mycoplasma is directly involved (or affected) by treating the disease in humans (which will definitely not be true in general)? That's the missing biological link. There has to be a mechanism by which direct benefit is conferred onto Mycoplasma as a downstream consequence of those contaminants being in our databases, but unless the invalid sequences are in genes that are involved in some Mycoplasma infection, Mycoplasma doesn't stand to gain anything at all.

      IAABIT [I am a biologist in training] and I personally work in genomics, so from my experience my guess of what likely got these sequences into the databases in the first place is a combination of their similarity to actual human sequence (despite the absence of homology), favorable sequence bias during the library generating process (the effects will be determined by what protocols you use), and possibly prevalence in the standard lab environment. None of these features are likely to confer evolutionary benefit to Mycoplasma "in the wild" under normal circumstances (i.e. the invalid sequences don't belong to genes involved in Mycoplasma infection), so I honestly don't see where this notion of the "arms race" is coming from.

      --
      Cogito, ergo sum, fosho!
  26. Re:Wow that's a relief! by jweller13 · · Score: 1

    Try self flagellation

  27. Fungi? by jweller13 · · Score: 1

    Myco means fungi right?

    1. Re:Fungi? by treeves · · Score: 1

      Yes. But this is about bacteria that happens to have the genus name Mycoplasma, not fungi.

      --
      ...the future crusty old bastards are already drinking the Kool-Aid.
  28. trigger evolutionary arms race? by hierophanta · · Score: 1

    i believe the evolutionary arms race was triggered a long time ago (possibly in a galaxy far far away) /. is so full of tripe today its making me question my patronage

  29. Funny, guys. by imric · · Score: 1

    Were these the patented sequences?

    --
    Paranoia is a Survival Trait!
  30. Re:The Immortal HeLa cell / Jurassic Park by retroworks · · Score: 2

    And it is EXACTLY what happened in JURASSIC PARK! The frog cells allowed some of the female dinosaurs to mutate into MALE Dinosaurs! We COULD be looking at RAPTORS who live in SILICA, and WE will ALL be typing in ALL CAPS out of sheer FEAR!

    --
    Gently reply
  31. Re:In silico? by treeves · · Score: 1

    Bacterium, not mold. Mycoplasma is genus name of this particular bacterium.

    --
    ...the future crusty old bastards are already drinking the Kool-Aid.
  32. This makes very little sense by rnaiguy · · Score: 1
    When you assemble a genome, you assemble the sequences into chromosomes based on overlap with other sequences. This contamination should not match up properly, or be assemble into its own "chromosome".

    The whole "evolution" thing is the biggest sensationalist bullshit I've ever heard. Ignore it.

    As was mentioned in another comment, it seems like the summary is misleading on the "contamination" actually being in the genome sequence.

  33. sequence != expression by neurogeneticist · · Score: 1

    The use of terms for sequence data and expression data are not interchangeable. The U133 microarray is for RNA, yes RNA, expression data. RNA microarrays quantify the fold change difference in expression between different subjects. DNA microarrays identify polymorphisms or repeats or the like. While arrays like the U133 rely on sequence level data to create the array, this is not the same as saying that sequence-level data is contaminated. Bottom line, the fact that this is not the cover article for Nature|Genetics this month tells you a lot of the story. Unless you are some sort of conspiracy theorist, or want to get swept up in the usual slashdot "sky is falling" imperative.

  34. Re:Don't see how Natural Selection applies here by AlamedaStone · · Score: 1

    Well *I* laughed.

    --
    "All these years believing you're the signified monkey, only to find out you're just a big hunk of nobody cares."
  35. Bad summary of bad blog citing bad arXiv paper by morty_vikka · · Score: 1

    So the arXiv article (http://arxiv.org/abs/1106.4192v1) cited in TFA is a little more accurate than the already-much-panned summary. But the authors of the arXiv article still use the term 'virtual infection' which is very misleading at best.

    Basically there are a few (actually only two described in the paper) entries in one particular human genome database maintained at EMBL that appear to be mycoplasma-derived. Two out of 45,000 features on the Affymetrix Human U133 +2 oligonucleotide array, used to quantify mRNA levels in a given sample, appear to correspond to the mycoplasma sequences. So, at best, two of the genes you look at (out of 45,000) might be mycoplasma genes in that particular type of experiment.

    It's not a big deal, so that's why the work isn't published in a peer-reviewed journal.

  36. US Army Patent 5,242,820? by E.I.A · · Score: 1

    Seems there are other possibly related forms of this beast: http://www.nap.edu/openbook.php?record_id=11765&page=181

    --
    Laws are like sausages. It's better not to see them being made. - Otto von Bismarck
  37. Re:Summary is contaminated with random science jar by Mindcontrolled · · Score: 1

    True. In the end, mycoplasm is just another contributing factor to signal/noise in your dataset. It's completely illusory to assume that you get noiseless measurements given the amount of data involved.

    --
    Ubi solitudinem faciunt, pacem appellant.
  38. These Scientists by Antianging · · Score: 1

    These guys have a lot of crap to talk. Maybe they are very idle.http://www.laserqueen.com.au/injectables.html