To Respond To a Disease Outbreak, Bring In the Portable Genome Sequencers (ieee.org)
the_newsbeagle writes: Epidemiologists working on Zika virus could benefit from portable genome sequencers, like these used during the Ebola outbreak. In spring 2015, researchers conducted the first experiment in real-time genetic surveillance during an infectious disease epidemic. The researchers packed all their equipment in a couple of suitcases and set up a mobile lab in Guinea, where they used palm-sized sequencing devices to analyze viral RNA from 142 patients. Genomic data can illuminate the chains of transmission in an outbreak, and can help scientists develop diagnostics and vaccines.
Yeah, maybe we'll buy some of those sequencers soon, but right now we are working on mosquito control.
I'm an outsider, so I've just gotta be misunderstanding something. The oxford nanopore website seems to be claiming that you can sequence an analyte in real time, with a $1000 startup fee and $900 or less for a consumable...It uses a nanoscopic hole with an enzyme around it that ratchets a DNA strand through one nucleotide pair at a time, the whole time, spitting out the results to your computer....I can't process this. How can it be this portable, simple, and cheap? How did we get so good at this stuff?
So? It's not like you're having to work with only a single strand of DNA. Unless the error is systematic you can sequence several dozen strands and use standard error-correcting algorithms to recreate the original sequence with fairly high confidence.
Or maybe they're already doing that and accuracy plateaus at 96%. Still, does it really matter? They're not trying to do genome-research class sequencing, they just need to identify the DNA strands of interest (which are probably way more than 4% different than any other ambient virii) and identify the presence of mutations to trace the source of an infection, which probably have a 96% chance of being in the accurately-sequenced sections.
--- Most topics have many sides worth arguing, allow me to take one opposite you.
Yep, pretty impressive.
I wonder how long it will be be until technology evolves to the point that it will be standard practice at the doctor's office for them to draw a little blood or biopsy as you walk in and have your entire micro-biome gene-sequenced to identify every pathogen you're currently carrying before you've even gotten out of the waiting room. There will no doubt still be room for human judgment, but no more trying to guess at the problem based on symptoms and likelihoods and trying different treatments until something works. Just a printout listing Identified pathogens and confidence levels, plus any unknown DNA that might be a new pathogen.
My guess is it won't be much longer at all.
--- Most topics have many sides worth arguing, allow me to take one opposite you.
So do it 3 times and merge the results. Boom: 4-Nines accuracy. Right?
Maybe it's total tree-hugging tinfoilism. Maybe not.
http://www.theecologist.org/News/news_analysis/2987024/pandoras_box_how_gm_mosquitos_could_have_caused_brazils_microcephaly_disaster.html
I was on the design team for the MiSeq DNA sequencer at Illumina that can sequence 1 billion bases in one day, doing embedded systems/FPGA/control loop work. I no longer work there, but think they've managed to increase throughput. This particular unit fits on a tabletop, and costs about $100K.
A story was related to me while working there about an outbreak in the intensive care unit in Cambridge England where 7 preemie infants got sick. With this instrument, they could see how the virus mutated on a room-by-room basis, and a day-by-day basis. It was apparently unprecedented. They had one of our instruments on an early trial basis to give feedback on it's usage. The pathology department was pretty excited. This seems like a very useful kind of instrument when tracking the spread of diseases. I'd be curious about the adoption rates for such instruments in pathology labs, the CDC, etc. I understand that Illumina has made a push to have their instruments certified as a medical device, but I don't know the status of it. I'd like our labs to have all the tools they need to rapidly converge on the infectious agent, etc.
One important consideration for portable DNA sequencers is the read error rate of the DNA fragments (akin to bit error rate in a length of magnetic tape). The higher bit bit error rate, the more samples you have to make to reduce the probability of error to a small acceptable level. Even though some instruments on the market may be cheaper to run, you have to read a lot more samples to reduce the error statistics. (the Q scores). Any portable instrument must do this with a low error rate, such that the small sample size is meaningful. Also, the longer the read length of an individual strand the better.
DNA sequencing is sort of like taking a photograph and cutting it up into thousands of pieces, and reassembling it. The bigger the chucks, the more distinctive it is, and the easier it is to fit into a larger puzzle, pieces that are too small, like bits of sky aren't distinctive enough to see how they fit into the larger picture . I still don't think we've been able to completely DNA sequence a human being, because the "sequencing-by-synthesis" method used by Illumina only uses relatively short strands of 100base pairs (more if you do "paired-end" sequencing that pushes it to +250, though my knowledge is a few years old).. There is some small percentage that they can't fit because it's not distinctive enough, and the DNA itself does not break apart uni-formally. Some areas are over represented, and other ares where they're underrepresented..
It's very cool in it's portability and in real time. a traditional illumina has higher throughput. they processed 1450 samples in 6 months (their peak rate was much higher). An illumina can do many more samples in a single run, in batch. But you might not want to take it into the field and your latency would be higher since you would accumulate samples until you had enough to justify one run. The cost of that run per sample would be less but the cost of the batch run more which is why you wait. Another way this thing is superior is in read-length (50kbases) but they were only doing 2kB read lengths so not exploiting it's killer advantage over the illumina.
Some drink at the fountain of knowledge. Others just gargle.
> Base calling accuracy: up to 96%
Um, bullshit. See, this has been the problem with Oxford Nanopore since the beginning. They distract and confuse through a lot of misleading statements and media hype, which is why I can't trust any of their claims. The typical accuracy of single-pass 1D reads on real data is about 70%, about 80% on 2D reads. The 96% accuracy they are quoting on their site is after they error-correct the reads.
Or maybe they're already doing that and accuracy plateaus at 96%
Yes, this is correct.
They're not trying to do genome-research class sequencing, they just need to identify the DNA strands of interest
Well, it does depend on what kind of downstream analysis they plan to do, but 96% is not great. That is 1 error per 25 bases. Good enough for alignment procedures to work, but definitely bad if you are looking for SNPs.
As one commenter on ONP has been stating for a while: what's the point of a portable sequencer if you have to haul around a full-size Illumina sequencer along with it to get the accuracy you need?
The nanopore's advantage in this example is the virus genome, which is a relatively small size, and a well-defined reference sequence. In its present state, the nanopore is mostly useless for larger, previously unsequenced, genomes on a cost/bp basis.
Only if the error is random, which it is not in this case.
Good to know, thank you. I can see how a 4% error rate would leave much to be desired when building a reference sequence, though if necessary you could presumably do many additional passes to bring the error rate down further. I assume that 96% is just the point where they decided that diminishing returns weren't worth the incremental cost, and that will presumably improve with time.
I agree that 96% is not great, but it's more than sufficient to recognize a virus. And once you have a database of related DNA, it shouldn't be difficult to look for differences and similarities. You may not know for certain whether any given deviation from the "norm" is noise or genuine mutation, but so long as you're taking many samples from a community you can probably make a pretty high-confidence conclusion about even SNPs - if it's present more than a few percent of samples it's probably a real mutation characteristic of the local virus strain. Similarities between viruses infecting different communities then gives you a pretty good indication of a common origin. Not perfect, but a huge improvement over simply guessing at the path of infection.
--- Most topics have many sides worth arguing, allow me to take one opposite you.
though if necessary you could presumably do many additional passes to bring the error rate down further.
In its present state, not really. The biggest problem with nanopore data right now is systematic errors in homopolymer regions. These can't be easily corrected out with higher coverage. Incidentally, some of the most significant mutation events are in homopolymer regions, so this is bad.
but it's more than sufficient to recognize a virus.
Correct. But you need to know more. In particular, which strain of virus? Strain variations can easily be much less than 4%.
but so long as you're taking many samples from a community you can probably make a pretty high-confidence conclusion about even SNPs
If the errors were mostly random, you are correct. That is the problem here, the errors are systematic, not random, which is why they can't be corrected out with higher coverage. The good news, though, is that if you are looking at other types of mutations, like inversions or repeat expansions, that are easier to identify than SNPs, the error rate is probably good enough.
Not perfect, but a huge improvement over simply guessing at the path of infection.
You don't have to guess. You just have to use a different sequencing technology. Almost every vendor is trying to provide a rapid sequencing service for this exact reason. Illumina has MiSeq (12-24 hrs. run time), and PacBio is always fast (run time ~3 hrs) as is Ion Torrent (run time ~2 hrs). The biggest advantage that ONP has is portability, but if you need a lab (and an Internet connection) anyway to process samples, I'm not sure that this will really play out to their favor in the long run. ONP gets a lot of awe and excitement, which leads to a lot of hype, but not a lot of practical advantages.
Seems to me the big practical advantage is actually having a sequencer available in relative backwaters. Satellite internet is available everywhere, while physically shipping non-degraded samples to labs that may be many days away seems like it could be a challenge.
--- Most topics have many sides worth arguing, allow me to take one opposite you.
Very informative answer @volvox_voxel!
IMHO, one of the big issues with Illumina sequencing is that (apparently by design), it does not facilitate "real time" sequencing (streaming) as the MinIONs/Promethion does, i.e:
https://www.biostars.org/p/156...
If those .bcl files being generated could be fed ASAP into a socket or similar, that would bring Illumina closer to the new generation (4th now?) of sequencing.
Can you please contact me (OP of BioStars post above)? I'm really interested in discussing this topic: trying to squeeze the timeline of the Illumina's to go from "batch" processing into something a bit more generative/streamlined.
Yes, absolutely. However, the nanopore sequencer has to have more than one limited-applicability advantage for it to be commercially successful against competitors. Just consider seriously for a minute what has actually been described (not hyped about) in this paper.
1) A mobile lab in a suitcase including sequencer - yes, that's awesome
2) Deployed to a region experiencing an outbreak
- ok, can be useful, but how many outbreaks occur every year that actually benefit from on-site sequencing
- in the case of Ebola, which spreads and mutates quickly, the advantage may be very real, but Zika? the flu? not so sure
- is the advantage enough to offset the tremendous cost compared to alternatives?
3) They did sequence a segment of the viral genome (not the whole genome) and successfully call base mismatches
- but they didn't call indels
- they ignored homopolymer regions and the ends of their amplicons
- they did get some useful information, but there were samples that they couldn't successfully analyze after sequencing
So in the end, it is a sequencer that can be deployed to remote villages, provided you have a very limited set of analyses you intend to do, and you don't care about the cost. But is that enough to be commercially viable and displace competitors? I don't think so.
I'm not trying to rag on Oxford Nanopore, don't get me wrong. If they really could reliably sequence whole genomes fast and with minimal preparation from a usb stick, I would definitely jump on the bandwagon. I'm just tired of all the hype. They've been promising these breakthroughs for more than a decade now, but they have yet to deliver. Meanwhile other companies, namely PacBio, have appeared and been very successful at providing long reads at an affordable cost, so I'm not holding my breath for ONP.
All right, I think I understand your objection. The details are always far more significant from "in the trenches". On the other hand this is my first exposure to the technology outside of I think hearing of it as proof of concept years ago, and it seems like it has great potential. Watching from a distance the speed of evolution of gene-sequencing technology in general is quite breathtaking. The mere existence of these tools today leads me to expect much more sophisticated implementations to be commonplace within a few decades, though not necessarily based on the same technology.
--- Most topics have many sides worth arguing, allow me to take one opposite you.