Slashdot Mirror


Graphs Show Costs of DNA Sequencing Falling Fast

kkleiner writes "You may know that the cost to sequence a human genome is dropping, but you probably have no idea how fast that price is coming down. The National Human Genome Research Institute, part of the US National Institute of Health, has compiled extensive data on the costs of sequencing DNA over the past decade and used that information to create two truly jaw-dropping graphs. NHGRI's research shows that not only are sequencing costs plummeting, they are outstripping the exponential curves of Moore's Law. By a big margin."

23 of 126 comments (clear)

  1. Great! by the_humeister · · Score: 2

    How about the cost of analysis of said genomes?

    1. Re:Great! by hedwards · · Score: 3, Insightful

      Sequencing has been where the focus on cost has been going. It doesn't make much sense to try and reduce the cost of analysis when it takes a very long time and a huge amount of money to accomplish. The graph was hard to read, but at this point with the cost well over $10k there's a lot more that has to be done before analysis is worth spending a lot of time economizing.

      But as it gets cheaper more and more of the focus will be on the analysis side. And the cost of analysis will come down, given that insurance isn't going to cover the sequencing at this point, analysis is moot in most cases. As more research analyzes sequenced DNA I'm sure tricks and such will be discovered to bring the cost down. But right now you're dealing with low volumes and as such cost is higher than it will be with higher volumes.

    2. Re:Great! by RDW · · Score: 2

      "How about the cost of analysis of said genomes?"

      It's computationally expensive and pretty much subject to Moore's Law (though improved algorithms like Burrows-Wheeler alignment have helped to speed things up in the last couple of years). So it's getting cheaper, but not fast enough to keep up with the expected deluge of data. If you're just interested in sequencing a fixed number of genomes you benefit from both cheaper/faster sequencing and cheaper/faster processing power. But if you're a major genome centre, be prepared for some serious IT investment, or the bottleneck will increasingly be the speed with which you can crunch the data.

    3. Re:Great! by varcher · · Score: 4, Informative

      to sequence 1 million SNPs per person

      Actually, they're not sequencing.

      They're checking.

      The way 23andme and most personal genome companies work is that they have those genochips (Illumina) with one million DNA sequences on them, and they check whether or not your DN has one of those sequences.

      If you have a SNP not on the chip (well, you have lots of SNP not on the chip), it won't list anything. If, at a given chromosome locale, they have "all" of the "known" SNP, but you happen to have a mutant variant not on their lib, then you're not detected.

      "Sequencing" involves taking your DNA, and getting every sequence, no matter what. And that's still long and very expensive. We're in the era of the "thousand genomes", meaning we expect in a couple year to complete a thousand full sequences. Of course, 10 years later, we'll sequence everyone, but, so far, it's still a way out.

    4. Re:Great! by RDW · · Score: 2

      Progress in SNP chips, though they were a big breakthrough when introduced and remain very important in research, has been pretty static compared to the dramatic speed with which 'next generation' sequencing technologies have brought down the cost and increased the amount of data we have to cope with. Whole genome sequencing is on an entirely different scale - 3 billion bases rather than a million. Even an 'exome' (the sequence of all the actual genes in your genome) runs to about 40 million bases.

    5. Re:Great! by RDW · · Score: 2

      Not for annotation, but for the initial alignment of NGS reads to the reference genome. BWA, Bowtie, and SOAP2 all use the Burrows-Wheeler transform, and are in common use. For variant calling and functional annotation we'd use other tools, e.g.:

      http://www.broadinstitute.org/gsa/wiki/index.php/Best_Practice_Variant_Detection_with_the_GATK_v2

      http://www.openbioinformatics.org/annovar/

  2. Moore's law is too slow by MoobY · · Score: 3, Interesting

    We've been observing this decrease over the last few years at our sequencing lab too. Some people might find it fascinating, but I, as a bioinformatician, find it frightening.

    We're still keeping up at maintaining and analysing our sequenced reads and genomes at work, but the amount of incoming sequencing data (currently a few terabytes of data per month) is increasing four-to-five-fold per year (compared to doubling each 18-24 months in Moore's law). Our lab had the first human genomes at the end of 2009 after waiting for almost 9 years since the world's first human genome, now we're getting a few genomes per month. We're not too far away of running out of installing sufficient processing power (following Moore's law) and no longer being able to process all of this data.

    So yes, the more-than-exponential decrease in sequencing costs is cool and offers a lot of possibilities in getting to know your own genome, advances in personalized medicine, and possibilities for population-wide genome sequencing research, but there's no way we'll be able to process all of this interesting data because Moore's law is simply way too slow as compared to advances in biochemical technologies.

    --
    --- Sigmentation Fault - Comments Dumped
    1. Re:Moore's law is too slow by Kjella · · Score: 2

      I assume you're talking about incoming data, not the final DNA sequence. As I understand it the final result is 2 bits/base pair and about 3 billion base pairs so about a CD's worth of data per human. And if you were talking about a genetic database I guess 99%+ is common so you could just store a "reference human" and diffs against that. So at 750 MB for the first person and 7.5 MB for each additional person I guess you could store 2-300.000 full genetic profiles on a 2 TB disk. Probably the whole human race in less than 100 TB.

      --
      Live today, because you never know what tomorrow brings
    2. Re:Moore's law is too slow by RDW · · Score: 3, Informative

      Yes, the incoming (and intermediate) data sets are huge. You don't just sequence each base once, but 30-50 times over on average (required to call variants accurately). And you don't want to throw this data away, since analysis algorithms are improving all the time. But it's true that the final 'diff' to the reference sequence is very small, and has been compressed to as little as 4Mb in one publication:

      http://www.ncbi.nlm.nih.gov/pubmed/18996942

    3. Re:Moore's law is too slow by RDW · · Score: 2

      'The incoming data is image-based, so yes, it will be huge.'

      The image data is routinely discarded by at least some major centres; the raw sequence and quality data alone is huge enough to be a major issue! See:

      http://www.bio-itworld.com/news/09/16/10/Broad-approach-genome-sequencing-partI.html

      'It's been a couple of years since we saved the primary [raw image] data. It is cheaper to redo the sequence and pull it out of the freezer. There are 5,000 tubes in a freezer. Storing a tube isn't very expensive. Storing 1 Terabyte of data that comes out of that tube costs half as much as the freezer! People [like Ewan Birney at EBI] are working on very elaborate algorithms for storing data, because you can't compress bases any more than nature already has. The new paradigm is, the bases are here, only indicate the places where the bases are different . . . In 2-3 years, you'll wonder about even storing the bases. And forget about quality scores.'

  3. What happened in... by diewlasing · · Score: 2

    ...Oct 2007?

  4. Re:DIY? by MoobY · · Score: 2

    Even in research, most of the sequencing at whole genome level is outsourced to big companies (like, for example, Complete Genomics) since investing in the capabilities, machinery and computer power to sequence whole genomes is simply too big for sequencing one or a few individual genomes (you currently need to invest a few millions to get started with the sequencing of whole genomes). You can DIY sequencing of small fragments (for example, to determine whether a known genetic cause of a hereditary disease that is looming in your family is also affecting you) but it still requires quite a few skills in molecular biology and a few thousand euros/dollars of investment to get to this level.

    --
    --- Sigmentation Fault - Comments Dumped
  5. Re:Still north of $12,000 by TooMuchToDo · · Score: 2

    Compared to the $100 million it cost 10 years ago? Yeah, $12K is cheap. Not to *you*, but for research its direct cheap.

  6. Re:More unecessary tests.... by durrr · · Score: 2

    If someone has a genetic disease it's not always apparent. You can have a genetical susceptibility to tuberculosis, unless you get inefcted you'd never know it. Or you can have some more general problem such as marfans syndrom which traditionally is diagnosed based on aortic root dilatation or some similar criteria. Turns out that around half or so of the people with marfans syndrome have no traditional manifestations of the disease, that is that in the absence of genetic sequencing you'd never know they have marfan syndrome.

  7. Re:Still north of $12,000 by RDW · · Score: 4, Informative

    'Also, my understanding is that most uses don't require sequencing the entire genome, but rather just a small subset of it.'

    Very small subsets (e.g. individual genes) are still done the 'traditional' way (1990s technology!). Intermediate subsets (like the 'exome') are now done using a pre-selection 'capture' process ('target enrichment') followed by analysis on the same 'next generation' instruments that are used for whole genomes. Right now, this makes sense economically, since it requires less capacity (fewer consumables and less run time) on the expensive sequencers. But as sequencing prices continue to drop, we'll probably reach a point where it's cheaper to do the whole genome than any significant subset (since the 'capture' process is also fairly expensive). Cheaper to do the wet lab stuff, anyway - whole genomes also require much more processing power than useful subsets like exomes.

  8. Moore's Law? by fahrbot-bot · · Score: 2

    NHGRI's research shows that not only are sequencing costs plummeting, they are outstripping the exponential curves of Moore's Law. By a big margin.

    Moore's Law is about the number of transistors on a wafer and other directly-related hardware density issues, not about cost - and certainly not the cost of gene sequencing.

    --
    It must have been something you assimilated. . . .
  9. Re:Still north of $12,000. (No, that's $30,000) by Smurf · · Score: 2

    The graphs seem to indicate that the cost is still north of $12,000 which isn't exactly cheap.

    Dude, you are reading the graph incorrectly. Look carefully at the (logarithmic) scale: the cost is actually around $30,000 !! (No, those are not factorial signs, I am just expressing my shock by the 30 thousand figure.)

    Yes, I know, that actually supports your point even more strongly: while the cost was reduced dramatically from $100 million, it seems to be leveling at a cost that is still way too high for many practical applications in the clinical field outside of research.

  10. 23andme does not sequence your DNA by drerwk · · Score: 2
    As some ACs have pointed out in response to a few of your posts on this thread, 23andme does not sequence your DNA.
    https://www.23andme.com/you/faqwin/sequencing/
    my emphasis:

    What is the difference between genotyping and sequencing?

    Though you may hear both terms in reference to obtaining information about DNA, genotyping and sequencing refer to slightly different things.

    Genotyping is the process of determining which genetic variants an individual possesses. Genotyping can be performed through a variety of different methods, depending on the variants of interest and resources available. At 23andMe, we look at SNPs, and a good way of looking at many SNPs in a single individual is a recently developed technology called a “DNA chip.”

    Sequencing is a method used to determine the exact sequence of a certain length of DNA. Depending on the location, a given stretch may include some DNA that varies between individuals, like SNPs, in addition to regions that are constant. So sequencing is one way to genotype someone, but not the only way.

    You might wonder, then, why we don't just sequence everyone's entire genome, and find every single genetic variant they possess. Unfortunately, sequencing technology has not yet progressed to the point where it is feasible to sequence an entire genome quickly and cheaply. It took the Human Genome Project over 10 years' work by multiple labs to sequence the three billion base pair genomes of just a few individuals. For now, genotyping technologies such as those used by 23andMe provide an efficient and cost-effective way of obtaining more than enough genetic information for scientists—and you—to study. Copyright © 2007-2011 23andMe, Inc. All rights reserved.

    To be sure you have gained interesting information for your $200, but you have neither your sequence, nor a complete list of differences from a reference human sequence, which of course if you did would give you your sequence.
    23andme only gives you a list of many SNPs.

  11. Re:market at work by ObsessiveMathsFreak · · Score: 2

    Tulips.

    Your arguments are invalid.

    --
    May the Maths Be with you!
  12. Re:Deja vu all over again... by RDW · · Score: 3, Informative

    'Of course, that may just be the plateau before it falls off the next cliff.'

    The next cliff is already emerging through the mist, e.g.:

    http://www.genomeweb.com/sequencing/life-tech-outlines-single-molecule-sequencing-long-pieces-dna

    http://www.wired.com/wiredscience/2011/01/guest-post-introduction-to-nanopore-sequencing/

    It's not clear which 'single-molecule' technology will eventually win out, but it will almost certainly have the word 'nano' in it somewhere.

  13. Re:More unecessary tests.... by khallow · · Score: 2

    That knowledge doesn't help the patient.

    If there was no difference in treatment or health outcome for people who have a known inclination for disease or disorder, then you'd be right. You aren't.

  14. Re:Economic of Scale vs Moore's Law? by mlush · · Score: 2

    Moore's Law is exactly what we should be measuring this against. CPU speed is proportional to the amount of data that can be processed, it looks like were headed for a era where there is more data than we can process!

  15. Harry Potter explanation by tempest69 · · Score: 2

    Imagine taking 8 Harry potter books ((Goblet of Fire) all from different publishers) and putting them into the shredder. Luckily you placed them in a way such that the lines of the books are intact, and because you have different publishers, the lines don't always match up. And only 60% of the lines from a given book are usable. The rest mangled by the shredder.

    You need to find ends from one strip that match the beginning of another strip.
    This means that with some patience you could flip through all the scraps and rebuild one copy of Goblet of Fire.
    But that 40% of the lines that were mangled by the shredder were at random, some of the passages would not be salvagable, Roughly 1/1600 the of the book would be missing entirely (half a page). Feeding another book into the shredder is kinda expensive to get a quarter page of information, Another 2 to get 3 /16th's of a page.
    __ different methods are used to get those last few bits of information -- its technical and boring to grad students.

    It's useful for a few things Clinically, you can sequence a carcinoma (cancer) to determine the best course of chemotherapy. If the oxidative damage genes are damaged, you can introduce chemicals that induce oxidative damage. If DNA break repair is damaged, you can go with radiation as a treatment.

    For an infection, you can determine if your dealing with a resistant bacteria.. it's a bit expensive right now, but the price will fall.

    Tracing back evolution

    Bringing back the Mammoth

    Building a malaria resistant mosquito.

    Building synthetic Spider Silk (with transgenic goat milk (still ironing out kinks))

    Finding the Genes that conferred HIV resistance. (some are done)