Nanopore DNA Sequencing
mindpixel writes: "Harvard scientists have concieved a revolutionary technology for probing, and eventually sequencing, individual DNA molecules using single-channel recording techniques. The technique essentially pulls a single strand of DNA through a nanopore, reading off the individual bases electrically. The technique could allow for decoding of a person's genome in hours instead of years." While the sequencing in hours instead of years is something that's pretty darn cool, our holdup in using this data is actually now what the genes are, and how they interact. That will still take years for us to figure out.
There are several companies rushing to commercialize this general way of sequencing and haplotyping. The most interesting one I've seen so far is US Genomics.
The problem is still at the stage of "translating our pretty drawings and ideas into working hardware". There are tremendous engineering challenges both with the hardware (detection devices and materials handling) and software (algorithms etc.)
The science is solid and the ideas are clever. Its just going to take a while to build a system that can reliably resolve single molecules as they speed down a tiny channel.
just my $.02
I don't think that this is possible because of the structures involved. DNA is a long double-helix (ignoring any integral structutal proteins,so pulling it through a hole is possible. However, most proteins fold and form large structures, and unfolding these would probably be extremely difficult, never mind examining the many component amino acids. AFAIK, this is the reason the way protein folds is so important, as well as x-ray crystallography determining the structure of known proteins.
This will be a great boon for storage device makers! Each genome will be terabytes of data.
To answer an earlier question, sequencing is reading the DNA and recording the base pairs in some fashon. Often this information ends up as an asci file. Because of the mechanical limits of the current sequencing technologies, chemicals called restriction enzymes are used to snip long strands of DNA at particular patterns of sequence. A sample of DNA is amplified using a technique called polymerase chain reaction ( PCR ) which gives many many copies of the precisely same sample. It is then split into multiple samples each "cut" with a different restriction enzyme each with a different "trigger sequence" so the different samples end up being cut in different places which means that the cut pieces of different samples overlap the cuts in other samples. Then all the samples are seperatly sequenced into files. The files can then be compared for overlaps and "assembled" into longer sequences until the sequence of the original sample ( which was too big for the machine ) is known. After the sequence is known, it can be converted from nucleic acid ( G A T C ) code to amino acid code ( google can probably find you the genetic code ) which is composed of 3 nucleic bases per amino acid. Because we really do not know where a given amino acid starts within the nucleic file, we often do a sliding frame translation, decoding the entire sequence for each possible starting point of amino acid translation. The proper translation is often indicated by having a minimal number of null translation frames. The decoded sequence can be compared to other previously decoded sequence which have been analyzed for function - these blocks end up being called genes. A gene from a mous which makes a particular protein is the same as that gene from a human, so by starting with simple organisims, many genes for specific proteins have been found which help decode the more complex organisims....
hmmmmm sorry for the drif folks...
Z
enough is too much
Yay! One step closer to Gattaca. Wheee.
Let's see your C.V.
Anyway, I see two issues. One, what size DNA strand can be sequenced using a nanopore without breaking? Two, how many nanopores can be made to operate simultaneously? With 100 nanopores operating in parallel on 10-kilobase DNA fragments at 10 seconds per nanopore per fragment, you're getting 10 kilobases per second, or a billion bases every three hours. At these speeds, the real limiting factor is probably going to be something other than raw sequencing speed. I think it's a very exciting technology.
why not use ATP-driven translocation of polynucleotides, single-base pairing receptors changing their fluorescence, the pores able to regulate each other so that only one is sequencing at a time, enzymes stretching out hairpins, things like this make me wish i come back to my old toys one day...
But it's very important empirically to have more than one dataset. Who's GNOME did we sequence anyway? And comparing the sequence of many different instances of genes from different people is not time consuming. You're talking about figuring out how the Human GNOME works. That's a completely different dicipline.
Being able to decode strands of Nucleic Acids (and the technique might be applicatble to Poly Peptides(Protiens) as well) in a matter of minuites without the mess of Gel Electrophoresis would be HUGE.
Because the pore in the channel is large enough to admit only a single strand of DNA, the time it takes for the DNA to be drawn through the channel (enlarged view) effectively measures the length of the DNA molecule (here, 1,300 ms corresponding to a 1,060 nt polymer).
So if there are 3 billion bases in the Human GNOME,
Of course if the apparatus is easy to use I supose you could have several running at the same time provided you could prepare the material fast enough.Sequencing: finding out how the A, T, C, and G units are arranged in you.
Decoding: finding out what the sequence leads to as far as expression of traits goes. Like why you have an aversion to bio-tech speak (grin).
What's the difference between decoding and sequencing?
I guess I'm a little naive on the genome thing... but the way quantum computing is going, I better brush up on it, eh?
Don't think that a small group of dedicated individuals can't change the world. It's the only thing that ever has.
Such a system could be used to trace all kinds of "things" and "substances"
This is very true. I can't wait to break off a little bit of a Linnean holotype for a bit of DNA , ;) . This of course will provide much more data for diagnosing new species and species relationships ultimately fueling the debate on "linnean" versus "phylogenetic" nomenclature. Now if the zoological community would get off the a$ses and create/support a universal warehouse for all nomenclature.
only infrmatn esentil to understandn mst b tranmitd
I do believe I submitted this story over a month ago, after having seen it long before that. Glad to see things are being kept up-to date around here.
no equivilant to the verb "to have"
/does/ use the word "to have", then objectively speaking it's probably less sensible to say "That's my book" than "I have that book", since you're stressing the fact that you're the one whose it is (ie "who has it") rather than that that is a book. It doesn't really matter though -- constructs are whatever people make of them over many generations.
So?
"I have no child."
"There's no child unto me."
You can understand that, although we wouldn't say it that way. So, too, for other meanings of the verb "to have":
"You have my stuff."
"My stuff is with you."
"I have to go."
"I need to go."
"I've never gone."
"I never went."
There is no particularly compelling reason that a language should use "to have". In fact, if English
~
I like to think that genomics is the ultimate in the Law of Natural Selection. Every living thing wants to breed with only the healthiest and strongest of it's species. And just like Gattaca we will soon have that power. In a few hundred years this application of natural selection may breed out most genetic diseases naturally instead of having to rely on drugs.
here we come? I just love it when sci-fi becomes fact.
P.S. if you're reading this Dan, do you need an electrophysiologist? G. (LOL)
Tonight the sky is empty. But that is nothing new
DNA passwords!
Get your Unix fortune now!
Somewhere in a lab at Harvard University in Boston, mid 2001:
"Professor, the computer is decoding the first base pairs right now!"
"Record them for posterity, this is history in the making"
"Ok... reading now... G... A... T... T... A... C... A..."
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
Not only will this allow sequencing on minute amounts of DNA but it appears to be nondestructive to the degree that one could get multiple reads, i.e reverse polarity and reread the sequence(s). Gaining the holy Felgett's advantage at the cost of a few minutes and a few microvolts. I can't help but also think of the potential this holds for data storage. An array of micopores on a chip could read massive amounts of data off a drop of liquid in minutes.
the next step is a DIY kit that you could do from home! That would be some funky shyte.
Thank you for the info! I did however find some examples (below) of how DNA folding and tangling can affect the transcription process. I vaguely remember reading something a long time ago about single proteins originating from different segments of DNA, but I may have been imagining it. It may not be a strong enough affect to be a barrier to our understanding like you say, but I don't get the idea that DNA tangling is all that well understood. Is this true?
t ml
Here are the links along with key excerpts:
"...It has been put forward that many of these chromosomal changes are caused by the ability of certain specific types of DNA sequences to fold into unusual structures which interfere with the faithful copying of chromosomes."
www.cpa.ed.ac.uk/news/research/07/item3.html
"...The knots and kinks in the DNA provide crucial topological stop-and-go signals for the enzymes."
www.khouse.org/articles/technical/19971201143.h
"...our holdup in using this data is actually now what the genes are, and how they interact." One example of this, is the protein folding problem. A sequence of DNA, which corresponds to a sequence of amino acids in a protein, still leaves the problem of how the string of amino acid molecules folds up to make a 3D structure. This problem is still really really hard. What makes things harder is that DNA itself folds up and tangles upon itself, meaning that different sections of DNA on completely different ends of the long molecule can contribute to the same protein. It's going to be very difficult to make full use of knowing the sequence of a strand of DNA until problems like these are solved. I'm not an expert, does anyone out there know of any interesting advances in the protein folding problem?
Ok.. Count to 10 and take it easy... This forums are not for geneticists, they are for everyone. Of course that some of the ppl that have posted messages on this particular forum are ignorant, arrogant and idiotic, but since it's a democratic system they can express them selfs and say what they want, even if it's stupid. Modesty and pacience are two important virtues for a scientist, work on them. BTW: "I've been conducting genomics and proteomics-related research since before those terms and the genome sequencing projects existed". wow! 20-30 years? That's a lot of time...
Being able to sequence DNA quickly is only one small technological benefit from this nanopore discovery. In the '70s the McKenna brothers in their book The Invisible Landscape postulated a similar mechanism in the post synaptic membrane of neurons. As virtually any technology that humans invented was invented previously by nature, I believe that this mechanism is there in neurons and that this discovery will propel research into this precise area. What we have with this nanopore discovery is conversion of DIGITAL DNA sequences into ANALOG electrical signals. Moreover this is a way for thought to be stored in DNA! As thought in our brains is electrical signals, we now have a mechanism demonstrating how instinctive behavior is passed on through DNA. I call it Meme Storage in DNA.
This would make real DNA Fingerprinting a reality.
Get arrested, give a blood sample. It'll only take a few hours to verify who you are. None of this "probably" stuff, they'll have YOUR sequence on file, and there won't be any doubt (unless you have an identical twin).
I think gels are used for very small samples of DNA. About 500 base pairs is the limit for a gel. With 72 channels on the gel that is nowhere near the number of base pairs in a chromisome.
The faster better technique is electrostatically driven capliary tubes. They suck the sample through a microcapliary and shine a laser on it as it passes by. Generally the same 72 channels in parallel as the gel method - something about legacy analysis software as well as the plates that hold the samples and the robots which manipulate them. Kind of like the gauge of the railroad tracks used to carry space shuttle boosters being determined by the wheel spacing of the wheels on a roman chariot which was determined by the space required by two side by side horses.
we have the most advanced transportation system in the world having a major design parameter determined by the width of a roman horses ass.
Z
enough is too much
The article mentions 3 billion bases in less than 2 hours. That comes to
(3/2) x 10e9 / (3.6 x 10e3) bases/sec
= 416667 bases/sec.
So you would need a sustained writing speed of about 400 kilobytes/sec, or if you compress it into 4 bases per byte, say 100 kilobytes/sec. to write to about 3 GB (or 730MB compressed) of disk space.
You could fit it onto an IBM Microdrive attached to your Palm!
What will it be used for?
Immortality, my friend. Immortality.
The honolulu(sp?) technique used in cloning (done with mice - an incredibly difficult subject for cloning because the eggs cycle so quickly), unlike the faulty technique used with Dolly (which involved starving the nucleus-donor cell, and then using an electric shock to cause it to merge with the nucleus-free egg), has fertility rates that are rapidly approaching those of normal mice, and little to no genetic damage per breed. Yet, at around 5 generations of clones (cloning a clone of a clone of a clone of a clone), we start to see premature aging. Why? Because, as DNA lives, it slowly mutates. As must mutations are bad, it steadily poisons the mice's genetics.
The key is in a DNA backup.
Digital data doesn't corrode. It can be verified, backed up, copied, you name it. With the recent production of a completely DNA-synthesized fruit fly, the possibility approaches that we can completely re-create an individual's DNA strand. Then, when cloning organs for the individual, we use the backup DNA, not DNA from the person themself, as that DNA has become slightly corrupted overtime.
Digitizing DNA strands is a key to immortality.
That is why this is important.
-= rei =-
P.S. - I vote for genetically altered population. I certainly hope we see that day soon when people can make choices on whether or not they want their children to have to suffer.
"This may be presumptuous..." "That's my favorite kind of 'This'."
I was thinking that this method could also be used to sequence proteins - a process which is now done using an automated process which can only produce (correct me if I'm wrong, and if I am I'll eat a bug) maybe 30 "letters" of sequence. Compare this to many hundreds at a run from DNA sequencing. If proteins could be sequenced hundreds of amino acids at a time, you could sequence a whole protein in one run. This would be better than the current method, where fragments are sequenced, and then the overlaps are compared to piece together the whole sequence.
Freedom: "I won't!"
It looks to me that there is a big gap between the idealized graph above, which shows clearly the different nucleotides, and the actual data they have gotten which shows blocks of 30 purines and 70 pyrimidines. Can they really distinguish between adjacent nucleotides or are they so close physically they will just crowd through and blur together?
And will they be able to tell A from G (both purines) and C from T (both pyrimidines)? I don't have the charges handy but I think C and T are pretty close.
Also, DNA breaks very easily. No way are you going to be able to pull a whole chromosome through at once. If they get just 100 bases at a time, will that be useful?
It's great the we might be able to do sequencing in a matter of hours rather than years, but the real question is, what does that get us?
Every drug company has a Genomics division these days, to analize the existing data from the Human Genome Project. Now that new data can be gathered at such increadible speeds, are we any closer to improving the quality of life based on this work. Probably not, and the cause is a double edged sword.
The problem is the restrictions through international treadies and government regulation, on gentic engineering of humans. Don't get me wrong, I'm not in favor of such modification of the human genome, howeer, this leaves only one recourse. They can create medications that the sufferer of a genetic defect can take every day for their entire lives to prevent the ocurrance of an illness that they are genetically predisposed to. This is a boon for drug companies. If they can generate long term revenue streams by creating medications which reduce the chances of developing illnesses to which certain people are genetically predisposed to, and clain that they are doing this, instead of developing ways to repair a gene at birth - not because it's more proficable to do it this way but - because this is the only avenue they're allowed to pursue due to federal and international regulations against messing with the human genome; then who are the regulations truly serving? the population, or the drug companies?
Along the same lines, there will always be countries which are not signatories to the afore mentioned international regulations - in which drug companies can deelop the gene theropies which could truly benefit sufferers of gentic diseases and defects. That said, there will always be a black market for these theropies, once deeloped.
The question becomes which is a better world to live in: one where we have a drug dependant population, or one where we have a genetically altered population.
At this point I'll conclude my analysis because any further speculation will lead to the realm of Gattica style science-fiction. There is, however a great deal to consider...
--CTH
--
--Got Lists? | Top 95 Star Wars Line
That was my first take as well, but then when I looked through the references, I found that many feasibility questions seem to be resolved already. For instance, I read the main page and thought, "Sure, but how do you transport the strand through the nanopore?" Then I checked the first reference listed, and what do you know: "We show that an electric field can drive single-stranded RNA and DNA molecules through a 2.6-nm diameter ion channel in a lipid bilayer membrane."
The final system may still be largely conceptual, but it's by no means blue sky. I tend to be a techno-skeptic but this work impresses me.
The page sounds to me like a breathless plea for lots of venture capital funding.
This is grossly unfair. The language and style are well within the normal bounds for scientific papers. The word "revolutionary" is appropriate for a technology that would do years of work in hours. And in case you didn't notice, it's not private research -- it's being done at The Department of Molecular and Cellular Biology, The Biological Laboratories, Harvard University. What interest would a university laboratory have in "venture capital"? If they later spin it off into private industry for product development, then they might go for venture funding, but it simply makes no sense to do so now. There's a big difference between research sponsorship and venture funding.
Tim
It must be a bit like climbing mount Everest the hard way and while you are sitting at the top eating your Kendal Mint Cake, someone rides up the access road on a bicycle.
So what was the point of spending several hundred million doing the job the hard way? Oh they filled a gazillion patents on the sequences the read out. And there I thought you had to invent something to get a patent.
Looking for an Information Security student project suggestion?
Try http://dotcrimeManifesto.com/
Just to correct a few inaccuracies. Firstly it hasn't been completed yet. Celera Genomics decided to claim they'd finished it, which spurred the public project to also claim the same. In reality both had only 90 - 95% of the consensus sequence and only then at "draft" quality.
Secondly, I feel it's wrong to claim that Celera were the people to "complete" it. Celera used their own data in conjunction with the public data, and yet they have (more or less) comparable results in terms of coverage, number of contigs, quality and so on. Personally I feel that this is like admitting that their own work doesn't add anything new to the public effort - ie they failed.
The bottleneck with sequencing at the moment is in the "finishing" process - tidying up the results to produce highly accurate answers. This is largely caused due to the randomness of the shotgun sequencing approach. As it's effectively solving a jigsaw puzzle from lots of randomly cut pieces of DNA, in some places you'll get lots stacking up and in others you'll find none. It is unrealistic (not to mention expensive) to keep increasing the coverage so that everywhere gets covered by the random shotgun process.
Instead a fixed depth (only 3 or 4 fold in the draft sequences) is used followed by directed sequencing where the user, or an automatic program, analyses the data set and chooses experiments to perform (primer walking typically). The graphs I've seen of draft vs finished data show quite well how the finishing is lagging behind.
However this new strategy, not being random, will greatly reduce the amount of finishing needed. However note that at present they are using it for probes rather than full scale sequencing. It has great potential, but looks to be years away from replacing the current work.
A similar technique is DNA sequencing using scanning-tunneling microscopes and atomic force microscopes. Here is a Google search, and here is an article from 1992.
Harvard scientists have concieved a revolutionary technology
This sounds exactly like late-night infomercials that invariably say things like "our scientists [actors in white lab coats conspicuously walking around behind the one being interviewed] have devised a revolutionary new formula that will make you lose weight without dieting or exercising!"
That is, if the people selling something describe it as "revolutionary" themselves, it isn't. If it really is revolutionary, we'll hear about it in other places. The HP-35 from the slide rule article -- that's revolutionary.
So while this may be a significant improvement, I'd change the prose if I was them.
It will not replace conventional sequencing technology, unless it can beat the now pretty cheap cost. Conventional sequencing is based on labelling the individual DNA bases with a different flourescent dye, and running the DNA through a gel which seperates the DNA according to size: As each base runs through the gel, it goes past a laser/detector which can detect the specific DNA base (A,T,C or G) at that position. Due to gradual impovements to this technique over the last 20 or so years (originally it employed radiation, rather then flourescence) the speed, sensitivity and cost has decreased dramatically. For example, the human genome project started in ernest about 10 yrs ago. Celera Genomics, using modern technology (and alot of financial backing, and the fact they are a subsidiary of the people who make sequencing machines,) competed the genome in a matter of months. The increase in DNA sequencing capacity puts moore's law to shame.
For example, our lab could process around 100kb (thats KiloBases guys!) of data a day, but we never even touch this with our machine. No need, and the same stands for many small-medium research labs. Alot of people like us will probably stick with conventional sequencing technology for a long time (it works well, is high enough throuput, cheap & easy).
However, the are some exciting applications with single strand sequencing. For example forensics. Also, it allows the oppotunirty of sequencing RNA (this is the "messenger" which passes the "important" part of the DNA message to the ribosomes, which then "compile" a protein - the stuff which actually does things, like an enzyme or structural component). Sequencing RNA is exciting, as currently you have to convert the RNA back to DNA (which can cause problems) and then sequence that.
Another obvious application for this would be very high throuput sequencing which would be employed by the major sequencing centres. Yes, i know we already have the Human Genome, but a fashionable idea at the moment is comparative genomics. This is very much taking biology back to its roots (i.e. like Darwin and Wallace comparing the morphological characteristics of certain species and infering adaption), but at a molecular level. This will yield amazing insights with discoveries having important implications from medicine to evolution. In fact I think the general public & media will soon be bored of this. Each week it will be a new genome being announced; mouse, chicken, rat, pufferfish, rice, corn, dog, cat, cow, chimp......
But it is a place to start.
Side note:
while looking up the Finnish Language pages for this comment, I came across this tidbit: That Finnish has "no equivalent of the verb to have". This has interesting philosophic implications in the history of open source, etc.
Check out the Vinny the Vampire comic strip
"It is a greater offense to steal men's labor, than their clothes"
I am a zoological systematist, working in entomology.
The human genomes was sequenced by taking lots of DNA, cutting it up randomly sequencing the random pieces of cut up DNA.
In my field, we work with much smaller amounts of DNA. Sometimes I only have a single specimen of a tiny insect, or unique material (from rare or extinct species) to try and get some DNA out of. In older material, DNA is usually degraded and many times we end up with nothing but a destroyed or damaged specimen.
With small amounts of DNA to begin with, we have to amplify (PCR) single genes or regions by using general primers, which means that they don't only fit on the insect DNA, but fungi and human DNA too, making contamination of your material very real risk.
If this technology turns out to work on a larger scale, it's amazing news for me and my collegues.
The nanopore technolgy sequences single moleculer, which means the PCR step becomes unneccesary! This means that we can get sequences from specimens with severely degrades DNA, and we don't have to be as afraid of grinding up rare material in hope of getting sequences.