PE Celera, which just split at 300, is the one with the genome map. <P>Celera aren't mapping as such. They're trying to sequence most of the genome, by picking random fragments. They haven't done any extra mapping work, and each fragment could come from anywhere in the genome. They're partly going to relate this to the wider structure by finding pre-existing markers in their data, and partly by using the publicly available data from the human genome project, which has benefitted from a lot of detailed mapping work. Interestingly, I believe that Celera are now only planning to do the genome 4 reads deep, rather than the 8-10 they originally announced. I wonder whether this resulted from a decision to scale down the sequencing work, or whether they had problems sequencing on that scale? I doubt they'll tell us soon, but I'd love to know. <P> FWIW, the public consortium have done about twice as much sequencing on the human genome as Celera, and are still sequencing (AFAICT) at almost twice the rate.
According to the CNN story, Bill and Tony are going to applaud the researchers on the Human Genome Project (just so you know where I stand, that includes me) and call upon other researchers to make their results public with the same lack of attached strings. This sounds like a reference to Celera Genomics (and possibly Incyte and a number of other biotech companies). What it doesn't say is that they're going to make it in any way more difficult for said companies to obtain intellectual property rights over sections of the human genome. They may do, but it doesn't say so in the story.
Re:Does not produce knowledge
on
Genome
·
· Score: 1
Sorry, the genome project produces DATA, not knowledge.
The sequence analysts I work with will be very interested to hear that.
But remember that without those corporations we would never have had enough money to start research on the genome project in the first place.
Without companies the economy wouldn't be where it is to support the work, indeed, but these companies are asking for IP protection far beyond anything previously on offer. The companies that develop drugs currently don't have patent protection on the enzyme systems the drugs act on, regardless of any research they've done on these systems to help their drug development. The argument that further pharmaceutical innovation is dependent on this has been repeatedly asserted, but never supported with evidence (or if it has been it's slipped past me entirely).
He had freewill and determinism worked out in the fifth century AD.
I'm not convinced, I'm afraid. I thought it was contradictory, Panglossian and entirely dependent on the existence of a God. Not, in fact, relevant to this discussion at all.
In my view, pre-industrial philosophy generally doesn't share enough of a conceptual basis with us to be directly applicable to empirical matters.
I think it's probably misleading to contrast inherited behavioural tendencies with free will like that. Obviously the choices we make will tend to be the choices humans make rather than those that (say) chimps or herring would make, but that doesn't affect the fact that the choices are freely made. We inherit the sorts of creatures (and to an extent the sorts of people) we are, but we're no more a slave to that than we are to our past experiences. I haven't noticed many people worrying about our ability to learn robbing us of our freedom. Having said all which, does being "free" depend on not understanding why you make your decisions?
So does this book overlap at all with Roger Penrose's "The Emperor's New Mind"?
It sounds like it does. Penrose has written three books on the subject that I'm aware of, the other two being "Shadows of the Mind" and "The Large, the Small, and the Human Mind".
If I remember correctly, that book was trying to state that something on the order of quantum gravity (i.e. some phenomena we don't understand yet) was responsible for conciousness.
You remember correctly. Penrose's thesis is that a correct quantum theory of gravity will be non-computable, and will describe interactions within and between brain proteins (alpha-tubulin, a structural protein of cells) that produce consciousness. "Controversial" is definitely the word : to me, it just seems like kite-flying. He's explaining something almost imponderable in terms of something absolutely imponderable.
The need for this comes from a conviction that human minds aren't subject to the restrictions described by Godel's incompleteness theorem. I don't know what line this new book takes, though.
I'm no great fan of Dr. Venter, but it's not true to say that he wants to do the work first and consider the ethics later. There was originally an announcement in January about this, stating that it was time for the situation be considered fully before any action was taken. The more recent statement - a paper in Science - could be considered a second RFD. They've called for discussion in public forums, which seems to me to be very responsible.
The timescale for the HGP, incidentally, is rather shorter than the article supposes. Both it and the private sector efforts expect to have substantially complete sequence coverage of the human genome during the first half of next year. Things are moving very quickly. The HGP proper is due to have fully finished sequence, accurate to better than 99.99%, by the end of 2003. The events shown in Gattaca - which I would agree is both a good and a perceptive film - will be plausible well within current lifetimes.
Actually, all of us in the human genome project are critics of current apparent practice. Immediate free availability of the sequence data without intellectual property restrictions is a cornerstone of our policy.
why not pursue causes that really can make more people happy and live a dignified life, instead of spending unimaginable sums and energy for projects that endanger our freedom ?
Healthcare doesn't make people happy and dignified? The information will help refine existing treatments and develop new ones. Some of these will initially be very expensive (unfortunately) but in time everyone will benefit.
Important point : nothing the human genome project produces is patented.
in this case 3% of an obviously vital chromozone will lay undiscovered all because they decided it was time to call it quits and go on to the other ones.
Work isn't entirely finished on 22. The analysis and summary has to be published at some point, and that point is when all the routine work is finished, and all the usual techniques have been tried. A few remaining problems are still being worked on, and I imagine that new techniques will be tried for remaining areas as they become available : this has happened before to bring previously-covered areas to a higher standard. The goal is still to cover all euchromatic parts of the chromosome if reasonably possible.
I dunno for sure about Sanger, but WashU and Whitehead in the US are working from the same clone library.
At the Sanger we've historically been using the RPC-1 library from Pieter de Jong's lab. I have been told several times that the individual can and has been identified. I was told a name as well. More recently we've been moving onto other libraries based on anonymous panels, such as the RPCI-11.2, from the same lab.
This is a bad misnomer because the junk DNA is required for the proper expression of all of our genes. [ . . . ]The complexity [ . ..] is in how these genes are regulated
The 42% figure applies to known repeat sequences ("tandem and interspersed repeat sequences" in the original paper). Only 3% (the exact figure you cited) is coding sequence. The remainder is regulatory, junk or has unknown function. Probably most of it wouldn't be missed greatly, but that's not a settled issue currently.
If these samples are from many individuals, will the sequence from one gene make sense in the context of differing adjacent genes from other individuals even if they are all decent people?
Yes, they will.
There's not much sequence variation between humans. If two individuals had differences such that that might be a problem, then you'd notice interfertility problems. You'd probably class them as belonging to separate species.
In general it means that the location has been established relative to known markers. In this case, though, the chromosome has been sequenced : the areas have had their composition established base-by-base.
Does that mean they know what all the bases are in the average human?
Roughly, yes. The sequence is a mosaic derived from several people.
Does this imply any knowledge of the pattern of such variations?
Not in itself, no, although other work is continuing to establish this.
Does it imply any knowledge of the function of the encoded proteins?
Again, not in itself. Many of the identified genes have been studied already. Others have similarities to genes already known, either from humans or other creatures. Some have been inferred from features of the sequence itself and are of totally unknown function.
A biology class I took said that human DNA was 96% junk (not protein encoding).
Was this biology class wrong?
No. The vast majority doesn't code for protein, and most of this has no known function. Closely related species have widely differing amounts of this, so (together with other reasons) the current hypothesis is that it doesn't do much that's useful for the organism. Some of it is composed of "selfish" elements such as transposons : it might be the case that in a looser sense a lot of it is.
Several of the HGP institutions are involved in the SNP project, which uses a similar chromosome-specific-shotgun strategy to that used by Celera, with the data to be placed in the public domain. This is largely funded by pharmaceutical companies, with the aim of finding single-base differences between individuals that might be relevant to disease and its treatment.
I was actually at a seminar about it at lunchtime. It seems to be going nicely.
There are various libraries of samples being used, some derived from one person and others from panels of several to many individuals, suitably anonymous and from a wide range of decents.
Actually, it probably won't make much odds as there's little differewnce between people at that level. The point is, though, to do a Human Genome Project rather than a White European Male Genome Project.
The Sanger Centre's hoping to be able to announce the completion of Chromosome 20 sometime next year, and I understand that the GSC in St. Louis is hoping to do the same for Chromosome 7. Both have stats pages up, if you're interested. The Sanger's is at http://www.sanger.ac.uk/HGP/stats.shtml
The timing is actually entirely coincidental. We've been working on this for several years, and I've been involved in problem-solving in the closing stages. There's no way we'd have been able to time the finishing to match, and there's no way we (or Nature) would have been willing to delay it to that purpose.
From reading the article, what I gleaned was that Venter is trying to deprivatize the Human Genome, so drug companies cannot monopolize the industry.
Absolutely not. The other genomic sequencing organisations - the members of the human genome consortium - are placing all data on publically-accessible servers within 24 hours, with no attempt to retain proprietary rights.
Celera, on the other hand, is a commercial organisation committed to retaining IP rights over - and charging for access to and use of - human genomic sequence.
> A company spending millions of dollars on > research has every right to protect its > intellectual property by any means neccessary.
Well, surely that depends on what the result is? Having spent money on something doesn't automatically give you rights to something you don't own in the first place.
The usual ground for granting patent rights is that you've invented something. There's nothing self-evident about this : it was introduced on purely pragmatic grounds. Discoveries have not historically been protected like this, and to the extent that they might become so, it's happening by creeping extension rather than as a result of considered decision-making. The argument that the amount of work going into it means that these sequences should be treated as inventions is rejected by a large majority of those working in the field - and let's not forget that these others have done a lot more work than Celera. The belief that a "gene" is an invented intellectual construct entirely separate from the sequence is believed equally specious. It's believed to be very possible that these patents will be struck down by the courts on these grounds, but there's no certainty in that.
Another point worth remembering is that Celera haven't actually done anything as dramatic as they imply : their press release states that they've sequenced 1.2 billion bases, in 40 days. That's 30 million a day, which at roughly 500 useful bases per sample, comes to 60,000 lanes a day. Using 96-tube 3700s, that's about 600 runs a day. In theory they could manage over 2000 : clearly their production scale-up is nowhere near complete. At the moment, the various members of the Human Genome Consortium are still probably outsequencing them quite comfortably.
IIRC they aim to reach at least 30 billion bases sequenced - 10-times coverage of the genome : it's going to take them a while yet to get there. And then they have a large assembly job to do, which is in itself not a trivial task.
Sure this helps with some of the information
It doesn't affect us in the HGP at all : all our data are released freely to the public within a day or two anyway.
it doesn't really address the underlying problem of private companies patenting gene sequences.
Unless there's more to the statement than CNN are aware of, anyway.
PE Celera, which just split at 300, is the one with the genome map.
<P>Celera aren't mapping as such. They're trying to sequence most of the genome, by picking random fragments. They haven't done any extra mapping work, and each fragment could come from anywhere in the genome. They're partly going to relate this to the wider structure by finding pre-existing markers in their data, and partly by using the publicly available data from the human genome project, which has benefitted from a lot of detailed mapping work. Interestingly, I believe that Celera are now only planning to do the genome 4 reads deep, rather than the 8-10 they originally announced. I wonder whether this resulted from a decision to scale down the sequencing work, or whether they had problems sequencing on that scale? I doubt they'll tell us soon, but I'd love to know.
<P> FWIW, the public consortium have done about twice as much sequencing on the human genome as Celera, and are still sequencing (AFAICT) at almost twice the rate.
According to the CNN story, Bill and Tony are going to applaud the researchers on the Human Genome Project (just so you know where I stand, that includes me) and call upon other researchers to make their results public with the same lack of attached strings. This sounds like a reference to Celera Genomics (and possibly Incyte and a number of other biotech companies). What it doesn't say is that they're going to make it in any way more difficult for said companies to obtain intellectual property rights over sections of the human genome. They may do, but it doesn't say so in the story.
Sorry, the genome project produces DATA, not knowledge.
The sequence analysts I work with will be very interested to hear that.
But remember that without those corporations we would never have had enough money to start research on the genome project in the first place.
Without companies the economy wouldn't be where it is to support the work, indeed, but these companies are asking for IP protection far beyond anything previously on offer. The companies that develop drugs currently don't have patent protection on the enzyme systems the drugs act on, regardless of any research they've done on these systems to help their drug development. The argument that further pharmaceutical innovation is dependent on this has been repeatedly asserted, but never supported with evidence (or if it has been it's slipped past me entirely).
He had freewill and determinism worked out in the fifth century AD.
I'm not convinced, I'm afraid. I thought it was contradictory, Panglossian and entirely dependent on the existence of a God. Not, in fact, relevant to this discussion at all.
In my view, pre-industrial philosophy generally doesn't share enough of a conceptual basis with us to be directly applicable to empirical matters.
I think it's probably misleading to contrast inherited behavioural tendencies with free will like that. Obviously the choices we make will tend to be the choices humans make rather than those that (say) chimps or herring would make, but that doesn't affect the fact that the choices are freely made. We inherit the sorts of creatures (and to an extent the sorts of people) we are, but we're no more a slave to that than we are to our past experiences. I haven't noticed many people worrying about our ability to learn robbing us of our freedom.
Having said all which, does being "free" depend on not understanding why you make your decisions?
It sounds like it does. Penrose has written three books on the subject that I'm aware of, the other two being "Shadows of the Mind" and "The Large, the Small, and the Human Mind".
If I remember correctly, that book was trying to state that something on the order of quantum gravity (i.e. some phenomena we don't understand yet) was responsible for conciousness.
You remember correctly. Penrose's thesis is that a correct quantum theory of gravity will be non-computable, and will describe interactions within and between brain proteins (alpha-tubulin, a structural protein of cells) that produce consciousness. "Controversial" is definitely the word : to me, it just seems like kite-flying. He's explaining something almost imponderable in terms of something absolutely imponderable.
The need for this comes from a conviction that human minds aren't subject to the restrictions described by Godel's incompleteness theorem. I don't know what line this new book takes, though.
I'm no great fan of Dr. Venter, but it's not true to say that he wants to do the work first and consider the ethics later. There was originally an announcement in January about this, stating that it was time for the situation be considered fully before any action was taken. The more recent statement - a paper in Science - could be considered a second RFD. They've called for discussion in public forums, which seems to me to be very responsible.
The timescale for the HGP, incidentally, is rather shorter than the article supposes. Both it and the private sector efforts expect to have substantially complete sequence coverage of the human genome during the first half of next year. Things are moving very quickly. The HGP proper is due to have fully finished sequence, accurate to better than 99.99%, by the end of 2003. The events shown in Gattaca - which I would agree is both a good and a perceptive film - will be plausible well within current lifetimes.
Actually, all of us in the human genome project are critics of current apparent practice. Immediate free availability of the sequence data without intellectual property restrictions is a cornerstone of our policy.
why not pursue causes that really can make more people happy and live a dignified life, instead of spending unimaginable sums and energy for projects that endanger our freedom ?
Healthcare doesn't make people happy and dignified? The information will help refine existing treatments and develop new ones. Some of these will initially be very expensive (unfortunately) but in time everyone will benefit.
Important point : nothing the human genome project produces is patented.
in this case 3% of an obviously vital chromozone will lay undiscovered all because they decided it was time to call it quits and go on to the other ones.
Work isn't entirely finished on 22. The analysis and summary has to be published at some point, and that point is when all the routine work is finished, and all the usual techniques have been tried. A few remaining problems are still being worked on, and I imagine that new techniques will be tried for remaining areas as they become available : this has happened before to bring previously-covered areas to a higher standard. The goal is still to cover all euchromatic parts of the chromosome if reasonably possible.
I dunno for sure about Sanger, but WashU and Whitehead in the US are working from the same clone library.
At the Sanger we've historically been using the RPC-1 library from Pieter de Jong's lab. I have been told several times that the individual can and has been identified. I was told a name as well. More recently we've been moving onto other libraries based on anonymous panels, such as the RPCI-11.2, from the same lab.
Junk DNA is one of the worst misnomers possible
Overly pejorative, yes.
This is a bad misnomer because the junk DNA is required for the proper expression of all of our genes. [ . . . ]The complexity [ . . .] is in how these genes are regulated
The 42% figure applies to known repeat sequences ("tandem and interspersed repeat sequences" in the original paper). Only 3% (the exact figure you cited) is coding sequence. The remainder is regulatory, junk or has unknown function. Probably most of it wouldn't be missed greatly, but that's not a settled issue currently.
If these samples are from many individuals, will the sequence from one gene make sense in the context of differing adjacent genes from other individuals even if they are all decent people?
Yes, they will.
There's not much sequence variation between humans. If two individuals had differences such that that might be a problem, then you'd notice interfertility problems. You'd probably class them as belonging to separate species.
What exactly does "mapped" mean?
In general it means that the location has been established relative to known markers. In this case, though, the chromosome has been sequenced : the areas have had their composition established base-by-base.
Does that mean they know what all the bases are in the average human?
Roughly, yes. The sequence is a mosaic derived from several people.
Does this imply any knowledge of the pattern of such variations?
Not in itself, no, although other work is continuing to establish this.
Does it imply any knowledge of the function of the encoded proteins?
Again, not in itself. Many of the identified genes have been studied already. Others have similarities to genes already known, either from humans or other creatures. Some have been inferred from features of the sequence itself and are of totally unknown function.
A biology class I took said that human DNA was 96% junk (not protein encoding).
Was this biology class wrong?
No. The vast majority doesn't code for protein, and most of this has no known function. Closely related species have widely differing amounts of this, so (together with other reasons) the current hypothesis is that it doesn't do much that's useful for the organism. Some of it is composed of "selfish" elements such as transposons : it might be the case that in a looser sense a lot of it is.
Several of the HGP institutions are involved in the SNP project, which uses a similar chromosome-specific-shotgun strategy to that used by Celera, with the data to be placed in the public domain. This is largely funded by pharmaceutical companies, with the aim of finding single-base differences between individuals that might be relevant to disease and its treatment.
I was actually at a seminar about it at lunchtime. It seems to be going nicely.
There are various libraries of samples being used, some derived from one person and others from panels of several to many individuals, suitably anonymous and from a wide range of decents.
Actually, it probably won't make much odds as there's little differewnce between people at that level. The point is, though, to do a Human Genome Project rather than a White European Male Genome Project.
The Sanger Centre's hoping to be able to announce the completion of Chromosome 20 sometime next year, and I understand that the GSC in St. Louis is hoping to do the same for Chromosome 7. Both have stats pages up, if you're interested. The Sanger's is at http://www.sanger.ac.uk/HGP/stats.shtml
The timing is actually entirely coincidental. We've been working on this for several years, and I've been involved in problem-solving in the closing stages. There's no way we'd have been able to time the finishing to match, and there's no way we (or Nature) would have been willing to delay it to that purpose.
I'd agree that it's happy timing, though.
Introns are certainly one category of 'Junk', but there's much more. In general the term 'Low Information Density' would be preferre.
Human Genome, so drug companies cannot monopolize the industry.
Absolutely not. The other genomic sequencing organisations - the members of the human genome consortium - are placing all data on publically-accessible servers within 24 hours, with no attempt to retain proprietary rights.
Celera, on the other hand, is a commercial organisation committed to retaining IP rights over - and charging for access to and use of - human genomic sequence.
That's roughly the claim : the sequence was there, but it isn't a gene until people study it. I'm told that certain lawyers think this makes sense.
> research has every right to protect its
> intellectual property by any means neccessary.
Well, surely that depends on what the result is?
Having spent money on something doesn't automatically give
you rights to something you don't own in the first place.
The usual ground for granting patent rights is that
you've invented something. There's nothing self-evident
about this : it was introduced on purely pragmatic grounds.
Discoveries have not historically been protected like this,
and to the extent that they might become so, it's happening
by creeping extension rather than as a result of considered
decision-making. The argument that the amount of work going
into it means that these sequences should be treated as
inventions is rejected by a large majority of those working
in the field - and let's not forget that these others have
done a lot more work than Celera. The belief that a "gene"
is an invented intellectual construct entirely separate from
the sequence is believed equally specious. It's believed to be
very possible that these patents will be struck down by the
courts on these grounds, but there's no certainty in that.
Another point worth remembering is that Celera haven't actually done
anything as dramatic as they imply : their press release states that they've
sequenced 1.2 billion bases, in 40 days. That's 30 million a day, which at
roughly 500 useful bases per sample, comes to 60,000 lanes a day. Using
96-tube 3700s, that's about 600 runs a day. In theory they could manage
over 2000 : clearly their production scale-up is nowhere near complete. At the
moment, the various members of the Human Genome Consortium are still probably
outsequencing them quite comfortably.
IIRC they aim to reach at least 30 billion bases sequenced - 10-times coverage
of the genome : it's going to take them a while yet to get there. And then they
have a large assembly job to do, which is in itself not a trivial task.