Distributed Computing and the Human Genome Project

← Back to Stories (view on slashdot.org)

Distributed Computing and the Human Genome Project

Posted by Cliff on Sunday November 28, 1999 @09:49PM from the breaking-the-genetic-code-instead-of-rc5 dept.

I'm sure most of you have heard about the Human Genome Project by now and how it is working to map our DNA. Aparently there is now a race going on with corporations also performing the similar experiments, except with the intent of patenting the results. Now troc is wondering if another distributed computing effort might be in order. What do you all think? Click below troc's actual question.

troc asks: "I was watching a TV programme on UK TV last night about the Human Genome Project and how there was a race to sequence and publish the whole thing before the private companies do it and patent the sequences. Basically lasers are used to break up the strands, these are then read and fed into a computer that tries to match the bits up with other bits like a giant jigsaw puzzle. This requires a lot of computing time.

Is this an opportunity for the open source movement to help decode the sequences and publish the whole thing becore it's patented?

<soapbox>

I, for one, don't like the idea of a private company owning my gene sequences. They will be able to limit the use of these so only really rich pharmaceutical companies will be able to develop drugs etc and then sell them at huge profits, which isn't realy for the benefit of mankind blah blah blah.

</soapbox>"

I agree. I don't see how information like this can be patented. There is nothing truly proprietary about it, and it would do more good in the public where the benefit can truly be felt.

43 of 146 comments (clear)

Min score:

Reason:

Sort:

Distibuted computer projects... OT by famfurnell · 1999-11-28 16:59 · Score: 2

Are they really worth the effort? RC5 and SETI are both successful, but both of them require a permanent connection to the net (in essence) in order to get the best updates etc. (if ya get what i mean). As this was a UK-based thing, why not send the whole lot around on a CD?
1. Re:Distibuted computer projects... OT by Kingpin · 1999-11-28 17:30 · Score: 3
  
  All this could be done so much easier. Use applets - people do not have to understand anything at all in order to help out on a project like this. No need to install obscure clients and what have we. I think the only good use of applets is for easy distributed computing.
  
  --
  Unable to read configuration file '/bigassraid/htdig//conf/14229.conf'
  Geocrawler error message.
perhaps this will be a wake-up call by SEAL · 1999-11-28 17:00 · Score: 5

Patents, in general, have really taken a nose dive since the personal computer achieved widespread use. The original intent of a patent was to allow an inventor to come up with an idea and protect it for a period of time. Whether he profits from it or sits on it is then up to that inventor.
However, with the computer age, the speed of (dare I say) innovation has been astounding. This has produced two detrimental effects. First, the patent examiners simply don't have the niche expertise to scrutinize patents. I'm sure most of us have seen some of the idiotic patents out there. Second, the time span of a patent has become too cumbersome. By the time the patent expires, the invention is often useless.
I sincerely hope that this particular project will be placed under a HUGE spotlight when the patent requests inevitably filter in. I have a feeling it won't hold up, and at the very least, not in some countries.
However, keep in mind that this is scientific information about a human being, not software / computer advances. In that regard, a patent will be cumbersome, but not quashing. The patent (if granted) WILL expire someday. And I'm fairly certain that the information will still be very important and valuable when that day arrives.
Of course I'm all for beating the would-be patenters to the punch, if possible.
Best regards,
SEAL
1. Re:perhaps this will be a wake-up call by PG13 · 1999-11-28 18:10 · Score: 2
  
  >The patent (if granted) WILL expire someday
  
  Technically yes and the same thing could be said about copyright. Except the industry which holds copyrights has gotten extremly powerful. An interesting trend is that whenever the original disney copyright for mickey mouse etc... is about to expire the copyright term is extended (yes for both new copyrights and old copyrights).
  
  This extension of copyright clearly serves no public benifit (these works have already been created so reatroactively extending the copyright doesn't encourage the production of new works) and yet it is enacted! If the biotech industry became large enough such a scenario is possible (tho less likely because of competition within the industry).
  
  For further information about the copyright term extension act and efforts to fight it visit copyright commmons
  
  --
  Marriage is the "pseudo-ethics" that cloaks the messy truth of sexuality in the raiment of propriety -- it's "Don't Ask,
Hold on. The seq's can't be patented. by reve · 1999-11-28 17:05 · Score: 3

Okay, before everyone hops on this really popular anti-patent train, let's make sure we note that the sequences can't be patented. Yes, independent companies are gonna beat out the human genome project and have been filing patents. But the patents arn't on the sequences themselves, they're on applications. Whether these applications have to do with more efficient methods of genome-unraveling or whether they have to do with specific uses of the patterns they've found, it's NOT the actual sequences.
In a number of countries it's already quite specifically illegal to attempt to put intellectual property restraints on anything involving human genes. US is considering some laws as well, but let's just get all the facts straight before panicing, okay?

--
-- r . m o s q u i t o --
1. Re:Hold on. The seq's can't be patented. by _Marvin_ · 1999-11-28 17:27 · Score: 4
  
  Of course the seq's themselves can't be patented.
  Otherwise anyone holding such a patent would be
  (AFAIK) entitled to control the reproduction of
  the sequences, that is, since we are contantly
  reproducing them in our bodies he could charge
  us for letting us live...
  Now, this would make patent law a satire just too obviously.
  Still, (again, AFAIK, correct me, if I'm wrong)
  patents on gene sequences (that is, their
  applications) have a new quality: They do not
  cover applications that the patent holder has
  thought of, they cover all applications that
  become possible only if you know that gene
  sequence.
  If I remember it correctly, there are already
  cases where companies hold patents on certain
  proteins in our bodies (again, not the proteins
  themselves but any of their applications) and
  you are not allowed to TEST for these substances
  without paying them license fees, even if you're
  using a completely new testing method you developed on your own.
  
  --
  "We won't use guns, we won't use bombs, we'll use the one thing we've got more of and that's our minds" - Pulp
This can't be open source! by ghoti · 1999-11-28 17:08 · Score: 5
Well I don't think anybody will say "No, let's not do it, let the big bad corps patent our genes!!".
The only problem I see here that developing a distributed client for this takes a lot of time and effort --- and one, which definitely cannot be open-source!
Two reasons:
- False results. If the data format etc. are known, it's possible to feed the servers bogus results, which could lead to inconsistencies in the data base. This might even destroy results that are already there (okay, this problem also exists with closed source stuff like SETI@Home, I know).
- Data Theft. An open source program could be modified by Big Bad Corporation Inc. to simply harvest raw data and feed it into their own computers, thereby gaining information they would otherwise have to find themselves. Granted, they won't have as much computing power, but when they have their own and the stolen data, they're still saving time. And I am not sure if enough data is produced to keep hundreds of thousands of computers occupied (see the problems SETI@Home had in the beginning).
So, sorry, folks, but I believe this is one of the few things that open source clearly is not suited for. But it would be kinda cool to have a proggy running on my machine that messed with genes ... ;-)
--
EagerEyes.org: Visualization and Visual Communication
1. Re:This can't be open source! by flux · 1999-11-28 17:25 · Score: 2
  
  False results can he handled easily: just submit the packet to two different places, or to 1.5 places in average, and if they disagree, the system checks the packet by itself (or hands over to third machine.) Yes, It'll slow down, but I can't see any other viable alternative..
  
  Data theft.. Isn't the idea that the data is already there, but it needs to be processed? No idea in data theft then. Also the system could look after domains or ip-address spaces that keep eating and eating the data space faster than anyone else and blackhole them.. Or sue them :).
2. Re:This can't be open source! by Lars+Arvestad · 1999-11-28 17:31 · Score: 4
  
  Data Theft. An open source program could be modified by Big Bad Corporation Inc. to simply harvest raw data
  and feed it into their own computers, thereby gaining information they would otherwise have to find themselves. Granted, they won't have as much computing power, but when they have their own and the stolen data, they're still saving time. And I am not sure if enough data is produced to keep hundreds of thousands of computers occupied (see the problems SETI@Home had in the beginning).
  
  The Human Genome Project is extremely open. They try to make all data public as soon as possible, making patents impossible. So data theft is not an issue here.
  
  False results might be a problem, but I would expect it to be relatively cheap (computationally seen) to check a solution to see if it is valid.
  
  A distributed (open source) effort will probably not happen because a computation like this is more difficult to distribute than trying crypto-keys et.c.
  
  Lars
  
  --
  
  --
  Reality or nothing.
3. Re:This can't be open source! by ianezz · 1999-11-28 17:36 · Score: 2
  
  > False results. If the data format etc. are known, it's possible to feed the servers bogus results, which could lead to
  inconsistencies in the data base
  
  Send the same data to multiple receivers (randomly chosen), and see if they produce the same results. (or, at least, choerent ones). If note, one (or possibly more) are lying. Anyway, a closed-source client does not prevent someone to see what it does and send bogus data anyway. It only makes things harder for the ones that actually want to send correctd data.
  
  > Data Theft. An open source program could be modified by Big Bad Corporation Inc. to simply harvest raw data and feed it
  into their own computers
  
  This is a more realistic issue, but Big Bad Corporation is probably rich enough to do reverse engineering of the protocol by itself, and access random lumps of raw data anyway. A closed-source client don't make much sense here.
  
  The real point is that modified versions (i.e. to improve performance) could quickly spread so that just a few uses the original clients.
  
  If suddenly it turns out that a widespread modified version produces erroneous data from time to time, then probably large amount of computations has to be thrown away. Of course, you could check for that using the same method you use to check for "bad guys", but it's a serious problem if you got only few people running the original.
  
  My 0.02 Euro as usual.
4. Re:This can't be open source! by ghazban · 1999-11-28 18:07 · Score: 2
  
  Plus there is the double dilemma of having the company send bogus results, sabotaging the project, and also using the program to their advantage to add to their databases.
5. Re:This can't be open source! by John+Allsup · 1999-11-28 18:56 · Score: 2
  
  I get the feeling that the patterns are significantly harder to find than to verify.
  
  This would make false data less of a problem ( since it would merely act like any other flooding DOS attack).
  John
  
  --
  John_Chalisque
6. Re:This can't be open source! by fpepin · 1999-11-28 20:19 · Score: 2
  
  There is also a slight problem of the practicality of having a distributed client. The problem here isn't really a matter of brute force.
  
  You need to sequence the gene first. This is the long and costly part if I remember well.
  
  The computing power is used mainly to see the similarities with other genes already discovered (in humans and in other species). Here you need more of a huge database holding all the information as you simply search for matches and near matches in the sequences.
  
  I'm not sure it would be very useful to have a distributed client for this. And for myself, I'd rather wait a few more years and be sure that I can trust those results.
Prior Art? by JohnG · 1999-11-28 17:11 · Score: 2

Hmmm, does anyone else think God (or Alla or Odin, or the Great Bannanarama, or whoever your supreme being is) will have a problem with these big companies patenting His invention?
1. Re:Prior Art? by dylan_- · 1999-11-28 18:01 · Score: 2
  
  Hmmm, does anyone else think God (or Alla or Odin, or the Great Bannanarama, or whoever your supreme being is) will have a problem with these big companies patenting His invention?
  Yes, he does. Unfortunately the Other Guy has all the lawyers.... :-)
  dylan_-
  
  --
  
  --
  Igor Presnyakov stole my hat
2. Re:Prior Art? by wocky · 1999-11-28 18:34 · Score: 2
  
  You mean there are no lawyers in heaven?
  
  --
  David
Patents - just a few ideas by CormacJ · 1999-11-28 17:15 · Score: 2

Even if we did have a distributed effort and made advances, someone would still have to patent the discovery.

As we have seen with Y2K fixes and other things, making a discovery does not stop someone else patenting the idea.

An open source body would have to be setup to patent the discoveries just so that nobody else could patent them.

This body can declare thier patent open for use.

There is a lot of legal issues here - if you opne your patent too much could you lose it.

Patent law is also a case of boiler plating your patent - you have to ensure that every option is covered and also included on the patent.

This sort of thing is costly, and this is why a lot of companies patent thier ideas. Once they have the patent they recoup thier investment, and then some.

If an open source patent body is set up there will have to a lot of time spent considering patent administration and the costs involved.
DeCode Genetics by lawn_ornament · 1999-11-28 17:20 · Score: 2

I live in Iceland, and here there is the company DeCode genetics. They are building a huge database with the medical histry of every Icelander in it, to be able to trace "bad genes".

the funny thing is, they're a privatly owned company and still they are entitled to go through all your medical records at their own will and put it in a database

sure, they say it'll be secure but what if they start selling info on you to insurance companies?
imagine this:
you: Hi, I'm (some name) and I'd like a life insurance.
insurance rep.:well... I'm sorry... it's gonna cost you (insert obscene amount here) because your family has a record of heart failiurs.

these are just my thoughts... check it out for yourself, I think this has made it to most news medias in Europe and America, also check out www.ie.is

---

--

---
Killroy Woz Here
HGP almost completed; also, NIH computers? by The_Messenger · 1999-11-28 17:20 · Score: 2

I was priveleged enough to actually speak with one of the NIH (National Inistitute of Heatlth) scientists working on the project earlier this year. He came to speak in our school Medical Society. Being the geek that I am, I made sure to inquire as to the Y2K compliancy of the computers used for analysis and data storage; alas, he wasn't involved in that aspect. ;-) He said he "thought they were", though.

If I remember correctly, and there have been no delays, it's supposed to be finished before 2002.

I tried to tape the whole question and answer session with my microcassette recorder, to put on my webpage (in RealAudio format), but he was against it. Oh well. (I would have tried to sneak it anyway from the back of the room, but my recorder has a crappy mic, so I wouldn't have gotten much by doing so.)

The whole concept is very cool... imagine being able to prevent disease on a genetic level...

Does anyone have any information on the computing systems being used? Come on, there have to be a few NIHers reading /.! ;-)

This is slightly off-topic, but has anyone else heard about this "Soul Catcher" project, which I think is based mainly in the UK? (Based on the concept of recording an entire human consciousness to a traditional physical medium, if I remember correctly.)

--
--
I like to watch.
1. Re:HGP almost completed; also, NIH computers? by imac.usr · 1999-11-28 21:17 · Score: 2
  
  >Does anyone have any information on the computing systems being used?
  > Come on, there have to be a few NIHers reading /.! ;-)
  
  I work as a Macintosh support tech over at NHLBI (the National Heart, Lung, and Blood Institute) and interviewed recently for a position over at NHGRI (I didn't get it mainly due to non-competition agreements between the federal contractors who supply NIH). Like any good geek, I asked about the machines in use on the project. Apparently, while some processing is done here in Bethesda, a lot of it is done at other sites (universities and such) on Unix boxen, although my interviewer wasn't sure of the specific platform. At the institute itself there's a fairly large number of Macs used for graphic analysis of the data and both Macs and Wintel PCs for basic stuff like writing papers and reports.
  
  I can tell you NHGRI is pretty well funded within NIH, right up there with the cancer institure and the infectious disease institute (which deals with things like AIDS and whatnot). They certainly have more translucent Macs than any other institute. :-]
  
  And yes, they do use Linux there, although from what I gather, it's mostly being used by individuals experimenting with the system, and not for any actual rendering/mapping of gene data. Coincidentally, I took my first Linux support call a couple of weeks ago from somebody here who installed Caldera 2.2 and needed help setting up networking. Got him set up in only minutes, and soon he was enjoying NIH's 300kbps-and-up network connection. Makes watching MacWorld keynotes a lot more viable.
  
  If you check the Netcraft records for NHLBI, NIDDK (National Institute of Diabetes and Digestive and Kidney Diseases), and NHGRI, you'll see that NIH is far from your typical NT government shop. Plus, the NHGRI main website has lots of info on the project and why it's a Good Thing.
  
  BTW, slightly off-topic: there are 12 people in my support group, and of those, I'm the only full-time Mac tech, while two others are mostly PC techs with some Mac skills. Oddly enough, the PC people are always busier than me despite having roughly the same number of machines to support.....
  
  --
  I use Macs for work, Linux for education, and Windows for cardplaying.
Sex = piracy? by vaxer · 1999-11-28 17:27 · Score: 2

You heard it here first -- intellectual-property idealists will revive a grand tradition by copying their genes without a patent license. Someone will print "Information wants to be free love" on a black T-shirt, and all around the world, geeks will go out into the streets and protest WIPO and the genome barons by having sex. With themselves, mostly, but hey -- it's the thought that counts.
Software Patents in EU by Anonymous Coward · 1999-11-28 17:28 · Score: 2

Since when did they start allowing software patents here ?

If they are indeed allowing them then they are restricting my freedom of expression. My programs are my art (I don't and won't write them for money). Patenting software is like patenting the golden ratio in paintings.

I don't know if you've ever really considered this, but not all software is about money; in my case it's mostly about creativity and art (all non-profit, well, except intellectual profit). In case you want to know, I mostly write sound synthesis and processing software and the field is very heavily patented. What artist makes paintings and doesn't share them with other people ? Imagine if you couldn't show a painting to other people if you painted it with a certain brush unless you pay license fees. This is what software patents are to me and probably many others. They need to be stopped now (or at least make non-profit use legal) !
AC
Open Source Genome Projects by ewanb · 1999-11-28 17:34 · Score: 5
There are some good open source genome projects for doing this efficiently - and we do welcome help of any kind. Here are some open source projects which I know about/work on/
- ensembl is an open source genome project designed to get as much data and software into the public domain as possible
- EMBOSS
- bioperl
All these are well backed, strong open source projects with different strengths. Everytime genome stuff comes up on slashdot I try to point these things out to people, but everything gets lost in the noise about people $%!"'ing on about patents (generally without alot of knowledge!).
Anyway - check out these projects for more information about real open source efforts in biology.
1. Re:Open Source Genome Projects by bluets · 1999-11-29 22:39 · Score: 2
  
  In evolutionary biology, where we are focusing on reconstructing the tree of life, there are actually very few programs that are licensed under the GPL or the LGPL. There is *one* program (Paup, being distributed with manual by Sinauer) upon which most evolutionary biologists depend that has been in beta testing for 6+ years. With a 30 day expiration built into the binaries (of course, source code is not distributed). The author refuses to license the code under the GPL or the LGPL or any other type of open source licensing scheme. Where I work, we have a cluster of linux systems for this tree of life reconstruction - they are sitting mostly idle because the most recent beta of this program expired last January. The next beta is not even likely to have PVM or MPI support. Anybody want to do some programming for me? :)
TIGR, HUGEP and genomics by jw3 · 1999-11-28 17:53 · Score: 5

Hello, my name is January and the group in which I am doing my Ph.D. thesis sequenced in 1996 a bacterial genome (Mycoplasma pneumoniae). Since we are into genomics, transcriptomics and all other -mics I know at least a little about the way it works - although on a much smaller scale.
First issue: could distributed computing help? My answer is a brief "no". First, the bottleneck is on the experimental side - getting the sequences, and not putting them all together. Second, although you need quite a lot of computing power to do so, much of the job must be revised and checked by humans, i.e. there is a lot of skilled manual work to do - you have to have "an eye" for the sequences. But the first point is more important.
Now, TIGR, the commercial alternative to the Humane Genome Project has sequenced more organisms then any other scientific group in the world. Craigg J. Venter seems to be very efficient and hard working guy. Even if you don't like the idea of making money with patents in this area the scientific community owes him a lot - he was the one to sequence the first organism, to sequence Helicobacter pylori and many, many others. On the other side... you know, when M. pneumoniae sequence was about to be published, it was supposed to be the first Mycoplasma sequence. But Venter was faster with Mycoplasma genitalium - and he kept it quiet, so noone involved in sequencing those organisms actually knew there is a race. Now Venter claimed to be able to complete the human genome with much less effort and much less $$, and considerably faster then the HuGeP. I'm not sure whether he is able to do so or not, because it depends chiefly on the "hardware" side - the new Perkin Elmer automatized sequencers they are supposed to use.
Anyway, the question is, whether it is good or bad if Venter sequences the human genome. In my opinion - it's OK. The Hugep is somewhot different in its purely scientific interest, and I'm convinced that they will produce data of much higher quality. On the other hand, human genome has a considerable variation, so two genomes are better then one. I would not be very concerned about the patent issue, because it will come anyway (because of **!'*%$! american and international patent law) - even if TIGR would not sequence the genome, someone takes the output of the HUGEP project and will patent the same sequences Venter would. Venter just wants to gain a little time for evaluating the sequence before releasing it to the public.
And of course, not the _sequences_ are patented - what is patented, is the usage of modification of a certain sequence for medical purposes, or a certain enzyme as an aim in medical treatment.
Regards,
January
warm and fuzzy by counsell · 1999-11-28 18:05 · Score: 5

It's good that hackers are well-informed and principled enough to think it matters. This happens to be my area of interest; I'm responsible for Bioinformatics at the Institute of Cancer Research in the UK. A couple of weeks back I went to an excellent talk by a clever guy call Ewan Birney from the Sanger Centre near Cambridge, UK. He is writing code to catalogue and annotate the assembled sequences in real time as they come off the mammoth robot sequencing "production line". In one of those rare occasions where the British are leading a "big science" project the Centre has been responsible for the largest fraction of the Human Genome sequenced at any single institute. The code does stuff like figure out which bits of the sequence are real genes and which bits are that 90%+ of so-called "junk DNA" you might have heard of and also attempts to assign provisional functions to the genes by various computational means. Eventually people in white coats will have to confirm such assignments properly, but it's important to beat the drug companies to making good guesses.

Ewan's code and all the data are entirely Open Source. If you've got a good reason and a reasonable Pentium with lots of memory and a 30Gb hard disk you could mirror the human genome and get it updated every night. (I feel strange just typing that sentence and I've been following this story for years). The Wellcome Trust and others (including US and European government agencies) funding the project are keeping everything Open because that's the way science is done and because this will subvert commercial attempts to stake a claim on our species' genetic heritage. (Er, go Wellcome!)

Biochemists often talk about the "rate limiting step" in a reaction---the single point which sets the speed of the whole process---like a bottleneck. As far as I understood Ewan's talk (if you're reading this Ewan, please put me right), the rate-limiting step with the Genome Project isn't the assembly of the sequenced stretches of DNA (or "contigs") as the original poster suggests, but the collection of the data in the first place. At the Sanger they have clusters of PCs and Alphas crunching the contigs---distributing the effort would give us all a warm fuzzy feeling, but wouldn't be essential. Again, I may be wrong about this.

One thing that definitely is a priority is making some sense out of all of this information. What would be great would be if members of the global community of hackers started taking molecular biology and biochemistry classes so they could write code to help people like me make sense of the embarrassment of riches that the project is creating. I'm off to Cambridge in two weeks to the Bioinformatics Open Software Development meeting to listen to some project leaders talk and discuss the existing efforts. Personally, I would love to give crash courses in biology to programmers with time on their hands in an effort to harness their collective genius rather than sponsor an effort to write a contig-crunching client to harness their collective spare cycles, but I have no idea how such a thing could be organised. Any ideas?
1. Re:warm and fuzzy by ewanb · 1999-11-28 18:13 · Score: 4
  
  Consell -
  Great that you were following the talk. I thought I put everyone to sleep
  The rate limiting step at the moment is effectively the mapping in fact, then sequencing. The interesting thing about the analysis is that the amount of CPU is unbounded. If we have more CPU we just use more accurate algorithms. We can do something within the CPU bounds on the hinxton campus, but if anyone wants to give me a super computer, then we could get more accurate analysis.
  I can always use more juice!
2. Re:warm and fuzzy by ewanb · 1999-11-28 19:01 · Score: 2
  
  Hardware at the moment generally are clusters of alpha boxes or intel boxes (running tru64 or linux respectively).
  The two big drainers on CPU for analysis are gene prediction (genscan) and database searching (blast). database searching can't be distributed easily as you have to worry about the database ;)
  However, there are programs like sim4, genewise and est2genome that could greatly help us and could be distributed.
  Genewise - you can download (I wrote it) at Wise2 est2genome is somewhere around as well.
  For the more general overview of the problem - check out ensembl for an idea of the project.
Difficult to distribute by Lars+Arvestad · 1999-11-28 18:06 · Score: 4

Common successful distributed projects in cryptography rely on the fact that all you need on a client is the algorithm and a few keys to try. Therefore, clients are really cheap (resourcewise) to distribute and use.

In the case of the Human Genome Project, the situation is somewhat different. A well known analogy is the following: Take a few copies of a newspaper. Feed it through a shredder. Remove a handful or two of paper. Insert errors. Now, piece together one copy of the original newspaper.

In order to make a useful contribution, a client is going to need a lot of data. This means that it will be difficult to distribute (long downloading times for instance) and that few people will appreciate having the client on the machine because the client will be using a lot of memory and the machine might be a bit unresponsive (your HGP screensaver might flush all your apps to disk for instance).

Lars

--

--
Reality or nothing.
1. Re:Difficult to distribute by ewanb · 1999-11-28 18:15 · Score: 3
  
  Lars
  This is only for the assembly and not for the analysis. With analysis you have a better data/cycles ratio. Assembly is done at the genome centres anyway...
2. Re:Difficult to distribute by Lars+Arvestad · 1999-11-28 18:35 · Score: 2
  
  Ewan is a very informed and knowledgable guy at one of the key centers in HGP, so he needs more moderation. Hey Ewan, go get more karma!
  
  This is only for the assembly and not for the analysis. With analysis you have a better data/cycles ratio. Assembly is done at the genome centres anyway...
  
  Then I don't get it. The original submission was about trying "to match the bits up with other bits like a giant jigsaw puzzle". Clearly this is about the assembly problem, no?
  
  What kind of analysis what this be?
  
  Lars
  
  --
  
  --
  Reality or nothing.
3. Re:Difficult to distribute by ewanb · 1999-11-28 18:50 · Score: 2
  
  I assumme that the original poster did not understand what was going on ;). Like alot of slashdot in this case - concerned but not knowledgeable.
  Celera always talk about the assembly problem as they have gene myers solving it (he has) and think it is pretty cool. It is not trivial, but from my view (an annotation centric view) not the most important thing.
Who makes drugs now? by lovebyte · 1999-11-28 18:38 · Score: 3
I, for one, don't like the idea of a private company owning my gene sequences. They will be able to limit the use of these so only really rich pharmaceutical companies will be able to develop drugs etc and then sell them at huge profits, which isn't realy for the benefit of mankind blah blah blah.
This is an interesting statement. How do you think drugs are made now? Well, they are made by big pharma companies which make (often) a good profit. Drugs are not made for the benefit of mankind. They are made to make money.
When it comes to patenting the use of some genes, we should consider that:
1. patents are short lived.
2. A company has no interest in not using its patent. So for some money, other companies will be able to buy patents
3. patents don't stop anyone from working on whatever is patented. Lawyers always find ways to circumvent patents
On the subject of open source distributed computing for genome data, I am afraid I agree with other people here. There is simply too much data to download. It's a pity, but it won't work. Maybe in a few years time when the problems in genomics will have changed, other problems might be more suitable to this type of computations.
--
I'll do it for cheesy poofs.
Patents are anti-competitive by Morgaine · 1999-11-28 18:59 · Score: 2

I'm surprised that the US in particular hasn't done anything to reduce the most glaring anti-competitive aspects of patenting. Doesn't the free market lobby have anything to say on the topic?

Patents have always been intended to reduce competition for a limited period, so that inventors have an opportunity to bring their research to market during a sort of protected honeymoon period, but in practice that no longer works very well in the modern world. It's all to do with timescales: in the computer age and with instant global communications, timescales for everything are shrinking, and in some areas an advantage period for the patent holder of more than say just a couple of years is starting to become inappropriate, a restraint on progress, development and trade. Although it's impossible to tell what might have been, who knows which entire market sectors might have developed if their pivotal idea hadn't been tied down by patents.

Be that as it may, it's rare for a week to pass without totally ridiculous patents being highlighted here, and the analogy with icebergs definitely applies -- there's vastly more out there that we don't see on Slashdot. The whole area is clearly in utter shambles and needs urgent review.

A "fix" doesn't have to be complicated. As far as I can see, just three things are needed: a ban on patenting algorithms (as enforced elsewhere); a short, strict and non-extensible time limit (possibly related to the field, eg. default 2-3 years but longer in the nuclear power arena, for instance); and an informal "public review" system not unlike Slashdot, run by the patent office and used both to supply niche information and also to weed out the type of nonsense that translates into "how to breathe air".

But of course, something that simple could never come about, because otherwise patent lawyers would be out of a job. Oh well.

--
"The question of whether machines can think is no more interesting than [] whether submarines can swim" - Dijkstra
d.net coders wanted for DNA analysis by ewanb · 1999-11-28 19:17 · Score: 3

It is clear from these postings that people would
like the client to run. If there are people with
experience in writing these sorts of d.net systems
then please drop me a note. We have the problem
for you to work on - it is just a question of
figuring out how to do it.

Drop me a mail (birney@sanger.ac.uk).
Molecular Biology and BioChem for hackers by Morgaine · 1999-11-28 19:30 · Score: 2

Well, if you do decide to hold such classes then be sure to let us know. If it's anywhere near Cambridge then that means a 2-hour commute for me, but it would be well worth it -- this is an extremely important area.

I sure hope that what you have in mind is evening classes though, as otherwise you'll get just the unemployed to attend, which would be limiting.

Sounds like an excellent project!

--
"The question of whether machines can think is no more interesting than [] whether submarines can swim" - Dijkstra
Decode the sequences? by heroine · 1999-11-28 19:52 · Score: 2

Never knew there was a race to decode gene sequences using computers. There is a race for low paid women to load the sequencers but the "decoding" of the sequence is not the limiting factor. You've got to be damn good to get into those labs. Harvard PhD quality.
Distrib client worries: you're looking at it wrong by Morgaine · 1999-11-28 20:05 · Score: 2

The whole area of concern about clients being compromised to return incorrect results stems from the meme-setting effect of dedicated clients like rc5des, seti@home and (it seems) all others currently in existence. Their susceptibility to being cracked and reworked is entirely due to the dedicated nature of their task, as it gives nasty-minded people a visible target.

The problem would not arise if distributed clients were generic, ie. if they would do arbitrary computations on arbitrary data received from arbitrary sources. In other words, if a global distributed computing system accepted numerous different computational tasks from the public and distributed interleaved fragments of them arbitrarily to an undifferentiated pool of clients, it would no longer be possible for clients to be compromised meaningfully. (Clients would really just be maths engines, and you'd be detected pretty quick if your client made 2+2=5.)

Would there be interest in creating such a global computing system as a free software / open source project?

[Note that pretty single-task stats displays would still be available from the task sponsers' site, but that's a completely separate issue to the one of data distribution and computation.]

--
"The question of whether machines can think is no more interesting than [] whether submarines can swim" - Dijkstra
somebody patented Brit's natl dish by Anonymous Coward · 1999-11-28 20:09 · Score: 2

Someone in Japan has applied to patent curry, of all things. If successful, the guy gets a royalty every time the Brits dig in.

And with the WTO, all other countries will have to recognize and comply with the patent.

Can you believe it? Curry, of all things. Wonder what the folks in India think of this. We've lost our minds.

It's on today's London Times.
Re:Why exactly would it help to patent this info? by Lars+Arvestad · 1999-11-28 20:15 · Score: 2

However, if standard medical law prevails, there's no way they can deny a person access to the information necessary to save that person's life or to prevent his/her disease if that person cannot afford to pay for the information. Basically, just like an emergency room can't turn away people who can't pay, how could a company that patented a human genome withold that information from people who can't afford to pay?

Hmm, sounds like a good point, but laywers have probably worked this out already. After all, you can patent compounds that are used for various treatments, and this has been going on since before the discovery of DNA.

I read in an text on patenting that you cannot for example patent a surgical method, but you can patent a device that is basically necessary for same surgical technique.

Lars

--

--
Reality or nothing.
hmm by SKicker · 1999-11-28 20:42 · Score: 2

I worked for a bit as a CO-OP student in this area last summer, which is not to say I know anything about this, but.. :]
While distributed computing would probably benifti the HGP, there are a couple of points to take into consideration.
1) How secure is distributed computing? SETI and RC5 arent really all that concerned with the the integrity of the data they are getting back. They can just re-check a data block if it is a sure sign of ET or whatever. Here there will need to be a guarantee that data has not been tampered with.
2) It seemed to me that some of the tools used could do with some open source style improvement by the hacking(coding) community before throwing lots of computing power at them.
As for the patent stuff... bah!. Let the lawyers mess around with that, everyone else can concentrate on the advancement of the human race.. or something like that.
links:
Genome database
The Sanger Centre
The NCBI
False results (irrelevant) and feasibility (???) by jabbo · 1999-11-28 22:49 · Score: 2

Unless someone has the time and money to distribute microarrays and bench time at a local hospital with a good clinical lab, the clients would be worthless. Venter's efforts are succeeding because of Celera's partnership with Perkin-Elmer.

However, Celera appears to be less than picky about the quality of data they are producing. So the same approach as theirs (multiple shotgun sequencing runs for each block of base pairs) with parity checking and/or some means of verifying data would be fine.

Celera's operation is effectively a distributed effort already, it just happens to be in one building. The government will most likely step in and appropriate the sequences for a reasonable fee if it turns out that Venter et al. have reneged on their promises to distribute the sequences freely.

--
Remember that what's inside of you doesn't matter because nobody can see it.
What may end up happening... by otis+wildflower · 1999-11-28 23:22 · Score: 2
... is something along the lines of Lexis/Nexis and the law. Nobody can copyright the law, but the indices and commentaries can be copyrighted, and (whether we like it or not) currently the search algorithms to manage those indices can be patented.

So, to carry it over to genetics, the underlying genes (law) cannot be copyrighted (and this is ambiguous still: is it code (copyright) or algorithm (patent)?), but the indices and commentaries on the sequences can be copyrighted, and the search/combination techniques and/or machinery can be patented.

So, we need to make clear and loud the mandate that:
1. the human genome itself constitutes information that is in the public domain (or, at worst, the property of the person(s) who contributed the gene sample(s))
2. that while indices and commentaries on the genetic code may be proprietary in a society that protects proprietary intellectual property, equally protected is the privilege of the people to compile a separate set of indices and commentaries, at public expense, of the same public-domain information, or to license (or acquire by legal means) . Assuming, of course, that any research funded by public money is released into the public domain.
We need to define the problem fairly and completely, then fight strenuously to make sure that bad precedent is not set.
Your Working Boy,