Slashdot Mirror


Bioinformatics

tadghin pointed out this Newsweek article on bioinformatics, and also notes: "At O'Reilly, we just published our first bioinformatics book last week, Learning Bioinformatics Computer Skills, by Cynthia Gibas and Per Jambeck, and it immediately rocketed to the top of the Amazon Computer bestseller list. This definitely appears to be a new area for the computer industry that's just starting to hit people's radar big time. I've also made the point to VCs looking at distributed computation startups that what I see on sites like slashdot is a lot of movement by hackers towards new and interesting problems. And science looks a lot more interesting than some of the business computing that's been front and center the past couple of years. And the Biological Open Source Computing Conference I spoke at last year was definitely popping with ideas and excitement. Unfortunately, this year's conference is in Copenhagen, right before the O'Reilly open source convention, but I definitely urge slashdotters to check out this area. Demand for perl expertise is especially high."

13 of 105 comments (clear)

  1. What you see on Slashdot ... by Anonymous Coward · · Score: 3
    what I see on sites like slashdot is a lot of movement by hackers towards new and interesting problems

    No, what you see on sites like Slashdot is a lot of talking by bored sys admins about new and interesting problems they wish they could work on.

  2. Re:Book looks like fluff by Jonathan · · Score: 3

    I haven't read the book myself, although I did know one of the authors (Per Jambeck) in grad school (in fact I still have his copy of Knuth's "The Metafont Book " if he's looking for it). I doubt the book is fluff, just not for CS folk. Like all new sciences, bioinformatics is done by people coming from other areas. If you are looking for a book about bioinformatics for CS folks who are non-biologists look at Dan Gusfield's "Algorithms on Strings, Trees, and Sequences", (1997) although it is beginning to be a bit dated.

  3. Re:similar book for CS people? by Jonathan · · Score: 3

    As I mentioned in another posting, Dan Gusfield's "Algorithms on Strings, Trees and Sequences" is good, although getting a bit dated now. Another excellent book is Durbin, et al's "Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids".

  4. Twenty Points To Whomever Finds DeCSS in DNA by VValdo · · Score: 3
    This seems to be a fun application of bioinformatics.

    Take some code, say the tinest known CSS descrambler in C. Maybe compress it into a nice tight zip/.gz binary. Now convert it to a DNA sequence (It seems you could actually make a couple possible sequences by switching around the letters) I wonder what the odds are of finding one of these sequences in the billions of combinations currently being sequenced? W
    -------------------

    --
    -------------------
    This is my SIG. There are many like it, but this one is mine.
  5. Actually, yes, I do by FallLine · · Score: 3

    I happen to be involved in the biotechnology industry and I live in Philadephia, so I know a thing or two about the subject. You, on the other hand, do not. I also went to business schoool, as in finance, economics, and all that jazz, so you're way off base there as well.

    You ignore many fundamental issues in this business:

    There is strong competition. This means that it is very rare for any one company to totally dominate a market, especially for a prolonged period of time. From an offensive point of view, this means that a company with its hands on a cure would be choosing not from owning a market outright, but from owning a sliver of it, and even then with risk involved in not coming out with better alternatives as time progresses. With a "cure", a company would:

    1) be free to charge a lot for it. HMOs and insurers would prefer to pay for a cure like this, especially when you consider that so many of the costs that they pay go not to any one drug company, but (mostly) to the thousands of other ailments ASSOCIATED with that disease. (e.g., hiring doctors, nurses, medical equipment, etc).

    2) have relatively low risk. This, in financial terms, is equivalent to money.

    3) have quick turn over, when you compare that to the average 10+ year time to market for the drug companies, that's like a dream come true. put simply, 7b dollars today is worth a hell of a lot more to any one of these companies than 10b dollars over 5 years. This again, translates to money. Hint: Those dollars could have been invested in less risky ventures and returned more.

    4) would allow the company to take the entire market, rather than just a sliver. Meaning more money...

    5) saves on-going R&D dollars

    6) establishes a solid reputation...

    In addition, sitting on a cure also can easily become a defensive problem, when and if competitors find it for themselves. All those minority players in a given market would have plenty of motivation to release a cure if they had it. Meanwhile, the company that sits on it risks losing all their previous sales.

    I could go on, but you just don't get it. Now this is not to say that it's so cut and dried, that a company would never fail invest in the discovery a cure. There are certain times when the allignment of certain circumstances, say, risk, market size, pecularities of the disease, may prevent a company from investing large sums of money in a cure, but if you think companies sit on their hands on large and lucractive markets where such an opportunity is clearly exploitable you're only kidding yourself.

    1. Re:Actually, yes, I do by FallLine · · Score: 3
      So it's more lucrative to charge a person once rather than weekly for the rest of their lives? I can't see how that's possible.
      Why not? Who says that a series of pills must be sold for more than a single one (not that a cure is necessarily a single pill, in fact that's very unlikely)? Who says that the profits on those sales must be more? If you think it's impossible, you have little to no understanding of business, never mind the drug business.

      Ok put it this way, imagine you're Eli Lilly, you're in a drug market and sell 2b dollars a year with 30% of a given market. However, that 2b dollars a year product took 15 years to bring to market. (Hint: This depreciates the value of that return hugely). You've only been on the market 2 or 3 years and your patent will soon expire, meaning that your prices will get cut by 3x at least by the generics. Plus you've got other competitors banging at your door with alternatives today, chipping away at your sales. Furthermore, you should understand that the mere invention of that one drug was by no means assured, it was risky (investors demand a lot more return for taking on that kind of risk). You could very easily find yourself 3 or 4 years down the road without a single hit drug. In fact, to even have a hope of staying on top, you need to spend very substantial sums on R&D and marketing. In fact, only 3 out of 10 drugs on the market meet or exceed their R&D costs. Of those, only a small fraction will really generate your profits. Realistically, you're looking at a profit margin of about 15-9% (9 when you figure in depreciation), when all is said and done (remember only a very small fraction actually make it to market, let alone suceed), on a 2b dollar a year product. The picture I am painting is fairly close to reality.

      Now, imagine you're that same company, and you have a cure at hand (since you imply that they can do either just as easily). You can either continue down that same path (to the extent that you can control it) or you can bring the cure to market. The cure, if it's a given, is a no brainier. That's about ~7b dollars in revenues in the first year alone if you could sell the "cure" for the cost of one years worth of drugs, a very reasonable and low number. In fact, the HMOs and insurance companies would be willing to pay much more than this, considering how much they save from other medical bills, the complications alone far far outweigh the costs. What's more, that money comes relatively risk free. As a percentage of sales you would spend far less on R&D, meaning higher return for the shareholders, marketing would also be significantly reduced, given that it is a "cure", which would quickly become common knowledge in the medical community. So quick and dirty, ~6b in profit (minimum) for the cure versus 180m a year (figure 9% of 2b) for however many years. It really is a no brainer.
  6. Open Source Bioinformatics by Bizzaro · · Score: 3
    Some people in the field are now releasing their software under Free/Open Source licenses. It may seem odd to non-scientists that the license is an issue. Isn't all scientific work free and open? Far from it, especially in bioinformatics, where, as you may have read, there is a lot of money involved.

    A couple organizations have taken it upon themselves to promote freedom and openness in bioinformatics. One, Bioinformatics.org, has a modified version of SourceForge so that the community can perform project management and collaborations on a community-run website. Bioinformatics.org has other services, such as website hosting, news forums, a software registry and repository, and more to come. The organization currently hosts 27 projects and has over 600 members. (Disclaimer: I am the Director of the organization.)

    Another organization, The Open Bioinformatics Foundation, supports the development of several language libraries for bioinformatics, such as the famous BioPerl. They also host the BOSC conference mentioned in the post.

    --
    This sort of thing has cropped up before. And it has always been due to human error.

    --

    --
    This sort of thing has cropped up before. And it has always been due to human error.
    HAL9000

  7. AI and Bioinformatics by acomj · · Score: 3

    This is interesting to see bioinformatics in the spotlight.. I used to work at a place trying to do "meaning based search" in the medical field. They were working on among other things ontology based search and a search for protein-gene relationships for quicker drug discovery.

    We also had a doctor on board before the money started to run out.. It helps because the biology terms are very foreign to Computer types (assay, gene clips etc....)

    There was a paper in the office of some proffesor who used a brill learning algorithn with existing genes and then had it try to guess what a ramdom genes did. It did very well in the test despite the "primitive" ai.

    3rdmill and spotfire /labbook and a host of others are working on this stuff to sell to pharama companys to do better search and allow quicker more accurate drug creation. The thinking is that if you can make a parma discover drugs faster than the rest you can charge a boatload of money for the software. Discovering new drugs while keeping the side effects minimal is non-trivial.

    There is a lot of computing power in the life sciences field,and a lot of data created with gene-clips and assay data. People can't sort it all out anymore some computer analysis makes everything faster. Look at the human genome. Computers made it happen.

    "Sit back and enjoy the chaos" -Unknown

  8. Culture clash: biologists and programmers by bwt · · Score: 5

    There are two factors that I think are driving the emergence of bioinformatics: culture and data explosion.

    When I was in college, the computer science majors "hung out" with the math majors, the physics majors, and the electrical engineering majors. Biologists hung out with the less analytical crowd. Obviously these are generalizations, but I believe a lot of "the problem" is that culturally biologists just don't have very good computer skills. Suddenly it is the case that biology as a science absolutely requires these skills. If you were one of the few (and some do exist) that broke the stereotype, you need to be starting a company about now. Otherwise the race is on for the biologists to learn programming and the CS-math-physics types to learn biology.

    Second is the fact that biologists are drowning in data. Projects like the human genome project are producing lots of data, but thats just the tip of the iceberg. There is already an exploding market in high throughput assays and measurement computation. The result is that the field as a whole simply isn't managing it's data well. Often groups store there data in extremely crappy formats. Custom text formats, asn.1, etc... I'm an Oracle programmer, so I expect the kind of solutions that Banks and .com's use: big iron data warehouses running heavy duty RDBMS's like oracle, DB2. Nope. I have yet to come across a single bioinformatics project that has a clue about data modelling. It's actually much above average to use a database at all, let alone well. If I was head of the NIH, you can bet that Freshmen biologists would take a class in SQL starting immediately.

    When you combine the two factors: culture and data innundation, very strange things start to happen. The data infrastructure just isn't there and worse a lot of people just don't realize it. Biology is presenting problems that require massive data warehousing solutions to a field whose main data background is calculating p-values to show the effect of a drug is significant.

  9. Re:Drugs by psin+psycle · · Score: 3

    hehe.. I thought they only created vaccines for things that would kill you. That way, by creating the vaccine they will actually make MORE money off you because you will live longer and spend more money on coff syrup and headache pills ;)

    --
    Need a website host? Try out http://WebQualityHost.net
  10. Re:Drugs by hillct · · Score: 3

    There is something to be said for this position (that drug companies can't make money on curing diseases but rather by selling drugs that treat symptoms), however it is a somewhat alarmist position, at least the way it has been expressed here. I don't know why it would be suprising to see a company invest in technology that will generate future profits.

    What bothers me about this issue is the futile attempts the federal government has made to attempt to regulate biological research with respect to use of the Genome Project data to assist in such morally ambiguous areas as human cloning. The attempts to regulate this field of resesearch are futile, as they are being handled now, since the industry high profit potential, that virtually unlimited funds will be expended to house research facilities in places beyond the borders of countries that choose to regulate this field of research.

    While on the subject, I'd like to aplaud the genobe project researchers for enbracing the concept of 'Open Source' science. There were a number of firms that actively tried to gather together and copyright genome project data.

    Well done gentlemen!

    you have allowed the creation of an entirely new field of science. The openness of the research data will reduce the percieved moral ambiguity of the derivative works based on that data.

    --CTH

    --

    --

    --Got Lists? | Top 95 Star Wars Line
  11. Drugs by swagr · · Score: 3

    Eventually, the proponents of bioinformatics claim, the new field will change health care by allowing pharmaceutical companies to shave years off the drug-discovery process, and letting doctors tailor medicines to an individual's genetic makeup.
    Pharmaceutical companies are around to make money. That's why they create drugs that treat symptoms and not drugs that are cures. Now they're investing in ways to make more money from us. Great.

    --

    -... --- .-. . -.. ..--..
  12. How many points for telling you the odds? by Flying+Headless+Goku · · Score: 4

    Stripped of header and gzipped, I get 366 bytes, X4 is 1464 nucleotides.

    The probability of any two random sequences of the same length being equal is the inverse of the number of expressible sequences of that length. In this case, it is 4^length.

    When you are looking for a random sequence (of length N) within a longer sequence (length M), the probability of finding it is the above probability multiplied by M-N (the same chance over and over again for every sub-sequence of length N, assuming you don't count wrapping substrings).

    So N=1464, and M equals roughly 3 billion. So the probability is:
    (3*10^9-1464)/4^1464

    Which is in the neighborhood of one to a squared googol odds.

    Of course, that assumes random data, but I figure it's a good enough approximation.

    Don't knock yourself out looking for it. It's not there.
    --

    --