Slashdot Mirror


Bioinformatics

tadghin pointed out this Newsweek article on bioinformatics, and also notes: "At O'Reilly, we just published our first bioinformatics book last week, Learning Bioinformatics Computer Skills, by Cynthia Gibas and Per Jambeck, and it immediately rocketed to the top of the Amazon Computer bestseller list. This definitely appears to be a new area for the computer industry that's just starting to hit people's radar big time. I've also made the point to VCs looking at distributed computation startups that what I see on sites like slashdot is a lot of movement by hackers towards new and interesting problems. And science looks a lot more interesting than some of the business computing that's been front and center the past couple of years. And the Biological Open Source Computing Conference I spoke at last year was definitely popping with ideas and excitement. Unfortunately, this year's conference is in Copenhagen, right before the O'Reilly open source convention, but I definitely urge slashdotters to check out this area. Demand for perl expertise is especially high."

1 of 105 comments (clear)

  1. Culture clash: biologists and programmers by bwt · · Score: 5

    There are two factors that I think are driving the emergence of bioinformatics: culture and data explosion.

    When I was in college, the computer science majors "hung out" with the math majors, the physics majors, and the electrical engineering majors. Biologists hung out with the less analytical crowd. Obviously these are generalizations, but I believe a lot of "the problem" is that culturally biologists just don't have very good computer skills. Suddenly it is the case that biology as a science absolutely requires these skills. If you were one of the few (and some do exist) that broke the stereotype, you need to be starting a company about now. Otherwise the race is on for the biologists to learn programming and the CS-math-physics types to learn biology.

    Second is the fact that biologists are drowning in data. Projects like the human genome project are producing lots of data, but thats just the tip of the iceberg. There is already an exploding market in high throughput assays and measurement computation. The result is that the field as a whole simply isn't managing it's data well. Often groups store there data in extremely crappy formats. Custom text formats, asn.1, etc... I'm an Oracle programmer, so I expect the kind of solutions that Banks and .com's use: big iron data warehouses running heavy duty RDBMS's like oracle, DB2. Nope. I have yet to come across a single bioinformatics project that has a clue about data modelling. It's actually much above average to use a database at all, let alone well. If I was head of the NIH, you can bet that Freshmen biologists would take a class in SQL starting immediately.

    When you combine the two factors: culture and data innundation, very strange things start to happen. The data infrastructure just isn't there and worse a lot of people just don't realize it. Biology is presenting problems that require massive data warehousing solutions to a field whose main data background is calculating p-values to show the effect of a drug is significant.