Biohackathon
wjv writes: "Open source Bioinformatics hackers from around the world are meeting in the
first ever Biohackathon to hack, eat, hack, sleep, hack... The South African
Business Day has the scoop, or see our weblog. The
event is co-sponsored by my employer
and O'Reilly. I'm typing this from the
hackathon, and you wouldn't believe the buzz... or the scenic venue!"
Interesting venue to hold a Biologicaly minded event. Many Capetonians will not go to the Oudekraal hotel, when the hotel was developed about 3 years ago there were large protests against developing on that part of the mountain due to ecological sensetivity, the fact that it is one of the last stretches of the coastline that isn't developed and its proximity to a kramat (burial place of a muslim Holy man). They also demolished a historic homestead to build the thing...
I am the director of a core molecular biology laboratory with a focus on agricultural genotyping at a major midwestern university. I am happy to see that there is an interest in providing better downstream tools for data analysis.
My main area of concern however is the lack of good tools to take the raw data from sequencing machines and produce genotypes. Most software available is vendor specific, closed source, not very robust and extremely expensive. The closed source vendor specific solutions which are available lock up the data in proprietary databases, making it difficult to migrate to equipment from other vendors in the future and to get the data organized for many projects. All the software (including the few open source projects that exist) that I have evaluated has the same basic flaw, it starts with the premise that the lab will use them to screen against an existing database of genotypes (for disease or pedigree). These are fine medical applications (for which they were developed) but does not address the needs of the basic research laboratory which is working to discover the genotypes to begin with.
I would like to build an open source application that gives the user the freedom to choose the data collection platform, the freedom to move the data from one application to another and the freedom to improve and expand the application itself. I face two challenges: 1) Administration that says "open source, why would we want to use shareware". This one I'm addressing by building the information infrastructure using Linux. 2) Finding qualified programers that would like to work on the project. (I'm a pretty good admin, but am not a programmer).
The need for this work is great. In talking with other people in my field, I've found that the key thing they want to know is what software are you using to do the raw analysis. No one is satisfied with the current situation, but most of these are old school and don't know anything about opensource software. I've showed them that we can use existing open source software to run the lab. I'd like to show them that we can develop our own software to do some of the basic work. Any volunteers?
http://www.extremeprogramming.org
You wouldn't believe the lack of anarchy among these people. They sound young, but there is a lot of personal discipline in that room.
The best product is the one that is tested and evolves with that experience - and this is working code, used in anger by the human genome project.
Hey, check out http://www.ensembl.org and see what you think.