Domain: bioperl.org
Stories and comments across the archive that link to bioperl.org.
Comments · 21
-
Re:So what?
And comparing Perl to biochemical engineering, I can see you have no idea.
Sorry, I forgot the humor tags.
Your response was such that I decided to go back and look at your other posts. You seem to not normally snap at people, but you did that time. I think what you missed is the common complaint that Perl is a language riddled with side effects. I code in Perl as well and I use them myself. Add a chemical to a body, that has a specific target receptor you are aiming for (or a target effect), and we normally find many receptors and/or many effects. That is why so many promising drugs fail. They have too many side effects. From a simple persons perspective, caffeine makes you feel more awake, more energized, but starts a ruthless cycle of actually making you less energetic (similar to sugars). Same thing for Meth. How about NSAID pain killers. Their task is to block pain. But, they have some very nasty side effects. Not necessarily common, but present and enough of a problem to be wary of their existence. That is what I mean by side-effects. Perl is a great example from a programming side (especially on
/.) as most people here tend to undertand a few things about Perl. Ease of writing unreadable code (Which I believe is universal in all languages) and the gotchas Perl presents for side-effects (DWIM-isms). Yeah, Mr Obvious could tell there is no comparison between Perl as a language and biochemical activities, but the side-effects because of DWIM and the mess of obfuscated Perl seem to paint the right picture from what I have touched.There are no programming languages that can be compared to anything in biochemical engineering that I am aware of.
Yep, engineers are the ones that make the machines, but most of them still can not use them themselves to solve medical problems. Kind of like the guy who makes hammers but can't get the nails in. I want my hammer from him, but he is not building my house. I want my MRI from the best engineers, but I still want a good doc for what ails me. I have used Perl in gene analysis in the past. Better tools exist today, but Perl is still great for some things. We have used (and still do use) BioPerl to get things done amongst other tools as well.
InnerWeb
-
Re:Messy Speghetti Help
We are predominantly Perl programmers at least on the European side - http://www.ensembl.org/ is the European based Genome browser (probably a million lines of Perl)... plus most of the http://www.sanger.ac.uk/ Wellcome Trust Sanger Institure data manipulation and presentation is in Perl...
See also http://www.bioperl.org/ -
Re:So which programs do you use?
I am not sure why simply because it is about one of many available tools, the post is out of place on Slashdot. I am not a member of a huge biochem or medical lab, but I am trying to learn and use biochemistry, so I can use every bit of help.
It's out of place because the announcement is somewhat akin to posting a front page article when some guy releases version 0.1 of a new text editor onto Sourceforge. It's been done a million times before, and it doesn't cover any new ground. It isn't even interesting to people who don't use text editors.
That said, if you're really trying to get a handle on biochem and molecular biology (and the bioinformatics that goes along with it), almost all up to date textbooks on the subject include a section (or more) on bioinformatics. In 2006, knowing how to perform basic analysis on your DNA or protein sequence is just about as important as understanding the concept of a gene, or how the complementary nature of DNA works. If the textbooks you currently have are a little out of date, take a look around the library and grab something more recent. There are also plenty of bioinformatics and sequence analysis textbooks on the shelves now.
If you're looking for some places to get started, (and I think someone has already mentioned these), try ExPASy . Although it's more protein oriented, it has an extensive list of links to a very broad cross-section of bioinformatics and sequence analysis tools (along with some tutorials). Also take a look at NCBI, which not only has a range of important tools (like BLAST), but also PubMed. In a similar vein, also explore the EBI site which has another extensive set of tools and databases.
Since you ask, some of the stuff that I commonly use for bog-standard molecular biology tasks (in addition to the links above) includes PlasMapper (finds restriction sites and generates tasteful plasmid maps) and the New England Biolabs site which has some similar tools (NEBcutter, for example), but also handy information on all the restriction enzymes themselves.
If you're into writing bioinformatics applications yourself, start by looking at something like BioPerl. Just using Perl as an example (since it's very popular in biology), there are pre-existing libraries, all fully open sourced and Free(tm), which do things like reverse translation and interfacing with analysis tools like BLAST already.
That's just the tip of the iceberg. Anyone getting started in molecular biology will discover these kinds of sites very quickly. They're mentioned in the textbooks, they're easily found with Google, and they'll be revealed after a 2 minute conversation with anyone working in the field. That's what make this story so pointless. There's nothing new here. It's all been done before, and done 500 times before at that. Even outsiders from other sciences will discover this kind of stuff within a day or two if they're actually serious. -
Oriented Towards Servers?I don't quite know what you mean when you claim that Bioperl, Biopython, EMBOSS and BioConductor are more oriented towards servers than stand-alone applications. First of all, servers and stand-alone applications don't divide up the application world into mutually exclusive parts. Applications can be stand-alone and run on a server for example. I've built applications using Bioperl that have a GU interface (take that grammar nazis!), and people are extremely happy with them. So, if you have a Perl guy nearby, I highly recommend talking to them about your problems.
Secondly, translations? Database searches? Sounds like you're doing some very basic Bioinformatics work. Not to say that your research isn't meaningful, just that the problems you're approaching are easily solved by a computational biologist. For example, here's a snippet of Bioperl code that will read in a set of GenBank sequences, translate them and print the results to a new file:
my $seqin = Bio::SeqIO->new( -file => 'myseq.gbk', -format => 'genbank' );
my $seqout = Bio::SeqIO->new( -file => '>translated.gbk', -format => 'genbank' );
while ( my $seq = $seqin->next_seq ) {
$translated_seq = $seq->translate;
$seqout->write_seq( $translated_seq );
}
Seems pretty simple, right? There are similar, simple wrappers around BLAST, FASTA and some other common algorithms in computational biology. Check out the Beginners HOWTO on the Bioperl website, it explains Bioperl without requiring previous CS experience. I think it's a good intro, but I also wrote it so I'm slightly biased.
If programming is not your style, check out JEMBOSS. It's a Java-based GUI wrapper for EMBOSS.
Cheers and good luck. -
BioJava BioPerl?
Isn't this already done in the http://www.bioperl.org/BioPerl and BioJava initiative?
Of course this is focused towards gene annototation and stuff, I'll RTFA. -
True
The similarity between open source and the academic process with their 'you share, I share' principles is shown by the human genome project.
Very true. "If you have an apple and I have an apple and we exchange these apples then you and I will still each have one apple. But if you have an idea and I have an idea and we exchange these ideas, then each of us will have two ideas." -- George Bernard Shaw (1856 - 1950)
This is propably even more insightful when applied to biotechnology than to software at large.
Speaking about biotechnology and free software, check out the bioperl project:
"Officially organized in 1995 and existing informally for several years prior, The Bioperl Project is an international association of developers of open source Perl tools for bioinformatics, genomics and life science research.
"Facilitated by the Open Bioinformatics Foundation we work closely with our friends and colleagues across many projects including biojava.org, biopython.org, DAS, bioruby.org, biocorba.org, EnsEMBL and EMBOSS.
"The Bioperl server provides an online resource for modules, scripts, and web links for developers of Perl-based software for life science research. We can also provide web, FTP and CVS space for individuals and organizations wishing to distribute or otherwise make freely available standalone scripts & code."
Very interesting.
-
Re:because perl is a pig that runs out of memorylike on small amounts of data or when performace doesn't count.
So I guess genome processing doesn't count as a counter example to your FUD?
-
Re:because perl is a pig that runs out of memory
perl thrashes and crashes trying to crunch seriously large amounts of data
Yep, that's why it did this. -
"Scientific Applications on Linux" page...
It's not 'hard numbers', but then, a lot of people have already pointed out that hard numbers may not REALLY be what you want. (After all, since when is "Everybody's doin' it" a persuasive argument for a good scientist?)
On the other hand, I see there are still lots of applications listed at the Scientific Applications on Linux site and the NCBI Toolbox of Bioinformatics code compiles and runs just fine on my linux box, and BioPerl, BioJava, and BioPython all run just fine on Linux (there are even a couple of fledgling BioPHP projects out just getting started out there, which will obviously also work.
Disclaimer - both of the semi-active "BioPHP" type projects that I know of - Here and here - were started independently by individual amateurs...and one of them is me. Both projects are still in the early stages (Genephp has more code available at the moment) and have different development approaches, but are slowly working on trying to combine development towards a 'formal' set of "BioPHP" modules. Blatant plug - if you are interested in helping with friendly advice or actual development or testing, please join the mailing list which both projects use)
-
Re:age-old answer: it depends
Just like all software, this should be "the best tool for the job." It is often determined by arbitrary personal preferences and resource availabilities (using Matlab over Mathematica just because there are licenses laying around).
I've done some genetics work using combinations of Matlab, Perl and C. I used Matlab for the heavy algorithmic implementation, with the really intense bits in C, and Perl to do the pre/post-processing. I've also used the Simulink package in Matlab to do some chemical signal modelling.
I've used the above software on combinations of Intel/Windows and SPARC/Solaris platforms (my employer didn't mind if I burned CPUs over the weekend). In that case, taking advantage of all that hardware (a few million $s worth) influenced my choice of language/toolkit. I know that a lot of quantitative analysts seem partial to C++. Many professors, especially the older crowd, use a lot of FORTRAN. One guy I know, doing high-energy work at CERN, uses Java. Some students, when I was an undergrad (late 90s), used something called IDL (maybe it's "Interactive Data Language") to process some astronomical data.
As another poster mentioned, physicists (which is mainly the category I fall into) tend not to trust any off-the-shelf stuff (hardware or software) and will typically look to build things themselves (or at least understand what they've bought in intense detail). Although, I've never gotten involved in serious computational physics (other than typical post-processing and visualization on data I've collected).
One statistical mechanics grad student knew all the details of the least-squares and regression implementations in Excel. Actually, you'd be surprised how many people will use Excel for simple calculations and visualization. It's quick and easy, once you know what you're doing, so why mess around doing it with GNUPlot or something else (which is equally/more efficient, but is another tool to learn)?
In the end, whatever requires the least amount of the researcher's time is the best solution. A little extra computing time (in many, but not all, cases) is negligble when compared to the time spent coding. The UI doesn't need to be slick, the performance (probably) doesn't need to be great (except for detector software that's generally real-time). All that matters is that the results are correct/accurate (and hopefully repeatable). -
Re:Bioinformatics runs on Open SourceOr go straight to bioperl.org, one of the absolute key projects for bioinformatics.
In case you are wondering, there is a biopython, biojava, biocorba, bioxml, biruby, etc. but perl is really where 90% of bioinformatics is done (simply because, in the end, all it is is text processing)
-
Computers, Perl, and Bioinformatics
Go into a field that mixes computers and science. Like say bioinformatics or molecular modeling. I'm fairly ignorant on these subjects, but they seem much more interesting to do on a day to day basis. So that's where I'm trying to head.
A year ago I left the programming and management world to go back to get my Master's. The university I'm attending just started offering an option in computational biology. Once I started the computational biology option, it's been tremendously exciting. I've been approached by biologists who want me to roll my thesis work into their efforts--data mining biology-related data, etc. I've also been told by the department that biotechnology companies are just throwing grant money towards bioinformatics like crazy. If I decide to get my Ph.D., I'm assured it will be paid for.
And the best part of all? Check out BioPerl and bioinformatics.org. Open Source is quite popular in this field. It's incredibly refreshing to be hacking away at problems that don't involve the same old corporate data warehouse.
-
Open source bioinformatics tools
Get open source bioinformatics tools from:
bioinformatics.org
bioperl.org
biojava.org
and even www.cvbig.org for a talk on bioinformatics with PHP/Ming -
from file_format_control-2-network_access_control
yep, stay away from turning your data over to a third party. Heck with codeweaver, you can run all your ms office crud on linux(read the msft eula, they might go after you if you run their code on a linux box). Better yet, just ditch proprietary code all together and turn your box into a dna_sequencer
.Hecky, with all the $$ to be lost by "allowing users" to do such things, the biotech industry might just jump in along with the music and movie conglomarates.
--If you truely care about poverty, reform your IP laws.
-
More about the story
* Project page on freshmeat (sources, cvs, mailing lists, etc.)
* Bioperl Documentation
* Bioperl's Gene Object in UML (very nice diagram)
* Beowulf & Bioperl discussion
And related stuff that may interest you...
* BioPython
* BioCORBA
* BioXML
* BioJava
* BioRuby
* BioExchange Software tools (other tools for working with bio*, interesting.) -
Re:perl expertise. sure.
Check this out:
Bioperl
Biopython
Biojava
BioXML
BioCORBA
I couldn't find anything for ruby (either linked from bioperl, as those were, or on their own app list) but you can bet it's coming. I'd personally love to see it. But there's plenty of options for bioinformatics other than perl, although perl's excellent text handling makes it a very suitable choice.
"I may not have morals, but I have standards." -
Re:Why just Perl?While Perl is great for cranking out some web sites with high mutation rates anyway, IMHO Perl is a maintenance nightmare.
Anyone tried to do non trivial changes to his old Perl programms?
My Perl programs were those that were the hardest to get understood after I had stopped working with them for some weeks. It is usually easier to rewrite them.
Nonetheless I valued the good performance of Perl programs and was thus sceptical to other kids on the scripting language block, like Python.
Months later, I must say that the much saner syntax of Python, the formidable documentation and the large library have changed my scripting preference from Perl to Python. Like Perl, Python has been ported to a lot of platforms.
Ruby is a language I have not looked into yet. Its strong Japanese supporter base, has led to a lot of FreeBSD ports. So I might have a look soon.
BTW, there are bioperl, biopython, bioruby and biojava efforts - anyone spotted a bioc or bioc++ one? And some dork registered www.biofortran.org.
-
Re:Applying Open Source philosophy to BioinformatiIn addition the skills requirements usually include advanced degrees in biology or statistics, things few average programmers can offer.
There are a lot of open source bioinformatics projects. These are typically spawned by university or other public research projects. You mention Python and Perl, so try bioperl or biopyhon for a start.
The one thing I didn't like about the biotech industry was how their research and information distribution was tied closely to their purse strings.
You will have a lot of open source (where the majority of development money will come from public research funds) and a lot of commercial applications.
It is unlikely that a large bunch of hackers will revolutionize this field. This is because you need a lot of domain specific knowledge and because a lot of work that needs to be done is too tedious or uncool to attract open source people from outside the bioinformatics field.
Something like the Gimp could be done, because nearly everyone needs such a tool - but who needs for example a multiple sequence alignment editor besides biologists?
Did you see some open source satelite control software or hydrodynamic simulation from outside their engineers communities?
-
find good hackers in science open source projects
my particular bias is life science it will show with the following URLS: bioperl.org, biojava.org, biopython.org. Find projects like these in whatever dicipline you are interested and you will find lots of science-capable programmers either actively contributing or participating in discussion. Don't spam the mailing lists with job listings of course - be a bit more subtle and find out things like what school programs are producing these folks, what conferences & events they attend etc. etc.
-
full text of Celera/Science 'call to arms'letter
The open letter from Ewan Birney and Sean Eddy that the genomeweb article talks about can be read in full at:
http://bioperl.org/pipermail/bioperl-l/2000-Decemb er/001826.htmlIf you plan to write to Donald Kennedy check the listserv email thread to read about a correction to his email address.
-chris
-
Re:The Genome project uses PerlLizardKing said:
Most of the code behind the unravelling of DNA is written in Perl. And they even make it freelly available ...Sure, that's the bioperl effort, as well as some related projects. You can even see some of my contributions at bioperl.org. However, that's bioinformatics and not chemistry.
I've yet to see Perl used intensively for building chemical compounds, or doing molecular dynamics simulations, or doing substructure searches, or visualizing 3D structures. I have seen Python and Tcl code for those tasks.
To repeat myself, Perl is used a lot in bioinformatics, but rarely in chemistry.