Slashdot Mirror


User: ewanb

ewanb's activity in the archive.

Stories
0
Comments
32
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 32

  1. Perl in bioinformatics on Why Corporates Hate Perl · · Score: 2, Interesting

    As someone who has both written and read _alot_ of perl, in particular in Bioperl and Ensembl, in bioinformatics I have a rather love/hate relationship with Perl.

    I love: the low learning curve for people coming from biology, with alot of forgiving behaviour (in particular I think the auto-creation of datastructures as you use notation to fill in complex anonymous - think pointer based - structures). This is probably the critical one which means we can hire a much broader group of people with a much better understanding of biology and for them to be productive far earlier

    I love: the large and robust libraries accessing nearly every sort of database, web-app and other things you need

    I love: the consistency of behaviour between systems (don't get me started on Java or porting C++ code between compilers/library systems. Ugh! unbelievable pain as one starts using those languages and move between high end systems. Its C for the fast stuff and Perl for anything else for portability in my book).

    I love/hate: The (huge) amount of robust existing Perl code that we have in Ensembl and that works day in, day out on multiple outings

    I hate: The lack of clean objects. Why, oh why, oh why?

    I hate: The inability to switch on strong typing and bigger checking optionally in libraries - I know you can do more these days, but it is still clunky.

    I hate: switching the word "continue" (in C) to "next" (it gets me every time)

    I hate: having to always brace if statements

    I hate: operators designed for one-liners that gets in the way of good readable code - grep and map in complex lines are pet hate of mine.

    I hate: the tortorous cross-language capabilities - compare python's jython and other C-level compilers. Soooo much better.

    Interestingly I coded in python for about 6 months in the late 90s - very early on python - and lots python appeals to me. But then Perl came along, and lots of bioinformaticians were using it, and systems people were installing it by default on systems...

    Roll on Parrot. I want Parrot to be able to run
    Perl5 syntax code, Perl6 and Python/Java syntax
    all together, with easy ways to load in C level or compiled down libraries. That's what Perl needs to save it.

  2. Re:1 in 2000 people on The 1000 Genomes Project · · Score: 1

    Also, don't forget that each person has two haplotypes, one from each parent, so
    when one sequences a person, one captures the variation on two human genomes at once.

    Of course, this all relies on the coverage you sequence at, and one option for
    the 1,000 genomes project is doing this at low (2x?) coverage, using pretty sophisticated
    methods to combine statistical power between sample datasets.

    The "1,000" though is more a round number that is in the right range. it might well be
    1346 people or something like that (often some multiple of 96, as 96, or 4*96, 384
    is the standard size of a molecular biology "tray" put into a robotic system).

    We're going to have alot of fun at http://www.ensembl.org/ with this...

  3. Re:As a UK local government councillor ... on UK Gov't Considers Expanding Open Source Use · · Score: 1

    This is probably why you need a consultancy
    firm (dare I say it... IBM or someone) to show
    you what is going on. If I had time... I'd be
    happy to show you what is going on.

    Raw OpenSource generally only appeals to people
    who are confident about what they want and understand the IT problem correctly. Then you can
    get this stuff for free, off the net and set up
    things for just the cost of the time of the guys
    who installs it. And generally it is far stabler
    than any "commercial" solutions.

    But, in the absence of someone like that in your
    department, ring up IBM or RedHat (or hopefully
    they will see your post here, and some salesman
    will give you a call). You'll have to spend money
    at some point, but your total cost will
    be waaaay lower than a heavily marketted, (presumably M$oft) "solution"

    Don't dismiss open source straight out because
    the raw software doesn't come with a fancy brochure.... that's a sign of strength...

    (if you would like some more pointers, I can
    help you out. But... looking at your web page,
    you seem to have a high comfort level with MS
    stuff, so I think it would be slightly pointless
    unless you really want to learn stuff.

    At some point you will be using open source
    directly - you already do indirectly via web
    sites and email - so, you might as well build
    you skill set up sooner rather than later)

  4. Made me smile on Phone Plus Sensory Deprivation Equals... · · Score: 3, Funny

    The idea that people would actively get into
    a swimming pool and put on a helmet to answer
    a work phone call. The mental image... is
    quite worrying in some cases.

    Though I find the best thing about working from
    home is that people dont have my phone number
    here, so ... noone calls me. And I go to no
    meetings. Magical.

  5. ESTs are hard work on Researchers Revamp Human Gene Count Estimates · · Score: 1

    There is a comment somewhere down here which is
    really that noone knows how to convert a whole bunch of ESTs hitting the genome into genes. The EST data is *very* messy. We've looked at this recently inside Ensembl and don't see a big win from confidently placed ESTs. Our opinion is that the Ohio State thang is just somewhat enthusiastic
    researchers getting good PR for their work.

    Check out http://www.ensembl.org/ for the more sober-headed view of this.

  6. Slashdot PR again! on Fastest Commercial Supercomputer To Be Built · · Score: 1
    This annoys me. Slashdot are really happy to pander to the PR that these sorts of companies have but consistently turn down interesting stories about how we are trying make the human genome open and accessible for all, in projects like Ensembl. What are these guys really going do with this? Probably nothing. They don't look like they know what they are doing. And yet they get posted to slashdot.

    I wish Slashdot was more interested in the real science of the genome and less PR orientated. Slashdot aint what it used to be...

  7. Open source for genome data on Medicine And Open Source? · · Score: 2

    I like the article. The more of these sorts of
    articles that are around the easier it is for
    people like me to make an impact.

    BTW - on topic here somewhat - if you want to see
    an open source genome management system, take
    a trip over to

    http://www.ensembl.org/

    for your open source project ...

  8. Wow - CmdrTaco pissed off on Tech Stocks Tumble · · Score: 5

    That is an impressive show of being pissed off
    by CmdrTaco. I guess it got to him.

    Stock Market is a non-story to me as well.
    I don't think it should be commented on by slashdot either!

  9. Re:open source genome analysis & annotation tools on Celera Completes Human Genome. Sorta. · · Score: 1

    Yo Chris - thanks for the tag. I always feel that
    the signal to noise discussions on slashdot
    are pretty skewed. Who knows how this all going to
    pan out.

    I have to admit I think we have done pretty
    well with the latest bioperl. Kudos for you
    as well chris...

  10. Some web sites...open source as well... on Learning About Genetic Engineering On The Net · · Score: 5



    It always amuses me how clueless slashdot generally as group is about these things....
    Despite best efforts otherwise. It comes up as
    an "Ask Slashdot" related question regularly;
    slashdot posts pseudo-science stories or op-ed
    about cloning etc, and yet... slashdot hasn't
    attempted to *contact the actual scientists*
    involved to get their opinion.

    Yes - I have suggested this as an interview topic
    a number of times. Slashdot editorials are more
    interested in "wow-science" stories than real
    science. It annoys me. (but I still read slashdot).

    Here are some pointers:

    The largest public sequencing center in the world

    http://www.sanger.ac.uk/

    The US biological information portal

    http://www.ncbi.nln.nih.gov/

    The European biological information portal

    http://www.ebi.ac.uk/

    Some open source projects in this area:

    (The bio* group.)

    http://bio.perl.org/

    http://www.biojava.org/

    http://www.biopython.org/

    http://www.bioxml.org/

    Open source genome annotation project

    http://www.ensembl.org/

  11. The answer is .... kinkos on Net Access on an American Road Trip? · · Score: 1


    I have been a long UK -> US road traveller,
    and bizarrely the best thing to do is track
    down a kinko's - kinko's offer reasonable
    (still pretty steep) cybercafe type access
    but they are everywhere (even in knoxville
    tenesse for example)

    I never tried to get dhcp into an ethernet
    port. I don't think they offered it then (this
    summer). But you never know - if enough of us
    ask ;)

    ewanb

  12. Work with geek girls - and it all works out. on Want More Geek Chicks? · · Score: 1

    I have worked (closely) with two female
    programmers. One was an ex-physicist with strong
    java/perl/c skills and the other was a c programmer who used to code asynchronous signalling stuff.

    Both were/are great. And we get on well. And the
    work is good.

    It confuses me why there are not so many girls
    in the industry but I guess the best way to
    solve it (like most things) is just to live to
    your ideals. So - I don't worry about the
    sex/age/culture/race of the people I work with
    and that seems good enough for me...

    I think talking about it helps air some issues
    but doesn't really change much.

  13. Re:Molecular Biology and BioChem for hackers on Distributed Computing and the Human Genome Project · · Score: 1

    join in with ensembl and help us out. You
    would learn *alot* of biology v.quickly ;)

  14. Re:This was my idea. on Distributed Computing and the Human Genome Project · · Score: 1

    Thanks troc - just got around to reading this
    commnet.

    I have sort have appealed at the top to people
    to come along. People seem more interested
    in writing about patents than getting down to
    nuts and bolts of course....;)

    If there is anyone out there who would like to
    do this coding, as sure as hell I don't know how
    to it ;). But I know what to run...

  15. d.net coders wanted for DNA analysis on Distributed Computing and the Human Genome Project · · Score: 3


    It is clear from these postings that people would
    like the client to run. If there are people with
    experience in writing these sorts of d.net systems
    then please drop me a note. We have the problem
    for you to work on - it is just a question of
    figuring out how to do it.


    Drop me a mail (birney@sanger.ac.uk).

  16. Re:I think it's technically unfeasible on Distributed Computing and the Human Genome Project · · Score: 1

    There are aspects of the work which have
    a good data/cycles ratio. (surprisingly).

    I would read about the subject before you pronounce... ;)

  17. Re:warm and fuzzy on Distributed Computing and the Human Genome Project · · Score: 1

    Absolutely - see my reply to the post above yours.

  18. Re:warm and fuzzy on Distributed Computing and the Human Genome Project · · Score: 2
    Hardware at the moment generally are clusters of alpha boxes or intel boxes (running tru64 or linux respectively).

    The two big drainers on CPU for analysis are gene prediction (genscan) and database searching (blast). database searching can't be distributed easily as you have to worry about the database ;)

    However, there are programs like sim4, genewise and est2genome that could greatly help us and could be distributed.

    Genewise - you can download (I wrote it) at Wise2 est2genome is somewhere around as well.

    For the more general overview of the problem - check out ensembl for an idea of the project.

  19. Re:Difficult to distribute on Distributed Computing and the Human Genome Project · · Score: 2
    I assumme that the original poster did not understand what was going on ;). Like alot of slashdot in this case - concerned but not knowledgeable.

    Celera always talk about the assembly problem as they have gene myers solving it (he has) and think it is pretty cool. It is not trivial, but from my view (an annotation centric view) not the most important thing.

  20. Re:Difficult to distribute on Distributed Computing and the Human Genome Project · · Score: 3
    Lars

    This is only for the assembly and not for the analysis. With analysis you have a better data/cycles ratio. Assembly is done at the genome centres anyway...

  21. Re:warm and fuzzy on Distributed Computing and the Human Genome Project · · Score: 4
    Consell -

    Great that you were following the talk. I thought I put everyone to sleep

    The rate limiting step at the moment is effectively the mapping in fact, then sequencing. The interesting thing about the analysis is that the amount of CPU is unbounded. If we have more CPU we just use more accurate algorithms. We can do something within the CPU bounds on the hinxton campus, but if anyone wants to give me a super computer, then we could get more accurate analysis.

    I can always use more juice!

  22. Re: cycles/data on Distributed Computing and the Human Genome Project · · Score: 1
    Bioinformatics generally has a very good cycles to data ratio - ie - we have algorithms that take alot of cycles for very little data. So it is feasible...

    Does anyone want to write it? If so - I have alot of CPU hungry algorithms to run.

  23. Open Source Genome Projects on Distributed Computing and the Human Genome Project · · Score: 5
    There are some good open source genome projects for doing this efficiently - and we do welcome help of any kind. Here are some open source projects which I know about/work on/

    • ensembl is an open source genome project designed to get as much data and software into the public domain as possible
    • EMBOSS
    • bioperl
    All these are well backed, strong open source projects with different strengths. Everytime genome stuff comes up on slashdot I try to point these things out to people, but everything gets lost in the noise about people $%!"'ing on about patents (generally without alot of knowledge!).

    Anyway - check out these projects for more information about real open source efforts in biology.

  24. Not so impressed on SourceForge Goes Public Beta · · Score: 1
    I could not submit a bug in the source forge bug report area (doh. Can't even submit a bug that the bug submission does not work!)

    Had a dodgy certificate that explorer didn't like...

    And the projects that are there seem to be focused mirroring other projects

    Finally - could you/would you trust someone else to keep a server up 24-7 for your source code? My experience of projects is that they need more than cvs/mailinglist. They need coordinated web site and people close by to make it all work

    So. I am not moving from my work machine yet. But I guess this is the way things are going to go

    ewan

  25. Re:Perk/TK front end to readseq on New Genetic Information Web Portal · · Score: 1
    Check out bioperl. In particular the new 0.6 series (just available via anonymous cvs). Bioperl is more up to date than readseq, and it is in your favourite language.

    Bioperl at bio.perl.org