Bioinformatics in The Economist
Erich Schwarz writes "Bioinformatics has gone from being an esoteric sub-field to being a business. The Economist gives a useful overview, while warning 'Bioinformatics is not for the faint of heart...'"
← Back to Stories (view on slashdot.org)
If I had to do it again, I'd definitely choose biology or bioengineering or something related.
It seems as most everything in computer has "been done", and biology/chemisty/biochemical engineering seems to be where all the fun & excitement is these days.
Anyone else agree? Just curious.
there's a ferrari driving around downtown calgary with a plate "bioinfo"...
;)
reminds of the "I love linux" plate someone had on a lamborghini in the late 90's, which used to be shown off at linux shows and stuff...
some people will always make money off the stuff we give for free (like genes
Is it really what we want/need as humans? I'm not sure. But I for one won't wager a guess until there's more research done in the area, so I say let's explore it more before we defame it conclusively or support it as a technological breakthrough.
Some other recent news items:
... Development Agency (NABDA) and the United Nations Education Scientific and Cultural ...
Nabda, Unesco Collaborate in Bioinformatics Training
AllAfrica.com,Africa-05 Dec 2002
Organisation (UNESCO), penultimate Tuesday held a two-day Bioinformatics
Bioinformatics ahead for Danville
... Developing these plants will involve both horticulture and bioinformatics and will ...
Danville Register and Bee,VA-30 Nov 2002
be one major focus of Danville's Institute for Advanced Learning and Research
The race to computerise biology ...
Economist (subscription),UK-12 Dec 2002
Welcome to the world of bioinformatics--a branch of computing concerned
with the acquisition, storage and analysis of biological data.
Observing Proteins And Cells In The Wild: Quantum Dots May ...
... Today it is internationally renowned for research and graduate education ...
Science Daily-13 Dec 2002
in the biomedical sciences, chemistry, bioinformatics and physics.
Biological processing units!
Imagine being able to create a creature which is basicly a living supercomputer! It will break teh limitations of current cpus
We already created a polio virus from scratch, and we are trying to create a organism from scratch.
(bad joke ahead)
Imagine living beowulf clusters, we could create a cell with cpu like properties, they would reproduce by splitting, and your biocomputers computing capacity would double every few minutes!
Now if each cell could perform 1 megaflop then a petaflop computer would need 1,000,000,000 cells. 2^30 is appox 1,000,000,000. So if it took 10 minutes for cells to split, then a petaflop biocomputer could be grown in about 5 hours!
Proteomics will be THE next medical frontier. Maybe we will finally understand how proteins work and consequently, how living things are built. That will eventually lead to real genetic engineering and maybe an organism could be constructed from scratch.
The biggest trick the devil pulled was letting lawyers become politicians so they can write the laws.
What, so you can represent each possible codon as a single bit? Last I checked, you can represent 64 values with a measly six bits. (2^6 = 64).
...
Run, don't walk to bioinformatics.org and contribute!
The first O'Reilly bioinformatics conference rocked. Shame I wont make the next one in San Diego - I get to go to Adelaide for the ISMB in June instead
I don't read your sig, why do you read mine?
I'm not sure from whence came all your 'hences' but the sixty four combinations of ACTG only translate into 20 (some argue a few more) amino acids and start/stop signals. The system is highly redundant to lessen the impact of single-base polymorphsims (i.e. if a codon is CCA and the second C is copied as an A accidentally the same amino acid still can be produced in the end-result protein chain).
... it's not 64bit computation. It's ~23bit computation ... but all those other buzzwords are fun too. :)
So hence
-j
Yes, it's an important field. It's been an important field for decades. And it's going to continue to make steady progress, not because of, but in spite of the attention and hype, and the stupid patents and opportunism that come along with it.
I'm not dissing the article completely yet (as I haven't finished reading it, and don't know if I'm completely interested), but I find it wonderful how ignorance among press still prevails.
There is no science, (apart from Math itself - which I consider more of an Art) that has mathematical exactness in it. The word science comes from the latin root of scientia, and means knowledge. Sciences are disciplines where as much knowledge about the existing (thus empirical) world is gathered as possible, and models are generated based on this data.
Mathematics on the other hand, derives from Axioms, and Logic. Both of which aren't derived from the empirical world. And I say it's much more akin to Art because it is a skill that you develop to be a mathematician: you forge out of simpleness new more complex theorems. You are 'creating' them... (in science, you are looking for them).
To make a long story short, there is no such thing as a mathematically exact science.
Hmmm. Seems like it would be easier to say that everything which has been done in computer science has 'been done', whereas everything that hasn't been done in computer science, 'hasn't'.
Seriously, though, you may be mis-categorizing your subjects. Look at computers as computational entities, rather than disk drives, monitors, and so forth. In that case, an optical computer or a biocomputer operates on many of the same systems priciple as a 'digital computer', and there is therefore much to be done in the field of computer science.
Absolutely. Optical computing is getting some great advances in Holographic Video at the MIT Spatial Imaging Group. And chemical computing is advancing nicely in Carbohydrate Chips at the University of Chicago.
For my money, I'd bet on optical video cubes, 3D television, and biochips in the future... which are all applications of computer technology. Remember, 'computer' use to refer to the job title of a person.
For my money, I think that the future has got SnowCrash, Cryptonomicon, Neuromancer, Count Zero, Mona Lisa Overdrive, and Johnny Mnemonic written all over it (and maybe a bit of Jurassic Park.
What you can do, however, is apply computer science and engineering skills to biological problems: work as a developer or engineer for a biotech company or lab.
"player 4 hit player 1 with 0 stroms"
Last I checked, you can represent 64 values with a measly six bits. (2^6 = 64).
But if you're searching for some pattern, representing each possible alternate codon as a separate bit has its applications.
Will I retire or break 10K?
What, so you can represent each possible codon as a single bit? Last I checked, you can represent 64 values with a measly six bits. (2^6 = 64).
0 000000000000000 000000000000000 000000000000000 000000000000000 00000000000000
That's fine, but the point is that one can make sort of a 'virtual gene' or 'genetic codon probability template', which happens to be optimized for a 64 bit computer, utilizing genetic algorithms. I agree with you that you can represent 64 values with six bits, which isn't exactly what I was refering to. The idea is to map each possible codon onto a bit of the packet which is going to analyzed by the processor. Probably, one will want to create a big-Indian or little-Indian (endian, whatever) ordering scheme.
For example:
AAA = 1000000000000000000000000000000000000000000000000
AAC = 0100000000000000000000000000000000000000000000000
AAG = 0010000000000000000000000000000000000000000000000
AAT = 0001000000000000000000000000000000000000000000000
etc.
and so forth. Therefore,
1111000000000000000000000000000000000000000000000
would equal ( Protein | AAA, AAC, AAG, AAT).
Great for optimizing a mainframe for crunching through a combinatorial space of codons, searching for proteins.
I graduated from Rutgers U. (decent NJ state school) in May 2000 with a bachelor's in biology. Back when I decided to major in bio, I really enjoyed studying the field, and (high school) teachers were telling me that molecular biology was the fastest growing job sector. So by the time senior year came around I began looking for a "real" job. I checked all the papers, company websites, monster, etc. and realized that there really are NO good jobs out there for biologists. There are a few bioinformatics jobs, but guess what, you need a CS degree for them, not biology. I ended up getting a lousy job as a lab technician paying around $14 an hour (which took several months to find, btw, and this was when the economy was booming), and I'm extremely unhappy. I've completely lost any love I had for this field. Say what you want about money not being the most important thing, but when you can't afford to do anything you want, your life gets miserable very fast, I don't care how great your job is. Biology jobs typically have zero mobility and are extremely underpaid. You think there's a flood of people graduating from CS? Biology is far worse. My CS classes have about 20-30 people in them. My bio classes had hundreds.
Just recently I decided to go back to Rutgers for a second bachelor's in computer science. Not only are the job prospects better and higher-paying (even considering the dot-com collapse), I've always enjoyed working with computers and my grades are actually far better (although I was never a bad student). I'm actually having fun in my CS classes, I never had fun in bio. The classes were more like a chore. Another horrible thing about biology is that you really don't learn anything practical in college, you just memorize facts. At least with computer science you learn many useful tools to make you a competant programmer, which is actually a marketable skill. Nothing about biology is marketable. I highly discourage anyone from majoring in Bio unless they seriously, seriously love it and intend to get a PhD and devote their lives to research without regards to trivial things like money and job prospects.
Karma: Excellent (In Soviet Russia, karma pimps YOU)
More commonly known as "gene chips", microarrays are to the genetic revolution of today what microprocessors were to the computer revolution a quarter of a century ago. They turn the once arduous task of screening genetic information into an automatic routine that exploits the tendency for the molecule that carries the template for making the protein, messenger-ribonucleic acid (m-RNA), to bind to the DNA that produces it. Gene chips contain thousands of probes, each imbued with a different nucleic acid from known (and unknown) genes to bind with m-RNA. The resulting bonds fluoresce under different colours of laser light, showing which genes are present. Microarrays measure the incidence of genes (leading to the gene "sequence") and their abundance (the "expression").
The analogy that comes to mind is a coin sorter. Is this an accurate analogy? It also appears that there is not necessarily an existing "slot" for many genes, like what happens if you get a coin from a country that your design did not include. You don't know where it will end up.
Table-ized A.I.
Thank you. =)
I agree that the problem can be implemented with a 23bit model. I believe that model is valid in many cases. However, I see the 64 bit model as being more accurate and representative of the underlying mathematics of genetics. Moreover, one can do the calculations with a 64bit model, and utilize a 23bit model as a check... if the 64bit model runs correctly, the results should be reducible to a 23bit explanation, and can be checked via another algorithm.
Also, I believe that the 64bit model can model a bunch of stuff that the 23bit model leaves out, including mutations, loops, reverses, and so forth. (i.e. aaa.aaa.CCC.ggg.ggg.ttt.ttt becomes aaa.aaC.CCg.ggg.ggt.ttt.ttX) I forget what that's called, but you get the idea. Deletion! Ah. Deletion, mutation, etc.
Anyhow, the 64bit model is really slick mathematically, and one can do really crazy cool masking and information analysis on the matrices which are formed.
I kind of hope I don't get modded down for this as I am totally serious: This is one of those posts where I didn't know whether to mod it "Insightful" or "Funny." Perhaps we need the new mod: "Lost"
Sunny
Be my Friend
Perhaps some of these bioinformation engineers should spend a little time on security. I tried to go to the website of one of the companies referenced in the Economist article and got a defaced website:
'Bioinformatics is not for the faint of heart...'"
... They're not kidding! Even the name is hard to spell!
Since when has this country used intellectual elite as a pejorative term?
Bioinformatics is a fun interesting field. I worry however, that it may be a little overhyped. People who are interested in bioinformatics need to realize it is a very (albeit cool) specialized field. There aren't going to be a million more bioinformatics researchers in the world. The demand for these researchers just isn't there (and won't be there in the near future). For example, a search on hotjobs reveals only 51 listings for the keyword bioinformatics and nearly 900 listings for programmer.
That said, bioinformatics is exciting. If a computer savy person is interested in getting into it, they should intern or work for a researcher/professor on a research project. You might be surprised, however, when you find that working as a programmer elsewhere pays 2-4 times more.
-Sean (sdm@stanford.edu)
An interesting overview about CI can be found at Nature.
Still, you need dedication for this job: A Ph.D. in chemistry plus solid computer science knowledge is still the norm. But those few who qualify are really sought after.
Disclosure: I am the Director of Chemoinformatics at start-up ChemCodes (www.chemcodes.com), so I know what I am talking about.
Same here. I just got my mod points message, and starting browsing this article to see if there was anything I should mod. However, instead of "Lost", what I think what we really need is "+1 Completely confusing to the mod but the poster sounds like he knows what he's talking about"
live(free) || die;
Nah. I'm not exactly 'lost'. I did live in the hometown of Gregor Mendel for a year during high school, where I studied mendelian genetics (Received a scholarship, via the 1994 congressional information act, and went with the Youth For Understand program). I also studied genetics at the University of Chicago's Department of Ecology and Evolution. Anyhow, I happen to know alot about genetics, actually... to the point that I'm making comments based on 'insider information' much like many people make 'inside jokes' which don't make sense to other people.
I would suggest the new mod: "Inside Info"
Bioinformatics is a fun interesting field. I worry however, that it may be a little overhyped.
Okay, being burned by past bubbles, how can I *this time* around make money from the poppage of future bubbles? (No, "stock puts" are too expensive for me.)
Table-ized A.I.
goto www.lanl.gov and click on the 'jobs' tab then the postdoc link.
here is one example:
Summary: Postdoctoral Positions in Protein Bioinformatics and Structural Genomics: The Bioscience Division (B-2 Group) is seeking 2-3 highly motivated researchers for immediate openings to work with our interdisciplinary team of Bioinformatics and Structural Biology. Research activities will focus on the development and application of methods in Functional and Structural Genomics, including: 1) inference of function in proteins based upon structural and sequence information; 2) prediction of protein structure, protein binding, ligands, and active sites using both ab initio approaches and experimental information; 3) identification of signatures of pathenogenosis; 4) annotation and analysis of selected genomes; and 5) creation and curation of annotated protein databases.
Required Skills:Experience in at least 2 of the following areas is required (more than two areas of experience is highly desirable):
- Protein structure modeling or protein-ligand analysis or other related modeling
- Background in molecular biology, or microbial pathogenesis, or related fields
- Experience with the common sequence analysis tools for Blast search, sequence alignment, phylogenetic analysis, etc.
- Drug design, or protein design or protein structure predictions or docking
- Functional annotation of putative genes based on literature analysis
- Curation of biological databases and web programming
Desired Skills:Knowledge of one computer programming language (e.g., Perl, Python, FORTRAN, C++). Use of common molecular graphics tools such as Pymol, Xtal. Research in genomic sequence analysis or protein structure. Familiarity with SQL databases, unix, and XML is useful. Education:A Ph.D completed within the last 5 years or soon to be completed is required. Notes to Applicants:Starting salaries range from $59,300 to $67,300. For further technical information about the position and the project, contact Charlie Strauss at cems@lanl.gov (505-665-5838), or Murray Wolinsky at murray@lanl.gov (505-665-0952).Candidates may be considered for a Director's Fellowship and outstanding candidates may be considered for the prestigious J. Robert Oppenheimer, Richard P. Feynman or Frederick Reines Fellowships. Please see Special Postdoctoral Fellowships for further details.
For general information refer to the Postdoctoral Program page.
Some drink at the fountain of knowledge. Others just gargle.
http://www.ncbi.nlm.nih.gov/
and
http://genome.ucsc.edu/
Ah, just what I need: another sarcastic anonymous coward.
Sorry for sharing knowledge. I'll keep my mouth shut and write closed source code in the future.
Ummm. That's exactly my reasoning. 26bit computing is perfect for analyzing language, but only as it is written in the Roman Alphabet. Other alphabets, such as Cyrillic, would need other platforms.
And exactly my point that they are both related to molecular biology. But seriously... James Watson and crew, who theorized the double helix structure of nucleic acids back in the 50s, were part of a growing trend of molecular biologists who were investigating stereoscopic visualization of chiral molecules. And you are correct that visualization techniques and databases don't necessarily have anything to do with each other. Genetics, however, encodes what is fundamentally a chiral molecule and is therefore fundamentally encoding stereoscopic data. So, there is a connection, exactly at the molecular biology level.
I'm not certain I follow the reasoning as to why 64-bit computing is ideal for genomics. I mean, it's generally going to be faster and more efficient than 32-bit computing, but that really has little to do with codons. I don't know why you would need to assign a 64-bit number to an element in a set of 64 elements- as other posters have pointed out, you'd only need a 6-bit set of numbers to label 64 things. Besides, saying there are 64 codons is a tad naive, since it doesn't account for things like post-translational modifications to amino acids and nonstandard tRNA anticodons. I hope no one has tried to study the translation of something like collagen (full of modified amino acid hydroxyproline) thinking that the codons were the last word in the formation of the protein. I don't see why 64 bits is optimal as far as the crunching is concerned- are you saying that a 128-bit computer program would for some reason not be as suitable to the task?
"FDA staff reviewers expressed concern about the number of patients who were left out of the study because they died."
Oh, I think that 128bit would be a good platform, as well as any other multiple of 64, especially 256bit computing. Although, due to chunking, I would not attempt to do serious genetics work on a 32bit computer.
And please don't think that I'm trying to give the last word on this subject. I'm merely trying to point an optimal method for number crunching. There are certainly many optimal algorithms. I suspect that each algorithm will ultimately produce different results, so it's important to consider methods to be used before going out and spending alot of time and money coding a project.
And, as you said, 128bit could do the task. 64bit chips are more available on the open market at the present time. Now, a 256bit processor, with an 8bit coprocessor, could do some amazing work in a number of applications which need to compute matrices.
I'm not trying to say that what I've suggested is the last word on this subject. Far from it. A great book on the subject is:
Genetic Algorithms + Data Structures = Evolution Programs. Zbigniew Michalewicz. 1996
I don't know why you would need to assign a 64-bit number to an element in a set of 64 elements- as other posters have pointed out, you'd only need a 6-bit set of numbers to label 64 things.
Representing each codon as a 64-bit word with a single nonzero bit would speed some operations up. Using this representation, permutation group operations can be programmed very efficiently using only loops and bitwise operations. Trying all possible permutations of a list of codons could be done really fast by blazing through them in Gray order. In this case, 64 bits is the minimum wordlength that you'd want, so that you can fit a whole codon into the CPU. As usual, the larger the wordlength, the better, since you could deal with more codons at a time.
except if you happened to apply a genetic algorithm to a bioinformatics problem
Well, yes. That is actually exactly what I was implying.
I happen to think that the book is quite good, and I have read it, and I do know what it's about. In fact, I took a class in which it was one of my primary reading sources. The class was entitled 'Cultural Evolution and the Dimensions of Globalization'.
Now then, there are many different ways to skin a cat. There are also many different ways to write an algorithm. I am merely saying that this book is a good source for learning more about optimisation problems, and how to code evolution programs, utilizing genetic algorithms. When you finish the book, you will understand that 64bit computing is an ideal (although not necessarily perfect) platform for bioinformatics and genetics work. Yes, I agree that it is not the only platform, but it is an ideal platform because a 'genetic data chromosome' can easily be written for all of the codons utilizing a 64bit vector.
Hey, I have a BS in Bio, and I am about to complete my law degree. I will soon be practicing patent law, which has always been a secure field. I know how bad the general /. community feels about patents, but they are a fact of life, and I for one think they are very important for technological advancement. (Are scientists worse in the EU? No? Then why is US consistantly leader in innovation: liberal patent system)
Why slave away at a lab bench for $30k when you can make $100k starting as a patent lawyer?
"In order to make an apple pie from scratch, you must first create the universe." -- Carl Sagan, Cosmos
If you're really serious about doing any type of biology or chemistry, a PhD is a requirement. Physics people have more in the way of engineering jobs at the MS level, but also need PhD to do serious research.
great opportunity for all of us MSCSE's.
Yes, actually. The point of the reading was to gain a greater understanding of how to optimize traveling salesman problems for distribution of global resources. We weren't just talking about cultural evolution and the dimensions of globalization to make us feel better... Rather, we were going through the mathematics of how to solve the traveling salesman problem and calculate memetic distribution amongst society. Topics included:
evolutionary genetics (genetics, memetics, bioinformatics, change management)
epidemiology (vector theory, networks, viruses, propogation, transmition)
demography (demographics, statistics, data mining, forecasting)
economics (markets, networks, advertising, buy/sell functions)
communications (telcom, network programming, routers, collaboration, push/pull)
science and technology (mainframes, personal computers, networks, design)
history (memory structures, databases, file systems)
political science (US Code, social programming)
Anyhow, those were most of the topics covered. It was a graduate level sequence, and I worked in a network programming laboratory at the National Opinion Research Center while I was taking the course. The other reading for that class I used included Knuth's The Art of Programming, Hull's The Structure of Scientific Revolutions, Plato's Republic, and all of Wimsatt's personal publications.
Oh, and the course was taught by four people: a memetic evolutionist, a linguist, a mathematician, and a computer programmer. We would use an algorithmic template (the genetic algorithm) and create an instance and map that algorithm onto each of the above mentioned problems and discuss the pros/cons regarding implementation. There wasn't much purpose of taking the class if one didn't know how to optimize an algorithm.
I have recently finished my master's degree in biomedical sciences, my thesis and traineeships being all about bioinformatics.
Now I am wondering what country and especially which institutes offer the best atmosphere to do a bioinformatics PhD well. Does anybody have advice?
The only problem is biology is something everyone wants to do. It's what all the highest rated TV shows are about. The people who do it are celebrities on not just geek websites but real news. You have to spend a long time in school and a lot of money to get an entry level position anywhere in it because all the fellowships are taken by celebrities. By the way, biology is not a good degree to go into bioinformatics. Chemistry is where you should be.