A Genome Mark-up Language
There's an interesting story running about the need/development of genetic mark-up language. It's called GEML - Gene Expression Mark-up Language and is basically a DTD [?] . Obviously, with working with things like genes, GEML is useful - and a good example of why DTD is muy bein.
The bioxml project has been trying to do this very thing for quite a while now. Previous to that, there was the biomolecular sequence markup language (BSML), and I don't think it ever came close to becoming a standard. The problem that these efforts always run into is the sheer diversity of opinion on how biological data should be represented. Molecular biologists and computational biologists can't even agree on the basic things, like how to represent sequence regions, let alone more complex issues, like annotation syntax.
Why Nature chose GEML as a standard is unclear--the article doesn't present a compelling argument for it over the alternatives, and the choice seems a little arbitrary. It'll be interesting to see what impact this has on the other projects, and how open the standard will be to extension and modification.
Let's try not to let fact interfere with our speculation here, OK?
From the GEML terms of use:
...
The GEML Format is a free, public-domain, open standard created and licensed by Rosetta Inpharmatics, Inc. ("Rosetta") in order to define a single, distinct format for handling gene expression data and avoid proliferation of incompatible variations.
You may not modify, lease, loan, sell, charge for, or create derivative works of the GEML Format or documentation without written permission from Rosetta.
So nobody can fork the standard without first consulting with Rosetta Inpharmatics. Wonderful. I just love their definition of "open standard."
This looks like another corporate-buddy move by a major scientific journal, much like the Science/Celera deal a few weeks back...
Go see bioxml for a truly open alternative.
Let's try not to let fact interfere with our speculation here, OK?
Sorry, I'm too lasy to annotate this myself :-):
Link to NCBI
FASTA looks remarkably like the example given in the article.
Quicky description of FASTA (just one of many schemes but one of the most popular and oldest.
Perhaps rather than writing a trendy article trying to get buzzwords like genomics and bioinformatics together with geek speak, he should have done a tad more research.
Not to say there can't be huge improvements and trying to show the interplay (temporally AND physically) between genes. But don't do a half-assed job by ignoring what has already been used for decades.
Constrast this with a relatively more recent model genetic organism, the roundworm Caenorhabditis elegans. Standards were set early whereby all gene names were standardized by basis of their phenotype (eat-4 is a worm with a mutant feeding behavior, unc-6 describes a worm with uncoordinated movement, lin-41 describes a mutant with mutant cell development lineage, etc etc), and is ascii-friendly. As a result, C. elegans people enjoyed standardized and searchable computerized gene databases for much longer than other geneticists in other fields.
I hope that a standard becomes set and rapidly adapted; lab chiefs (to us grad student peons anyway) can often seem like PHB's in IT when it comes to adapting new methods and paradigms.
NO CARRIER
Insurance provider: Well Mr. Johnson, I'm afraid you have the tag.
Mr. Johnson: No!
Insurance provider: Yup. It's right between the <bald ugly-looking guy> tag and the <most likely to drink beer after finding out his wife gets fatter with age> tag.
Mr. Johnson: Oh God.
Insurance provider: I'm sorry.
Mr. Johnson: Is this hereditary? What can be done about my kids?
Insurance provider: Well, we can comment out the little buggers if we try. Some GScript may work to prevent them from passing the traits onto their children. Hell, we may even be able to use some Gava to touch up their faces so they won't be as ugly as you.
Mr. Johnson: And as for me?
Insurance provider: Your body is 2.0, Mr. Johnson. As far as we're concerned, noone supports you anymore.
- I don't care if they globalize against free speech. All my best free thoughts are done in my head.