Slashdot Mirror

← Back to Stories (view on slashdot.org)

Dumb Things With Bioinformatics

Posted by timothy on Thursday February 7, 2002 @07:26AM from the do-we-have-to-spell-out-g-a-g dept.

PrvtBurrito writes: "About 3% of the human genome is "coded" as genes. The proteins those genes encode can be represented as long sequences of amino acids, a twenty letter alphabet. In an attempt to perhaps prove that nothing is sacred, someone has cataloged all of the english words found in known annotated protein sequences from many organisms. It looks like after cataloging over 37,000,000 characters, the longest word is chapstick and the most common word is kilter."

4 of 30 comments (clear)

Min score:

Reason:

Sort:

Amino Acids by oregon · 2002-02-07 07:35 · Score: 4, Informative

The 20 letters are

a Alanine
r Arginine
n Asparagine
d Aspartic acid
c Cysteine
q Glutamine
e Glutamic acid
g Glycine
h Histidine
i Isoleucine
l Leucine
k Lysine
m Methionine
f Phenylalanine
p Proline
s Serine
t Threonine
w Tryptophan
y Tyrosine
v Valine

--

---
Oregon
Let's copyright them. by clarkie.mg · 2002-02-07 07:44 · Score: 4, Funny

Like the aussies who copyrighted ringing tones, someone should copyright those sequences.

Patent on DNA material is already there, so let's go one step further with proteins.

--
Men are born ignorant, not stupid; they are made stupid by education. Bertrand Russel
Right there by heikkile · 2002-02-07 09:29 · Score: 5, Funny

near the beginning of chromosome 1, in plain view for anyone to read: Frst Post

--
In Murphy We Turst
Do it yourself by meiocyte · 2002-02-07 09:41 · Score: 4, Informative

Here's a link to check whatever protein sequence you want against the human genome. Make sure to select "blastp" (for protein sequences) in the pulldown menu. Use the alphabet provided above.. it will find near matches too. Enjoy..

--
The thing in the box has no place in the language-game at all; not even as a something; for the box might even be empty.