Slashdot Mirror


Dumb Things With Bioinformatics

PrvtBurrito writes: "About 3% of the human genome is "coded" as genes. The proteins those genes encode can be represented as long sequences of amino acids, a twenty letter alphabet. In an attempt to perhaps prove that nothing is sacred, someone has cataloged all of the english words found in known annotated protein sequences from many organisms. It looks like after cataloging over 37,000,000 characters, the longest word is chapstick and the most common word is kilter."

30 comments

  1. Option by Anonymous Coward · · Score: 1, Funny


    No "CowboyNeal"?

    (Yes I know there's no amino acid with the abbreviation 'B')

  2. Amino Acids by oregon · · Score: 4, Informative

    The 20 letters are

    a Alanine
    r Arginine
    n Asparagine
    d Aspartic acid
    c Cysteine
    q Glutamine
    e Glutamic acid
    g Glycine
    h Histidine
    i Isoleucine
    l Leucine
    k Lysine
    m Methionine
    f Phenylalanine
    p Proline
    s Serine
    t Threonine
    w Tryptophan
    y Tyrosine
    v Valine

    --

    ---
    Oregon
    1. Re:Amino Acids by Atrahasis · · Score: 1

      All this tells us is that the words created are entirely arbitrary - for example d for aspartic acid.
      Maybe thats the point.
      Excuse me while I cut and paste something to see if I can get modded up to 3,Informative.

    2. Re:Amino Acids by Anonymous Coward · · Score: 0

      Diddums! Are you depressed by the endless list of (Score:1)?

      --
      SlashDot is great.

    3. Re:Amino Acids by Atrahasis · · Score: 1

      I couldn't care less what I'm modded at - I know that the /. mod system is flawed. My post wasn't an attack on the parent post, more a confirmation that searching for words in the genome is completely pointless, and gracing it with a place as science news was, well, stupid.

    4. Re:Amino Acids by Anonymous Coward · · Score: 0

      well, you're still a moron.

    5. Re:Amino Acids by Atrahasis · · Score: 1

      And you're an Anonymous Coward.
      Do you actually know what a moron is?
      I await your cutnpaste reply.

    6. Re:Amino Acids by meridoc · · Score: 1

      Actually, the abbreviations do follow some sort of logic. "A" is for alanine, "C" is for cysteine, etc. Argenine couldn't use "A", so it has "R".

      When you get to asparagine (which was first found in asparagus), A, S, P, R, G, and I were already taken, so it got "N".

      Aspartic acid and glutamic acid have the same structures as asparagine and glutamine (except for the carboxylic acid groups on the ends); they got stuck with "D" and "E", respectively.

      By the way, if you're counting, the six letters that aren't used are B, J, O, U, X, and Z.

      --
      "Two things are infinite: the universe and human stupidity, and I'm not sure about the former." -- Albert Einstein
    7. Re:Amino Acids by Atrahasis · · Score: 1

      But the words are arbitrary.

  3. Let's copyright them. by clarkie.mg · · Score: 4, Funny

    Like the aussies who copyrighted ringing tones, someone should copyright those sequences.

    Patent on DNA material is already there, so let's go one step further with proteins.

    --
    Men are born ignorant, not stupid; they are made stupid by education. Bertrand Russel
  4. Re:Funny that... by oregon · · Score: 1

    He's looking at proteins - made up of 20 different amino acids, not DNA - made of 4 different bases (ACTG).

    --

    ---
    Oregon
  5. If I find... by Lars+T. · · Score: 2, Redundant

    a my name written in one sequence, can I patent the gene it's in? It has after all my name on it ;-)

    --

    Lars T.

    To the guy who modded me down from perfect to terrible Karma - Apple haters still suck

    1. Re:If I find... by breon.halling · · Score: 1

      Sorry, Lars, but there's no "S" in the sequence. ;)

      --
      "Yeah, well, Dracula called and he's coming over tonight for you and I said okay."
    2. Re:If I find... by Anonymous Coward · · Score: 0

      Of course there is 'S'.
      The longest word is chapStick

  6. Re:Funny that... by Anonymous Coward · · Score: 0

    I know. It was a wry joke that apparently is overrated at zero. :-/

  7. When will it by SLot · · Score: 1

    be the basis of a new encryption system?

    SecureGene 1.0! Now available in fine stores everywhere!

    Maybe it should be the basis of a new audio codec - it's pretty small in the first place. :)

    I want my MPGene!

  8. War and peace by "Zow" · · Score: 2

    This gives a whole new meaning to the saying about an infinite number of monkeys will eventually produce War and Peace. No need for the typewriters anymore.

    -"Zow"

  9. Kilter!!! by Unknown+Poltroon · · Score: 1

    Casue if its not scottish, its crrrrap!!!

    --
    All Troll + "offtopic" mods are meta moderated as "Unfair", because you abused the system.
    1. Re:Kilter!!! by Atrahasis · · Score: 1

      While funny, kilter is not a scottish word.

  10. Right there by heikkile · · Score: 5, Funny

    near the beginning of chromosome 1, in plain view for anyone to read: Frst Post

    --

    In Murphy We Turst

  11. Do it yourself by meiocyte · · Score: 4, Informative

    Here's a link to check whatever protein sequence you want against the human genome. Make sure to select "blastp" (for protein sequences) in the pulldown menu. Use the alphabet provided above.. it will find near matches too. Enjoy..

    --
    The thing in the box has no place in the language-game at all; not even as a something; for the box might even be empty.
  12. Sports by arunkv · · Score: 3, Funny

    The only sport I found in the word list was "CRICKET". Looks like God intended His subjects to play only cricket!

    1. Re:Sports by jokrswild · · Score: 1

      Cricket WAS named after the Cricket Wars, you know. Douglas Adams must have known something.....

  13. Mind the P's and Q's by meridoc · · Score: 3, Interesting

    I took biochem in undergrad. When trying to remember the amino acids, we'd spell our names. I was one of two people who could write their *entire* name in amino acids.

    I also took to writing sentances. "Chemistry and art. Well, that's an interesting idea. Is it new?" became a printmaking project. It probably doesn't exist though... too many mixed hydrophyllic and hydrophobic residues.

    --
    "Two things are infinite: the universe and human stupidity, and I'm not sure about the former." -- Albert Einstein
    1. Re:Mind the P's and Q's by meridoc · · Score: 1

      By the way, because E, I, L, S, and V are all hydrophobic residues, they're often found near each other. It isn't uncommon to find "ELVIS LIVES" in your protein sequence :)

      --
      "Two things are infinite: the universe and human stupidity, and I'm not sure about the former." -- Albert Einstein
  14. Surprise by leastsquares · · Score: 1

    I'm shocked that there were only 20 Alleles found ;)

  15. Re:Funny that... by Anonymous Coward · · Score: 0

    You ought to join the "ban the overrated and underrated moderation categories society".

    Moderators ought to take responsibility for their opinions.

  16. And the $64,000 question is..... by -douggy · · Score: 3, Funny

    Where does ALL YOU BASE occur?