Slashdot Mirror


Genome Methods Applied to Reverse-Engineering

L1TH10N writes "Wired news has an article on a truely innovative way of analysing network protocol reverse-engineering. Marshall Beddoe, a security analyst, is using algorithms used in bioinformatics to analyse closed-source and secret network protocols which he calls "Protocol Informatics".According to Beddoe, network conversations are full of "junk" -- usually the actual data being sent -- which interferes with the analysis of the occasional command sequence that controls what to do with that junk. This has parrallels with Bioinformatics that has to deal with a similar problem of finding known DNA sequences separated by long gaps of unknown data. Biologists have devised complex algorithms to discover whether DNA sequences are descended from the same ancestors by comparing the genetic differences with the known mutation rates of certain DNA components. Beddoe applied the same principles to mutating network conversations of evolving network protocols."

94 comments

  1. After today's Nobel prize in physics... by mirko · · Score: 4, Funny

    I guess we are on our way to finding global laws for everything :)

    --
    Trolling using another account since 2005.
    1. Re:After today's Nobel prize in physics... by robslimo · · Score: 2, Interesting

      I'm not sure I see anything to do with 'laws' in this. It does look like a novel approach and I applaud the kind of lateral thinking that caused someone to apply an algorithmic method to the task that was devoloped for something in such a (seemingly) different field.

      I firmly believe that bioinformatics is going to the the next IT. Programmers will use compilers that create genetic sequences for bio-machines and bio-computers (the debugging process is the main scary part). The odd contrast to present IT is that the underlying 'hardware' isn't something we will have invented, but something we are just learning to use.

    2. Re:After today's Nobel prize in physics... by Tyndmyr · · Score: 1
      Actually, theres probably a lot more global laws out there than we currently know...

      Im sure you have all read about swarm style ai taken from nature. Google boids if you haven't. Its a relatively good model. It isnt too hard to imagine emergent behaviour used in future technological issues... Certainly copying nature is a valid method of problem solving. No point trying to reinvent what already works.

      --
      Support more choices in goverment-Vote 3rd party.
    3. Re:After today's Nobel prize in physics... by yerfatma · · Score: 1

      the debugging process is the main scary part You mean when you forget to iterate in a loop and wake up outside a cocoon asking Geena Davis to shoot you?

    4. Re:After today's Nobel prize in physics... by Sgt+York · · Score: 1
      It's not so much a law as it is a similar system. Basically, whoever "wrote" the genetic code ain't talking, so it's closed source. We have a great way to ID information sent through an unknown or unfamiliar process, and it's providing answers. It makes sense that you could use the same process in both realms.

      I agree about bioinformatics being the next big thing, but it's really just another kind of information technology. Same idea, different system. Damn, I gotta learn how to code.

      --

      There is a reason for everything. Sometimes that reason just sucks.

    5. Re:After today's Nobel prize in physics... by mefus · · Score: 1

      It does look like a novel approach

      Novel? This stuff is old hat.

      --
      mefus
      In Open Society, GPL Software frees YOU!
    6. Re:After today's Nobel prize in physics... by robslimo · · Score: 1

      Yes, novel. Novel that the 'old hat' bio-info stuff was applied to analysing data on digital networks.

    7. Re:After today's Nobel prize in physics... by magefile · · Score: 1

      I agree about bioinformatics being the next big thing, but it's really just another kind of information technology ... Damn, I gotta learn how to code.

      You can code. It's just that, since you're a slashdotter, you don't have a compiler.

    8. Re:After today's Nobel prize in physics... by mefus · · Score: 1

      You have strange notions of novel. That's like tacking "... on the Internet" onto the end of something that begins with "A method for..." and calling it novel.

      --
      mefus
      In Open Society, GPL Software frees YOU!
  2. Now it would be truly interesting... by Tuxedo+Jack · · Score: 5, Interesting

    If we could find a way to apply said algorithms to spam at the gateway level.

    If that could be implemented somehow (an attached appliance or something), it could drastically cut the amount of spam that goes through.

    --

    Striking fear in the authors of godawful fanfiction, I am here, appearing in darkness, Tuxedo Jack!
    1. Re:Now it would be truly interesting... by Windbeutel · · Score: 1

      ... if only we could apply it to AOL users at gateway level...

    2. Re:Now it would be truly interesting... by Anonymous Coward · · Score: 0

      Wouldn't that just teach us how SMTP works, since this algorightm *ignores* the data (e.g. spam content)?

    3. Re:Now it would be truly interesting... by cdc179 · · Score: 1

      I've eliminated over 98% spam from reaching my box by turning on greylisting for postfix (postgrey). This kicks ass. Look into Greylisting concepts if you aren't fam with it:

      http://isg.ee.ethz.ch/tools/postgrey/

    4. Re:Now it would be truly interesting... by magefile · · Score: 1
      if(email.address==*@aol.com) {
      email.send("/dev/null");
      }
      else {
      email.deliver
      }
      Done.
  3. shouldn't it be... by Anonymous Coward · · Score: 2, Funny

    reverse-engineering methods applied to genome

    1. Re:shouldn't it be... by magefile · · Score: 1

      In Soviet Russia, perhaps.

  4. Will It Read .doc Files? by tilleyrw · · Score: 4, Funny

    Perhaps these techniques can be applied to the never-ending task of creating an accurate converter for MS Word .doc-uments?

    Yes, simple document conversion is possible but until 100% accuracy is possible the race is not won.

    --
    This post encoded with ROT26. If you can read it, you've violated the DMCA. Handcuffs please, sergeant.
    1. Re:Will It Read .doc Files? by Windbeutel · · Score: 1

      err.. what am I supposed to do with a file containing an end of file mark, thinking that's all that will be left once all the meaningless junk is filtered out?

    2. Re:Will It Read .doc Files? by kanweg · · Score: 2, Interesting

      Well, the only good news is that Microsoft isn't able to reach accuracy 100% themselves, whether it involves exchange of Word documents between PCs, or between Macs and PCs.

      Bert
      Who started his own company and now understands first hand what his former secretary had to endure when battling with that productivity killer. We need competition to get rid of it. Any measure against Microsoft should involve opening the standard.

    3. Re:Will It Read .doc Files? by pjt33 · · Score: 1

      I tend to use the person who sent me the file as a converter, but for reading files in cases where that's not possible I use fastpdf.com, which I think scripts Word to do the conversion.

  5. Modeling by KingKire64 · · Score: 3, Insightful

    The Human Brain... the most complex and amazing computer ever built. The more we learn about it and how it works the more we can apply to computers. Imagine the computational power of the mind put to something specific.

    I dont know what im talking about... but its cool anyway.

    --
    "All I can tell the "lesser of two evils" folks is that if they keep voting for evil, they'll keep getting evil."-Lp.org
    1. Re:Modeling by Anonymous Coward · · Score: 0

      You are talking nonsense. Most of that computing power is needed to solve the problem of managing the body from the most archane set of input.

    2. Re:Modeling by Anonymous Coward · · Score: 0

      It's true. The Human Brain is capable of some truly amazing things. Such as this conversation I've had several times....

      <Them> What's this?
      <Me> ... uh, a 50 cent peice?
      <Them> Oh.
      <Them> How much is it worth?

    3. Re:Modeling by Anonymous Coward · · Score: 0

      Depends on what level you're studying the brain at. It's quite possible that it's performing some kind of symbolic computation, just as it's possible for an electronic circuit to be running a Python program at a higher level of analysis. The structure of the mind isn't necessarily transparently related to the structure of the brain.

  6. So... by Anonymous Coward · · Score: 2, Funny

    Microsoft will finally be able to figure out what is happening in their own network protocols!

  7. Illegal in the US.. by kyhwana · · Score: 2, Funny

    Of course, this is illegal in the US. No reverse engineering allowed

    --
    My email addy? should be easy enough.
    1. Re:Illegal in the US.. by ZuperDee · · Score: 3, Informative

      Not quite true--it is still allowed for the purpose of ensuring compatibility, IIRC.

    2. Re:Illegal in the US.. by Anonymous Coward · · Score: 0
      Of course, this is illegal in the US. No reverse engineering allowed

      Uh, no. Idiot.

    3. Re:Illegal in the US.. by Anonymous Coward · · Score: 0

      Unless you own the rights to the software being reverse engineered or are the Government (e.g. DoD).

      RE is typically used in industry to reverse engineer legacy code for which the source code has been lost.

    4. Re:Illegal in the US.. by kyhwana · · Score: 1

      Not if the EULA says it's not, which they all say, btw. (See blizzard vs bnetd)

      --
      My email addy? should be easy enough.
  8. Computer forensic has other clues... by museumpeace · · Score: 4, Interesting
    A Sciencedaily.com article recaps a news release about U of Toronto researchers, David Lie and Ashvin Goel, who are at work on [as in they do not have a finished tool or product to announce] on software that not only detects intrusions but backtracks to the sources and cleans up the damage. The article hints
    These naive hackers also leave clues. Although they use IP (Internet protocol) addresses to bounce from machine to machine, hackers pick up languages used on interfaces along the way, leaving a trail of breadcrumbs that trace back to the point of origin.
    that the native human language of the locale where each in the chain of nodes used for an attack creeps into the evidence/clues. I wonder what they are talking about?
    --
    SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
    1. Re:Computer forensic has other clues... by Anonymous Coward · · Score: 0
      These naive hackers also leave clues. Although they use IP (Internet protocol) addresses to bounce from machine to machine, hackers pick up languages used on interfaces along the way, leaving a trail of breadcrumbs that trace back to the point of origin.
      That's bogus. It is too bad the origionating article cannot be given a troll mod.
    2. Re:Computer forensic has other clues... by Not_Wiggins · · Score: 1

      that the native human language of the locale where each in the chain of nodes used for an attack creeps into the evidence/clues. I wonder what they are talking about?

      You mean like when someone defaces a webpage with "Roight! USA eats chunder! AUSSIES RUL3!!1!one!1!" they can figure out that the perp is (obviously) Canadian?

      --
      Diplomacy is the art of saying, "Nice doggie!" until you can find a rock.
  9. Contrasts: Datastreams to DNA by w.p.richardson · · Score: 4, Insightful
    "Junk" in the datastream is useful (since we have made it, we use the control codes to reassemble).

    "Junk" in DNA (e.g., "latent" DNA) is probably not junk, we just don't know the function (yet). No scientist worth their salt would admit that (at least not in earshot of a grant proposal review committee!)

    --

    Curb CO2 emissions: Kill yourself today!

    1. Re:Contrasts: Datastreams to DNA by haluness · · Score: 2, Informative

      > Junk" in DNA (e.g., "latent" DNA) is probably not
      > junk

      Actually theres an article in this months SciAm that talks exactly about this. Very interesting

      http://sciam.com/article.cfm?chanID=sa006&colID=1& articleID=00045BB6-5D49-1150-902F83414B7F4945

    2. Re:Contrasts: Datastreams to DNA by espressojim · · Score: 1

      It's amusing that right now I'm investigating intronic DNA, and looking for signals of selection. A few percent of the genome is conserved in non-gene regions between humans and mice (for example.) Why would the DNA be conserved (against a backround mutation rate), unless it was important.

      I can't think of many scientists who think about "junk" DNA anymore...but if I ever get my research finished and published, then I'll add one more nail to the coffin.

    3. Re:Contrasts: Datastreams to DNA by pfafrich · · Score: 2, Informative
      "Junk" in DNA (e.g., "latent" DNA) is probably not junk, we just don't know the function (yet). No scientist worth their salt would admit that (at least not in earshot of a grant proposal review committee!)

      From what I've read there is a case that there is real Junk in the DNA. Various sequences which at some point in the past served a purpose but now (like the human apendix) the original function is no longer relavant. I've also read somewhere that some of the DNA is actually a sort of virus which eons ago colanised the DNA sequence.

      From Junk DNA

      There are many theories about the factors that shaped junk DNA and why it persists in the genome. Speculations are that:
      • These chromosomal regions are trash heaps of defunct genes, sometimes known as pseudogenes, which have been cast aside and fragmented during evolution. Evidence for a related hypothesis suggests that the junk represents the accumulated DNA of failed viruses.
      • Junk DNA acts as a protective buffer against genetic damage and harmful mutations. An overwhelming percentage of DNA is irrelevant to the metabolic and developmental processes, so it is unlikely any single, random insult to the nucleotide sequence will affect the organism.
      • Junk DNA provides a reservoir of sequences from which potentially advantageous new genes can emerge.
      • Junk DNA serves the role as "meta-DNA", being involved in the development of an organism from embryo to adult. Recent results indicate that so-called ultraconserved elements of junk DNA are common to all vertebrates, and this could mean that this part of the genome is essential to our survival.
      It may be that a combination of these are true, or partly true.

      The first of these seem to indicate a posibility of real junk.

      --
      There are four sorts of people in the world: fools, lunatics, idiots and morons. - Umberto Eco, Foucaut's pendulum.
    4. Re:Contrasts: Datastreams to DNA by mefus · · Score: 1

      Actually theres an article in this months SciAm that talks exactly about this.

      Exactly? The article you've linked to (what I can see of it; I'm not a subscriber) appears to be about RNA's role in the regulation of genes.

      There's nothing about "Junk DNA", although I know introns play a role in the regulation of a genes transclation. Nobody calls the DNA in those regions "junk" DNA, though.

      Having not been able to read the full article, however, I may have missed some important link into the "junk" DNA to which you refer.

      --
      mefus
      In Open Society, GPL Software frees YOU!
    5. Re:Contrasts: Datastreams to DNA by thogard · · Score: 1

      Junk DNA is just part of the data segment.

      If your disassembling the code of a program, the data is just junk that gets in the way until you figure out what the code is doing. Of course the ascii comments in data may be useful and from what I can tell, DNA doesn't seem to have any text strings in it so for now its just junk.

      I haven't looked into the pattern matching stuff the bio guys are using but its very handy to be able to take a bit of a program and find out where the common libraries functions are hiding but since they have references that get fixed up by the linker, every version will be different. I still haven't found a good simple algorithm that will go though a binary and try to find a match to an existing function. The ideal situation would be able to have a small table of function fingerprints and then have another small bit of code be abel to search a binary for those fingerprints and be able to say "printf is at 0x4564 and vsprintf is at 0x498c" and have it work with any CPU.

  10. Network Protocols vs. Building Blocks of Life by Sheepdot · · Score: 5, Insightful

    That'll come as a relief to Beddoe, who until now assumed that biologists wouldn't pay much heed to his project.

    "They're working on uncovering the mysteries of life itself; we're just hacking network protocols," he said. "Which sounds more important to you?"


    I don't think Beddoe should cheapen the reverse engineering aspects of networking compared to biology. We may still be years away from finding a cure to cancer, AIDs, etc. and there's a good chance that biology work in this area might not be as fruitful. After all, (without getting into a religious debate, here) man was not created by man, whereas network protocols are. Because of this, it is relatively easier for us to reverse-engineer something that was created by another human, because we know how they think. Evolution or creation, we don't know much about our own building blocks, because we don't know how either God thinks, or the universe fully works.

    While his software is great for "hacking network protocols", the biologists paying attention to his work might not find what they are looking for. The inputs very well may be just too vast for his ideas to provide any help.

    On the other hand, the Samba team and the Spam Assasin author will most likely enjoy this.

    1. Re:Network Protocols vs. Building Blocks of Life by DLWormwood · · Score: 0, Flamebait
      After all, (without getting into a religious debate, here) man was not created by man

      Funny, I was taught that every person now alive was created by man, or more exactly, was created by man and woman.

      Don't make me explain why... it's kind of gross, and outside the domain of most /.'ers anyway.

      --
      Those who complain about affect & effect on /. should be disemvoweled
    2. Re:Network Protocols vs. Building Blocks of Life by Anonymous Coward · · Score: 0

      I don't think Beddoe should cheapen the reverse engineering aspects of networking compared to biology. We may still be years away from finding a cure to cancer, AIDs, etc. and there's a good chance that biology work in this area might not be as fruitful.

      Are you retarded? These are general algorithms that are used everyday in basic research in biology and bioinformatics. They're not specifically related to any particular search for a cure. They are already extremely useful, and will only get more so with each day.

      After all, (without getting into a religious debate, here) man was not created by man, whereas network protocols are. Because of this, it is relatively easier for us to reverse-engineer something that was created by another human, because we know how they think. Evolution or creation, we don't know much about our own building blocks, because we don't know how either God thinks, or the universe fully works.

      This is just a pointless way of looking at things. Cryptography was invented by humans too, that doesn't mean it's easier to figure out what a text encrypted with 2048 bit RSA says than what a piece of DNA does. In fact, we figure out the latter everyday, whereas no one has yet to figure out the former.

      When I read the article, I knew someone would bite on that last comment. Why can't you just accept that he is right? What biologists do is more important than hacking network protocols. Period.

      And, sure, it will be fun for the Samba team. But that's a little irrelevant to your point, don't you think?

    3. Re:Network Protocols vs. Building Blocks of Life by magefile · · Score: 1

      Flamebait? WTF? This should be modded Funny. I wish I had mod points ('course, I've posted already several times in this discussion, but ...).

  11. Not an apt analogy by galt2112 · · Score: 2, Insightful

    I think that network protocols are not similar to unmapped genome sequences in that network traffic is metadata and data.

    Genome sequences are much more consistent. It's all data, processed by RNA computers.

    1. Re:Not an apt analogy by the+morgawr · · Score: 3, Insightful

      DNA doesn't have meta-data? (i.e. Data about Data) You know this how?

      --
      The policy of the United States is worse than bad---it is insane. -- Ludwig von Mises, Economic Policy(1959)
    2. Re:Not an apt analogy by Anonymous Coward · · Score: 0

      Don't base your knowledge of DNA and it's inner workings to the few news articles you've read and movies you've seen. Despite the complexity of the information it carries, DNA is an incredibly simple but elegant method of information storage and transfer. The vast majority of the DNA sequence actually doesn't encode genes, or data in your analogy. Most of the sequence is involved in regulation, development early in life, and quiescent information either through downregulation or evolutionary holdovers.

      And RNA computers? RNA has little to do with processing DNA with a few minor exceptions. RNA is a copy of DNA that can serve either as a messenger, an enzyme, or as recognition to tranlate a messenger RNA into a protein.

      It would be better to refer to proteins like RNA polymerase, which transcribes DNA to RNA, or a ribosome, which tranlates RNA into a protein, as computers.

    3. Re:Not an apt analogy by Daedala · · Score: 1

      DNA does have metadata: there are the rules about whether or not particular genes are expressed, and when.

      --
      What I say does not represent the views of my employers, my friends, my cats, or myself.
    4. Re:Not an apt analogy by seanellis · · Score: 1

      Start and stop codons?

  12. Reinventing the wheel. by Anonymous Coward · · Score: 1, Funny

    I'd just grep the stream and be done with it.

  13. true+ly = ? by kamagurka · · Score: 2, Informative

    it's "truly", damn it! TRULY!

    1. Re:true+ly = ? by Anonymous Coward · · Score: 0

      Welcome to Slashdot where no one seems to "truely" understand the difference between:

      then - than
      choose - chose
      loose - lose
      to - too - two
      there - their
      the - teh
      fantasy - reality

  14. Gary Larson's prior art by Chukcha · · Score: 2, Funny

    Gary Larson has previously documented this phenomenon: http://home.earthlink.net/~grleone/funny/farside/g inger.gif

  15. tech-transfer... coming to IT near you by jnull · · Score: 4, Interesting

    I always enjoy such articles.... Technology tranfer has been the cornerstone of innovation for how long? Companies study other industries in order to bring innovation to tired processes and technologies. It is responsible for many of today's disruptive technological achievement. Was it South West Airlines who did formal research on pit crews at Daytona (or something like that)? Regardless, keep up the good work... who knows the next great step in reverse engineering might come from examining how Vegas tears down their casino's, or is that just what I'm thinking for Windows. "It is a miracle that curiosity survives formal education." --Albert Einstein --j

  16. Universal principles of information communication by medication · · Score: 4, Insightful
    quote:
    "The problem of decoding the language of networks and the problem of finding signals in DNA are really two related instances of machine learning problems. We're almost bound to discover universal principles of information communication by investigating both," - Terry Gaasterland
    This seems like a pretty obvious conclusion after reading the article but I'm curious why there aren't any reference's to pure informatics studies. Is there such a thing? After initial googling I'm only seeing bio-informatics results. Anyone have any insights as to what I should be looking for to find research/papers/studies on pure informatics or "universal principles of information communication".
    --
    "If you're flammable and have legs, you are never blocking a fire exit." - Mitch Hedberg
  17. I was thinking along similar lines by Roadkills-R-Us · · Score: 3, Insightful

    Both at the gateway and the SMTP server, it seems like sifting through junk to find what matters, and determining common ancestry would be useful anti-spam measures.

    At least until the spammers figured out how to make spam look so much like certain types of legit email that we started losing good email...

  18. Unification and Backtracking by Maljin+Jolt · · Score: 1

    Prolog configured to huge stacks does the job with a very little code actually writen. If you are sufficiently patient.

    --
    There you are, staring at me again.
    1. Re:Unification and Backtracking by Anonymous Coward · · Score: 0
      But this software has to do statistical analysis on data - there may be bad data, right? Prolog might be useful if the data was perfect, but it seems not so if there is bad data.

      Also, how long might it take to get a result?

  19. Looks like a nail by coreolyn · · Score: 2, Funny

    Didn't realize the human Genome could be used as a hammer...

  20. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  21. DMCA by Anonymous Coward · · Score: 0, Offtopic

    Sweet...time to make some money! I'm gonna sue Beddoe for violating the DMCA because he is re-engineering my genes without my authorization!

  22. Talk about Race conditions by EvilTwinSkippy · · Score: 1
    I've heard of race conditions in computer science, but this goes way too far.

    Seriously, how much would a Big Red Button have cost?

    --
    "Learning is not compulsory... neither is survival."
    --Dr.W.Edwards Deming
  23. DNA vs. DMCA by Doc+Ruby · · Score: 1, Flamebait

    You can't reverse engineer the genome: some of the genes are patented! Nevermind the prior art in your mom's nuclei, they literally own your ass - you've just got a limited license to use it. When they release the retrovirus with the broadcast flag flipped on, finally every Slashdotter's dream of "baby licenses" will be possible.

    --

    --
    make install -not war

    1. Re:DNA vs. DMCA by Anonymous Coward · · Score: 0

      "Baby licenses" is a great idea since most of our problems are because of too many stupid people breeding.

    2. Re:DNA vs. DMCA by Doc+Ruby · · Score: 1

      I dunno if baby licenses are the answer: humans will just evolve to suit the license constraint on our environment. Since we're not good enough to model the rest of the environment in any license test, we'll no longer be fit enough to survive the real test.

      The real problem is a ratio: N:C. N = the number of people; C = the people's ability to communicate with one another. N:C is too large. And the units for C are unknown (and, reflexively, a factor in C). We're too busy driving N to collapse. All hope lies in maximizing C. To return towards the topic: the DMCA minimizes C.

      --

      --
      make install -not war

  24. Bioinformatics links by mattr · · Score: 4, Informative
    Yesterday wrapped up over a week of intense Bioinformatics seminars, poster sessions, exhibitions, and brainbusting studying at Bio Japan in Tokyo and related links. I just saw a presentation on the H-Invitational database which though in Japan also combines the content of foreign databases. It is extremely impressive, and they combine lots of online calculators and results visualizers that are really impressive.

    Also figuring out biology seems to be a lot harder than figuring out networking, at least there are all kinds of nefarious things but also serendipitous things found. Like one presentation I just heard had a U.S. scientist who announced that they had discovered an entire signalling network in human cells that was like the one found in yeast cells. And apparently more proteins can be encoded than the number of genes, because of alternate orderings (counting from different displacements in the gene, I think, ask a real bioinformatics expert). One talk I heard a year ago that stuck with me was a scientist who had devised a way to find signalling pathways in cells quickly; by forcing the cell to die if certain requirements were not met, he created a parallel computer that allowed him to discover a whole swath at once. There is also a lot of math and statistics, as well as a lot of biological knowledge behind it, it is not strange to see various statistical tests, references to different computer programs they used for analysis, or a mention of simulated annealing (well maybe that one not so often, came up yesterday though).

    One interesting thing is that they (the H-Invitational people / Japan Bioinformatics Consortium) have I believe twice held what they call annotation jamborees, much like a hackfest! In 2002 they had 120 scientists gather (mostly Japan but from all over the world) in a big room with a computer per person. They locked them in for 10 days, and annotated IIRC over 20,000 genes, basically doing a figure some man years of work in a week, inputting data so it can be searched, analyzed, and crossreferenced.

    They do have a comparison between mouse and human genome there, I wonder if something similar could be done in open source in terms of annotating and indexing a libary of open source code in different languages, really all in one pseudo language would be more useful perhaps. Anyway biologists are learning from computer scientists learning from mathematicians, and someone famous has said that in the future, all science will be computer science.

    Bioinformatics people are doing text mining and data mining, but also there are many flavors and types of analysis programs designed to penetrate and match up information as encoded by tiny molecules, folded proteins, genes, and so on. Here are some links to get started. Also note the perl for bioinformatics books, and there was a big oreilly bioinformatics conference archived from 2003 and other links too (see bio.oreilly.org link below).

    I cannot speak for everyone, but I can convey what I have heard, that there have long been communication gaps that have held back some of this, actually cultural differences. For example physicists like pure math and biologists deal in dirty, wet things.. when people successfully combine different perspectives in this area [more] discoveries start getting made. In Japan at least they are trying to figure out how to grow more bioinformaticists, since students tend to go only towards either biology or towards computer science (why study twice as hard). But there seems to be a lot of interesting stuff in there for both sides.

    PLoS Bio article
    some clusty
    faq

    1. Re:Bioinformatics links by Anonymous Coward · · Score: 5, Informative
      And apparently more proteins can be encoded than the number of genes, because of alternate orderings (counting from different displacements in the gene, I think, ask a real bioinformatics expert).
      Actually, the increase in number of genes compared to actual encoded genes as you move up the "eukaryotic evolutionary chain" is due to the organisms finding new and novel ways to combine the same protiens.. not in different displacements of the same gene. See Nature paper on draft human genome analysis: Nature. 2001 Feb 15;409(6822):860-921 Also the draft Mouse genome analysis: Nature. 2002 Dec 5;420(6915):520-62
    2. Re:Bioinformatics links by jabbo · · Score: 1
      --
      Remember that what's inside of you doesn't matter because nobody can see it.
  25. Re:Universal principles of information communicati by pjt33 · · Score: 2, Informative

    "Information theory". If you get too many random pages with that, throw "Shannon" in as well.

  26. Re:Universal principles of information communicati by medication · · Score: 1

    Cheers - I appreciate the suggestion

    --
    "If you're flammable and have legs, you are never blocking a fire exit." - Mitch Hedberg
  27. Biologists are aware of this by jaxon6 · · Score: 4, Interesting

    I work right in the middle of all that is biology at MIT(Center for Cancer Research, Biology, BioInformatics, Chemistry, Biological Engineering, Brain and Cog, Mathematics, Physics, Computer Science, etc..) and the geeks in each department are aware of the advancements made in other departments and how they can help themselves. In fact, MIT created something called CSBi, the Computational and Systems Biology Initiative(csbi.mit.edu), which has professors and students from all the departments listed above, and more. They collaborate, share students and projects, organize retreats and conferences. There's even a degree program in systems biology.

    The majority of study is computer research applied towards biological methods and models, but I'm sure some of the cs geeks will be reading this article and grab the work done by the bio geeks.

    And in the end, we will all have the best mouse trap ever.

    --
    Do you see the sig? Do you have it in your sights? Why yes, Miss Moneypenny...
  28. Re:Universal principles of information communicati by cougartoo · · Score: 2, Informative

    Shannon's seminal paper created the field of information theory, it's a surprisingly easy read for such an influential paper.

  29. Another good source by PetoskeyGuy · · Score: 1

    For those "evolving" protocols...
    http://www.ietf.org/rfc.html

  30. crossover of underlying math by bodrell · · Score: 1
    I think this is pretty damn cool, but not any more interesting than some of the other crossover techniques that have come out recently. One idea was to mimic the way ants find food and communicate to the colony where that is. Simulated ants with simulated pheromones were used to find a decent solution to the traveling salesman problem, where the salesman wants to hit each of a list of cities in the shortest possible route, without backtracking.

    There's a pdf here on the subject or you could read the google html version here.

    --
    Si la vida me da palo, yo la voy a soportar Si la vida me da palo, yo la voy a espabilar
  31. Ahem... by coulls · · Score: 1

    So... I did this with intrusion detection (masquerade detective actually) about a year and a half ago. Just FYI ...

    http://www.acsac.org/2003/beststud.html

    1. Re:Ahem... by coulls · · Score: 1

      Er... masquerade detection, not detective. Stupid tablet pc.

  32. Could someone explain.. by gnalle · · Score: 1
    So he can use a binary file to create a tree of realted bits, but suppose that he has access to the compiler, how does he get from this tree to a description of which source code leads to which binary code?

    I guess he should write a script to create a huge amount of very similar programs, and compile them all to create binary trees. Are there standard methods for analyzing such a data set? Is it just simple multivariate statistics?

  33. Re:Universal principles of information communicati by Anonymous Coward · · Score: 0

    Information theory and statistics/probability theory.

  34. Will It Read .doc Files?-Nvidia. by Anonymous Coward · · Score: 0

    "Perhaps these techniques can be applied to the never-ending task of creating an accurate converter for MS Word .doc-uments?"

    Or reverse-engineer the Nvidia driver.

  35. Protection from genetic damage by div_B · · Score: 2, Interesting

    Junk DNA acts as a protective buffer against genetic damage and harmful mutations. An overwhelming percentage of DNA is irrelevant to the metabolic and developmental processes, so it is unlikely any single, random insult to the nucleotide sequence will affect the organism.

    I read something about this in NewScientist a while ago. Blocks of a certain base (guanine?) either side of important regions of DNA, which are more susceptible to damage (by free radicals?), serve to protect the important code, by being damaged first. Anyway, I thought it was really cool because it's basically analogous to bolting blocks of more easily oxidizable metal onto the hull of a ship, to prevent the hull from corroding. (What is this process called, anyone?)

  36. DNA and Meta Data by Anonymous Coward · · Score: 0

    You should read Godel Escher Bach by Douglas Hofstadter the abilities of DNA to hold both data and meta data about itself is the single most amazing thing about it. It is a code that contains the instructions to build itself and the source of that which makes us (to some extent at least) understand ourselves - our brain.

    DNA demonstrates a programming system that defies Godel's second theorem: there is no such thing as a complete mathematical system, one that can prove itself.

    Definitely worth looking at from a programming perspective, I'd say.

  37. At which point by babybird · · Score: 1

    At which point we apply the spammers' techniques to genome research! :D

    Wouldn't that be ironic, that spam actually DID provide a cure for cancer or some other disease? And you wouldn't even have to read it or buy anything!

    --
    Keith D.
  38. sounds exciting by anuraggoyal · · Score: 1

    Sounds exciting - applying one science onto another - I think this is the basic foundation on which Science builds itself up -isn't it!

  39. bioinformatics is reverse engineering too but by Bob+Bitchen · · Score: 1

    is our genome protected under the DMCA or is that around the corner? Hope I didn't give them any ideas....

    --
    http://tinyurl.com/3t236