Slashdot Mirror


Toward a 3D Search Engine

Plasma Droid writes "NewScientistTech has a story about a 3D molecular search engine that is over 1,500 times faster than anything previously developed. The researchers, from Oxford University, developed a lightning-fast way to quickly match 3D shapes mathematically. This could not only speed up searches for new drugs, but lead to 3D search engines, for finding objects uploaded to platforms such as Google Earth, they say." The problem will be in jump-starting the supply of 3D data about molecules and everything else.

27 of 83 comments (clear)

  1. Enter Search Term: by LiquidCoooled · · Score: 5, Funny

    Boobies, extra large please.

    --
    liqbase :: faster than paper
  2. WOO HOO! by Lumpy · · Score: 2, Funny

    Finally I can search for Dodecahedron porn!

    --
    Do not look at laser with remaining good eye.
    1. Re:WOO HOO! by Kenja · · Score: 5, Funny

      Hot molecule on molecule action! See uncensored carbon bonding!

      --

      "Have you ever thought about just turning off the TV, sitting down with your kids, and hitting them?"
  3. Shape versus negative space by goombah99 · · Score: 5, Informative

    It's pretty easy to geometrically hash or construct reduced feature vectors for matching. People (like me) have been doing this for years. It's much harder to know if a molecule will fit into a crevice or negative space. THe latter is probably more important to drug design. the reduced feature vectors let you know quickly if two molecules are simmmilar in shape. Which is the title given to the article. But then this is discussed in the context of drug targets. A harder problem. What maybe new or clever here is that they found a very useful set of feature vectors.

    --
    Some drink at the fountain of knowledge. Others just gargle.
    1. Re:Shape versus negative space by Anonymous Coward · · Score: 4, Funny

      It's pretty easy to geometrically hash or construct reduced feature vectors for matching. People (like me) have been doing this for years

      I bet you have to beat the chicks away with a stick.

  4. Impact on Pharma (esp. patents) by Mateo_LeFou · · Score: 4, Interesting

    I've always been of two minds about whether the drug industry was a good example of patents being cost-effective, because I suspect that very good technology will soon emerge that makes pharma R&D less expensive, by making it primarily a data-processing (esp. simulation) issue. Seems like this tech might be the first piece of that puzzle?

    --
    My turnips listen for the soft cry of your love
    1. Re:Impact on Pharma (esp. patents) by ThosLives · · Score: 2, Insightful

      The problem isn't that it takes a while to find new stuff. The problem is the barriers to entry are so high that sufficient competition can't take place, hence there is no pressure to work quickly. Basically the medical industry is *not* a free market.

      Now, I don't think the barriers need to be removed, because most of the high barrier is to ensure that treatments are effective without nasty side effects. About the only part of the barrier I can see being removed is somehow changing the liability laws, but I don't know what would be acceptable.

      --
      "There are a dozen opinions on a matter until you know the truth. Then there is only one." - CS Lewis (paraprhase)
    2. Re:Impact on Pharma (esp. patents) by Red+Flayer · · Score: 2, Insightful

      The problem is the barriers to entry are so high that sufficient competition can't take place, hence there is no pressure to work quickly.

      Except the barriers to entry are mostly not regulatory in nature. As with most advanced R&D-based industries, the barriers are brainpower and equipment. There's plenty of capital out there to handle the hit-and-miss nature of drug design, and the regulatory restrictions on drug production and marketing are not barriers to entry for research.

      IMO, what is truly limiting the pharma industry is profit incentive. Big pharma researches the things that will make them the most money -- which, BTW, are not cures for diseases, but rather treatments for conditions.

      The 'competition' you speak of has nothing to do with R&D of new drugs. Barriers to entry prevent new entrants from producing and selling a commodity good, and new drugs are by no means commodities (patents have a lot to do with that). If you're talking about R&D as a commodity, that's a whole different discussion -- but again, it's brainpower and equipment that are the limiting factors causing the barriers to entry.

      As for incentive to work quickly, that is not the case. There is definitely an incentive to work quickly as there is competition from all the big companies -- look at the COX2 inhibitors that were all the rage as low-side effect NSAIDS a couple years ago until certain really bad interactions manifested. Merck, Schering-Plough, everybody was in the game when the new sub-class was discovered. It was literally a rush to market, which is why the adverse effects weren't recognized until post-phase 4 trials.

      --
      "Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
    3. Re:Impact on Pharma (esp. patents) by ponos · · Score: 2, Interesting

      IMO, what is truly limiting the pharma industry is profit incentive. Big pharma researches the things that will make them the most money -- which, BTW, are not cures for diseases, but rather treatments for conditions.
      This is not entirely accurate. From a business standpoint, if you sell a cure and your competitor sells a "treatment", you'll erase them from the map. So they would definitely like to "cure" things. However, most of the rich, western people do not suffer from diseases per se, but from "risk factors" like hypertension, diabetes, hypercholesterolemia etc etc. The treatments for these conditions are extremely effective but a cure is almost impossible (unless you manage to install a new pair of kidneys or a new pancreas etc).

      Except the barriers to entry are mostly not regulatory in nature. As with most advanced R&D-based industries, the barriers are brainpower and equipment. There's plenty of capital out there to handle the hit-and-miss nature of drug design, and the regulatory restrictions on drug production and marketing are not barriers to entry for research.

      FDA approval is a regulatory barrier and demands very lengthy, very expensive and time consuming pre-clinical and clinical testing. You can't just stab someone with a syringe full of X just because the computer said it works. You need to go through all proper procedures, including testing in mice, primates, healthy volunteers, otherwise healthy patients (i.e. patients that don't have anything else than the disease you want to treat) and the general patient population. You also have to determine lethal doses, drug interactions with a billion other things (foods? additives? common drugs?), allergic reactions etc etc.

      My point is that the "hit and miss" process is not just a wasted stack of paper or some CPU cycles but a process involving real patients, possible deaths, legal battles. After that you'll need a host of research publications to persuade the medical community, marketing exposure etc. A "miss" is a very, very costly thing. Take Merck and Vioxx for example.

      P.

  5. I'll bring the Hot Grids by Mateo_LeFou · · Score: 2, Funny

    couldn't resist

    --
    My turnips listen for the soft cry of your love
  6. Good, but just one tiny bit of the problem by filthWisard · · Score: 5, Interesting

    This is a really cool advance when working with molecules you already know the shape of, but it still doesn't get around the problem of what shape a molecule is in the first place. A protein molecule will naturally collapse into the shape with the lowest energy. If there are 100 atoms in the main chain, that's 99 different angles that it could have, that's 99 degrees of freedom. I hear that genetic algorithms are pretty good at finding the most lightly shape though, so this may not be as big a problem as it used to be.

    1. Re:Good, but just one tiny bit of the problem by picob · · Score: 2, Interesting

      Usually the aminoacid sequence is known, and you can find structures of similar aminoacid sequences in databases using a BLAST (search algorithms). If that doesn't give a structure of which the structure (preferably from a crystal, otherwise NMR) was determined you can try to predict the protein structure: proteins have domains, small subsequences of which the shape is known. Many domains are known that have a particular shape. If you have determined a few of these then it becomes a lot more easy to determine the rest of the protein.

  7. Comment removed by account_deleted · · Score: 3, Insightful

    Comment removed based on user account deletion

  8. Re:Problem...? by LordPhantom · · Score: 2, Insightful

    No, that will be a problem. Once you have the database, what exactly am I supposed to input for searching? Will I need to learn how to create a 3D model in order to search for similar objects?
    The rest of your comments are pretty valid, however in this case that would seem to be aside the point. Searching objects in this fashion would be as simple as metadata that is appropriate for 3d model searches. Rather than provide a base model, you could search the metadata supplied with/for/generated for shapes, and once you have a few from the library, use THOSE as searches for -similar- or combined models. It's actually quite possible, if of questionable use - not to mention your criticism could be thrown back at you by simply saying "What!??! A search engine for sound? That will never work, I'd have to learn how to whistle".

  9. Speed versus Thoroughness by wsherman · · Score: 3, Insightful

    NewScientistTech has a story about a 3D molecular search engine that is over 1,500 times faster than anything previously developed.

    The implication both from the summary and from the article itself is that this new search is just as thorough as other search methods but much faster. To prove thoroughness they would have had to show that anything found by other search methods will also be found by their new, much faster, search method. I doubt very much that they were able to do prove this rigorously.

    That's not to say that the problem of matching 3D molecular shapes is not important or that their research is not valuable. I would say, though, that it is misleading to claim that they have solved the 3D search problem with a much faster algorithm. There are many different measure of 3D similarity and, for many measures of similarity, the only way to guarantee an optimum match is by exhaustive search.

    Note that, in general, every search will be exhaustive in the sense that the query must be compared to every entry in the database. The problem is that many measures of similarity have additional parameters that must be optimized by exhaustive enumeration for each comparison. The classic example is a measure of 3D similarity that pairs each atom in the query with an atom from the structure in the database. In the general case, all possible pairings must be tried through an exhaustive enumeration.

  10. they got it backwards by oohshiny · · Score: 3, Interesting

    Currently, the most common way to find the 3D shape of a particular molecule within a database is to superimpose a candidate over the query molecule and see how much of it overlaps. But this is time consuming, partly because it requires both molecules to be precisely aligned.

    Yes, that's currently "the most common way" because at least you can tell what you're getting: when you get a match, you can actually say how close the different shapes are to one another.

    The new technique uses a different approach. It analyses the position of the different atoms within a molecule to understand its shape. These relative positions can be mapped and stored a molecular database.

    That's actually not a "new technique", it's an old technique. It's what people used to do before they tried to overlay 3D shapes accurately. They used to do that because computers used to be too slow to do the accurate comparison.

    As the article points out, there is only limited 3D shape information available at all. Few people need to do 3D queries right now, and there is little data to do them on, so optimizing speed is the wrong thing to do; we need to optimize accuracy and scientific relevance.

  11. Not enought structures? by ajax142 · · Score: 4, Insightful
    The author lists an apparent problem of this 3D search as a lack of molecular structures and calls for a "jump start" in the supply of 3D data, I call BS on this claim. A quick look at the Cambridge Structural Database shows 400,977 strucutures of 363,931 different molecules. There are another 89,064 structures of inorganic molecules in the Inorganic Crystal Structure Database. On the biological side there are 3,425 structures of Nucleic Acids in the NDB as well as 42,082 structures of proteins and polypeptides in the PDB. If that still isn't enough for the authors, fire up any number of ab initio quantum chemistry programs and in a short time you can create a library of good guesses for the structure of small molecules.

    I tend to think the authors of the article are refering to the problems of a "useable form" for the structures and easy access of many of these databases. The first problem is mearly a problem of converting between the various structural file formats out there, something a good programmer (or grad student) can solve is a few weeks or less. The second is a bureaucrat issue and not a scientific one.

  12. Re:Problem...? by GMO · · Score: 2, Interesting

    Hmmm. Maybe it depends on whether you can convert from internal coordinates to a 3D structure. What you seem to be suggesting is moving through structure space, matching as you go along.

    So at any point, you have to generate images of the 'neighbours' of the current structure. It could work. Maybe.

  13. Quite interesting by excelsior_gr · · Score: 3, Interesting

    This is quite an interesting achievement. The tools that I am familiar with can only search for 2D structures like functional groups (alcohol groups, aromatic rings, etc). At their best, they might give the ability to search for R- and S- stereoisomers, but that is it. This is pretty enough for tasks like solvent design that are quite frequent in the chemical process industry, but in the pharmaceutical R&D they need more powerful tools.

    I will give a simple example of an enzyme: These nice molecules catalyze reactions of vital importance in the modern pharmaceutical industry by providing a chemical "lock" where the "keys" (i.e. the reacting molecules) will dock on. This enables them to react and form a new molecule that will then undock from the enzume leaving the "lock" free for the next pair.

    These "locks" are actually 3D structures of appropriately aligned molecules. This is where this search ability comes in: The chemist suspects how the appropriate lock would look like for catalyzing his reaction (3D alignment of functional groups), much like someone suspects what the right keywords for a Google search are. Then he feeds the data to the machine and gets the molecules that are likely to be of assistance in his work. After that, he can make experiments testing these enzymes to see if they actually work.

    This should speed things up very much in biochemical research. It means less literature research and less failed experiments.

  14. Ehm... it's how much faster? by lagfest · · Score: 2, Interesting

    So the summary says it's 1500 times faster. OK then, if i double the number of items in the database and compare again, is it still 1500 times faster? What if we do a million times the number of items?

  15. related problem by smellsofbikes · · Score: 2, Interesting

    It's nice to know what shape a molecule is. It would be even nicer to be able to make a molecule in a particular shape. If you map an enzyme's active site -- its topology, charge distribution over the surface, possibility for organometallic or hydrogen bonding -- you have a much better chance of finding some interesting analog to the enzyme's substrate that'll make the system do something new. Even better, you could take an existing molecule that you *want*, and form an enzyme surface so that two cheap molecules, exposed to your new enzyme surface, will find it thermodynamically favorable to become the molecule you want, and suddenly you're in a very profitable business: you can breed chemical engineering factories rather than having to build them.

    This poses a problem, similar to the (unstated) problem posed by the molecular printers in Neal Stephenson's Diamond Age: what happens when this sort of stuff starts to become widely available and people start engineering enzymes or instructing their printers to produce, say, heroin, or TNT? With molecular printers, presumably the first versions would only be able to produce structural stuff: printing bicycles, not martinis. But if we get to the point where we can design enzymes for a desired substrate -> product reaction, we have a real problem because it's all wet chemistry and there isn't an obvious hardware/firmware way to block people making anything their inventive, twisted little minds can come up with.

    Mind you, I think that's great. I miss the days where I could order almost any chemical I wanted without having to wade through masses of paperwork, tracking, and laws intended to ban any drug analog that might have pharma activity. But it is going to have some very exciting side-effects.

    --
    Nostalgia's not what it used to be.
  16. Possible application? by MercBoy · · Score: 2, Interesting

    This makes me wonder if this could evolve to more general purpose 3-D searches, such as facial recognition, searching for a specific shape of car, suspect identification in a crowd based upon a combination of body shape, face, etc.

  17. Re:so? by Bat+Country · · Score: 2, Funny

    Great, I can finally search for the chemical formula for C14L11S, which honestly has been puzzling me for some time. Apparently it affects the molecule P3N1S.

    --
    The land shall stone them with the bread of his son.
  18. Great... by The+Orange+Mage · · Score: 2, Funny

    Just what we need...another dimension to lose things in.

  19. really? by GMO · · Score: 2, Insightful

    Although the crystal structure is not the same as the structure in solution, it can't be that far off.

    Crystals are pretty watery, much like the cell. Unless packing contacts are altering the active site, they are unlikely to be much different.

    Also, the bulk of the structure is there to keep the active site residues in a particular orientation.

    Perhaps management vitriol was partially justified? :) Only joking, you may be right. I don't work on drug design, only backbone structure.

  20. existing 3D molecule search engine by dr_blurb · · Score: 2, Interesting

    Go to: http://shape.cs.princeton.edu/search.html/ and select "Protein Database" from the drop down list, and enter "random" as the keyword. Next, the "find similar shape" links do full 3D feature vector matching against a database of 16900 protein molecule models, in a fraction of a second. But apparently this new method is "1500 faster than anything previously developed"? Maybe the authors never checked the current 3D shape matching literature?

  21. Re:so? by iago-vL · · Score: 2, Funny

    Am I the only one who had to stop and think, "Ok 14 atoms of carbon combined with..... what the hell element is 'L'?"