Slashdot Mirror


Hitachi Develops New Visual Search

Tech.Luver writes to tell us that Hitachi has developed a new visual search engine that can supposedly find similar images from within millions of video and picture data entries in around 1 second. "The technology assesses the similarity of images based on image characteristics presented as high-dimensional numeric information. The information is acquired by automatically detecting information regarding the images, such as color distribution and shapes."

12 of 166 comments (clear)

  1. Hmmm.... robotics? by fyngyrz · · Score: 5, Interesting

    This is interesting to me - if it performs well - because this is one of the key missing elements for robotics; robots have a lot of trouble trying to match the environment around them to stored records of objects unless the environment is severely constrained. I'm not speaking of AI here (or at least, not yet) but just robots that would be able to clean your floor, carry your groceries, navigate in a burning building, walk your dog, tend your lawn. If they can classify images against stored images well, we're that much closer to generally useful and at least semi-autonomous robot devices.

    Training might be a little annoying the first few times, but once you had a good database, you could replicate - or share via RF, that'd be freaky... neighbor's robot learns what a ferret looks like, now yours knows too - so that newer models were more and more informed right out of the box. Crate. Coffin. Whatever.

    Add an associative database so that images normally found near other images which have just been found are searched first, and perhaps you could get the general search time down from the quoted 1 second, I'm thinking. One second is kind of pokey for a lot of robotic applications. But if the thing is in a kitchen, why would it need to be looking to recognize images that are found in a shipyard?

    And I, for one, would welcome our semi-autonomous, environment recognizing, floor cleaning robot underlings.

    --
    I've fallen off your lawn, and I can't get up.
    1. Re:Hmmm.... robotics? by dotpavan · · Score: 2, Interesting

      what I am curious about is the computing resources required to process.. other than the algorithm, is this one of the reasons which is delaying the emergence of search in the field of images/music/video on a commercial level? riya made some strides, but is still "learning"

    2. Re:Hmmm.... robotics? by lawpoop · · Score: 4, Interesting

      I'm not speaking of AI here (or at least, not yet) but just robots that would be able to clean your floor, carry your groceries... Well, you are talking about AI here. It turns out that it's relatively easy to make a computer that can beat humans in chess or do complex math equations, but something as simple as walking with 6, 4, or two legs, which a lot of really stupid organisms do, is really difficult. Something like distinguishing 'indoors' from 'outdoors' or a cloud bank from the bushes, seems way in the future.

      My pet theory is that we don't have the right kind of device yet. A mind, the 'function' of an organic nervous system, is not a Turing machine. I don't really understand the math behind it, but Goedel's incompleteness theorem seems to show that a human mathematician can understand certain mathematical proofs that a Turing machine can never prove. Since all computers are a essentially a Turing machine, no matter how fast or parallelized they are, or how much memory they have, they will never be able to do what a human mind can do. So, maybe someday we will have artificial intelligence, or, a floor-washing robot, but we currently don't have the right kind of device that can do it.
      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
    3. Re:Hmmm.... robotics? by fyngyrz · · Score: 2, Interesting
      Well, you are talking about AI here.

      No. I'm not. There are robotic vacuums and lawnmowers right now. I'm just talking about giving them some eyes so they know not to mow your puppy or your child or your roses, or vacuum up your engagement ring. Teaching a robot firething not to step into a hole in the floor, and to rescue people before pets, and pets but not stuffed animals.

      It turns out that it's relatively easy to make a computer that can beat humans in chess or do complex math equations, but something as simple as walking with 6, 4, or two legs, which a lot of really stupid organisms do, is really difficult.

      Not so difficult that it hasn't been solved multiple times, multiple ways, including such variations as stair-climbing and running. Nothing to do with AI, either; just a progression from over-complicated attempts to solve using complex equations over the whole assembly to simpler approaches like fuzzy-logic based feedback systems that work right at the joints.

      Something like distinguishing 'indoors' from 'outdoors' or a cloud bank from the bushes, seems way in the future.

      Not if there's a good image-matching mechanism, it isn't. The concept of "looks like" is a very powerful one. That's what Hitachi says they've done here; we'll see if it lives up to the report.

      My pet theory is that we don't have the right kind of device yet.

      Keep that theory warm. Reality has a way of bringing the cold, fast and harsh. My pet theory is that a serial computer architecture can emulate anything, anywhere, given the proper code, enough storage and enough time to jump through all the hoops; to which I add, once you get it working, you can optimize the code and the hardware to do the job better until it is in the realm of the practical, if the investment is worth it. And for AI, IMHO, any investment is worth it. That's been the history of every solved problem so far, and I see no evidence that any solvable problem will be any different. And intelligence is solvable; after all, nature solved it may ways.

      --
      I've fallen off your lawn, and I can't get up.
    4. Re:Hmmm.... robotics? by kebes · · Score: 3, Interesting

      Your implication is that the human mind cannot be reduced to a Turing machine. I am in the other camp--who believe that the mind is subject to rigorous physical law, and that physical law can be expressed arithmetically (in principle), and so the human mind is a Turing machine.

      Godel's theorem says that a consistent arithmetic system will contain unprovable truths. Put otherwise, such a system cannot be both consistent and complete. Thus the Godel counterargument to Strong AI (that human minds and computers are not fundamentally different) is that humans (e.g. mathematicians) can prove things like Godel's theorem, so we are able to "rise above" the arithmetic and exist in states of full proof and full consistency.

      But I think there is a flaw in that logic (note: I am not a mathematician). The theorem doesn't preclude that a given arithmetic system (e.g. human mind) will be able to prove a truth that a weaker system ignored. Thus our ability to see certain truths doesn't mean that there are not other truths that are unprovable to us.

      More fundamentally, no one has actually shown that the human mind is either consistent or complete (proving both would be required to show that we are not subject to Godel's theorem). The human mind is a computational device evolved to solve real-world problems, like escaping predators, rather than contrived ones, like mathematical proofs. It is thus in fact likely to be an inconsistent (internally contradictory) computational system. The human mind may be incomplete and inconsistent.

      I agree that "true AI" will require vastly more computer power, and much more sophisticated algorithms than we have today. But the emerging evidence, from what I've seen, is that "true AI" can be achieved, at least in principle, by a Turing machine.

    5. Re:Hmmm.... robotics? by fyngyrz · · Score: 2, Interesting

      The whole idea of things being impossible based on hierarchies of understanding and/or proof is specious in the extreme. It is a dead-end philosophical backwater. Problems can be, and often are, solved without full understanding. Nature does this all the time; evolutionary algorithms can do it too. So it is irrelevant as to if we can understand AI, or not. The only relevant question is whether we can arrive at it in any way possible, and that question will only remain open until, and if, someone gets it done.

      It is also useful to recognize that it is often true that there are multiple ways to solve the same problem. For instance, if you want to perform division, there's long division, which is clever and solves the problem relatively quickly, but you can also just subtract the divisor from the dividend and count how many times that can be done until the result goes negative. Both completely solve the problem and get you the same result. With regard to AI, it may be that we find a solution that is not the solution nature found for us, and it may be that it is trivially easy to understand. Or not. My point is that the legions of nay-sayers start with a lot of presumptions that have not been established as fact and go on to make these assertions on very shaky ground indeed.

      I'm perfectly ready to say that we don't understand ourselves, and agree that we are intelligent. But that in no way leads to the presumption that we can't create intelligence some other way, or that we can't understand how it is done. Or that it might not be a more effective intelligence than that which we sport.

      If nature can solve the problem - and it obviously has - then there are ways to solve the problem. If nature can do it with locally accessible materials - and it obviously has - then it can be done with locally accessible materials. What is lacking at such a point is merely technology. I fully expect full-on AI to be developed, and I see no known correlation that implies we'll have a good understanding of ourselves at that point in time.

      I agree that "true AI" will require vastly more computer power, and much more sophisticated algorithms than we have today.

      I think I can show you that this isn't so. If you agree, as you seem to, that AI can be embodied in an algorithm running in a Von Neuman architecture, then a slow computer should be able to solve the precise problems a fast computer can, it will simply hand you the result(s) later than the faster machine. Would you not agree that if the problem requires intelligence to solve, that the speed at which exactly the same, and entirely correct, answer is delivered is not a valid metric one could use to say intelligent or not? After all, one could (speaking generally) simply speed up the system (more memory, faster clock) and still get the same answer, perhaps now in the same amount of time; it's still not any smarter, just more convenient. And convenience in the sense of speed is a natural progression of technology.

      From here, we can observe that there is no limitation in today's technology that says we can't put X amount of memory on a custom machine made with readily available tech, both ram and HD; additionally, any CPU can emulate any other CPU. So I say that there is no technological limit we face today that would stop Ai from functioning. Might be slow; but we can provide the hardware resources without question. And if it can be done slowly, hand it to the hardware folks and they'll optimize the hardware when they see what it spends most of its time doing, and it'll get faster. And faster, and faster... :-)

      Conversely, I would expect that once the algorithmic issues are addressed, that we'll see intelligence - real intelligence - coming back to what many thought was incapable hardware. You might get your answer in eighty hours instead of a second, but if you get your answer... there you have it.

      --
      I've fallen off your lawn, and I can't get up.
    6. Re:Hmmm.... robotics? by mikael · · Score: 4, Interesting

      To implement a visual search engine you need to be able to perform the following:

      texture segmentation - splitting up a picture into segments of distinct objects. In a panoramic scene, you want to split the picture up into objects such as sky, ocean, waves, beach, boats, pier, wall, people, animals. As a psychological experiment, you can show someone a picture , point to a particular point and ask them what the first word that the associate with that point is. Then you will see how every scene becomes segmented by our own vision systems.

      Basic image segmentation is implemented using edge detection by Fourier Transforms (FFT, IFFT, DFT). This is a very computation intensive stage that is typically implemented using DSP's, GPU's or even dedicated ASIC's. Data used by the FFT can be in any dimension 1D (audio/radar), 2D (images) and 3D (volume visualisation). But to match the resolution of a human eye, you would need a 100 Megapixel floating point framebuffer.

      texture classification - having identified the silhouette of an object, now attempt to match the contents to a particular object. Simple ways include colour histograms and silhouette matching. More advanced methods attempt to simulate the first few layers of the human retina using Gabor filters, Ring filters and Wedge filters.
      But just to model a single type of retinal cell requires one or more FFT operations for an entire image. And
      there are at least twelve different types of such cells. For efficiency precalculated results of sample images are generated (these are referred to as feature vectors) and then compared against the results of any new image.
      For a really technical explanation of how human vision works have a look at The organisation of the retina and visual system

      texture retrieval - the actual design of the search engine to retrieve images through content rather than just keyword:

      QBIC - Query By Image Content. IBM's image retrieval database system

      All of this has to performed for a single image. For an entire movie requires the processing of hundreds of thousands of images.

      --
      Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
  2. Will Google copy or buy this technology? by bepolite · · Score: 3, Interesting

    I would think this would be a big and useful upgrade for http://images.google.com/

    --
    Always be polite.
  3. Document images by zymurgyboy · · Score: 2, Interesting
    Pictures are fun, but I wonder if it would be accurate enough to locate similar images of documents (and to what degree). It would be really cool (for me anyway) if it could look at, say, a million pdfs or tiffs that don't have embedded text and come back with everything similar/identical.

    I frequently have to create large collections of images from all sorts of file types -- some text-based, some graphics -- that get housed in a collection of images for easy, standardized review. If there were something that could avoid the step of extracting text from them, or later OCRing them and still end up with a searchable image collection, well, that would be exceedingly cool. It would cut the initial time outlay I have to devote to virtually any given project I have to deal with by 25 to 50%.

    --
    If you never make mistakes, it's probably because you're not doing anything.
  4. Re:Not as useful as it sounds... by Jrabbit05 · · Score: 2, Interesting

    But it could be used to create algorithms to find quality pictures, good photographs without viewing all of them.

  5. Re:Not as useful as it sounds... by FleaPlus · · Score: 2, Interesting

    For example: I want to find more cat images. I feed it a picture of a white cat. I am more likely to be returned results of white dogs than, say, tabby or black cats.

    It seems it would be straightforward to implement something analogous to Google Sets, where you could supply a few photographs of what you're interested in (say, several cat pictures of various colors, or several white-colored pets). It could then learn which of the features were relevant, and add weigh to those in its search.

  6. Saw it done 10 years ago by mattr · · Score: 2, Interesting

    Some time between 1992 and 1994 IIRC when I was working at the photo/press agency Pacific Press Service in Tokyo, I saw a demo of a system created IIRC by NEC which searched 90,000 photos in under one second, based on a color freehand drawing you would draw on the screen of the EWS unix workstation on which it ran. Basically if you drew a horizontal blue mass at the bottom of the screen you would get a lake, etc. In other words you could search by rough photographic composition. I am less impressed that after over 10 years Hitachi was able to do something along the same lines.