Recognizing Scenes Like the Brain Does

← Back to Stories (view on slashdot.org)

Recognizing Scenes Like the Brain Does

Posted by kdawson on Sunday February 11, 2007 @09:26AM from the software-meets-wetware dept.

Roland Piquepaille writes "Researchers at the MIT McGovern Institute for Brain Research have used a biological model to train a computer model to recognize objects, such as cars or people, in busy street scenes. Their innovative approach, which combines neuroscience and artificial intelligence with computer science, mimics how the brain functions to recognize objects in the real world. This versatile model could one day be used for automobile driver's assistance, visual search engines, biomedical imaging analysis, or robots with realistic vision. Here is the researchers' paper in PDF format."

4 of 115 comments (clear)

Min score:

Reason:

Sort:

Re:Interesting, but what comes next? by zappepcs · 2007-02-11 10:09 · Score: 5, Interesting

It is interesting to consider the problem of AI researchers. How to create intelligence when it is not really understood. In the time between now and when we do understand it, we'll have to develop systems using logic and software that approximates how we understand it. A simple example is to ask yourself how many times that you had to learn that fire is hot? An AI system may have to learn this every time that you turn it on.

There is software systems that can approximate the size and distance between objects in a picture with reasonable accuracy, and if the scope of scenery presented to the system is limited, then that ability combined with sensing motion of objects is enough to determine a large percentage of what is desired. This is not the trouble or the hard part. The hard part is determining object classification and purpose in those times when it is not simple.

Each of us can almost always look at a scene and determine the difference between a jogger and a purse thief on the run or a businessman late for an appointment. For computers to do so takes a great deal more work. It is only a subtle difference and one where both objects maintain similar base characteristics.

The point? Even mimicking human skills is not easy, and fails at many points without the overwhelming store of knowledge that humans have inside their heads. This would point to the theory that if more memory was available, AI would be easier, but this is not true either. Humans can recognize a particular model of car, no matter what color it is and usually despite the fact that it might have been in an accident. The thinking that comes into play when using the abstract to extract reality from a scene is not going to happen for computers for quite some time.

The danger is when such ill prepared systems are put in charge of important things. This is always something to be wary of, especially when it is used to define/monitor criminal acts and identify those who are guilty whether that is on cameras at intersections or security systems, or government surveillance systems.

--
Support NYCountryLawyer RIAA vs People
Re:Interesting, but what comes next? by cyphercell · 2007-02-11 10:23 · Score: 2, Interesting

we are able to give these systems our own abilities as a starting point and then watch it somehow create something more intelligent than we are... then we really have something.
This technology is prerequisite to providing an AI system with a starting point. It offers for instance the powers of perception as input for a learning system. A baby for example opens their eyes and simply sees, this is only part of the baby's starting point. Other aspects of your "starting point" include predetermined goals such as eating and also include points of failure like starving. Many avenues of input are required for effective learning at different capacities, Helen Keller for instance learned very early the value of eating, however formal communication was a remarkable accomplishment to say the least.

I agree with you that I would love to see a true A.I. system fully capable of learning, but discounting research that provides an AI system with the ability to see seems rather counter-productive.

If our intelligent systems are always evolution-limited by the progress of our own biological systems then I can't see how A.I. smarter than a human will ever ben achieved.
This will be achieved by more input streams, a more sophisticated "starting point", well thought out points of success and failure, and finally the fact that we can make cooperation mandatory between artificial "minds". This is of course that point at which humans become lost, try to pull the plug and Skynet launches the Nukes in retaliation.

--
Under the influence of Post-Cyberpunk Gonzo Journalism
Re:not like the brain does. by dfedfe · 2007-02-11 10:49 · Score: 2, Interesting

I admit I only gave the paper a quick read, so I can't say for sure. But my impression was that spatial information was only discarded in passing information to the next layer in the model. That strikes me as reasonable. For one, they're simulating the dorsal stream, which, in my understanding, is basically attended-object specific, so it seems proper to discard the relationship between the attended object and the rest of the scene. As for discarding spatial relationships between two features of the same object, that also strikes me as roughly reasonable. In real brains there isn't a strict tree-like hierarchy, projections from one region go both to the next higher region but also skip past it and go to yet higher regions. Thus if we have projections A->B->C, B can discard the spatial relationship of two units in A, as long as A also projects to C, which would then still get the spatial information from A as well as the combined information from B (hope that makes sense). It's true that they didn't include such connections in this model, though. I still think it's fair, at least as a starting point for more complex models.
They do discuss the lack of feedback projections, but I also think it's fair to ignore those for the present purposes, because feedback makes things a lot more complicated, modeling-wise.
Finally, I don't have time to go back and check this, but it seemed like the SVM was used to classify the output of the network. That is, it struck me as a test to see how well the highest layer in the network ended up representing the input (after all, you need *some* way to see how well it's doing, and that's a straightforward way). Could be wrong, though.
My own two cents by MillionthMonkey · 2007-02-11 11:02 · Score: 5, Interesting

I've written here before about epileptic seizures I have that start somewhere in the right occipital lobe possibly near V1, based on the nature of the aura and a recent video EEG last month. These things started for no reason when I was a teenager and now involve these interesting post-ictal fugue states where only chunks of my brain seem to be working but I'm still able to run around and get in trouble. I've developed a talent over the years for coping with brain trauma and sort of bullshitting my way through it.

Usually I'm not forming long term memories during fugue states, but when I do, I remember some pretty interesting stuff. One thing that is typically impaired is object recognition, since this mostly seems to be handled by the right occipital lobe. I can see things but can't immediately recognize what they are, unless I use these left-brain techniques. The left occipital lobe can recognize objects too, but the approach it takes is different and more of a pain in the ass to have to rely on. It's more of a thunky symbolic recognition, as opposed to an ability to examine subtle forms, shapes, and colors. I have to basically establish a set of criteria that define what I'm looking for and then examine things in the visual field to see if they match those criteria. I'll look for a bed by trying to find things that appear flat and soft; I'll look for a door by looking for things with attributes of a doorknob such as being round and turnable; I'll find water to drink by looking closely at wet things. My wife says I make some interesting mistakes, like once confusing her desk chair for a toilet (forgetting for a moment that part of a toilet has to be wet, but at that point memory formation and retrieval is disrupted to the point where I could imagine forgetting that it's not enough to just be able to be sat on, toilets have to have water in them too). I have trouble recognizing faces, and she says I'm sometimes obviously pretending to recognize her. Recognizing a face using cold logic can be tricky even when you're not impaired. Recognizing familiar scenes and places becomes difficult. I drove home in a fugue state once, back in my twenties, and while I didn't crash into anybody or have any sort of accident, I did get lost on the way home from work. I ended up driving miles past where I lived. Even as a pedestrian, getting lost in familiar areas is still a problem.

People have been trying to come up with image processing algorithms that mimic cortical signal analysis for decades. I remember reading papers ten years ago like this. It's amazing to see they're still mistaking road signs for pedestrians. I don't think even I could make an error like that. The state of the art was totally miserable back then, too. Neuroscience has got to be one of the sciences most poorly understood by humans.