MIT & Harvard On Brain-Inspired A.I. Vision
An anonymous reader writes with this excerpt from TGDaily:
"Researchers from Harvard and MIT have demonstrated a way to build better artificial visual systems with the help of low-cost, high-performance gaming hardware. [A video describing their research is available.] 'Reverse engineering a biological visual system — a system with hundreds of millions of processing units — and building an artificial system that works the same way is a daunting task,' says David Cox, Principal Investigator of the Visual Neuroscience Group at the Rowland Institute at Harvard. 'It is not enough to simply assemble together a huge amount of computing power. We have to figure out how to put all the parts together so that they can do what our brains can do.' The team drew inspiration from screening techniques in molecular biology, where a multitude of candidate organisms or compounds are screened in parallel to find those that have a particular property of interest. Rather than building a single model and seeing how well it could recognize visual objects, the team constructed thousands of candidate models, and screened for those that performed best on an object recognition task. The resulting models outperformed a crop of state-of-the-art computer vision systems across a range of test sets, more accurately identifying a range of objects on random natural backgrounds with variation in position, scale, and rotation. Using ordinary CPUs, the effort would have required either years or millions of dollars of computing hardware. Instead, by harnessing modern graphics hardware, the analysis was done in just one week, and at a small fraction of the cost."
Instead, by harnessing modern graphics hardware, the analysis was done in just one week, and at a small fraction of the cost.
How inconsiderate. Think about all the potential engineers, administrators, janitors and etc, that would have been needed to do all that work the slow way; thus creating jobs for many for years to come. With one swoop all that potential future effort was made redundant, once again "researchers" have proven that they are unable to see the big picture!
The Long Now Foundation
One of these guys will read about GA, realize this has been done before in other problem spaces, and already has a name.
Try not to get stuck on a local maxima!
Using a GPU to do heavy processing isn't really revolutionary anymore- it's been done before, in many different applications. It bothers me how everytime new research comes out that uses this technique, journalists sensationalize the GPU aspect of it, often taking away from the actual breakthrough.
Is this the result of researchers themselves going on about how CUDA made their lives easier, or the journalists saying 'woah woah, you did this with VIDEO GAME stuff? Tell me more!'
How much power requires that pattern recognition? By standards approachs probably a lot, but the approach they seem to use there (like in compare how much fits what they have with thousands of candidate models) could require less, and far better if you use for that hardware that are more adequated for that task.
The team drew inspiration from screening techniques in molecular biology, where a multitude of candidate organisms or compounds are screened in parallel to find those that have a particular property of interest. Rather than building a single model and seeing how well it could recognize visual objects, the team constructed thousands of candidate models, and screened for those that performed best on an object recognition task.
Without reading the article, because that would be silly, this sounds a lot like using genetic algorithms. Not actually a new technique.
They didn't really use a GA. They had a genome that described the structure of the neural net they wanted to test, but they didn't "evolve" the population through any process of mutation or crossover. They just kept generating new random individuals until they had a good one.
It's like only doing the first step of a GA, but you keep generating random starting points until you find one who's fitness is fairly high (although they did a uniform sampling over all parameter values for their starting point, not quite completely random).
From the paper, it took 23 PlayStation 3's one week to generate, train, and test a population of 7500 individuals.
Even so, they still beat the best hand-designed solutions. Imagine if they had implemented a true GA and just let that system keep on running.
Oh, and they used Python! I'm encouraged. I'm about half-way through translating the C# implementation of the HyperNEAT algorithm into Python. Next, I'll have to get some PS3s and implement a distributed PyCuda HyperNEAT system.
I void warranties.
As I posted in a reply above, they didn't really use a GA. There were no mutations or crossover. They just kept generating new random individuals until they had a good one.
I void warranties.
It seems to me that they are just using random functions to see what works best. But what they neglect to say is how good their best functions do. What percentage of of identification is correct. The assumption is that the brain uses some mathematical function for its processing which may not be the case.
I personally have been doing this as well as the same thing with mutations for 6 years in an artificial life/neural net simulation. And I'm just a hobbyist (many researchers have and are doing all kinds of this type of stuff). It's definitely a powerful technique and fun to read about their success, but hardly new.
Combinatorial chemistry techniques seem to be the inspiration here. Also known as trial and error (albeit in a rapid well-organized fashion). Not exactly a new idea, but this is an interesting new implementation.
They did not use a genetic algorithm, instead they took the Monte Carlo approach. Not a bad approach if you don't mind making the results better by hand, but not as good (or the same) as a GA. Also not a new approach.
First thing that comes to mind is echolocation when I think of ways to measure distance. Visualy it would take much more to process and recognize shapes if I take an educated guess and simulate the process. It would have to deal with various lighting conditions, and geometries. Sending a ping signal like a submarine can be measured equally as well, if not better than I expect a visual system to do it. In fact our eyes work in rather the same way, collecting light that bounces off objects.
Who knows if Nvidia OR AMD (think: Fusion) has been funding this research?
That part about magic cost reduction sounds a little funny.
Also, I think it should be said that a decent single core graphics card might cost $150 vs. one of those power-wise full featured $300 PS3's with the Cell processor (not the mention a stream-processor carrying Nvidia/ATI video card).