Image Recognition Neural Networks, Open Sourced
sevarac writes "The latest release of Java Neural Network Framework Neuroph v2.3 comes with ready-to-use image recognition support. It provides GUI tool for training multi layer perceptrons for image recognition, and an easy-to-use API to deploy these neural networks in end-user applications. This image recognition approach has been successfully used by DotA AutoScript tool, and now it is released as open source. With this Java library basic image recognition can be performed in just few lines of code. The developers have published a howto article and an online demo which can be used to train image recognition neural networks online."
Neural nets aren't a good solution for all problems (nothing is, of course).
Furthermore you need plenty of training data partitioned into training and test sets so you can evaluate the generalisation error. If you don't do that you can have absolutely no confidence about the performance of your system.
Working with neural nets has a mostly experimental feel to them (I am a final year AI student). I'm not sure application developers have the discipled methodology to use them correctly (though maybe this framework helps with that) as these behave quite differently from standard computer vision recognition techniques.
Nonetheless, good to see another free toolkit out there.
Maybe im wrong but to me "open sourced" sounds like the project was closed then it was "open sourced" and now its open source. This project has AFAICT always been open:
Neuroph started as a graduate thesis project, after that a part of master theses, and on September 2008. it became open source project on SourceForge.
Also why not give the license in the summary (LGPL3).
IranAir Flight 655 never forget!
This is hardly news-worthy. Having looked at their web-site and having done object recognition myself I can tell you, that this wont work for several reasons. From their FAQ it seems that they are just taking the image and create very long vector from it and then feed that into an ANN.
The following problems exist with this approach (which has been known and addressed in at least the last 20 years):
1. What if you want to classify pictures that have different sizes (not too uncommon)? Wont work because you first have to set a fixed number of neurons in your first layer.
2. What about different locations of the same object? Even if you move the whole image just one image to the right this will dramatically change your flattened vector and can yield completely different classification. This is called location invariance and is solved in the computer vision community by extracting a lot of features on "interesting" regions in the image. With that you can also have different sized images. One famous extraction method is the scale invariant feature transform (SIFT).
I could go on but you get the point. As for using ANNs instead of other classification methods, it does not really matter that much what kind of classificator you use as the pre-processing is much more important. Support-Vector-Machines has been hyped and work pretty well, but it has it's own caveats. However, using a SVM on the raw-image data produces the same horrible results that I would expect here.
So please don't take the (dumb) slashdot blurb as the exhaustive feature list of the library.
Supported :
-> Adaline
-> Perceptron
-> Multi layer perceptron
-> Hopfield
-> Kohonen
-> Hebbian
-> Maxnet
-> Competitive network
-> RBF network
So it does support several quite modern approaches. It also has a training utility which supports image training. This should be very useful to students imho.
I spent a lot of time on this project, writing a lot of neural net simulations, supervised and unsupervised learning, back-prop, Hopfield nets, reproducing a lot of Terry Sejnowski's and Grossman's work, taking trips over to Caltech to see what David Van Essen and his crew were doing with their analysis of the visual cortex of Monkey brains, trying to understand how "wetware" neural nets can so quickly identify features in a visual field, resolve 3D information from left-right pairs, and the like. For the most part, all of the neural net models were really programmable dynamical systems, and the big trick was to find ways (steepest descent, simulated annealing, lagrangian analysis) of computing a set of synaptic parameters whose response minimizes an error function. That, and figuring out the "right" error function and training sets (assuming you are doing supervised learning).
Bottom line was, not much came of all this, beyond a few research grants and published papers. The one thing that we do know now is, real biological neural networks do not learn by backward-error propagation. If they did, the learning would be so slow that we would all still be no smarter than trilobites if that. Most learning does appear to be "connectionist" and is stored in the synaptic connections between nodes, and that those connections are strengthened when the nodes that they connect often fire simultaneously. There is some evidence now of "grandmother cells" which are hypothetical cells that fire when, e.g. your grandmother comes into the room. But other than that, most of the real magic of biological vision appears to be in the pre-processing stages of the retinal signals, which are hardcoded layer upon layer of edge-detectors, color mapping, and some really amazing slicing, dicing and discrete FFT transforms of the orginal data into small enough and simple enough pieces that the cognitive part of the brain can make sense of the information.
It's pretty easy to train a small back-prop net to "spot" a silhouette of a cat and distinguish it from a triangle and a doughnut. It is not so easy to write a network simulation that can pick out a small cat in a busy urban scene, curled up beneath a dumpster, batting at a mouse....
To do that, you need a dog.