New Algorithms Improve Image Search
bc90021 writes "Electrical engineers from UC San Diego are making progress on an image search engine that analyzes the images themselves. At the core of this Supervised Multiclass Labeling system is a set of simple yet powerful algorithms developed at UCSD. Once you train the system (the 'supervised' part), you can set it loose on a database of unlabeled images. The system calculates the probability that various objects it has been trained to recognize are present, and labels the images accordingly. After labeling, images can be retrieved via keyword searches. Accuracy of the UCSD system has outpaced that of other content-based image labeling and retrieval systems in the literature. One of the co-authors works at Google, where the researchers have access to image collections at the largest of scales."
Snarkiness aside, this is pretty cool stuff. I hope to see usable OSS code in a few years. Imagine how cool it would be to query "show me all pics with my daughter and her rabbits" and have it week through the 1000's of digital family photos.
Method of processing duck feet
The probability is either zero or one, because whether or not the feature being sought is present is a state of nature. It would be more helpful to call this number the confidence that the feature is present.
... was similarly trained to recognise tanks in landscapes. I was doing really well - getting a great score on the fresh images it was presented with.
Then they introduced it to a new batch of images and it fell apart.
Turns out that the initial set of images had all the tanks shot on a sunny day and all the tankless images shot on a cloudy day (or vice versa). It had learned to tell a sunny day from a cloudy day.
Ha ha.
That's one of the famous uses of image analysis - finding the presence of human skin in digital pictures.
Skin detection.....5.5 million hits on Google.
Once you can do this accurately, companies like McAffee and Norton can scan the internet and database pr0n sites for the whole web. Keep in mind that there's a subscription service that allows a Norton database to filter websites for them.
Parents...
Since a huge % (perhaps most) image searches are for porn, it is probably a worthwhile thing for a search server to quickly classify likely porn as a way to reduce search server loading.
Engineering is the art of compromise.