An Advance In Image Recognition Software
Roland Piquepaille alerts us to work by US and Israeli researchers who have developed software that can identify the subject of an image characterized using only 256 to 1024 bits of data. The researchers said this "could lead to great advances in the automated identification of online images and, ultimately, provide a basis for computers to see like humans do." As an example, they've picked up about 13 million images from the Web and stored them in a searchable database of just 600 MB, making it possible to search for similar pictures through millions of images in less than a second on a typical PC. The lead researcher, MIT's Antonio Torralba, will be presenting the research next month at a conference on Computer Vision and Pattern Recognition.
This will be used to break CAPTCHA-type schemes even worse than they already are.
...I'll believe it when I see it.
Until then, it's snake oil, as far as I'm concerned.
Perl - $Just @when->$you ${thought} s/yn/tax/ &couldn\'t %get $worse;
If I read you correctly - and I think I do... You mean to say that snake oil is somehow... invisible?
No wonder those snakes are not only so quiet, but I never even see 'em coming!
Geez. We don't stand a chance.
"Flyin' in just a sweet place,
Never been known to fail..."
I hate reading press releases of reading papers with real explanations of what's going on.
I just finished reading "Small Codes and Large Image Databases for Recognition" written by the guy. All he did was implemented Geoff Hinton's idea of databasing images with a binarized coefficients produced by Restricted Boltzmann Machines.
Hinton himself gave a talk on it for Google here:
http://www.youtube.com/watch?v=AyzOUbkUf3M
Actually I'm wondering, is he plagiarizing Hinton?
-- Making computers see, hear, and think... http://www.componica.com/
Oh my, the soon to be most searched "name" on the web is... Jenna Jameson! Wait a minute, I think I misunderstood "facial" recognition...
Sig Registration Form 34c_766(a) submitted to Ministry of Signature Management. Approval pending.
"They're going to distinguish an individual based on images with 256 to 1024 bits of data?"
No one said they were going to identify individual people with this. The main gist of this research seems to be efficiency (in both space and time, if I read it correctly). For instance, if one wanted to identify every face in a picture of a crowd, they could apply this algorithm to a low-res version of the image to quickly find the locations of every "face," and then use a more advanced face recognition algorithm to actually figure out who it is they're looking at.
Or rather... I'll believe it when it sees me!
The actual paper is at http://people.csail.mit.edu/torralba/publications/nipsRecognitionBySceneAlignment.pdf
From what I can tell, it's basically, "blur the image down to only a few hundred pixels and then you have less data to comb through!"
Yep, I second that. The article is really short on details. Not surprising since they're presenting it at a conference next month. We don't even know what kind of features they are extracting from the images. Are they using wavelets? Texture descriptors? Color information? Shape recognition? It sounds like a combination of true content based image recognition with keyword input association if I read the article correctly.
If they are claiming to have a general image recognition algorithm that'd be something. As it is a lot of research goes into recognizing specific kinds of things, such as faces, license plates, etc. I'm very curious to see what they've come up with.
http://www.rootstrikers.org/
What you're asking for is ill-defined, but much sought after.
A reasonable descriptor which produces distances that seem somewhat correlated with human perception would indeed be Antonio Torralba and Aude Oliva's gist descriptor.
http://people.csail.mit.edu/torralba/code/spatialenvelope/
It's become quite popular in computer vision and computer graphics for scene matching.
Read the papers then
http://people.csail.mit.edu/torralba/tinyimages/
Any decent object recognition algorithm supports at least affine transformations, which include rotation.
Some of those scientists are actually pretty smrt.
There are all kinds of ways, but two simple ones come to mind. If you convert to a polar coordinate system the power spectrum is conveniently orientation independent. You can use the same trick with a shift: the power spectrum of a Cartesian coordinate system is shift independent.
Another way is to somehow identify the orientation. An simple way to do that is to find the axis along which there's maximum variation and rotate until those axes match in both images.
Pixel by pixel co-registration basically does look at a similarity measure for a lot of variations on the affine transform. You generally don't have to look at them all though: you use an iterative algorithm with a clever optimization strategy so your transform gets better and better instead of searching through the parameter space randomly.