An Advance In Image Recognition Software

← Back to Stories (view on slashdot.org)

An Advance In Image Recognition Software

Posted by kdawson on Saturday May 24, 2008 @12:45PM from the needle-in-a-haystack dept.

Roland Piquepaille alerts us to work by US and Israeli researchers who have developed software that can identify the subject of an image characterized using only 256 to 1024 bits of data. The researchers said this "could lead to great advances in the automated identification of online images and, ultimately, provide a basis for computers to see like humans do." As an example, they've picked up about 13 million images from the Web and stored them in a searchable database of just 600 MB, making it possible to search for similar pictures through millions of images in less than a second on a typical PC. The lead researcher, MIT's Antonio Torralba, will be presenting the research next month at a conference on Computer Vision and Pattern Recognition.

18 of 81 comments (clear)

Min score:

Reason:

Sort:

There goes the neighborhood by HeavensBlade23 · 2008-05-24 12:48 · Score: 3, Funny

This will be used to break CAPTCHA-type schemes even worse than they already are.
1. Re:There goes the neighborhood by Jeremiah+Cornelius · 2008-05-24 12:57 · Score: 4, Insightful
  
  This will be used to identify YOU, citizen.
  
  --
  "Flyin' in just a sweet place,
  Never been known to fail..."
2. Re:There goes the neighborhood by elnico · 2008-05-24 13:22 · Score: 5, Informative
  
  Incorrect. First of all, in a CAPTCHA, you're trying to very rigorously inspect a single image. This advance seems to be more about taking quick glances at lots of images. Furthermore, in the article, they talk about recognizing flowers and cars. The fact is, computers already have no problem recognizing letters and numbers in images. We got that down a long time ago. The difficult things about reading a CAPTCHA image are removing distortion and splitting the whole image into the component characters. If you read the article, you'd see that this research has nothing to do with that.
3. Re:There goes the neighborhood by dbIII · 2008-05-24 13:41 · Score: 4, Funny
  
  Looks like I'll have to stop attaching my low bit facial image as a signature.
Like every other "advance" in image recognition... by ZxCv · 2008-05-24 12:55 · Score: 2, Insightful

...I'll believe it when I see it.

Until then, it's snake oil, as far as I'm concerned.

--

Perl - $Just @when->$you ${thought} s/yn/tax/ &couldn\'t %get $worse;
Re:Like every other "advance" in image recognition by Jeremiah+Cornelius · 2008-05-24 13:03 · Score: 4, Funny

If I read you correctly - and I think I do... You mean to say that snake oil is somehow... invisible?

No wonder those snakes are not only so quiet, but I never even see 'em coming!

Geez. We don't stand a chance.

--
"Flyin' in just a sweet place,
Never been known to fail..."
Of course it helps if you read the papers... by Steve+Mitchell · 2008-05-24 13:08 · Score: 3, Informative

I hate reading press releases of reading papers with real explanations of what's going on.

I just finished reading "Small Codes and Large Image Databases for Recognition" written by the guy. All he did was implemented Geoff Hinton's idea of databasing images with a binarized coefficients produced by Restricted Boltzmann Machines.

Hinton himself gave a talk on it for Google here:
http://www.youtube.com/watch?v=AyzOUbkUf3M

Actually I'm wondering, is he plagiarizing Hinton?

--
-- Making computers see, hear, and think... http://www.componica.com/
1. Re:Of course it helps if you read the papers... by samkass · 2008-05-24 13:19 · Score: 2, Insightful
  
  I think plagiarizing is a strong word to throw around. And particular implementations of general approaches can often be very interesting when one considers what tradeoffs are made transferring pure theory to practical applications. If this sort of thing were attempted in the 90's, they'd probably arbitrarily pick a few hundred features by hand and KL-transform it down to the most significant dimensions and hash those into one of these codes. Since I've been out of "the biz" for awhile, it's pretty interesting to me to read about these new approaches and how far both the theory and implementations have come.
  
  --
  E pluribus unum
2. Re:Of course it helps if you read the papers... by Hays · 2008-05-24 15:17 · Score: 3, Insightful
  
  Jeff Hinton worked with them, you really think they're plagiarizing him? That claim doesn't even make sense, this is a novel research domain. A big part of science is taking people's ideas, reproducing them, and applying them to novel domains. That's how it's SUPPOSED to work.
  
  This research involves the use of one of the largest image databases seen in computer vision. It shows that you can do extremely rapid scene matching for databases of this scale. No, that's not obvious no matter what you think. This image data is fairly high dimensional.
  
  This research says something about the space of likely scenes and it might be a key enabling technology to a lot of the heavily data driven computer vision and computer graphics approaches popping up lately.
Search Jenna Jameson? by Bayoudegradeable · 2008-05-24 13:23 · Score: 4, Funny

Oh my, the soon to be most searched "name" on the web is... Jenna Jameson! Wait a minute, I think I misunderstood "facial" recognition...

--
Sig Registration Form 34c_766(a) submitted to Ministry of Signature Management. Approval pending.
Re:Oh really? by elnico · 2008-05-24 13:31 · Score: 3, Insightful

"They're going to distinguish an individual based on images with 256 to 1024 bits of data?"

No one said they were going to identify individual people with this. The main gist of this research seems to be efficiency (in both space and time, if I read it correctly). For instance, if one wanted to identify every face in a picture of a crowd, they could apply this algorithm to a low-res version of the image to quickly find the locations of every "face," and then use a more advanced face recognition algorithm to actually figure out who it is they're looking at.
Re:Like every other "advance" in image recognition by Yvan256 · 2008-05-24 13:41 · Score: 5, Funny

Or rather... I'll believe it when it sees me!
Re:Like every other "advance" in image recognition by Spikeman56 · 2008-05-24 14:55 · Score: 3, Informative

The actual paper is at http://people.csail.mit.edu/torralba/publications/nipsRecognitionBySceneAlignment.pdf

From what I can tell, it's basically, "blur the image down to only a few hundred pixels and then you have less data to comb through!"
Re:Like every other "advance" in image recognition by Concerned+Onlooker · 2008-05-24 15:01 · Score: 2, Interesting

Yep, I second that. The article is really short on details. Not surprising since they're presenting it at a conference next month. We don't even know what kind of features they are extracting from the images. Are they using wavelets? Texture descriptors? Color information? Shape recognition? It sounds like a combination of true content based image recognition with keyword input association if I read the article correctly.

If they are claiming to have a general image recognition algorithm that'd be something. As it is a lot of research goes into recognizing specific kinds of things, such as faces, license plates, etc. I'm very curious to see what they've come up with.

--
http://www.rootstrikers.org/
Re:Very cool stuff... by Hays · 2008-05-24 15:26 · Score: 3, Informative

What you're asking for is ill-defined, but much sought after.

A reasonable descriptor which produces distances that seem somewhat correlated with human perception would indeed be Antonio Torralba and Aude Oliva's gist descriptor.

http://people.csail.mit.edu/torralba/code/spatialenvelope/

It's become quite popular in computer vision and computer graphics for scene matching.
Re:Like every other "advance" in image recognition by Hays · 2008-05-24 15:27 · Score: 4, Informative

Read the papers then
http://people.csail.mit.edu/torralba/tinyimages/
Re:I forsee nefarious law enforcement uses by ceoyoyo · 2008-05-24 19:15 · Score: 2, Insightful

Any decent object recognition algorithm supports at least affine transformations, which include rotation.

Some of those scientists are actually pretty smrt.
Re:I forsee nefarious law enforcement uses by ceoyoyo · 2008-05-25 07:58 · Score: 4, Informative

There are all kinds of ways, but two simple ones come to mind. If you convert to a polar coordinate system the power spectrum is conveniently orientation independent. You can use the same trick with a shift: the power spectrum of a Cartesian coordinate system is shift independent.

Another way is to somehow identify the orientation. An simple way to do that is to find the axis along which there's maximum variation and rotate until those axes match in both images.

Pixel by pixel co-registration basically does look at a similarity measure for a lot of variations on the affine transform. You generally don't have to look at them all though: you use an iterative algorithm with a clever optimization strategy so your transform gets better and better instead of searching through the parameter space randomly.