Google Announces Image Recognition Advance

← Back to Stories (view on slashdot.org)

Google Announces Image Recognition Advance

Posted by timothy on Thursday November 20, 2014 @11:18AM from the what-does-a-grue-look-like? dept.

Rambo Tribble writes Using machine learning techniques, Google claims to have produced software that can better produce natural-language descriptions of images. This has ramifications for uses such as better image search and for better describing the images for the blind. As the Google people put it, "A picture may be worth a thousand words, but sometimes it's the words that are the most useful ..."

29 comments

Min score:

Reason:

Sort:

what would be useful by ihtoit · 2014-11-20 11:22 · Score: 0

...is an offline app that compares two images, and if they scale-match, keep the higher resolution one and ditch the smaller one. And runs the comparison over several thousand files (or even hundreds of thousands, or millions) - time is not a factor.
(a scaling deduplicator, if you will).
Is there already such a beast? Anyone?

--
Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
1. Re:what would be useful by Sowelu · 2014-11-20 12:21 · Score: 1
  
  If you had a program that could generate a scale invariant hash of an image, and given a command line tool that could tell you the resolution of an image (which exist, just don't know the names), I'm pretty sure you could do that in a single line in bash.
  Wouldn't be surprised if there was a program that generated an image hash. Even better if it generates a value where stronger features are higher bits and smaller features (that could be lost in scaling) are lower bits, so you could truncate and compare? That might be too fiddly.
2. Re:what would be useful by safetyinnumbers · 2014-11-20 12:24 · Score: 1
  
  findimagedupes builds a database of fingerprints (basically a scaled-down monochrome image) and can call an external program with the matching duplicates. You could read the resolution in the external script using jhead or exiftool.
3. Re:what would be useful by Morose · 2014-11-20 12:36 · Score: 4, Informative
  
  Not only is there a program for this, but it's free. It's called VisiPics (http://www.visipics.info/index.php?title=Main_Page). I use it to organize my photo collection. Not only can it do scaling, but it can check other similarity factors as well. Not sure if it scales to millions, but I've used it with 20,000 images before.
  Enjoy.
4. Re: what would be useful by Anonymous Coward · 2014-11-20 12:53 · Score: 0
  
  I wrote a program that does exactly this last year.
5. Re:what would be useful by ArcadeMan · 2014-11-20 13:37 · Score: 2
  
  However, because of lossy compression, you might want to keep an image that is slightly lower resolution but still has a better overall image quality.
  
  --
  Get free satoshi (Bitcoin) and Dogecoins
6. Re:what would be useful by mister_playboy · 2014-11-20 13:50 · Score: 2
  
  And runs the comparison over several thousand files (or even hundreds of thousands, or millions)
  Ah yes... the joys of Internet Art collecting!
  
  --
  Do what thou wilt shall be the whole of the Law ::: Love is the law, love under will
7. Re:what would be useful by Anonymous Coward · 2014-11-21 04:24 · Score: 0
  
  I like it. Very nice for incremental duplicate finding if you keep around a hashtable. The "Internet Art" collecting never stops...
8. Re:what would be useful by ihtoit · 2014-11-21 20:34 · Score: 1
  
  ooh, this looks like it might be just the ticket! Thanky! :D
  
  --
  Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
But can it generate an image from words ... by xmas2003 · 2014-11-20 11:25 · Score: 1

Pretty nifty ... wondering if I describe a scene such as from the TFA:
"Two pizzas sitting on top of a stove top oven (with a glass of wine)"
if it can generate an image algorithmically ... rather than just display an image from the library that meets those criteria ...

--
Hulk SMASH Celiac Disease
1. Re:But can it generate an image from words ... by cosm · 2014-11-20 12:29 · Score: 1
  
  Tell it two girls, one cup. Wonder if it would become sentient just to subsequently kill itself.
  
  --
  'We are trying to prove ourselves wrong as quickly as possible, because only in that way can we find progress.' RPF
2. Re:But can it generate an image from words ... by phantomfive · 2014-11-20 12:51 · Score: 1
  
  I would think it would be easier to generate the image, if you have a large enough library of objects.......
  
  --
  "First they came for the slanderers and i said nothing."
3. Re:But can it generate an image from words ... by Tablizer · 2014-11-20 13:27 · Score: 2
  
  Yes, it's called "Googling for images". (You didn't say "original".)
  
  --
  Table-ized A.I.
Anything to help us find cat videos on YouTube by TomR+teh+Pirate · 2014-11-20 11:31 · Score: 1

Actually, pass.
1. Re: Anything to help us find cat videos on YouTube by Anonymous Coward · 2014-11-20 12:05 · Score: 0
  
  This is about AVOIDING cat videos. Some fuckwit send you a link. User agent checks it out. Pointer hovers over link, tooltip says: yet another fucking cat video. Then you get angry that it showed you the email at all, when it should have autoresponded, "LOL"
Siri, describe the world around you. by Anonymous Coward · 2014-11-20 11:50 · Score: 0

You are on a beach with a marvelous and intriguing view of the ocean. You see a hut to the north. You see a shell.
1. Re:Siri, describe the world around you. by CreatureComfort · 2014-11-20 12:51 · Score: 2
  
  Your torch goes out.
  
  You are eaten by a Grue.
  
  --
  "Unheard of means only it's undreamed of yet,
  Impossible means not yet done." ~~ Julia Ecklar
Maybe help the blind? by puzzled_decoy · 2014-11-20 11:59 · Score: 1

I wonder if you could make a Google Glass assistant for the blind using this technology? Like a little earbud that describes stuff in front of you, and distances, and whatnot.

"Describe my surroundings."
"There is a lamp post directly ten feet in front of you. A lovely pizza parlor is off to your right (four out of five stars). There is moderate foot traffic, seven people in the immediate vicinity. There is a man walking towards you smiling. It looks like your friend Greg. There is heavy traffic to your left."
1. Re: Maybe help the blind? by stigmato · 2014-11-20 12:17 · Score: 2
  
  You are standing in an open field west of a white house, with a boarded front door. There is a small mailbox here.
2. Re: Maybe help the blind? by Anonymous Coward · 2014-11-20 12:53 · Score: 0
  
  You are standing in an open field west of a white house, with a boarded front door. There is a small mailbox here.
  Zork FTW!
  http://en.wikipedia.org/wiki/Zork
New stupid mandatory XKCD by Anonymous Coward · 2014-11-20 12:19 · Score: 1

http://xkcd.com/1444/
1. Re:New stupid mandatory XKCD by ArsonSmith · 2014-11-20 17:53 · Score: 1
  
  Also:
  http://xkcd.com/1425/
  
  --
  Paying taxes to buy civilization is like paying a hooker to buy love.
"A picture may be worth a thousand words..." by Arkh89 · 2014-11-20 12:49 · Score: 1

Especially considering a 1 mega-pixels image in 8 bits gray-scale. That's 1 MB worth of information. Considering 8 letters in average per word (including the various punctuation characters) and 250 words per page in whatever-16-bits character encoding, the image weighs the same as a book of 200 pages.
actually automatic picture caption generator by slew · 2014-11-20 12:59 · Score: 3, Interesting

Not as "advanced" in image recognition as advertised.
Basically they took the output of a common object classifier and instead of just picking the most likely object (which is what a typical object classifier looks for), it leaves in in a form where multiple objects are detected in various parts of the scene. Then they train a neural network to create captions (by giving it training pictures with associated captions).
According to the paper, it sometimes apparently generates a reasonable description. Other times it reads in picture of a street sign covered with stickers and emits a caption like "refrigerator filled with lots of food and drink".
Actually the most interesting thing about it is the LSTM-based Sentence Generator that is used to generate the caption from the objects. LSTM's are notoriously hard to train and they apparently they borrow some results from language translation techniques to attempt to form intelligible sentences.
This is all very googly-researchy in that they want to see what the limits of pure data driven machine learning are (w/o human tuning). This is not a however much of an advance in image recognition as it is an advance in the language for caption construction.
1. Re:actually automatic picture caption generator by AmiMoJo · 2014-11-21 02:34 · Score: 1
  
  Other search and map engines must be worried by this kind of thing though. One of the reasons Google Maps is so good is that they do image recognition with Street View photos. Google's search engine is better than Bing's because it understands the web more like a human does.
  
  --
  const int one = 65536; (Silvermoon, Texture.cs)
  SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
The really cool thing about this by Iamthecheese · 2014-11-20 22:36 · Score: 1

Right now the technology is well behind human cognition. It may say, "two men playing tennis" where a human would notice a few things, do some research, and say "Roger Federer practicing with his trainer for the upcoming Davis cup"

But the cool thing is that the machine will eventually reach, then surpass the human. The computer of tomorrow will say "Federer practicing for the David cup, but his injury will prevent a win. He also needs to start using a nitrogen fertilizer on his lawn."

--
If video games influenced behavior the Pac Man generation would be eating pills and running away from their problems.
Oblig. first picture description by Anonymous Coward · 2014-11-21 01:39 · Score: 0

The picture shows a naked man stretching his anus with both hands, to approximately the width of his fist. The inside of his rectum is also clearly visible. Below his gaping anus, his dangling penis and scrotum are visible, as well as a golden ring on the ring finger of his left hand.
you know, for the cats. on the internet. by Thud457 · 2014-11-21 03:26 · Score: 2

Next they need an AI that can describe the smells in a image.
Then you can finally tell if someone on the internet is a dog.

--
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
Words and Images by Anonymous Coward · 2014-11-21 03:41 · Score: 0

One word equals thousand images, for the readers imagine their own versions of the subject.