Google Announces Image Recognition Advance

← Back to Stories (view on slashdot.org)

Google Announces Image Recognition Advance

Posted by timothy on Thursday November 20, 2014 @11:18AM from the what-does-a-grue-look-like? dept.

Rambo Tribble writes Using machine learning techniques, Google claims to have produced software that can better produce natural-language descriptions of images. This has ramifications for uses such as better image search and for better describing the images for the blind. As the Google people put it, "A picture may be worth a thousand words, but sometimes it's the words that are the most useful ..."

8 of 29 comments (clear)

Min score:

Reason:

Sort:

Re: Maybe help the blind? by stigmato · 2014-11-20 12:17 · Score: 2

You are standing in an open field west of a white house, with a boarded front door. There is a small mailbox here.
Re:what would be useful by Morose · 2014-11-20 12:36 · Score: 4, Informative

Not only is there a program for this, but it's free. It's called VisiPics (http://www.visipics.info/index.php?title=Main_Page). I use it to organize my photo collection. Not only can it do scaling, but it can check other similarity factors as well. Not sure if it scales to millions, but I've used it with 20,000 images before.
Enjoy.
Re:Siri, describe the world around you. by CreatureComfort · 2014-11-20 12:51 · Score: 2

Your torch goes out.

You are eaten by a Grue.

--
"Unheard of means only it's undreamed of yet,
Impossible means not yet done." ~~ Julia Ecklar
actually automatic picture caption generator by slew · 2014-11-20 12:59 · Score: 3, Interesting

Not as "advanced" in image recognition as advertised.
Basically they took the output of a common object classifier and instead of just picking the most likely object (which is what a typical object classifier looks for), it leaves in in a form where multiple objects are detected in various parts of the scene. Then they train a neural network to create captions (by giving it training pictures with associated captions).
According to the paper, it sometimes apparently generates a reasonable description. Other times it reads in picture of a street sign covered with stickers and emits a caption like "refrigerator filled with lots of food and drink".
Actually the most interesting thing about it is the LSTM-based Sentence Generator that is used to generate the caption from the objects. LSTM's are notoriously hard to train and they apparently they borrow some results from language translation techniques to attempt to form intelligible sentences.
This is all very googly-researchy in that they want to see what the limits of pure data driven machine learning are (w/o human tuning). This is not a however much of an advance in image recognition as it is an advance in the language for caption construction.
Re:But can it generate an image from words ... by Tablizer · 2014-11-20 13:27 · Score: 2

Yes, it's called "Googling for images". (You didn't say "original".)

--
Table-ized A.I.
Re:what would be useful by ArcadeMan · 2014-11-20 13:37 · Score: 2

However, because of lossy compression, you might want to keep an image that is slightly lower resolution but still has a better overall image quality.

--
Get free satoshi (Bitcoin) and Dogecoins
Re:what would be useful by mister_playboy · 2014-11-20 13:50 · Score: 2

And runs the comparison over several thousand files (or even hundreds of thousands, or millions)
Ah yes... the joys of Internet Art collecting!

--
Do what thou wilt shall be the whole of the Law ::: Love is the law, love under will
you know, for the cats. on the internet. by Thud457 · 2014-11-21 03:26 · Score: 2

Next they need an AI that can describe the smells in a image.
Then you can finally tell if someone on the internet is a dog.

--
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff