Automatic Image Tagging

← Back to Stories (view on slashdot.org)

Posted by ryuzaki0 on Thursday November 2, 2006 @01:07PM from the on-the-horizon dept.

bignickel writes "Researchers at Penn State have applied for a patent on software that automatically recognizes objects in photos and tags them accordingly. The 'Automatic Linguistic Indexing of Pictures Real-Time' software (catchy name) trained a database using tens of thousands of images, and new images have 15 tags suggested based on comparisons with objects or concepts in the database. Not sure how you identify a 'concept,' and they're only talking about having one correct tag in the top 15, but still cool."

15 of 123 comments (clear)

Min score:

Reason:

Sort:

Not shockingly... by goldmeer · 2006-11-02 13:11 · Score: 4, Funny

The vast majority of the images on the internets including The Google include "Pornography" in it's top 15 tags suggested. The accuracy rate is surprisingly high.
That Sucks by JerkyBoy · 2006-11-02 13:15 · Score: 2, Funny

Researchers at a publicly funded institution are using their research results for personal (financial gain). Pennsylvania's tax dollars at work? How is this legal?

--

Always do right. This will gratify some people and astonish the rest. -- Mark Twain
1. Re:That Sucks by Nybarius · 2006-11-02 13:24 · Score: 2, Informative
  
  Contrary to what you might believe, there is nothing unethical about making money. The government even gives out grants for entrepreneurs, and lets them keep all the profits; it's good for the economy, overall. The profit motive is a much more powerful incentive to positive social change than the goodness that lies in the hearts of men,
  
  -Nyb
The other 50% is the problem by Heir+Of+The+Mess · 2006-11-02 13:21 · Score: 2, Informative

I've seen lots of systems like this. The problem is in the 50% of the images that don't work, so basically you have to manually tag 50% of your images.

I saw an interesting one about 10 years ago. It took an X-Ray image, did an edge detection, converted all the edges to a slope vs distance 2D plot, and conerted edge curves to a radius and distance plot, then used a kind of statistical correlation algorithm to pick which part of the body the image was from. I could imagine that you could apply something similar to the luminance of an image to pick out objects, and then maybe do some color transforms and stuff to improve results. The article says they do it in 1.4 seconds per image though, which is impressive.

--
Australian running a company that does C# / C++ / Java / SQL / Python / Mathematica
1. Re:The other 50% is the problem by cloudmaster · 2006-11-02 16:02 · Score: 2, Insightful
  
  Since you don't know *which* 50% it'll get right, though, you end up having to look at 100% to determine if the system got it right or not. At that point, it's only saving you a few seconds of typing / picking from a drop-down list. :)
A video an the subject by damgx · 2006-11-02 13:29 · Score: 2, Informative

Luis Van Ahn did something almost the same, his idea though is to use humans aswell.

View the video on Human Computation

--
I only read slash. for the articles...
w00t!!! by rts008 · 2006-11-02 13:38 · Score: 2, Funny

Now almost 7% of my pr0n will get tagged correctly!
That's cool, the rest of it will be like opening xmas presents!

*file: 123456.jpeg>open>Aghh! Goatse!*

Hmmm...This may be neat when it gets a LITTLE more accurate, but a cool start none the less.
Kudus to the gang for getting a grip on a hard problem...erm..nevermind.

--
Down With Slashdot BETA!!! I've been around the corner and seen the oliphant; you can only abuse me from your perspecti
1. Re:w00t!!! by CCFreak2K · 2006-11-02 13:56 · Score: 2, Funny
  
  You named your penis "problem?"
  
  --
  "Beware of he who would deny you access to information, for in his heart he dreams himself your master."
Reportedly by stunt_penguin · 2006-11-02 13:59 · Score: 2, Funny

Reportedly the researchers showed the system a picture of a Death Star, and it correctly tagged the image with 'thatsnomoon'.

The system has clearly been let crawl the web for far too long.

--
When the posters fear their moderators, there is tyranny; when the moderators fears the posters, there is liberty.
Re:LIPS by Original+Replica · 2006-11-02 14:31 · Score: 2, Funny

"cunning linguists
That's disgusting."

Actually, if done well, it's quite pleasant for all involved.

--
We are all just people.
Re:1 out of 15 ? impressive by wayward_bruce · 2006-11-02 14:54 · Score: 2, Insightful

How do they get less than a 50% average that you'd get by just guessing?
How do you get that 50% is average on guessing? Their tag pool contains 332 "concepts", which means that randomly picking 15 would give you about 1/22 chance of getting a correct tag for a picture that is tagged with one word. For a two-tag image, you get 1/11. To get up to 50% you'd have to work with images tagged with four or five words. Did I miss something here? Besides, the claim is that "in 98 per cent of tests suggests at least one correct tag in the top 15", the keywords here being "98%" and "at least". We don't know how the number of correctly identified tags is distributed, so we can't say much about that anyway. This reminds me of Pres Eckhart and John Mauchly inviting a group of female "computers" to show them their first two blocks of tubes perform a computation of 5*1000. One of these ladies later commented that they had a whole lot of equipment for such a simple computation.
XRay much easier though by SuperKendall · 2006-11-02 17:07 · Score: 2, Informative

The human body is pretty much the same between people, and XRays are generally shot from similar directions person to person - so the kind of check you are describing seems like it would yield high matches for pretty much any part of the body.

In the real world we have an object you might take a picture of from any angle, using a myriad of focal lengths, with variable levels of distorition depending on the lens and camera used. Really nasty for generic object recognition. I think the best we can hope for in terms of accuracy is perhaps some kind of facial recognition autmatically recognizing and tagging people in images.

--
"There is more worth loving than we have strength to love." - Brian Jay Stanley
Re:Bullshit Patents by OzPhIsH · 2006-11-02 17:58 · Score: 2, Insightful

The application is obvious, although, I'll admit, their EXACT method isn't. But at it's core, it is basic supervised learning. Feed your classifier a training a set of images that are already tagged. Extract the features of the image and use those features to predict the tags. When the predicted classifications don't match the actual tags, adjust the model, rinse and repeat. Just pick up a data mining book. Like I said, lots of people are working on image classification, and this is an obvious application, at least to those in data mining/machine learning related fields. That doesn't make it an EASY thing to do successfully. If it were easy, there wouldn't be so much research going on. In that sense this group gets my respect for doing a pretty successful job. My concern is the patent. People already look at images and classify them based on content. That's what tagging IS. When computer software is written to automatically do something that every normal person does anyway, should that be patentable? How is this different than people giving Amazon tons of shit for their patents on their product recommender system?

--
"To lead the people, you must walk behind them"
Re:LIPS by shashark · 2006-11-02 19:08 · Score: 2, Funny

They might be cunning linguists - but you sure are a master debator.
Neural Nets by gekoscan · 2006-11-02 19:25 · Score: 2, Insightful

How can you take a neural network and train it, then patent that?
That's like patenting training a dog to fetch a stick, it's completely rediculous.

You take software capable of generalizing a neural network algorithm by feeding it pictures and associating each picture with certain tags. It then creates a generalized algorithm model based on what you fed it initially. So that when you give new input it is capable of outputting tags most similar to what you initially trained it.

So yes this software can recognize boxes, shapes, other objects, maybe scenes etc and associate them with tags... but ask them how the algorithm works under the hood =) They have no idea... a neural network is like a black box after it has been trained. You feed it input and it gives you output based on it's initial training. The inner workings are chaotic spaghetti values set on each neuron weighting and can't be deciphered.

How can you patent software that is a black box inside?

"Yes hello patent office? I have this box that manufactures microprocessors. I feed it all the materials and it outputs a shiny new processor. I am not sure of the manufacturing process internally but the output works great. I would like to patent this manufacturing process.

"Okay your patent number is 247286-"BLACK BOX"-9

The whole point of a neural network is it generalizes what you train it and can future predict any input based on that.

It's like having the invention of the first mirror and everytime someone put something different infront of it, that person called up the art gallery because they had a new painting that they wanted in their name (because depending what was in front of it you get a different reflection).