Slashdot Mirror


Teaching Computers to See with Games

An anonymous reader writes "The Pittsburgh Post-Gazette has a story on Peekaboom, a two-player on-line game in which one player tries to get the other player to guess a word associated with an image, by revealing parts of the image one click at a time. From the article, "The process of revealing objects, or highlighting images within the larger context of the photo, is the sort of thing that researchers in computer vision must do to teach computers to see.""

57 comments

  1. Frist? by MaestroSartori · · Score: 1

    How apt, first five or six attempts to view this got:

    Nothing for you to see here. Please move along. :)

    1. Re:Frist? by smchris · · Score: 1


      Well, I got registered but there is something in some sites that crashes my Mozilla and they have it.

      It seems like worthwhile research. We might learn to recognize things by a process as mundane as actually seeing them from different angles but I suspect there are processes of object abstraction at work as well.

    2. Re:Frist? by Ersatz+Chickenweed · · Score: 2, Informative

      The site works like a champ on my Mozilla (bangbang023). The game is a Java app and requires Sun's JRE 1.4 or later, so perhaps that's what's causing the problem for you.

  2. Aka? by redmo · · Score: 1

    Also known as Pictionary..

    --
    If you're tired, sleep! Wenn Sie muede sind, schlafen!
  3. not quite pictionary... by Anonymous Coward · · Score: 0

    unless you get to reveal the picture using crayola crayons...

    potato and paint is my favorite method

  4. /.'ed by spot35 · · Score: 0

    Could they use /. to teach a webserver how to stay up...?

  5. Oh please don't... by Phidoux · · Score: 1

    ... tell MS about this! Can you imagine?

  6. Ummmm by fr0z3nph03n1x · · Score: 0

    So what?

  7. Interesting by frelis · · Score: 0

    I will possible to use that in biometrics

    Luis Freire
    http://webdicas.com/
    http://www.numberbit.com/

  8. Think of the possibilities! by fmwap · · Score: 3, Funny

    I always wanted a computer that could identify my predetermined pr0n fetishes and automatically download accordingly...then I could cut the browsing time in half and get right down to business.

    1. Re:Think of the possibilities! by master_p · · Score: 0, Offtopic

      mod the parent up! for Christ's sake!

    2. Re:Think of the possibilities! by Anonymous Coward · · Score: 0

      Just don't reveal that pr0n one part at a time...

    3. Re:Think of the possibilities! by hal2814 · · Score: 2, Funny

      Awesome. That would drastically reduce the likelihood of moms everywhere going down to the basement at inopportune times.

    4. Re:Think of the possibilities! by danila · · Score: 1

      Get Nici. This is an automatic porn downloader that (in the automatic mode) takes into account the ratings you gave to earlier porn.

      --
      Future Wiki -- If you don't think about the future, you cannot have one.
  9. Great name. by CosmeticLobotamy · · Score: 0

    Keep going with the science, but I'm revoking your right to name anything. "Peekaboom." Good God.

    1. Re:Great name. by Anonymous Coward · · Score: 0

      what would you call it then???

      Peekaboom's a good name.

      Fuck i hate people like you. I hope you're unhappy your whole miserable life.

    2. Re:Great name. by CosmeticLobotamy · · Score: 1

      I'm actually doing okay, you shit. Maybe if you relaxed for five seconds, you could get in on some of that not-unhappiness yourself.

      And to answer your question, I would have called it "Game." I'm bad at naming things. But if I was forced to name it, even just Peekaboo is better than Peekaboom.

  10. What about Bayesian analysis? by iamatlas · · Score: 1

    A few weeks ago there was an article about teaching a computer to play chess using a bayesian spam filter. While it was kludge-y, it was a pretty good idea, and had some interesting results.

    Why not try that with vision? Ditch the spam filter and use high-end bayesian analysis, feed the bayesian learning-program all of the data about different objects in a video game who edges are already defined, usually by colors and texture borders, see what you get.

    I'm no expert on this-- can anyone offer ways it could or couldn't work?

    1. Re:What about Bayesian analysis? by Itchy+Rich · · Score: 3, Informative

      I'm no expert on this-- can anyone offer ways it could or couldn't work?

      The human eye works in a similar way. The first layer of optic nerve after the retina recognise dots. The next layers recognise contrast and patterns in the previous layers, i.e. lines, edge recognition, etc. By the time it gets to the brain it's already broken down into basic shapes, at which point there are nerves that have been taught to look for certain combinations of shape and colour are triggered, causing the sensation of recognition.

      I assumed some pattern recognition would already work like this. Could be wrong though.

    2. Re:What about Bayesian analysis? by Tom7 · · Score: 1

      Bayesian learning is precisely the kind of thing that AI/ML people do in order to "learn." (There are many others, too.) It isn't even new, but has gotten some slashdot popularity because of the new spam filters.

    3. Re:What about Bayesian analysis? by Anonymous Coward · · Score: 1, Interesting

      You seem to logically following from a set of news story and extrapolating ideas. Unfortunately, the conclusions you are implying are, in actuality, a little backwards. Let me explain. Bayesian filtering wasn't developed to fight spam. It has been around, in theory, a long time before spam filtering, and spam filtering is just one application where it has reached prominence (especially in the Slashdot community). Saying that "a spam filter was used to learn to play chess" is really a misnomer. Bayesian filtering for chess has been tried 100s of times, long before spam was ever even an application of the filtering. It has been found to be insufficient, alone, in dealing with many classification and prediction problems, including chess and general vision algorithms.

      There are fundamental flaws in the assumptions made in Bayesian filtering that are a bit technical, but I'll try to distill it here, for you. First, Bayes rule is essentially a way of turning the expression (A given B) into (B given A). The idea is if I give the filter a huge sample of preclassified data, it can build up probabilities of each event. In English, you can take 10,000 emails and find the quantity "the probability of this word occurring, given the message is spam". Bayes rule gives you a mathematical way of turning that sentence around, and finding "the probability this message is spam, given this word has occurred.".

      In a single case, this is highly accurate. The problem cause when you begin to generalize this to entire email. Bayesian filters (in their simplest form) assume that each word is statistically independent of each other word. This is utterly absurd, of course, but in filtering email, it appears that this assumption doesn't cause too damage.

      However, when you try to translate Bayesian filtering to computer vision, gigantic problems arise. First of all, what is your actual data? Color? Texture (Fourier or wavelet analysis)? Edge positions? Corner positions? How do you go about finding the more complicated ones? How do you deal with scale and rotation? None of these problems exist in email. The input is very simple, and already extremely easy for a computer to understand. An equivalent problem would be to classify a handwritten message as spam... now you have a whole new set of problems.

      And, to further exacerbate the situation, if we take color as an example... the interdependencies between pixels is MUCH larger in an image, then the interdependencies of words in a message. Images are very coherent and predictable - ie, if I see a left eye, I can say with pretty large certainty there will be a right eye... or I'll know it's a profile, etc. There are huge and very macroscopic (difficult to quantify) dependencies that must be dealt with to form an effective classifier.

      So, to answer your questions, Bayesian filtering is based upon a model of the data that is insufficiently modeling what is going on. Far too much information is neglected in a basic naïve bayes filter. Hope that helps.

  11. Pictionary: Boring game!! by jurt1235 · · Score: 1

    Yep, most boring game in the world.

    --

    My wife's sketchblog Blob[p]: Gastrono-me
  12. Tips by M3wThr33 · · Score: 1, Interesting

    Always pick a hint.
    That adds 25 points to your score.

    During the bonus round you get points for clicking the same spot as your teammate. Once numbers start appearing, keep clicking right there for maximum points.

    Pass if the word looks difficult. Don't hesitate.

    Pass if your partner passes, too. He probably has a good reason to.

  13. hmmmm by mrselfdestrukt · · Score: 0, Troll

    Nowwww. Let's see. Boob!!!

    --
    "I used to have that really cool,funny sig ,but it got stolen."
  14. The real story today... by sootman · · Score: 2, Interesting

    Teaching, computers, games, yeah, fascinating... so, what's the deal with the moderation here? Why are there so few comments with scores over +3? My default is +5 and the whole front page right now shows *zero* comments at that level. Did they get real stingy with the mod points all of a sudden?

    --
    Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
    1. Re:The real story today... by nighthawk127127 · · Score: 0

      maybe it's because it's early on a wednesday morning and the /. horde hasn't started reading, posting, and modding comments yet.

      --
      10100111001
    2. Re:The real story today... by Anonymous Coward · · Score: 0

      I've noticed that too. Coincidentally, the mod system started to malfunction after this post appeared. My guess is that everybody is spending their points in there...

  15. Can we program some sense into the robot? by Anonymous Coward · · Score: 0

    Let's say that we can get the robot to recognize something, perhaps an egg. How can we get the robot to understand that an egg can also be cracked and that the insides are edible? At some point the robot may drop the egg on the floor and it will break. How can the robot know that eggs are not meant to be broken on the floor? Will it understand that the floor is not necessary when breaking an egg, neither is extra claw/hand pressure. We need to be able to trust that the robot can learn how to crack an egg so that it doesn't make a mess when cooking an omelette.

    The whole problem is much bigger than simply recognizing one thing or another. Any object, whether a face, an egg, or anything, has a set of things that can be done to it. Just recognizing an object for what it is does not lead to any great understanding of what that object is and what can be done with it.

    If we were to have to provide a database of actions for each and every object that a robot could ever encounter, I would hardly call that "teaching". In humans, we teach children certain heuristics so that they have a general understanding of the world around them. But hard-wiring in our brains also allows us to think outside of those heuristics and in some cases to think past the boundaries of current thought (geniuses and savants).

    How can we get a robot to 1) understand simple heuristics, and 2) to think beyond its initial programming? Can we provide a robot with such programming that it can grow in "intellect" on its own?

    1. Re:Can we program some sense into the robot? by Patrik_AKA_RedX · · Score: 1

      Sure but what if said robot attemts to apply its newly gathered knowledge of egg cracking on someones head. Heads can be cracked and the insides can be edible. Robots don't have taste, so it can assume that as the content of an egg is biological and since the content of someones head is too, then it is edible.
      Things aren't that simple ofcourse. Most likely we'll have to train robots to handle the world in the same way we educate children. But once we have a working brain, we can copy it. But that'll need some sort of artificial brain, magnitudes greater than what we've build so far. And such brain is still a long way off as we still don't know so much on what thinking and learning really is.

  16. end result... by rd4tech · · Score: 1, Interesting

    Bypassing captchas?

    1. Re:end result... by chialea · · Score: 1

      Actually, if you look, it's the same researcher who headed up both pieces of work. The whole idea behind CAPCHA is that if someone breaks it, we get some cool piece of technology out of the deal. This certainly counts as cool, but it's explicit labeling -- you still have to run it by actual people to get the labels.

    2. Re:end result... by Anonymous Coward · · Score: 2, Interesting

      Actually, this work is more related to his prior work on the ESP Game, which collected labels for images. The problem after that is that you know an image contains a boy and a dog, but you don't know what is the boy or what is the dog.

      With Peekaboom, you give them the job of guessing "dog", and the parts of the image that are revealed are likely the parts of the image that contain the dog.

      That said, the relationship to CAPTCHAs is still there. Simple image distortion CAPTCHAs don't really hold up, and the more difficult ones are based in the semantic understanding of an image.

  17. artificial intelligence by chrisranjana.com · · Score: 0

    Each day Research takes us one step closer to A. I

    --
    Chris ,
    Php Programmers.
  18. Isn't actually being used by ArbiterOne · · Score: 2, Insightful

    The article only says that this technology has the *potential* to help computers to see objects, not that it *is*.
    Quothal:
    The process of revealing objects, or highlighting images within the larger context of the photo, is the sort of thing that researchers in computer vision must do to teach computers to see.

    While the ESP Game was designed to generate descriptive labels for photographs and other images, Peekaboom is intended to help teach computers to see.

    1. Re:Isn't actually being used by Tom7 · · Score: 1

      The summary of the article isn't really very accurate. It's better to see this as a way to get a huge amount of labeled image data quickly--the players label it as a by-product of playing the game. The idea is then that the data set could be used as input to a learning algorithm.

    2. Re:Isn't actually being used by Anonymous Coward · · Score: 0

      It seems to me that training a computer to
      identify elements out of a static photo would be
      HARDER than teaching a child to identify things,
      because people don't see in 2D, most people see
      in 3D.

      Wouldn't it be easier to train the computer to
      take two photos a certain distance apart, and
      then use parallax with pattern mapping to
      help isolate objects? Isn't that how people see?

      After all, if we're training computers to
      see, we don't want them to be confused by the old
      "hang a picture in front of the camera" trick!

  19. But do they learn what we think they learn? by G4from128k · · Score: 2, Interesting

    There's an old story from the early neural net image recognition days that seems germane to this. A group of researcher were trying to train an artificial neural net to recognize military tanks that were partially hidden in forested scenes (this was the bad old Cold War days and spotting Soviet tanks in West German forests was the problem du jour). Pictures of natural forested scenes with and without tanks were used to train and test the system. It seemed to work very well on all the training and test data.

    But when they tried the system on more images, it failed miserably. Further investigation revealed that, by accident, all of the "tank" pictures had been taken on cloudy days and all of the non-tank pictures had been taken on sunny days. The system had learned, and learned beautifully, how to recognize cloudy vs. sunny days.

    The point is that the software was good enough to learn to recognize the difference between the two populations of images but that that difference wasn't the one intended by the people working on the system. In the same vein, I'm sure that Peekaboom will learn to distinguish between objects in images but whether it learns the actual object or just some incidental characteristic of that pocture of the object will require a very very good diversity of training pictures to avoid accidental, non-meaningful patterns in the image data.

    I do wish them luck. Perhaps Peekaboom could create a distributed version of the training process in which others can both submit and help train on new objects/images. Letting others submit images and train the system would help diversify the training & testing data sets. Because some people will, no doubt, submit porn, I'm sure the system might become quite adept at recognizing the nether regions of the human body.

    --
    Two wrongs don't make a right, but three lefts do.
    1. Re:But do they learn what we think they learn? by xygorn · · Score: 1

      That is exactly the problem that they are solving. By using a game to collect a data sample instead of a couple of grad students, they can get millions of images labeled instead of hundreds. The more images, the less likely you are to have correlated image features (tank presence and cloudiness were heavily correlated in the neural network system you are recalling). Of course they also run the risk of very poor data, such as someone spelling out the word using cleared space as ink.

      --
      I am a sig. I wish I were a more creative sig, but I am not. I guess everyone has something to strive for.
  20. [OT] I say it without fearing to be modded down. by ceeam · · Score: 0, Offtopic

    Fix the damn moderation system!
    No posts at 4+ for a long long time.
    I am not gonna read /. without mod points at low thresholds.
    If you want to vote for it - reply below.

  21. No moderation leads to lack of comments by Anonymous Coward · · Score: 1, Interesting

    In some sense, Slashdot is the ultimate MMORPG. The comments system provides almost immediate feedback in the form of replies and moderation. Most posters can be categorized into some sort of stereotype.

    Some posters like giving lots of information and opinion and getting lots of replies in return. They typically have a little background in what they are discussing, or have a very strong opinion on the subject. When they post and get modded up and have lots of replies, they have achieved a personal victory.

    Other posters enjoy causing mayhem. They will typically post a comment taking a very odd stance towards a topic that many people feel strongly about, or they may post blatantly incorrect information on a topic that everyone is well-versed in. Their goal is not direct replies specifically, but rather that a heated debate follows from that first troll post. The best troll posters are those who can get both a slew of replies and start a flamewar. Moderation is a peripheral concern to these players, but they obviously prefer to be modded upwards rather than downwards.

    Another player is the newb. This player simply doesn't get that the forum is a game populated by players much better than he. He posts replies in earnest to troll posts and karma whore posts, and may try to make on-topic jokes. This type of user is frequently seen making Star Wars references, posts about "42", and other stupid things that garner him neither karma nor respect from his peers. He is also frequently seen repeating Benjamin Franklin's worn out "those who would blah blah blah" catchphrase.

    Finally there are the vermin of the forum. These will typically post off-topic comments about all sorts of strange fetish behavior. Whether it be the innocuous first posters or the ASCII art purveyors, these posters are not welcomed by most of the community. That they are able to stick around despite constant down-modding is a testament to their cockroach-like existence.

    However without moderation, no one is interested in posting. The last few stories have only a handful of comments and most of them are posted by the vermin. The karma whores don't stick around because there is no payoff, the newbs are all gone because they follow the whores like flies on dog crap, and the trolls have no one to troll with the newbs and whores gone.

    The healthy Slashdot ecosystem is significantly disturbed by this sudden lack of moderation.

  22. Other people... by darken9999 · · Score: 0, Offtopic

    This would be great if people on the internet weren't retarded. I bet when I click over to Fark, this is on there. Maybe Bill Gates bringing computers to the masses wasn't such a bright idea.

  23. link to the powerpoint presentation by Anonymous Coward · · Score: 1, Informative

    http://www.aladdin.cs.cmu.edu/workshops/lamps05/Sl ides/Peekaboom.ppt

    the one in google's index now seems to be broken

  24. More Tips by Yjerkle · · Score: 1

    If you get a word that's possibly inappropriate for children (boobs is the mot common one, but also tits, gay, sex, ass, etc), pass immediately. There is a filter that will prevent your partner from guessing these words, but it will still give you pictures labeled with them.

    Seconding the use of labels. They're worth 300 points over the course of a full game. If none of them apply, because, for example, the word is an adjective, select "text". It's the least likely to be misleading to your partner.

    Common labels:
    man, men, woman, women, people, building, buildings, sky, water, grass, tree, trees, airplane, leopard. If you see something that could be one of these, guess it quickly.

    Try various levels of specificity. Many of the pictures have very generic labels, others are very specific. If you see a duck, try "bird", "animal", and "mallard" as well.

    Some pictures have labels that describe the picture itself, rather than it's content, such as: picture, photo, drawing, page, rectangle(!), etc. If your partner seems to be trying to uncover the whole image, try some of these kinds of words.

    Finally, this was mentioned in the parent, but cannot get enough emphasis. DO NOT BE AFRAID TO PASS. If you get a clue that you don't think you can show, pass. If your parter passes, pass. If you can't think of any more synonyms for the words your partner has marked as hot, pass. Passing costs you nothing more than the few seconds it takes to start a new image. Do not waste time on an image you're not going to get.

  25. Somewhat relevant... by YodaToo · · Score: 1

    My research involved developing a software system to "learn" a protolanguage of nouns/verbs based on visual perception. Part of the vision system involved having the computer detect "significant" objects & relationships in video frames and tracking similar object/relationships across both different frames & different videos. Here's a short paper.

  26. A little insight by Dougthebug · · Score: 1

    As a computer vision researcher, I thought I'd share a little insight as to why this is helpful for the computer vision community.

    Whenever one wants to train an algorithm to detect or recognize an object in a data set, one needs both the data set and the ground truth. The data set is usually a large set of images and the ground truth is some semantic information associated with each image, such as the locations of people and cars, or perhaps a representative word or category. The data set is usually easy to obtain, however the ground truth usually involves manual input. Considering that data sets regularly have more than 10,000 images, this can be quite a challenge since it can't be automated (if it could, your research would be pointless eh?).

    This is where the peekaboom application comes into play. Now, the task of annotating the images with semantic information is distributed among thousands of slashdot readers and other assorted nerdy individuals. Not only does this program provide a single ground truth for researchers to analyze, but a statistical description of the ground truth, that is validated by another user guessing the semantic information.

    More information about this project can be found at: http://www.cs.cmu.edu/~biglou/research.html

    As an aside, a friend of mine is working on a project to turn planetary transit model fitting into a web-based game. Keep a lookout for it in the next few months.

  27. See with games? by Anonymous Coward · · Score: 0

    With the current crop of games (like Doom 3 and GTA) I would have expected the article to be titled:
    Teaching Computers to Kill with Games

  28. I was there. by Xx+Shinwa+xX · · Score: 1
    I am a student at one of CMU's summer programs called Andrew's Leap and they gave a presentation on this program to us.

    The premise was that no algorithm existed for computers to be able to find where within a picture a certain object existed. But who is good at doing these things? People. Normally, a group of people would be paid to sort through hundreds of thousands of different images and find where a certain object was. But this was slow, and consumed unnecessary resources (like money). However, the ingenious people at CMU developed a clever way to make it fun, so much so that people would actually WANT to do it. By making a game.

    The creater, a person whose name is Roy, spent a year working on the game. Please don't diss it too much. How would you like it if someone dissed your program that you spent your life on, making it as good as possible?

    On a side note, the ESP Game was sold to Google for 1 million dollars. Not kidding.

  29. Guess the nude celebrity's identity? by skeptictank · · Score: 1

    Maybe 'face' recognition systems could use the same strategy for free research.