Slashdot Mirror


The Status Quo Of Computer Vision

prostoalex writes "The Industrial Physicist sums up the recent advances and developments in the world of computer vision. They mention an application for human-computer interfacing using a Webcam, Philips Research Lab Seeing with Sound product, which augments vision for visually impaired, as well as various frontal face detection applications."

69 comments

  1. Computers' Vision? by Metallic+Matty · · Score: 4, Funny

    I'd have to say computers generally have very good vision - I am yet to see one wearing a pair of glasses.

    1. Re:Computers' Vision? by Simon+Field · · Score: 1


      My computer must need glasses.
      It cannot tell which window I am looking at when I type. I have to tell it which window to use by clicking with the mouse.

      With better computer vision, the computer would know which window I was looking at when I started typing. With highly acute vision, it could even know which button I was looking at when I hit the return key.

      I will know that computer vision is a reality when I no longer need to use a mouse. Likewise, I will know that speech recognition is a reality when I no longer need to use a keyboard (although I am not certain I want to trade carpal tunnel for a sore throat, or let the rest of the office hear everything I would otherwise be typing).

      A Nouse might be good enough to indicate which window I wish to use (if the windows are large enough). But I think I'll wait for the software that can actually track my eyes.

    2. Re:Computers' Vision? by Hognoxious · · Score: 1

      Of course not, they haven't got ears.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    3. Re:Computers' Vision? by MooseGuy529 · · Score: 1

      That's actually a pretty good idea for using computer vision. The one problem I've always had with a GUI is the overhead in switching windows and such.

      I think having your current window in front, with others as small pictures at the top, getting smaller as unused longer (with a limit of course) would be great--you could just look at a window and it would pop up.

      --

      Tired of free iPod sigs? Subscribe to my blacklist

  2. Don't forget the DARPA Contest by Blaine+Hilton · · Score: 3, Informative

    Another big advance I think will come with the prodding of the DARPA's $1million contest. A lot of disccusion has been going on their message boards about computer vision systems.

    1. Re:Don't forget the DARPA Contest by Animats · · Score: 4, Insightful

      That's not doing much for computer vision. Most of the action in computer vision right now involves "homeland security" applications, real or imagined. The killer app for computer vision seems to be Big Brother.

    2. Re:Don't forget the DARPA Contest by Anonymous Coward · · Score: 0, Informative

      Wrong. Far and away, most CV research money is in medical imaging. Haven't you read PAMI or CVPR lately? Nice try with the oh-so-timely propaganda though.

  3. Choose your words carefully... by Kozz · · Score: 4, Funny

    At first, the phrase "frontal face detection applications" sounded rather cumbersome. But then a shorter phrase of "facial detection applications" might have been grossly misunderstood. ;)

    --
    I only post comments when someone on the internet is wrong.
    1. Re:Choose your words carefully... by Metallic+Matty · · Score: 0, Funny

      grossly misunderstood

      No pun intended. =)

  4. Microsoft webcam assistant by Powercntrl · · Score: 5, Funny

    It looks like you're trying to masturbate! Would you like me to load:

    * Your porn collection
    * An AIM conversation with a guy pretending to be female
    * Recommended self pleasuring techniques database
    * Featured lubricant merchants

    --

    ---
    DRM is like antifreeze, to the MPAA/RIAA it's sweet, to the consumers it's poison.
    1. Re:Microsoft webcam assistant by Anonymous Coward · · Score: 0
      *DOUBLECLICK* -> * Recommended self pleasuring techniques database

      Wow! I never knew that a new technique could make burping my worm feel 10 times better!

      I was spanking my monkey all wrong before! Thank you, Clippy!

    2. Re:Microsoft webcam assistant by Anonymous Coward · · Score: 0

      > * Recommended self pleasuring techniques database

      Any links? Come-on man, share!

  5. Algorithmic Progresses by Neuronerd · · Score: 5, Informative

    While it is clearly true that only the recent advances in computer speed allowed the Computer Vision Systems we are seeing now there are also other important influences.

    In particular there are really also better algorithms than a number of years ago. Many if not most successful computer vision systems use statistical Methods. In the case of faces for example they often build a probabilistic model of what a face is. Such models know that a face should usually has eyes but not always. That some people have beards etc. And these models train themselves up from a database of stimuli, for example real faces.

    A number of recent advances makes such probabilistic models fast enough to work well on real world data. In a sense is the problem of computer vision very similar to the problem of understanding a voice or extracting the highest possible bitrate from a stream of data transmitted via a telephone line. And indeed the resulting algorithms are often surprisingly similar

    --
    Googlefight "Slashdot Troll" against "BSD is dying" 303:229. BSD thus cant die.
  6. Philips Research... by jeroenb · · Score: 0

    Daredevil actually premiered in The Netherlands this week, so I was kind of expecting them to suddenly come up with some kind of "Seeing with Sound"-tool. These guys are so predictable.

  7. Face detection for Windows by gmuslera · · Score: 3, Funny

    it work well since years ago. Computers running Windows often show me their blue face when I show them mine, even if their owners says that Windows is very stable and they never saw a blue screen before. Surely Windows can recognize people and do this specifically to me.

  8. Artificial intelligence in under 20 years by CrazyJim0 · · Score: 5, Interesting

    All you need is it to understand english, and imagine in a 3d space.

    Type a sentence like Zork, and it makes the scene for you.

    Give it a book, and it could turn it into a movie for you.

    Vision recognition has a great many uses already, but when vision recognition matures, you'll be able to take a scene and reduce it into 3d reality space. You take the 3d reality space, and give the computer some goals, and its trying to accomplish something in the world.

    Thing is, it won't stop at plain vision, you'll get infared, sonar, ultraviolet, radar, all that crap to get the best 3d image possible.

    So since vision is progressing, the gap towards AI is shrinking. Also as video games become more realistic, the AI gap is shrinking. I could be bold and say 15 years from now we should have basic AI.

    1. Re:Artificial intelligence in under 20 years by tpearson · · Score: 2, Insightful

      We already have basic AI - we have UAVs that can plan their own route and complete their mission completely autonomously. Most commercial robots that are being released have enough AI to determine where they are, what their goal is, and how to perform that goal. In my opinion, that is definitely "basic" AI.

    2. Re:Artificial intelligence in under 20 years by burns210 · · Score: 2, Interesting
      15 years? HA! Try 50, at best. And that is with MAJOR corporations trying like hell to put the product to market. Have you ever talked to an ALICE bot? That is suppose to be a fairly advanced bot/AI program, and you can stump it like mad, without even trying! Now try and feed it the text of a book and have it understand what the hell is going on?! Reading a classic novel, where inferences have to be made and many MAJOR actions are implied, but not directly talked about, the computer would have to have an AI text/parser several generations more advanced than the (still very impressive) likes of ALICe or Zork.

      Not to mention the realtime 3d mapping... the processor load(think of pixar's server farm) would be crazy... even with Moore's Law, this is atleast 10+ years before anything remotely like this can be made decent quality on the normal desktop.

      I think the idea is awesome, and the level of AI amazing. But keep things realistic, your realtime book->movie program won't be around in 20 years, I just don't see it happening.

    3. Re:Artificial intelligence in under 20 years by oblom · · Score: 1
      All you need is it to understand english, and imagine in a 3d space. Type a sentence like Zork, and it makes the scene for you.

      The mapping is probably not direct. It's likely, that there is no straight conversion of words-to-pictures even in our heads. We like to visualize, as it simplifies understanding and memorization. However, there are a lot of words (mostly concepts) that don't evoke any pictures when pronounces. We never encountered them in physical worlds to give us visual representation. Yet, we manage to juggle them somehow.

      It's more likely that we have an internal language , that we express our thoughts in, which gets converted into English. In the same way, we may have internal structures that represent objects which are associated with visuals.

    4. Re:Artificial intelligence in under 20 years by deblau · · Score: 2, Insightful
      Sorry, I have to disagree. I think you have some very good ideas about what could possibly be automated, but the devil is in the details. One of the biggest pitfalls is assuming that a computer would somehow be 'smarter' than a human, just because it can perform fast calculations. For instance, you claim that an AI could turn a book into a movie. Great idea, but I know I can't do that myself, and I like to think I'm pretty sharp. I'm also pretty sure that most of the people developing AI can't either, or they would have been screenwriters for a living. Who exactly will train this AI to write movies, and what kind of skills will they require to do so? These questions have pretty vague answers, and computers don't tolerate ambiguity very well.

      I also think that your claim of progress toward hard AI because of vision advancements is a little misleading. A true AI will gather perceptual input from a variety of sources in order to get the most accurate representation of concepts. For instance, a rose isn't really a rose if you've never smelled one; instead, it has the same emotional impact as a strawberry, a pile of vomit, and a face -- it's just another hard, lifeless image. Of course, if you can build me a machine that generates a good, emotional screenplay from some words on a page, I'll buy it from you for $100,000,000 and consider it a good deal. Heck, I'll turn out blockbuster movie remakes of classic literature by the thousands and make that figure 100 times over.

      Game advancements don't really help out AI either. What they do help is expert systems, which is a related field, and one which many people confuse for hard AI. The basic difference is that the input model for an expert system is generally limited to a single topic, whereas a general AI trains on any input it can perceive. There are many "AI" projects out there which train to recognize faces, and similar tasks. These projects are really expert systems, since they'll never be good for anything beyond face recognition, or for whatever limited task they train. You wouldn't ask a face rec program why good-looking people succeed in politics, now would you? But you would ask your Marketing major buddy.

      To wrap up, I think expert systems is a thriving field, and that for many problems an expert system will be good enough. I wouldn't hold my breath waiting for a real AI, though.

      --
      This post expresses my opinion, not that of my employer. And yes, IAAL.
    5. Re:Artificial intelligence in under 20 years by Hognoxious · · Score: 1
      For instance, you claim that an AI could turn a book into a movie.
      I think it could produce some pretty good comedies, as long as it wasn't trying to.
      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  9. Related article by Toasty16 · · Score: 4, Informative

    Wired had an article late last year entitled Vision Quest about a similar topic. The doctor couldn't perform most of his techniques in the U.S. due to ethical laws, giving the article a real "Frankenstein" flair. Good read.

  10. Computer Vision! by blitzoid · · Score: 2

    I live for the day that we can have a computer installed in our bodies have have a HUD in our eyes. How cool would it be to browse the net or play games while doing other, more boring things in the outside world?

    We could be talking about a revolution in isolationism here! I can't wait!

    --
    I am a filthy pirate.
  11. It's a pity though.. by scratchor · · Score: 0, Troll

    ...that in times like this, people always have to resort to this kind of submissions and waste otherwise better served research money...

    --
    -- debian linux - vim powered
  12. could a machine eye read mozilla's itallics? by Miguel+de+Icaza · · Score: 2, Funny

    All this fantastic technology - and yet here i am using mozilla with linux all fully apt-get upgaded to testing, everything uber-optomised and configured , all is good, smooth and aliased...

    BUT my itallic fonts when on slashdot still look fucking shit by default!

    And don't try and tell me how to set my desktop up properly - check me out:
    I AM THE 'KIN DESKTOP (all your desktops are belong to me now) :^)

    --
    Before adopting WHATWG, read the moonlight.NET EULA [http://www.microsoft.com/interop/msnovellcollab/moonlight.mspx]
    1. Re:could a machine eye read mozilla's itallics? by Anonymous Coward · · Score: 1, Funny

      Watt proplem wit italics?

  13. Linux, too! by spanky1 · · Score: 2, Funny

    When I panic, the Linux kernel detects that and also panics. Wow, computers have had facial recognition for a long time!

  14. Nouse-ing by cybermace5 · · Score: 4, Informative

    I downloaded the Nouse, and the Bubble Frenzy demo. My webcam was already on top of my monitor, so all I had to do was run the program.

    All you do is calibrate it by centering your nose in the image and clicking. The program draws a green box around your nose and follows it...it's pretty hilarious. Good oblique lighting seems to work best, too dark or too light and the box will want to follow your chin or ear. Overall, pretty reliable and lots of fun.

    I loaded up the Bubble Frenzy game, which at first looks like a DOS-era Frozen Bubble. The Nouse worked fine...added a bit of challenge, levels I'd laugh at in Frozen Bubble were suddenly difficult. It's hard to keep track of the pointer when your head is moving. It was pretty fun, someone walked in and saw me playing, apparently just hitting the space bar while tilting my head from side to side.

    I had a neck injury a while back in a car accident though, and all this motion started to bring on a little soreness. I had to quit after about 20 minutes of Nouse-ing, about the same effect as an hour of driving.

    --
    ...
    1. Re:Nouse-ing by gad_zuki! · · Score: 2, Funny

      Wow, Nouse is the coolest thing I've seen on slashdot in ages. Mod parent up please.

      Nothing like playing games with your nose. Now I'm tempted to borrow a USB2 card for nose to nose pong!

  15. Some of the posts on here are getting a bit vapor by sielwolf · · Score: 4, Insightful

    Maybe it's some sort of technophilia but some of the posts on here are just pure vapor. Sure, there have been some great advances in computer vision and pattern recognition... but have some of these posters on here ever done any research in the area? Hell, most face recognition goes back to Fischer's 1936 iris data set and primary component analysis... not quite Wintermute stuff.

    Too often vision projects find speedups by sacrificing one or another components. For instance, you can get some great face recognition with PCA... as long as the person's face is immobile. Tilt your head slightly or rotate too much and the system has no clue.

    I'll admit, there is some killer work out there. But not of the full-blown "20 years and we will all have robotic man servants" thing. Keep the hype to a minimum.

    --
    What is music when you despise all sound?
  16. Nice Demo by Anonymous Coward · · Score: 4, Informative

    People might want to check out these cool pictures and videos from Cambridge University

  17. Re:Some of the posts on here are getting a bit vap by t · · Score: 2, Insightful
    No kidding. I'm personally quite disappointed with state of the art speech to text, computer vision, etc... Much of it has gone largely unchanged for years, optimizations here and there is about it.

    I think at some point we went down a path which will never lead to the solutions we expected to have by this time. And the reason we can't get off the current path is because of the way the tech culture is, you always have to publish an extension to previous work with copious references.

    And its not even the big stuff, look at spell checkers and grammar checkers. Are there any that can tell correctly spelled but misused words? Affect/effect? There/their? How about something easy like made and maid?

  18. Researcher's Perspective On "Big Brother" by chameleon1z · · Score: 4, Interesting

    As someone who has been doing research in areas of computer vision, and specifically identification and a member of a Computer Vision Research Laboratory, I just thought I would make a few comments here. Some area's of computer vision, in relation to big brother, have been around for a while and actually work quite well already. These areas include but are not limited to fingerprint, iris, and hand just to name a few. Those mentioned above are already in commercial applications around the country used for everything from secure entry into the country at immigration stations, to secure entry into rooms/labs/whatever, and to confirm identification for logins to computer and other systems. They work well (always some room for improvement), but require a completely willing subject and carry a certain 'stigma' of big brother and criminals with them that makes them less viable. The view mentioned here that researchers want to work towards is having a standard camera (like a security camera) able to identify people. However, despite some claims so far (most recent interesting claim out of Isreal), so far no one has proved to have ANYTHING that would be viable in a real world application. Best systems thus far have never even been tested with a database of over 500 people, most significantly less than that, and tend to not work well over time. Usually, they work fairly well the same day and then exponentially decrease in their effectiveness until around 6 months when you may as well be randomly guessing because you'd do about as well as most algorithms. Overall, I don't think you have anything to fear from big brother here anytime soon.

    1. Re:Researcher's Perspective On "Big Brother" by rtl · · Score: 1

      While your broad claim that face recognition is really not ready for a large-scale real-world application is basically correct, your specific claims are anything but. The FERET evaluation of 2000, for instance, used close to 4000 images, not 500, and the 2002 follow-up evaluation made use of far more images than that. Also, the fall off in performance due to the passage of time is not nearly as extreme as you imply. Comparing the 2000 results to the newer study should make it clear that substantial progress has been made in 2 years, and there's no reason to believe that rapid progress won't continue. See the FVRT page for more information.

    2. Re:Researcher's Perspective On "Big Brother" by chameleon1z · · Score: 1
      Sorry I wasnt specific in what I was saying. I was trying to dumb it down a little. My claim of 500 was meant towards the number of individuals involved in the studies, not the number of images. I believe the number of individuals is MUCH more important in determining the effectiveness in a real world border crossing type situation than the number of images involved.

      While your correct the feret database had more than that (1200 individuals in 2000 I even double checked it witht the feret website) they're acuracy was around 95% far lower than most of the modern claims today (Yes there has been considerable advancements) with a false alarm rate so high there would be tens of thousands of false alarms in airports like Ohare or LAX. So, I didnt include the feret study in what I mentioned. Sorry I was trying to dummy it down. It was my fault.

      Personally, I think the problem with 2D face recognition is simply that there isnt enough data in the standard 2D image to differentiate amongst millions of people. Further, in all of the FERET studies or any of the studies involving large groups these are willing subjects. The subjects arent attempting to disguise their identies through beards, or glasses. Two things which will kill completely most modern techniques. While some smaller studies have been done on this issue they have shown not very promising results at least with current software and methods. Therefore, I still believe that despite some promising stuff going on right now we're quite far away from the security camera having any chance of being to identify who you are.

  19. Re:Some of the posts on here are getting a bit vap by Anonymous Coward · · Score: 0

    Yes, PCA is old and doesn't work very well. That is an easy starwman to set up. You claim to work in the field, yet conveniently forget ICA techniques and the tensorfaces paper in ECCV last year. I have to suspect that you either A) don't work in the field, B) don't read the current journals, or C) don't understand the state-of-the-art in this field. I'm sure you impressed someone though. Keep up the good work you trolling jackass.

  20. Medical Imaging is where it's at. by Steve+Mitchell · · Score: 2, Informative

    Most of my research involves adapting the `what's in' in facial recognition and applying it to disease diagnosis and segmentation of heart MRIs. These days' people at developing statistical shape and appearance models of faces via PCA for matching and segmentation. It gets a bit scary what they can do in facial recognition if you start reading up, but it's also slightly disconcerting how much money is being tossed in medical imaging.

    --
    -- Making computers see, hear, and think... http://www.componica.com/
  21. Re:Some of the posts on here are getting a bit vap by DietHacker · · Score: 1

    Yes perhaps. But it is an exciting field. Vision is THE sense I wouldn't want to lose and it is sad seeing older people disengage from the world as it gets lost. Given my (generation's) connection to computers and visual stimuli just to get by (buy stuff, learn, communicate, et cetera), this technology is sure to be boon to humanity.

  22. Re:Some of the posts on here are getting a bit vap by Steve+Mitchell · · Score: 3, Interesting

    What totally drives me nuts is most people in that field are totally hooked on the whole fisher-face, eigen-face, ICA, thing. Basically they naively project a two-dimensional affine/brightness normalized face onto a basis function and then do a nearer neighbor on the coefficients to determine identity using some magical distance metric like Mahalanobis or Euclidian. They totally fail when the intensity or pose changes, and then blame it on the distance function or basis function.

    Shape models and combined models take this into account and are really popular in medical imaging, yet the facial people seem to shoot down. (Well it's antidotal on my part).

    Sorry, I guess I'm geeking out, but I love this stuff.

    --
    -- Making computers see, hear, and think... http://www.componica.com/
  23. Don't forget the movies by Boss+Sauce · · Score: 4, Informative
    Gollum was brought to you by vision technology. It takes a lot of specialized cameras like these to track a lot of dots in 3D. Also, cameras are tracked after the fact by analyzing photography with tools like this and this (search for MARS).

    To lump all computer vision together and say "it's not there yet" is phooey! There are lots of problems in vision, and they do get solved, but those problems are all specific-- you can't use a red-light-runner system to do facial tracking...

  24. high speed high res cameras by hyperventilate · · Score: 3, Insightful

    I was stunned by how OCR went from "impossible" to "Trvial" and all that changed was moores law making high res scans available in memory in a typical PC. Expect many vision problems to fall by the wayside with new 240 Frames per second 3 megapixel cameras. (Don't save THOSE movies uncompressed!) See the Sensor Spec.

  25. Eye Tracking by nycsubway · · Score: 4, Interesting

    This is very similar to a project I worked on in college. We were working on getting a webcam to track eye-gaze and to allow a user to control the mouse with their eye. I have always wanted to continue development of the gaze tracker, but never had the time after graduating. The website is here: http://www.gbook.org/projects/index.html

  26. "Face blindness" in autistics is the key by macraig · · Score: 1

    If researchers really want to understand the mechanism of human face recognition, they should be looking at the cases where it *doesn't* work: autistic people with face blindness.

    1. Re:"Face blindness" in autistics is the key by C21 · · Score: 1

      as knowing quite a few autistic people quite personally I would have to chime in and say a test that tried to pinpoint something as specific as this in a severely autistic person would be hard at best, and impossible at worst. The abstractness of this idea is what, in my opinion, would elude an autistic person.

      --
      this is not a sig.
  27. Other people helping vision with computers... by Randolpho · · Score: 1

    N2 Reading

    They use computer techniques to help people with Intermittent Central Suppression read. They're fighting the good fight too!

    --
    "Times have not become more violent. They have just become more televised."
    -Marilyn Manson
  28. A similar microsoft research project by Sabalon · · Score: 1

    This project at MS research not only does the face detection, but recognition.

    I can't get the videos to play right now, but when I saw them before, as people walked on and off camera, it would find their face, put a square around it and label their name on it.

    Pretty neat.

  29. Re:Nice Demo (and the Evil Empire...) by roman_maroni · · Score: 1

    When looking around the Cambridge Machine Vision URL noted above, I came upon this little site which looks like an interesting project in Hidden Markov Models; so, take a look at current 'owner' of site and related software license.

    Now idn't that funny?

  30. AI already here by TheLink · · Score: 1

    Just look at the various slashdot posters around :).

    --