The Status Quo Of Computer Vision
prostoalex writes "The Industrial Physicist sums up the recent advances and developments in the world of computer vision. They mention an application for human-computer interfacing using a Webcam, Philips Research Lab Seeing with Sound product, which augments vision for visually impaired, as well as various frontal face detection applications."
All you need is it to understand english, and imagine in a 3d space.
Type a sentence like Zork, and it makes the scene for you.
Give it a book, and it could turn it into a movie for you.
Vision recognition has a great many uses already, but when vision recognition matures, you'll be able to take a scene and reduce it into 3d reality space. You take the 3d reality space, and give the computer some goals, and its trying to accomplish something in the world.
Thing is, it won't stop at plain vision, you'll get infared, sonar, ultraviolet, radar, all that crap to get the best 3d image possible.
So since vision is progressing, the gap towards AI is shrinking. Also as video games become more realistic, the AI gap is shrinking. I could be bold and say 15 years from now we should have basic AI.
God spoke to me
As someone who has been doing research in areas of computer vision, and specifically identification and a member of a Computer Vision Research Laboratory, I just thought I would make a few comments here. Some area's of computer vision, in relation to big brother, have been around for a while and actually work quite well already. These areas include but are not limited to fingerprint, iris, and hand just to name a few. Those mentioned above are already in commercial applications around the country used for everything from secure entry into the country at immigration stations, to secure entry into rooms/labs/whatever, and to confirm identification for logins to computer and other systems. They work well (always some room for improvement), but require a completely willing subject and carry a certain 'stigma' of big brother and criminals with them that makes them less viable. The view mentioned here that researchers want to work towards is having a standard camera (like a security camera) able to identify people. However, despite some claims so far (most recent interesting claim out of Isreal), so far no one has proved to have ANYTHING that would be viable in a real world application. Best systems thus far have never even been tested with a database of over 500 people, most significantly less than that, and tend to not work well over time. Usually, they work fairly well the same day and then exponentially decrease in their effectiveness until around 6 months when you may as well be randomly guessing because you'd do about as well as most algorithms. Overall, I don't think you have anything to fear from big brother here anytime soon.
What totally drives me nuts is most people in that field are totally hooked on the whole fisher-face, eigen-face, ICA, thing. Basically they naively project a two-dimensional affine/brightness normalized face onto a basis function and then do a nearer neighbor on the coefficients to determine identity using some magical distance metric like Mahalanobis or Euclidian. They totally fail when the intensity or pose changes, and then blame it on the distance function or basis function.
Shape models and combined models take this into account and are really popular in medical imaging, yet the facial people seem to shoot down. (Well it's antidotal on my part).
Sorry, I guess I'm geeking out, but I love this stuff.
-- Making computers see, hear, and think... http://www.componica.com/
This is very similar to a project I worked on in college. We were working on getting a webcam to track eye-gaze and to allow a user to control the mouse with their eye. I have always wanted to continue development of the gaze tracker, but never had the time after graduating. The website is here: http://www.gbook.org/projects/index.html
http://github.com/gbook/nidb