The Status Quo Of Computer Vision
prostoalex writes "The Industrial Physicist sums up the recent advances and developments in the world of computer vision. They mention an application for human-computer interfacing using a Webcam, Philips Research Lab Seeing with Sound product, which augments vision for visually impaired, as well as various frontal face detection applications."
I'd have to say computers generally have very good vision - I am yet to see one wearing a pair of glasses.
At first, the phrase "frontal face detection applications" sounded rather cumbersome. But then a shorter phrase of "facial detection applications" might have been grossly misunderstood. ;)
I only post comments when someone on the internet is wrong.
It looks like you're trying to masturbate! Would you like me to load:
* Your porn collection
* An AIM conversation with a guy pretending to be female
* Recommended self pleasuring techniques database
* Featured lubricant merchants
---
DRM is like antifreeze, to the MPAA/RIAA it's sweet, to the consumers it's poison.
That's not doing much for computer vision. Most of the action in computer vision right now involves "homeland security" applications, real or imagined. The killer app for computer vision seems to be Big Brother.
While it is clearly true that only the recent advances in computer speed allowed the Computer Vision Systems we are seeing now there are also other important influences.
In particular there are really also better algorithms than a number of years ago. Many if not most successful computer vision systems use statistical Methods. In the case of faces for example they often build a probabilistic model of what a face is. Such models know that a face should usually has eyes but not always. That some people have beards etc. And these models train themselves up from a database of stimuli, for example real faces.
A number of recent advances makes such probabilistic models fast enough to work well on real world data. In a sense is the problem of computer vision very similar to the problem of understanding a voice or extracting the highest possible bitrate from a stream of data transmitted via a telephone line. And indeed the resulting algorithms are often surprisingly similar
Googlefight "Slashdot Troll" against "BSD is dying" 303:229. BSD thus cant die.
All you need is it to understand english, and imagine in a 3d space.
Type a sentence like Zork, and it makes the scene for you.
Give it a book, and it could turn it into a movie for you.
Vision recognition has a great many uses already, but when vision recognition matures, you'll be able to take a scene and reduce it into 3d reality space. You take the 3d reality space, and give the computer some goals, and its trying to accomplish something in the world.
Thing is, it won't stop at plain vision, you'll get infared, sonar, ultraviolet, radar, all that crap to get the best 3d image possible.
So since vision is progressing, the gap towards AI is shrinking. Also as video games become more realistic, the AI gap is shrinking. I could be bold and say 15 years from now we should have basic AI.
God spoke to me
Wired had an article late last year entitled Vision Quest about a similar topic. The doctor couldn't perform most of his techniques in the U.S. due to ethical laws, giving the article a real "Frankenstein" flair. Good read.
I downloaded the Nouse, and the Bubble Frenzy demo. My webcam was already on top of my monitor, so all I had to do was run the program.
All you do is calibrate it by centering your nose in the image and clicking. The program draws a green box around your nose and follows it...it's pretty hilarious. Good oblique lighting seems to work best, too dark or too light and the box will want to follow your chin or ear. Overall, pretty reliable and lots of fun.
I loaded up the Bubble Frenzy game, which at first looks like a DOS-era Frozen Bubble. The Nouse worked fine...added a bit of challenge, levels I'd laugh at in Frozen Bubble were suddenly difficult. It's hard to keep track of the pointer when your head is moving. It was pretty fun, someone walked in and saw me playing, apparently just hitting the space bar while tilting my head from side to side.
I had a neck injury a while back in a car accident though, and all this motion started to bring on a little soreness. I had to quit after about 20 minutes of Nouse-ing, about the same effect as an hour of driving.
...
Maybe it's some sort of technophilia but some of the posts on here are just pure vapor. Sure, there have been some great advances in computer vision and pattern recognition... but have some of these posters on here ever done any research in the area? Hell, most face recognition goes back to Fischer's 1936 iris data set and primary component analysis... not quite Wintermute stuff.
Too often vision projects find speedups by sacrificing one or another components. For instance, you can get some great face recognition with PCA... as long as the person's face is immobile. Tilt your head slightly or rotate too much and the system has no clue.
I'll admit, there is some killer work out there. But not of the full-blown "20 years and we will all have robotic man servants" thing. Keep the hype to a minimum.
What is music when you despise all sound?
People might want to check out these cool pictures and videos from Cambridge University
As someone who has been doing research in areas of computer vision, and specifically identification and a member of a Computer Vision Research Laboratory, I just thought I would make a few comments here. Some area's of computer vision, in relation to big brother, have been around for a while and actually work quite well already. These areas include but are not limited to fingerprint, iris, and hand just to name a few. Those mentioned above are already in commercial applications around the country used for everything from secure entry into the country at immigration stations, to secure entry into rooms/labs/whatever, and to confirm identification for logins to computer and other systems. They work well (always some room for improvement), but require a completely willing subject and carry a certain 'stigma' of big brother and criminals with them that makes them less viable. The view mentioned here that researchers want to work towards is having a standard camera (like a security camera) able to identify people. However, despite some claims so far (most recent interesting claim out of Isreal), so far no one has proved to have ANYTHING that would be viable in a real world application. Best systems thus far have never even been tested with a database of over 500 people, most significantly less than that, and tend to not work well over time. Usually, they work fairly well the same day and then exponentially decrease in their effectiveness until around 6 months when you may as well be randomly guessing because you'd do about as well as most algorithms. Overall, I don't think you have anything to fear from big brother here anytime soon.
To lump all computer vision together and say "it's not there yet" is phooey! There are lots of problems in vision, and they do get solved, but those problems are all specific-- you can't use a red-light-runner system to do facial tracking...
This is very similar to a project I worked on in college. We were working on getting a webcam to track eye-gaze and to allow a user to control the mouse with their eye. I have always wanted to continue development of the gaze tracker, but never had the time after graduating. The website is here: http://www.gbook.org/projects/index.html
http://github.com/gbook/nidb