The Status Quo Of Computer Vision
prostoalex writes "The Industrial Physicist sums up the recent advances and developments in the world of computer vision. They mention an application for human-computer interfacing using a Webcam, Philips Research Lab Seeing with Sound product, which augments vision for visually impaired, as well as various frontal face detection applications."
I'd have to say computers generally have very good vision - I am yet to see one wearing a pair of glasses.
Another big advance I think will come with the prodding of the DARPA's $1million contest. A lot of disccusion has been going on their message boards about computer vision systems.
At first, the phrase "frontal face detection applications" sounded rather cumbersome. But then a shorter phrase of "facial detection applications" might have been grossly misunderstood. ;)
I only post comments when someone on the internet is wrong.
It looks like you're trying to masturbate! Would you like me to load:
* Your porn collection
* An AIM conversation with a guy pretending to be female
* Recommended self pleasuring techniques database
* Featured lubricant merchants
---
DRM is like antifreeze, to the MPAA/RIAA it's sweet, to the consumers it's poison.
While it is clearly true that only the recent advances in computer speed allowed the Computer Vision Systems we are seeing now there are also other important influences.
In particular there are really also better algorithms than a number of years ago. Many if not most successful computer vision systems use statistical Methods. In the case of faces for example they often build a probabilistic model of what a face is. Such models know that a face should usually has eyes but not always. That some people have beards etc. And these models train themselves up from a database of stimuli, for example real faces.
A number of recent advances makes such probabilistic models fast enough to work well on real world data. In a sense is the problem of computer vision very similar to the problem of understanding a voice or extracting the highest possible bitrate from a stream of data transmitted via a telephone line. And indeed the resulting algorithms are often surprisingly similar
Googlefight "Slashdot Troll" against "BSD is dying" 303:229. BSD thus cant die.
Daredevil actually premiered in The Netherlands this week, so I was kind of expecting them to suddenly come up with some kind of "Seeing with Sound"-tool. These guys are so predictable.
it work well since years ago. Computers running Windows often show me their blue face when I show them mine, even if their owners says that Windows is very stable and they never saw a blue screen before. Surely Windows can recognize people and do this specifically to me.
All you need is it to understand english, and imagine in a 3d space.
Type a sentence like Zork, and it makes the scene for you.
Give it a book, and it could turn it into a movie for you.
Vision recognition has a great many uses already, but when vision recognition matures, you'll be able to take a scene and reduce it into 3d reality space. You take the 3d reality space, and give the computer some goals, and its trying to accomplish something in the world.
Thing is, it won't stop at plain vision, you'll get infared, sonar, ultraviolet, radar, all that crap to get the best 3d image possible.
So since vision is progressing, the gap towards AI is shrinking. Also as video games become more realistic, the AI gap is shrinking. I could be bold and say 15 years from now we should have basic AI.
God spoke to me
Wired had an article late last year entitled Vision Quest about a similar topic. The doctor couldn't perform most of his techniques in the U.S. due to ethical laws, giving the article a real "Frankenstein" flair. Good read.
I live for the day that we can have a computer installed in our bodies have have a HUD in our eyes. How cool would it be to browse the net or play games while doing other, more boring things in the outside world?
We could be talking about a revolution in isolationism here! I can't wait!
I am a filthy pirate.
...that in times like this, people always have to resort to this kind of submissions and waste otherwise better served research money...
-- debian linux - vim powered
All this fantastic technology - and yet here i am using mozilla with linux all fully apt-get upgaded to testing, everything uber-optomised and configured , all is good, smooth and aliased...
:^)
BUT my itallic fonts when on slashdot still look fucking shit by default!
And don't try and tell me how to set my desktop up properly - check me out:
I AM THE 'KIN DESKTOP (all your desktops are belong to me now)
Before adopting WHATWG, read the moonlight.NET EULA [http://www.microsoft.com/interop/msnovellcollab/moonlight.mspx]
When I panic, the Linux kernel detects that and also panics. Wow, computers have had facial recognition for a long time!
I downloaded the Nouse, and the Bubble Frenzy demo. My webcam was already on top of my monitor, so all I had to do was run the program.
All you do is calibrate it by centering your nose in the image and clicking. The program draws a green box around your nose and follows it...it's pretty hilarious. Good oblique lighting seems to work best, too dark or too light and the box will want to follow your chin or ear. Overall, pretty reliable and lots of fun.
I loaded up the Bubble Frenzy game, which at first looks like a DOS-era Frozen Bubble. The Nouse worked fine...added a bit of challenge, levels I'd laugh at in Frozen Bubble were suddenly difficult. It's hard to keep track of the pointer when your head is moving. It was pretty fun, someone walked in and saw me playing, apparently just hitting the space bar while tilting my head from side to side.
I had a neck injury a while back in a car accident though, and all this motion started to bring on a little soreness. I had to quit after about 20 minutes of Nouse-ing, about the same effect as an hour of driving.
...
Maybe it's some sort of technophilia but some of the posts on here are just pure vapor. Sure, there have been some great advances in computer vision and pattern recognition... but have some of these posters on here ever done any research in the area? Hell, most face recognition goes back to Fischer's 1936 iris data set and primary component analysis... not quite Wintermute stuff.
Too often vision projects find speedups by sacrificing one or another components. For instance, you can get some great face recognition with PCA... as long as the person's face is immobile. Tilt your head slightly or rotate too much and the system has no clue.
I'll admit, there is some killer work out there. But not of the full-blown "20 years and we will all have robotic man servants" thing. Keep the hype to a minimum.
What is music when you despise all sound?
People might want to check out these cool pictures and videos from Cambridge University
I think at some point we went down a path which will never lead to the solutions we expected to have by this time. And the reason we can't get off the current path is because of the way the tech culture is, you always have to publish an extension to previous work with copious references.
And its not even the big stuff, look at spell checkers and grammar checkers. Are there any that can tell correctly spelled but misused words? Affect/effect? There/their? How about something easy like made and maid?
As someone who has been doing research in areas of computer vision, and specifically identification and a member of a Computer Vision Research Laboratory, I just thought I would make a few comments here. Some area's of computer vision, in relation to big brother, have been around for a while and actually work quite well already. These areas include but are not limited to fingerprint, iris, and hand just to name a few. Those mentioned above are already in commercial applications around the country used for everything from secure entry into the country at immigration stations, to secure entry into rooms/labs/whatever, and to confirm identification for logins to computer and other systems. They work well (always some room for improvement), but require a completely willing subject and carry a certain 'stigma' of big brother and criminals with them that makes them less viable. The view mentioned here that researchers want to work towards is having a standard camera (like a security camera) able to identify people. However, despite some claims so far (most recent interesting claim out of Isreal), so far no one has proved to have ANYTHING that would be viable in a real world application. Best systems thus far have never even been tested with a database of over 500 people, most significantly less than that, and tend to not work well over time. Usually, they work fairly well the same day and then exponentially decrease in their effectiveness until around 6 months when you may as well be randomly guessing because you'd do about as well as most algorithms. Overall, I don't think you have anything to fear from big brother here anytime soon.
Yes, PCA is old and doesn't work very well. That is an easy starwman to set up. You claim to work in the field, yet conveniently forget ICA techniques and the tensorfaces paper in ECCV last year. I have to suspect that you either A) don't work in the field, B) don't read the current journals, or C) don't understand the state-of-the-art in this field. I'm sure you impressed someone though. Keep up the good work you trolling jackass.
Most of my research involves adapting the `what's in' in facial recognition and applying it to disease diagnosis and segmentation of heart MRIs. These days' people at developing statistical shape and appearance models of faces via PCA for matching and segmentation. It gets a bit scary what they can do in facial recognition if you start reading up, but it's also slightly disconcerting how much money is being tossed in medical imaging.
-- Making computers see, hear, and think... http://www.componica.com/
Yes perhaps. But it is an exciting field. Vision is THE sense I wouldn't want to lose and it is sad seeing older people disengage from the world as it gets lost. Given my (generation's) connection to computers and visual stimuli just to get by (buy stuff, learn, communicate, et cetera), this technology is sure to be boon to humanity.
What totally drives me nuts is most people in that field are totally hooked on the whole fisher-face, eigen-face, ICA, thing. Basically they naively project a two-dimensional affine/brightness normalized face onto a basis function and then do a nearer neighbor on the coefficients to determine identity using some magical distance metric like Mahalanobis or Euclidian. They totally fail when the intensity or pose changes, and then blame it on the distance function or basis function.
Shape models and combined models take this into account and are really popular in medical imaging, yet the facial people seem to shoot down. (Well it's antidotal on my part).
Sorry, I guess I'm geeking out, but I love this stuff.
-- Making computers see, hear, and think... http://www.componica.com/
To lump all computer vision together and say "it's not there yet" is phooey! There are lots of problems in vision, and they do get solved, but those problems are all specific-- you can't use a red-light-runner system to do facial tracking...
I was stunned by how OCR went from "impossible" to "Trvial" and all that changed was moores law making high res scans available in memory in a typical PC. Expect many vision problems to fall by the wayside with new 240 Frames per second 3 megapixel cameras. (Don't save THOSE movies uncompressed!) See the Sensor Spec.
This is very similar to a project I worked on in college. We were working on getting a webcam to track eye-gaze and to allow a user to control the mouse with their eye. I have always wanted to continue development of the gaze tracker, but never had the time after graduating. The website is here: http://www.gbook.org/projects/index.html
http://github.com/gbook/nidb
If researchers really want to understand the mechanism of human face recognition, they should be looking at the cases where it *doesn't* work: autistic people with face blindness.
N2 Reading
They use computer techniques to help people with Intermittent Central Suppression read. They're fighting the good fight too!
"Times have not become more violent. They have just become more televised."
-Marilyn Manson
This project at MS research not only does the face detection, but recognition.
I can't get the videos to play right now, but when I saw them before, as people walked on and off camera, it would find their face, put a square around it and label their name on it.
Pretty neat.
When looking around the Cambridge Machine Vision URL noted above, I came upon this little site which looks like an interesting project in Hidden Markov Models; so, take a look at current 'owner' of site and related software license.
Now idn't that funny?
Just look at the various slashdot posters around :).