Predator Outdoes Kinect At Object Recognition
mikejuk writes "A real breakthough in AI allows a simple video camera and almost any machine to track objects in its view. All you have to do is draw a box around the object you want to track and the software learns what it looks like at different angles and under different lighting conditions as it tracks it. This means no training phase — you show it the object and it tracks it. And it seems to work really well! The really good news is that the software has been released as open source so we can all try it out! This is how AI should work."
I for one welcome our new Open Source overlords.
Never email donotemail@WeAreSpammers.com
The true test: can it track objects without a red dot or yellow square?
Kinect doesn't just track the person, it tracks the wireframe so it can figure out what the objects legs and arms are doing, that's why it has the laser rangefinder in addition to the stereo cameras.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
1) Integrate this with a physical tracking system to move the camera to follow the target. 2) A simple program to actuate a solenoid when on target. 3) Add gun 4) train with photo 5) leave somewhere days before target arrives. 6) Profit
I heard you like you like Predators, so we put a Predator on your Predator so you can spy while you spy.
Watched TFV on TFA, very interesting. Something to play with soon I think.
I'm assuming the robot plane can track objects pretty well before it disconfigures them.
This method uses fairly standard techniques (tracking and on-line learning), and puts them together nicely. It is very nice work, but hardly a breakthrough in AI.
Also, this has nothing to do with Kinect. This tracker uses a 2D camera to track 2D image patches, while learning their shape. The Kinect is a 3D sensor which is used for tracking *articulated* models, such as people.
Yeah thats how it starts off. First you're like "ooh ahh look at the cute little robot isn't he pretty walking around by himself" then later theres running and screaming and then its all like "Newsflash :Bombing in Midtown, USA - Cyborg liberation front demands equal rights for robots".
------
beware he who would deny you access to information, for in his mind he dreams himself your master
what could possibl
sky net is growing
Very nice.
There are other systems which do this, though. This looks like an improvement on the LK tracker in OpenCV.
This could be used to handle focus follow in video cameras. Many newer video cameras recognize faces as focus targets, but don't stay locked onto the same face. A better lock-on mechanism would help.
That was a very nice demonstration and well done to Zdenek Kalal. That said, there's a bunch of trackers out there and what I find is that none of them do well in a noisey environment where there's a bunch of similar items. Security cameras have to work in the rain, snow, fog, low light conditions. So Zdenek, if you are listening, how real-word can you go with this?
Shouldn't we be developing AI to use two? I mean, we have two eyes (most of us, condolences to those who do not, no disrespect intended) and we recognize objects, dept of field and rates of change within three dimensions, using them.
A feeling of having made the same mistake before: Deja Foobar
Oh... I know. "Predator." That's not a loaded, terrifying term at all.
This isn't a breakthrough. Much of the technology for tracking objects in this way has been out for about a decade. See this Wikipedia article for one technique for doing this:
http://en.wikipedia.org/wiki/Scale-invariant_feature_transform
This runs inside of, or is somehow dependent on Matlab. I know little about Matlab, but I find that somewhat odd that it cannot be implemented as a standalone application. What magic of Matlab cannot be easily reproduced to make this a standalone app?
Ok guys, seriously. This isn't good news. This is just one more step towards the inevitable and eternal oppression of the majority of humanity through automation.
@1:40 "Usefull for disabeled people"...
But apparently not useful for PhD students who can't spell
How usefully open-source can it be with a commercial library requirement?
-- stream of did I lock the front door consciousness
You point it at a hall of mirrors?
This video was uploaded in January, and it's on slashdot NOW?
Bloody stupid name if you ask me ...
I mean they even spelled it properly and everything.
Do not meddle in the affairs of geeks for they are subtle and quick to anger
Kinect is how you feed data to an image recognition/tracking algorithm, Predator is that algorithm. The software side of Kinect has support for efficiently tracking items, but that is so you have the most CPU left for a game. That was the trade-off.
Kinect hardware can do something very useful that Predator can't -- it can tell how far away something else (and thus, judge position or size more accurately).
The predator algorithim (and other ones no doubt under development) using the two sets of data from a Kinect camera will still be superior to an algorithm using just one set of data.
How is showing it the object at different angles and different lighting conditions not a training phase?
As a person who does on a daily to daily basis research on object tracking, and having seen implementations and performances of many trackers (including this one) on real world problems (including gaming), this is nowhere a new approach or an approach which outperforms many other ones published in recent computer vision conferences.
From TFA:
"It is true that it isn't a complete body tracker, but extending it do this shouldn't be difficult."
Going from this to body tracking is a HUGE step, it's not a really easy thing to do. I don't know there is a strange hype around this one which I can't really understand the reason, it's coming up on many websites etc, while as I said not being a great tracker.
czechs, being generally weak and defenseless, have a jealousy hard-on for the american military system
Predator outdoes everything and everyone except for Arnold Schwarzenegger.
"Moley Mole" was the first name that came to me when watching the video.
Beam me up, Scotty.
Now why did you have to go and say that? Don't you know they hate it when you tell them what they're supposed to do?
Wouldn't be surprised if the robot uprising took place tonight. At least, I know who pushed them over the edge.
WARNING: Smartphones have side effects--most of them undocumented.
but I'm just going to take it to Fark.
Giving an initial starting point and/or tracing the object amounts to pure cheating. This is going backwards in time in terms of evolution of A.I. The object to track (a scale/rot/trans-invariant shapes database) and its intial starting point on the picture is a big problem in Computer Vision. This software ain't a breakthrough if the computer can't track a known object by itself.
No - wrong on all counts.
- Kinect doesn't have stereo cameras (it has an IR camera for depth perception and a visible light camera for other usage)
- Kinect doesn't use the visible light camera for body recognition. Recognition is based on the depth map provided by the IR grid projector and IR camera.
- Kinect doesn't operate like a laser rangefinder (it operates via structured light displacements, not via light pulse reflection times)
- Kinect doesn't track a wireframe (it tracks independent body parts)
How you got modded as "4 - informative" is beyond me. The blind leading the blind.
The way Kinect works is by projecting a dense evenly-spaced grid of IR dots (i.e. structured light) on the scene, then using it's IR camera (horizontally offset from the grid projector) to pick up the reflected dot pattern.
Due to depth differences in the scene, and the offset of the IR camera from the IR projector, the reflected dot pattern is not evenly spaced - the dots are horizontally displaced based on depth. To understand this, consider shining two parallel beams of light at a) a flat surface, and b) a surface angled at 45 degrees away from the light source. If you took a step sideways away from the light beams and looked at their reflections of the two objects, the dot (beam) separation on the flat surface would be the same as the true beam separation, but the dot separation on the angled surface would be increased. by an amount you could calculate using simple trig.
In order to operate in real-time with low cost, a dedicated chip processes the IR camera image and converts the dot displacements into the corresponding depth map.
The clever, and somewhat counter-intuitive, part is how Kinect then turns this depth map into a body part map. The basic idea is that it probabilistically maps local clusters of depths to body parts (via having been trained on a huge manually body-part-labelled image set), then converts these local probabilities into larger scale body part labels (i.e. if 60% of the local clusters in a region say "hand" , then the region is labelled as a hand). This way it doesn't track overall body postion or a wireframe, but rather independently tracks body parts (which is why it has no trouble correctly tracking muliple partially occluded people in frame).
...the software learns you!
I8-D
Warning: Goatse ahead.
Er, but did you actually look at the image? It's not goatse itself, it's an advert for Audio cars that's appeared on public billboards throughout the UK(!)
Spotted the similarity myself instantly, but having checked online it appears that lots of other people in the UK spotted and uploaded the same poster as well. Not surprised, more surprising that someone at the ad agency didn't spot it!
Ambulatory across a variety of terrain - CHECK
Wireless encrypted connectivity to a central hub - CHECK
Interchangeable weapons - CHECK
Perception of humanity as threat - STILL WAITING ON AI (but won't be long after)
Advanced tracking - CHECK
Get your popcorn ready, should be a helluva show!
Wow! You are of course correct, I have been trolled! I feel pretty stupid now.. the post was probably crafted with that URL to catch people out!
Sorry folks... post redacted!
This tagline was transcoded to result in at least one smirk. If you experience failure to smirk, please consult your Gen
"All you have to do is draw a box around the object you want to track and the software learns what it looks like at different angles and under different lighting conditions as it tracks it. This means no training phase — you show it the object and it tracks it."
Um, what?
Manually labeling something and giving the software the opportunity to learn the pattern in the box... that right there is training. The software is trained on one set of pixels so it can continue to detect similar groupings of pixels in future image frames.
Someone needs to go study their old Pattern Recognition 101 notes...
These "neural networks" and other algorithms to teach recognition to computers aren't AI. Worse: it isn't bringing us any closer to AI than what we had 20 years ago... Or the definition of "AI" is a very, very, sad one.
I work at a karaoke bar. http://www.justin.tv/7bamboo I'd really like to use this to track singers as they move around the room, or have spotlights follow.
IAMA Computer vision and robotics researcher and I find the Predator video very over hyped. This is nothing new.
As mentioned by previous posts, tracking with ongoing model updates isn't new. It has been around pretty much since the information filter (Kalman, condensation etc) days. As a fellow researcher that have made my own tracking videos, I noted the following:
1) The system does not seem to deal with rotation well (lost track of 3-fingered window when it was rotated). Generally, the system seems to be scale and translation robust, but not rotation.
2) It is a single hypothesis tracker, which means temporal occlusions and multiple similar targets will cause the system to diverge.
3) No failure cases were shown, which usually indicates a lack of testing. Everything has weaknesses (including our own vision system), where are the failure modes for the system?
Otherwise, I like the fact that it works practically and source code is available. However, claiming that this is better than the kinect is ludicrous as it provides no 3D information (initial object scale not known with a single camera!) and can't tell the difference between the photo of a face and an actual 3D face (whereas the kinect at least has a shot at this).
Right on!
http://slashdot.org/comments.pl?sid=2082940&cid=35823386 , lmao. Serves you right for trolling others first for no reason Skidborg.
Being the troll you are, I can see how you ask how humans do anything, first of all.
It's clearly, beyond you, & even on forums, understanding that there is another human being on the other end, first. No, you just TROLL OTHERS, as you did myself, here AND ADMIT TO IT ALSO:
http://slashdot.org/comments.pl?sid=2082940&cid=35823386
Says it all, see subject-line, or for the uninitiated? Because for that?? Well...
LOL, "U GOT 'P L A Y E D'" (U played yourself)
APK
P.S.=> See - Skidborg, if you wouldn't troll others here 1st for no good reason at all? You might not have gotten "PLaYeD" as badly, or rather, not at all, instead, so... Try to "drink that in & digest it" troll... apk
LOL "U GOT 'P L A Y E D'" (U played yourself) -> http://slashdot.org/comments.pl?sid=2082940&cid=35823386 where skidborg admits to trolling others 1st & getting his jollies from it, like a sick troll does. Grow up, get help freak - You need it, because LOL "U GOT 'P L A Y E D'" (U played yourself).
The goatse that has goatse in the link is not the true goatse.
/zen
Rampant carbon sequestration destroyed the Dinosaurs' tropical paradise. I'm here to help repair the damage.
the goatse is a lie
I've been watching progress of Yann LeCun on this topic for years now. Cool to finally see an application.
http://www.cs.nyu.edu/~yann/research/objreco/index.html
This article is written like an advertisement. Please help rewrite this article from a neutral point of view. For blatant advertising that would require a fundamental rewrite to become encyclopedic, use {{db-spam}} to mark for speedy deletion.
I hope he can mod this to track eyeball movements and be able to use our eyes as a pointing device like a mouse. This would probably do away with our long dependency on the mouse and probably help a lot of disabled people.
Also if it can track our hand and finger movement we can maybe use it to manipulate screens from a distance. Imagine clicking an icon in your pc screen from a distance. It can be something like the operators in the movie Avatar where they do gestures to manipulate a 3D projection/hologram.
What's with submitters adding their own little opinions before anyone gets a chance to read the articles?
Who are YOU?
This should be modded up... so far up... right up,.. not just because it's a great comment and this is the third or fourth time i've been back to read it, but because it gave me an excuse to say the above. Thank you, Poster.
This tagline was transcoded to result in at least one smirk. If you experience failure to smirk, please consult your Gen