Research Highlights How AI Sees and How It Knows What It's Looking At
anguyen8 writes Deep neural networks (DNNs) trained with Deep Learning have recently produced mind-blowing results in a variety of pattern-recognition tasks, most notably speech recognition, language translation, and recognizing objects in images, where they now perform at near-human levels. But do they see the same way we do? Nope. Researchers recently found that it is easy to produce images that are completely unrecognizable to humans, but that DNNs classify with near-certainty as everyday objects. For example, DNNs look at TV static and declare with 99.99% confidence it is a school bus. An evolutionary algorithm produced the synthetic images by generating pictures and selecting for those that a DNN believed to be an object (i.e. "survival of the school-bus-iest"). The resulting computer-generated images look like modern, abstract art. The pictures also help reveal what DNNs learn to care about when recognizing objects (e.g. a school bus is alternating yellow and black lines, but does not need to have a windshield or wheels), shedding light into the inner workings of these DNN black boxes.
Unfortunately they are wrapped around a tree; just around the corner. Mistook a bee 3 inches from the camera for a school bus.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
makes it seem like the computers are morons. Anything that is black and yellow is a school bus...mmmmm nope.
Reminds me of the reverse OCR tumblr. It generates patterns of squiggles a human could never read but the OCR recognizes as a word.
http://reverseocr.tumblr.com/
Freinds, I ask you, "what is a school bus?" can anyone truly say what a school bus is?
idk, these results seem more similar to how humans see than they do different. When people don't know exactly what they are looking at, the brain just puts in it's best guess. people certainly see faces and other familiar objects in tv static. They see bigfoot in a collection of shadows or a strange angle on a bear. i even feel like i did sort of see a peacock in the one random image labeled peacock. it's sort of like the computer vision version of a rorschach test.
Unless it's static of an image of a school bus, these things sound utterly useless.
According to TFS, Charlie Brown is a schoolbus.
It's OK, if AI is this stupid, we need not worry about it taking over any time soon.
Lost at C:>. Found at C.
My composter helped me wreck a nice beach.
Here is an article from Vice of all places about this research, from June http://motherboard.vice.com/re...
Research paper here: http://cs.nyu.edu/~zaremba/doc...
Also, a funny video demonstrating the rudimental nature of nintendo ds brain training pattern recognition: https://www.youtube.com/watch?...
What we need is also a statistical analysis tool separate from the machine vision neural net that says to the AI:
"Dumbass this is static"
Interesting way of communicating an actual failure as a success. While these algorithms reportedly detect a few things correctly (speech, objects in images...) they also "see things" that do not exist, much like certain psychiatric patients do. In this sense they are a giant leap forward, an important objective remains to achieve--distinguish between reality and illusion.
Then this is also a school bus.
Get free satoshi (Bitcoin) and Dogecoins
I have been assured many, many times by the experts of Slashdot that computers are nowhere near achieving artificial intelligence.
"War makes me sad." - Me
Research Highlights How a Deep Neural Network Trained With Deep Learning Sees and How It Knows What It's Looking At
There, fixed that for you.
Why is using the term "AI" wrong in this headline?
#001: Because industry experts don't agree on what AI is
#010: Because most of the definitions of AI are much broader than what the article is talking about
#011: Because at least one definition of AI says something like "if it exists today, it's not AI" - including "beyond the capability of current computers" or something similar as a defining condition of the term "AI"
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
If the network was trained to always return a "best match" then it's working correctly. To return "no image", it would need to be trained to be able to return that, just like humans are given feedback when there is no image.
I know how they created the images, so I know its not really an image of a backpack really so much as static that has been messed with by someone in photoshop....however, if you showed me that, backpack would be high on my list of guesses.
That one really does look to me like someone washed out an image of a backpack with static.
"I opened my eyes, and everything went dark again"
So it needs to learn that these exact images are tricks being played on it, so it can safely ignore it. This is exactly what machine learning is. What's the story?
These are computer programs, not artificial intelligences as some have come to think of them. They are simply some charges flipping around in some chips. There is no seeing or recognizing in human terms. We apply all that consciousness crap.
In this case, the neural networks are randomly formed nets that match up a few pixels here and there then spit out a result. There is no seeing. Increase the complexity a thousand times over and there will still be no seeing, but there might, might, might be less shitting processing with fewer bizarre results.
As John Searle said, brains make minds.
Everything else is just speculating.
Computer learns to pick out salient features to identify images. Then we are shocked that when trained with no supervision the salient features aren’t what we would have chosen.
I see this as a great ah-ha moment. Humans also have visual systems that can be tricked by optical illusions. The patterns presented while seemingly incomprehensible to us make sense to computers for the same reason our optical illusions do to us -- taking short cuts in visual processing that would fire on patterns not often or ever seen in the real world. Which BTW means even as is, this type of visual identification is still useful, since the random images generating false hits aren’t just any random images, but ones that have visual features similar to the targets identified, even if we humans can’t see the similarities or even if they look like white noise.
Now that we know what computers are picking out as salient features, we can modify the algorithms to add additional constraints on what additional salient features must or must not be in an object identified, such that it would correspond more closely to how humans would classify objects. Baseball’s must have curvature for instance not just zig-zag red lines on white.
Letter To Iran
I mean an AI that looks at static and says it's a school bus 99.99% of the time seems to be about as broken as could be. The researchers have to be the most optimistic folks in the world if they still think there's a pony in there. I'd be seriously thinking about scrapping the software (or, at least, looking for a bad coding error) and/or looking for an entirely new algorithm after achieving results that bad.
CUR ALLOC 20195.....5804M
There are SIMULATIONS of intelligence, but there is, and never will be, such a thing as "artificial intelligence".
An electronic switch knows nothing. A massive piles of electronic switches cannot know something. Replacing those switches with vacuum tubes or transistors does not change the equation, it only makes the mess more efficient.
There is a HUGE difference between STORING bits (and sorting and manipulating them) and UNDERSTANDING what those bits ARE and what they REPRESENT. You can teach a system to associtate images of balls with the dictionaly entry "ball", but the computer will KNOW nothing about balls, not understand balls, not KNOW what balls can be used for etc and at best will just have links to other dictionary entris for things like "toy" "rubber" "roll" "bearings" etc all of which the computer will also not understand. Genetic systems and AI systems are perfectly useful for many tasks, but they have NOTHING in common with actual intelligence.
This is, in fact, what makes the field of "Artificial Intelligence" so very dangerous in the long-term: people build these systems and can assign them tasks and easily forget that these systems are not actually intelligent at all, lacking entirely ANY sence of understanding and therefore also lacking anything like "common sense" and morals. You might THINK your AI system is flying your plane because it understands flight and weather etc but it might be judging what to do with the ailerons based on a pile of statistical alignments of sky color tones, accellerometer readings influenced by the movement of passengers and flight attendants, and some collection of data about the cargo manifest all of which might be just right on the first 1000 flights before being just a little wrong on flight 1001...
I guess its time for me to thrown out my dictionary, all words seem to have changed their meaning.
I think this distillation by a neural network could also prove useful for making new icons and symbols though. Could prove useful in a reverse application by using them to break down stuff, have a human review it, and modify it back again into something recognizable by us on a more fundamental level.
In the pictures from the last link, I clearly see the gorilla and the backpack.
Those images remind me of what you get with some edge-detection filters commonly used to enhance image features.
Think of the global implications to surrealism!
Official Pi Ambassador -- inquire for details!
I've done some image processing work.. It seems to me that you can take the output of this Neural network and correlate it with some other image processing routines, like feature detection, feature meteorology, etc; A conditional probability based decision chain,etc.
I work on a LIDAR sensor meant for Anti-. I work at a start-up that makes 3D laser-radar vision sensors for robotics and autonomous vehicles /anti-collision avoidance. The other day, I learned that such sensors allow robots to augment their camera vision systems to have a better understanding of their environment. It turns out that it's still an unsolved problem for a computer vision systems to unambiguously recognize that it's looking at a bird or a cat, and can only give you probabilities.. A LIDAR sensor instantly gives you a depth measurement out to several hundred meters that you can correlate your images to . The computer can combine the color information, along with depth information to have a much better idea of what it's looking at. For an anti-collision avoidance system, it has to be certain what it's looking at, and that cameras alone aren't good enough. I find it pretty exciting to be working on something that is useful for AI (artificial intelligence) research. One guy I work with got his Ph.D using Microsoft's Kinect sensor, which is something that gives robots depth perception for close-up environments..
“In the 60s, Marvin Minsky (a well known AI researcher from MIT, whom Isaac Asimov considered one of the smartest people he ever met) assigned a couple of undergrads to spend the summer programming a computer to use a camera to identify objects in a scene. He figured they'd have the problem solved by the end of the summer. Half a century later, we're still working on it.”
http://imgs.xkcd.com/comics/ta...
Well this isn't going to work as a stegnographic method. Evryone out there can implement the same dnn. Hmmm and you mean you canget a paper/phdmsc withthis?
It's a part of intelligence and it is artificial. It is not an "AI" as an entity, but it definitely is Artificial Intelligence. Every "AI" in the mythological sense you seem to think of will have to use such or similar mechanisms for seeing, just as we do.
And I see no reason why you shouldn't be able to engineer more parts of an "AI" by these or similar technologies. At some point you'll find it to be harder and harder to say if there's something that REALLY sees and thinks or not. The trouble with AI is not only that we have only poor tools to engineer one, but also that we only have a very rough idea of what "intelligence" actually is or should mean. Trying to recreate it or parts of it is definitely a great way to learn more about it. Also, it's perfectly possible that we can take approaches that biology just can't take for some reason. Like the wheel that nature didn't come up with and still it allows us to move faster than any animal could.
"Brains make minds", yes. They will, sooner or later. There are just too many tempting applications for that, we're constantly making tools to do things easier for us and thinking is really hard for many people! It also scales only badly, if at all. Very unreliable too.
And /.: Your captchas are pathetic. I've read books that were harder to read.
Machines got a lot of imagination, don't they? Next thing you know you'll be looking at the clouds with your robot buddy and it'll say "99.99% chance of that cloud looking like a puppy. BEEP". Oooorrrrr maybe a school bus, but you get what I mean.
Oh right I forgot this is Slashdot. MACHINES WILL DOMINATE US HELP. Peasants. Not like this display of reality will stop the rampart paranoia of people that works with computers and machines all day long... ...
ironic.
The DNN examples were apparently trained to discriminate between a members of a labeled set. This only works when you have already cleaned up the input stream (a priori) and guarantee that the image must be an example of one of the classes.
These classifiers were not trained on samples from outside the target set. This causes a forced choice: given this random dot image, which of the classes have the highest confidence? Iterate until confidence is sufficiently high, and you have a forgery with the same features the classifier is looking for.
For example, the digit training set (0,1,2...9) would need to be augmented with pictures of 'A', 'D', a smiley face, a doodle of a tree, a silhouette of Alfred Hitchcock and some spider webs. The resulting classifier would be more robust. The target classes (0,1,2,...9) would be counterbalanced with a null class (everything else). Looking inside the receptive fields of a robust image classifier is rather satisfying: you will find eigenimages that project back to image structures that are human recognizable, too.
The lesson in training your classifier is to either verify your assumption (all incoming samples must be a member of the chosen classes) or train (expose) your classifier to out-of-class samples.
The Figure 1 direct encoding images are recognizable to THIS human. Kudos to the DNN for picking out the small nonrandom areas and identifying them.
If shown an image of a digit that a human would say is clearly identifiable as such, does the DNN assign a p-value of only 99.99% or does it claim FAR higher reliability? like 99.99999999999999999999%?
Is the DNN restricted to considering just digits and not any old object? If the DNN isn't restricted to calling just digits from 0-9, are the p-values still 99.99%? Does it still call a digit over some other object (that's probably not a digit)?
Cool. Image recognition is far further along than I thought. It makes the same type of mistakes as humans although in a different way.
We humans see faces in everything. Smoke, clouds and static for example. This just means that this is inherent in the attempt of recognition.
Well, I might have a way, but it only works on a semi spherical planet in a vacuum.
Since the core of tis story is fooling a DNN rather than image recognition, I wonder whether the same exercise could be repeated with DNNs tasked to recognize human behavior and build digital profiles of humans based on for example browsing habits, keywords in online communication, movement is space, etc. How does a white noise terrorist look like? What would be its indirectly encoded best representation? We tend to be scared of digital profiling because we believe that our digital representation actually looks like us. but does it really?
The DNN examples were apparently trained to discriminate between a members of a labeled set. This only works when you have already cleaned up the input stream (a priori) and guarantee that the image must be an example of one of the classes.
These classifiers were not trained on samples from outside the target set.
This is not some network hastily trained by people who are ignorant of a very basic and long-known problem: "Clune used one of the best DNNs, called AlexNet, created by researchers at the University of Toronto, Canada, in 2012 – its performance is so impressive that Google hired them last year." From a paper by the developers of AlexNet: "To reduce overfitting in the globally connected layers we employed a new regularization method that proved to be very effective."
It does not seem plausible that this result can be explained away as an elementary mistake.
A neuron is a complex LIVING thing
There is simply NO SCIENTIFIC EVIDENCE that a non-living thing can actually KNOW anything. In fact, there's not really any scientific proof that lots of neurons are all that is required for intelligence - Living intelligent creatures have and use lots of neurons, BUT that's not precisely the same thing as proof that the one creates the other.
Saying "lots of neurons == intelligence, therefore lots of logic gates == intelligence" is about on par with saying "lots of steel == a ship therefore lots of pudding == a ship"
Neurons are absolutely NOT a drop-in replacement for a logic gate or an OpAmp. Using a neuron harvested from a living creature as part of an electronic circuit is also not proof of anything more than the fact that it can be kludged to work; doing so does not use all of a neuron's capabilities nor does it use them properly - this is on-par with somebody using a laptop computer as a doorstop (it "works" but it's not proof that you know houw to use a laptop or that a laptop and a doorstop are the same thing.
Far from showing weakness, this study seems to demonstrate a creatively brilliant algorithm. These are very, very strong results. I am deeply impressed.
Text recognition in white noise can be fixed with virtual saccades.
Aside from adding "human" sensibilities (do we only want it to only recognize objects in real, photo-realistic settings, and not drawings / art?), I would say it's good to go.