Research Highlights How AI Sees and How It Knows What It's Looking At
anguyen8 writes Deep neural networks (DNNs) trained with Deep Learning have recently produced mind-blowing results in a variety of pattern-recognition tasks, most notably speech recognition, language translation, and recognizing objects in images, where they now perform at near-human levels. But do they see the same way we do? Nope. Researchers recently found that it is easy to produce images that are completely unrecognizable to humans, but that DNNs classify with near-certainty as everyday objects. For example, DNNs look at TV static and declare with 99.99% confidence it is a school bus. An evolutionary algorithm produced the synthetic images by generating pictures and selecting for those that a DNN believed to be an object (i.e. "survival of the school-bus-iest"). The resulting computer-generated images look like modern, abstract art. The pictures also help reveal what DNNs learn to care about when recognizing objects (e.g. a school bus is alternating yellow and black lines, but does not need to have a windshield or wheels), shedding light into the inner workings of these DNN black boxes.
Unfortunately they are wrapped around a tree; just around the corner. Mistook a bee 3 inches from the camera for a school bus.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
makes it seem like the computers are morons. Anything that is black and yellow is a school bus...mmmmm nope.
Reminds me of the reverse OCR tumblr. It generates patterns of squiggles a human could never read but the OCR recognizes as a word.
http://reverseocr.tumblr.com/
idk, these results seem more similar to how humans see than they do different. When people don't know exactly what they are looking at, the brain just puts in it's best guess. people certainly see faces and other familiar objects in tv static. They see bigfoot in a collection of shadows or a strange angle on a bear. i even feel like i did sort of see a peacock in the one random image labeled peacock. it's sort of like the computer vision version of a rorschach test.
Unless it's static of an image of a school bus, these things sound utterly useless.
According to TFS, Charlie Brown is a schoolbus.
It's OK, if AI is this stupid, we need not worry about it taking over any time soon.
Lost at C:>. Found at C.
My composter helped me wreck a nice beach.
Here is an article from Vice of all places about this research, from June http://motherboard.vice.com/re...
Research paper here: http://cs.nyu.edu/~zaremba/doc...
Also, a funny video demonstrating the rudimental nature of nintendo ds brain training pattern recognition: https://www.youtube.com/watch?...
Then this is also a school bus.
Get free satoshi (Bitcoin) and Dogecoins
And then we also need a Red Forman translation tool to translate the message sent to the A.I.:
"This is static, dumbass!"
Get free satoshi (Bitcoin) and Dogecoins
I have been assured many, many times by the experts of Slashdot that computers are nowhere near achieving artificial intelligence.
"War makes me sad." - Me
Research Highlights How a Deep Neural Network Trained With Deep Learning Sees and How It Knows What It's Looking At
There, fixed that for you.
Why is using the term "AI" wrong in this headline?
#001: Because industry experts don't agree on what AI is
#010: Because most of the definitions of AI are much broader than what the article is talking about
#011: Because at least one definition of AI says something like "if it exists today, it's not AI" - including "beyond the capability of current computers" or something similar as a defining condition of the term "AI"
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
If the network was trained to always return a "best match" then it's working correctly. To return "no image", it would need to be trained to be able to return that, just like humans are given feedback when there is no image.
I know how they created the images, so I know its not really an image of a backpack really so much as static that has been messed with by someone in photoshop....however, if you showed me that, backpack would be high on my list of guesses.
That one really does look to me like someone washed out an image of a backpack with static.
"I opened my eyes, and everything went dark again"
So it needs to learn that these exact images are tricks being played on it, so it can safely ignore it. This is exactly what machine learning is. What's the story?
These are computer programs, not artificial intelligences as some have come to think of them. They are simply some charges flipping around in some chips. There is no seeing or recognizing in human terms. We apply all that consciousness crap.
In this case, the neural networks are randomly formed nets that match up a few pixels here and there then spit out a result. There is no seeing. Increase the complexity a thousand times over and there will still be no seeing, but there might, might, might be less shitting processing with fewer bizarre results.
As John Searle said, brains make minds.
Everything else is just speculating.
Computer learns to pick out salient features to identify images. Then we are shocked that when trained with no supervision the salient features aren’t what we would have chosen.
I see this as a great ah-ha moment. Humans also have visual systems that can be tricked by optical illusions. The patterns presented while seemingly incomprehensible to us make sense to computers for the same reason our optical illusions do to us -- taking short cuts in visual processing that would fire on patterns not often or ever seen in the real world. Which BTW means even as is, this type of visual identification is still useful, since the random images generating false hits aren’t just any random images, but ones that have visual features similar to the targets identified, even if we humans can’t see the similarities or even if they look like white noise.
Now that we know what computers are picking out as salient features, we can modify the algorithms to add additional constraints on what additional salient features must or must not be in an object identified, such that it would correspond more closely to how humans would classify objects. Baseball’s must have curvature for instance not just zig-zag red lines on white.
Letter To Iran
I mean an AI that looks at static and says it's a school bus 99.99% of the time seems to be about as broken as could be. The researchers have to be the most optimistic folks in the world if they still think there's a pony in there. I'd be seriously thinking about scrapping the software (or, at least, looking for a bad coding error) and/or looking for an entirely new algorithm after achieving results that bad.
CUR ALLOC 20195.....5804M
I've seen a school bus in porn that neither drives kids to school nor is it owned or operated by one. Next definition.
Yeah, who's to say the AI wasn't just seeing something we can't. Obviously aliens beamed a subliminal picture of a schoolbus into the TV static and the AI said Oh, a schoolbus!
When, Lord?! When the hell do I get to see the goddamn schoolbus?
Someone had to do it.
and never will be
How could you possibly know that?
An electronic switch knows nothing. A massive piles of electronic switches cannot know something.
A neuron knows nothing, and yet a "massive pile" of neurons can know, understand, imagine, lie, cheat, steal, love, hate, and dream.
AI may not be here yet, but it's practically inevitable.
systemd is Roko's Basilisk.
Then it's not a school bus. It might look like a school bus, it might even once have been a school bus, but generally to be a school bus it needs to be used for the purposes of hauling students to and from school, or to be frequently used in that capacity.
Do not look into laser with remaining eye.
A few neurons don't 'know' anything either. Neither do a dozen neurons. Tens of billions of neurons however... Anyway, you'll be the one looking like a complete tard in fifty years when AI is working well and is considered one of mankinds greatest achievements. (fusion power however will still be twenty years away)
Now all we've learned is that you define school bus in an idiosyncratic way, which already differs from the one I replied to (that definition stipulated that school buses also need to be owned or operated by a school). And if we ask 10 more people, we're going to find 10 more definitions, and I'm sure I can think of counter examples to all of them (for example, your definition would include parent-driven SUVs or any other kind of car frequently used to move children to and from school, which definitely doesn't count). Nevertheless, school buses exist, and there are vehicles that are definitely school buses. Maybe the problem is not with school buses, but with definitions.
I think this distillation by a neural network could also prove useful for making new icons and symbols though. Could prove useful in a reverse application by using them to break down stuff, have a human review it, and modify it back again into something recognizable by us on a more fundamental level.
In the pictures from the last link, I clearly see the gorilla and the backpack.
Those images remind me of what you get with some edge-detection filters commonly used to enhance image features.
Think of the global implications to surrealism!
Official Pi Ambassador -- inquire for details!
Sounds like the no true schoolbus fallacy.
One of you two will look like a tard. My money would be on you.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
I've done some image processing work.. It seems to me that you can take the output of this Neural network and correlate it with some other image processing routines, like feature detection, feature meteorology, etc; A conditional probability based decision chain,etc.
I work on a LIDAR sensor meant for Anti-. I work at a start-up that makes 3D laser-radar vision sensors for robotics and autonomous vehicles /anti-collision avoidance. The other day, I learned that such sensors allow robots to augment their camera vision systems to have a better understanding of their environment. It turns out that it's still an unsolved problem for a computer vision systems to unambiguously recognize that it's looking at a bird or a cat, and can only give you probabilities.. A LIDAR sensor instantly gives you a depth measurement out to several hundred meters that you can correlate your images to . The computer can combine the color information, along with depth information to have a much better idea of what it's looking at. For an anti-collision avoidance system, it has to be certain what it's looking at, and that cameras alone aren't good enough. I find it pretty exciting to be working on something that is useful for AI (artificial intelligence) research. One guy I work with got his Ph.D using Microsoft's Kinect sensor, which is something that gives robots depth perception for close-up environments..
“In the 60s, Marvin Minsky (a well known AI researcher from MIT, whom Isaac Asimov considered one of the smartest people he ever met) assigned a couple of undergrads to spend the summer programming a computer to use a camera to identify objects in a scene. He figured they'd have the problem solved by the end of the summer. Half a century later, we're still working on it.”
http://imgs.xkcd.com/comics/ta...
Machines got a lot of imagination, don't they? Next thing you know you'll be looking at the clouds with your robot buddy and it'll say "99.99% chance of that cloud looking like a puppy. BEEP". Oooorrrrr maybe a school bus, but you get what I mean.
Oh right I forgot this is Slashdot. MACHINES WILL DOMINATE US HELP. Peasants. Not like this display of reality will stop the rampart paranoia of people that works with computers and machines all day long... ...
ironic.
The DNN examples were apparently trained to discriminate between a members of a labeled set. This only works when you have already cleaned up the input stream (a priori) and guarantee that the image must be an example of one of the classes.
These classifiers were not trained on samples from outside the target set. This causes a forced choice: given this random dot image, which of the classes have the highest confidence? Iterate until confidence is sufficiently high, and you have a forgery with the same features the classifier is looking for.
For example, the digit training set (0,1,2...9) would need to be augmented with pictures of 'A', 'D', a smiley face, a doodle of a tree, a silhouette of Alfred Hitchcock and some spider webs. The resulting classifier would be more robust. The target classes (0,1,2,...9) would be counterbalanced with a null class (everything else). Looking inside the receptive fields of a robust image classifier is rather satisfying: you will find eigenimages that project back to image structures that are human recognizable, too.
The lesson in training your classifier is to either verify your assumption (all incoming samples must be a member of the chosen classes) or train (expose) your classifier to out-of-class samples.
Cool. Image recognition is far further along than I thought. It makes the same type of mistakes as humans although in a different way.
We humans see faces in everything. Smoke, clouds and static for example. This just means that this is inherent in the attempt of recognition.
Well, I might have a way, but it only works on a semi spherical planet in a vacuum.
Since the core of tis story is fooling a DNN rather than image recognition, I wonder whether the same exercise could be repeated with DNNs tasked to recognize human behavior and build digital profiles of humans based on for example browsing habits, keywords in online communication, movement is space, etc. How does a white noise terrorist look like? What would be its indirectly encoded best representation? We tend to be scared of digital profiling because we believe that our digital representation actually looks like us. but does it really?
The DNN examples were apparently trained to discriminate between a members of a labeled set. This only works when you have already cleaned up the input stream (a priori) and guarantee that the image must be an example of one of the classes.
These classifiers were not trained on samples from outside the target set.
This is not some network hastily trained by people who are ignorant of a very basic and long-known problem: "Clune used one of the best DNNs, called AlexNet, created by researchers at the University of Toronto, Canada, in 2012 – its performance is so impressive that Google hired them last year." From a paper by the developers of AlexNet: "To reduce overfitting in the globally connected layers we employed a new regularization method that proved to be very effective."
It does not seem plausible that this result can be explained away as an elementary mistake.
Except that a person has free will to self-identify, at least to an extent. There can be obvious delusion like Ugundan President Idi Amin, but it's fairly easy to say that a man born and/or raised in Scotland and who self-identifies with the culture of Scotland is probably a Scotsman, and even those men that don't self-identify but whose cultural perspectives derive from an upbringing in Scotland are still Scotsmen whether they want to be or not. Craig Ferguson holds American citizenship, but he's a Scotsman. John Barrowman is known as an American actor to American audiences, and even to most audiences in the UK, but he was born and raised in Scotland and speaks with a Scottish accent equally comfortably with his later-learned American accent.
Do not look into laser with remaining eye.
Ceci n'est pas une pipe.
I guess if your dictionary said "(to) thrown" is a verb, then yes, you ought to throw it out.
Far from showing weakness, this study seems to demonstrate a creatively brilliant algorithm. These are very, very strong results. I am deeply impressed.
Text recognition in white noise can be fixed with virtual saccades.
Aside from adding "human" sensibilities (do we only want it to only recognize objects in real, photo-realistic settings, and not drawings / art?), I would say it's good to go.