Microsoft's New AI Mistakenly Identifies Photos, Ignores Hitler (mashable.com)
An anonymous reader writes: Microsoft's newest online AI, CaptionBot, tries to identify what's in an uploaded photo, using two recognition APIs recently released by Microsoft Cognitive Services for app developers-- "Computer Vision" and "Emotion". But while Microsoft brags that their AI "can understand thousands of objects, as well as the relationships between them," bloggers are also sharing funny examples of CaptionBot's many mistakes. While it correctly identified Bea Arthur, Ozzy Osbourne and Joan Jett, and a movie poster with Arnold Schwarzenegger, it mistakenly identified Gene Simmons of KISS as "a woman in a red jacket...sitting on a motorcycle," described a wedding dress as "a cat wearing a tie," mistook Michelle Obama for a cellphone, and described one man's Twitter avatar as "a close up of two giraffes near a tree."
But CNNMoney reports that the AI is apparently programmed to ignore all images of Hitler and other Nazi symbolism (as well as Osama bin Laden), reporting that Microsoft's AI "often came back with 'I really can't describe the picture' and a confused emoji. It did, however, identify other Nazi leaders like Joseph Mengele and Joseph Goebbels."
But CNNMoney reports that the AI is apparently programmed to ignore all images of Hitler and other Nazi symbolism (as well as Osama bin Laden), reporting that Microsoft's AI "often came back with 'I really can't describe the picture' and a confused emoji. It did, however, identify other Nazi leaders like Joseph Mengele and Joseph Goebbels."
They don't want another nazi-bot
Microsoft's AI keeps embarrassing them. It's like they thought their corporate image problem from being a ham-handed OS monopoly wasn't big enough: they needed to automate gaffes.
Table-ized A.I.
Both Microsoft and Google's varieties are rather fun.
The key to CaptionBot is to feed it lots of images, and always give 1 star when it's spot on and 5 when it's most ridiculously wrong. Over time, it "improves".
I'm reminded that about half of Slashdotters are afraid that AI like this will put them out of a job soon. The other half of Slashdotters can tell the difference between a cell phone and the first lady, so they won't be replaced by Microsoft software.
On the other hand, 15% of Slashdot readers can't tell the difference between Obama and Hitler, with this AI can do so.
Microsoft's AI keeps embarrassing them. It's like they thought their corporate image problem from being a ham-handed OS monopoly wasn't big enough: they needed to automate gaffes.
It is trivially easy to get a instant mod-up on Slashdot by pointing to the Microsoft's AI's occasional mistakes and not its successes. But most of the time Microsoft's AI seems to be getting it right. If you have something better, put it up where we can see it.
They don't have any problem identifying photos of Hitler as Hitler. The problem is false positives: If the software mistook the photo of some living person as Hitler, and that was somehow published, that person would not be happy, and might start a lawsuit.
Problem is easily solved by telling the software "if you think it is Hitler, you say you don't recognise it". There was a case a while ago where some photo analysis software mistook a woman for a gorilla. Highly embarrassing for everyone involved.
I would think that software makers would nowadays add precautions to make particularly embarrassing mistakes less likely. (Mistaking a gorilla for a woman is no big deal, the other way round it's very bad).
We're not there yet but this effort by Microsoft is, IMHO, as smart as a mouse.
Mice are pretty smart, I'd argue that the current AIs are at insect level of "intelligence".
What's obvious from these results is that the AI has no idea what it's looking at. This is typical for a trained neural net: it finds the best matching pattern in an image, and maps that to one of its output categories. It makes no difference between a random black and white blob, and a penguin, so long as they match the pattern.
A mouse, and true AI, will have spatial understanding. It will (intuitively) know that the images represent objects in space, and will be able to recreate a coarse 3D model of what they see. Then they will break down the scene in basic features, and identify it based on those features. It might say: hey, these blobs remind me of a penguin, but will never say that they *are* a penguin, because the blob will miss the beak and eyes and flippers and feet.
Basically, what we have now are the neural nets we already had 50 years ago, only on much faster hardware, combined with a bot and a web search engine. It's basically ELIZA on steroids, but still a long long way from actual intelligence.