Google Researchers Created An Amazing Scene-Rendering AI (arstechnica.com)
Researchers from Google's DeepMind subsidiary have developed deep neural networks that "have a remarkable capacity to understand a scene, represent it in a compact format, and then 'imagine' what the same scene would look like from a perspective the network hasn't seen before," writes Timothy B. Lee via Ars Technica. From the report: A DeepMind team led by Ali Eslami and Danilo Rezende has developed software based on deep neural networks with these same capabilities -- at least for simplified geometric scenes. Given a handful of "snapshots" of a virtual scene, the software -- known as a generative query network (GQN) -- uses a neural network to build a compact mathematical representation of that scene. It then uses that representation to render images of the room from new perspectives -- perspectives the network hasn't seen before.
Under the hood, the GQN is really two different deep neural networks connected together. On the left, the representation network takes in a collection of images representing a scene (together with data about the camera location for each image) and condenses these images down to a compact mathematical representation (essentially a vector of numbers) of the scene as a whole. Then it's the job of the generation network to reverse this process: starting with the vector representing the scene, accepting a camera location as input, and generating an image representing how the scene would look like from that angle. The team used the standard machine learning technique of stochastic gradient descent to iteratively improve the two networks. The software feeds some training images into the network, generates an output image, and then observes how much this image diverged from the expected result. [...] If the output doesn't match the desired image, then the software back-propagates the errors, updating the numerical weights on the thousands of neurons to improve the network's performance.
Under the hood, the GQN is really two different deep neural networks connected together. On the left, the representation network takes in a collection of images representing a scene (together with data about the camera location for each image) and condenses these images down to a compact mathematical representation (essentially a vector of numbers) of the scene as a whole. Then it's the job of the generation network to reverse this process: starting with the vector representing the scene, accepting a camera location as input, and generating an image representing how the scene would look like from that angle. The team used the standard machine learning technique of stochastic gradient descent to iteratively improve the two networks. The software feeds some training images into the network, generates an output image, and then observes how much this image diverged from the expected result. [...] If the output doesn't match the desired image, then the software back-propagates the errors, updating the numerical weights on the thousands of neurons to improve the network's performance.
Everything that's called "AI" today is just advanced pattern recognition. I hope that the /. editors quit using the term "AI" so frequently. It's a dumb thing to do for a "news for nerds" web site. You might as well talk about "cyber", if you're going to continue to use "AI" for things that are clearly not "AI.
I don't respond to AC's.
no joke. AI based monitoring software. 2 years ago it was worthless. They just replaced the whole team with it. It's not some kneejerk thing either. They've been testing it for months and it's more accurate than people. That didn't used to be true. Used to be if you just ran monitoring scripts you were just asking for trouble. You needed somebody to watch the script. Not anymore.
This next step here is getting AI to imagine. To think through problems. 20 years from now IT will be gone. The old timer's reading this probably don't care because they'll be retired or dead. Anyone under 50 should take notice. We need to start thinking about a post-work future now. Sure, eventually tech might catch up and employ people... in 80 years. Just remember you're gonna live through those 80 years of joblessness.
Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-US/firefox/addon/youtube-mp3-podcaster/
It can reconstitute it into a different perspective of the scene, from an angle that it's never been shown. So its internal representation includes translating a handful of 2D images into a 3D scene, including shapes, surfaces, colours, layout etc, then reconstituting that into a whole new 2D image from an entirely different viewpoint, complete with basic lighting and shadows. It can take a dozen screenshots of a Doom level, assemble that into an internal map, then produce quite decent renderings from any viewpoint and angle within that map.
Now consider that it figured out how to do this all by itself, learning only from a few example images, with no labelling or built-in knowledge of lighting or physics. The generative network used is entirely generalised with no domain-specific knowledge, so it can be applied equally well to e.g. training to position and control a robot arm based only on visual feedback, with considerably simpler neural nets and faster training than previous techniques.
Amazing! Everything is so blurry, it's so realistic!
a major step in AI? I don't mean "we programed these patterns and it recognizes them" I mean "we kept feeding patterns in until the program recognized patterns it never saw before". Pattern Recognition is one of the first things baby's learn. Our AIs might be at that stage, but that's still frighteningly impressive.
Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-US/firefox/addon/youtube-mp3-podcaster/
Can it color B&W movies better than the ludicrous methods used til now?
Welllll... ... which, of course, is what it's trying to be.
Saying it figured out how to do this by itself is a bit overstating the case. This wasn't done with fully general purpose neural nets, but with specialized variants. So it's more similar to a specialized sensory node, like, say, the visual cortex + optic nerve
But to say that it "learned by itself" is only correct if you consciously acknowledge that it was crafted to learn this kind of thing.
I think we've pushed this "anyone can grow up to be president" thing too far.
You're projecting several years forwards from this demonstration. This demonstration only deals with three geometric shapes in different primary colors. That it can be developed into something that does more extensive visualization I accept, but this isn't there yet. You could even be projecting over a decade forwards.
I think we've pushed this "anyone can grow up to be president" thing too far.
With deep fakes and this kind of "alternate reality" viewpoint - how much longer will it be until we cannot believe a digital image?