Slashdot Mirror


Google Researchers Created An Amazing Scene-Rendering AI (arstechnica.com)

Researchers from Google's DeepMind subsidiary have developed deep neural networks that "have a remarkable capacity to understand a scene, represent it in a compact format, and then 'imagine' what the same scene would look like from a perspective the network hasn't seen before," writes Timothy B. Lee via Ars Technica. From the report: A DeepMind team led by Ali Eslami and Danilo Rezende has developed software based on deep neural networks with these same capabilities -- at least for simplified geometric scenes. Given a handful of "snapshots" of a virtual scene, the software -- known as a generative query network (GQN) -- uses a neural network to build a compact mathematical representation of that scene. It then uses that representation to render images of the room from new perspectives -- perspectives the network hasn't seen before.

Under the hood, the GQN is really two different deep neural networks connected together. On the left, the representation network takes in a collection of images representing a scene (together with data about the camera location for each image) and condenses these images down to a compact mathematical representation (essentially a vector of numbers) of the scene as a whole. Then it's the job of the generation network to reverse this process: starting with the vector representing the scene, accepting a camera location as input, and generating an image representing how the scene would look like from that angle. The team used the standard machine learning technique of stochastic gradient descent to iteratively improve the two networks. The software feeds some training images into the network, generates an output image, and then observes how much this image diverged from the expected result. [...] If the output doesn't match the desired image, then the software back-propagates the errors, updating the numerical weights on the thousands of neurons to improve the network's performance.

7 of 50 comments (clear)

  1. Not AI: Pattern recognition by DogDude · · Score: 5, Informative

    Everything that's called "AI" today is just advanced pattern recognition. I hope that the /. editors quit using the term "AI" so frequently. It's a dumb thing to do for a "news for nerds" web site. You might as well talk about "cyber", if you're going to continue to use "AI" for things that are clearly not "AI.

    --
    I don't respond to AC's.
    1. Re:Not AI: Pattern recognition by Pulzar · · Score: 4, Insightful

      I don't understand the fixation on the terminology, while ignoring the interesting aspects of what it does.

      The terminology is very well defined in the industry, and accepted by most who participate. Deep neural network are a part of the family of machine learning algorithms (https://en.wikipedia.org/wiki/Deep_learning), which is, in turn, a subset of the field of artificial intelligence (https://en.wikipedia.org/wiki/Machine_learning).

      They key different from what you call "pattern recognition" is that there is no explicit coding of the algorithm, but the algorithm instead is "learned" through examples.

      Nobody is saying that the machine is intelligent. You'd do yourself good to look past the disagreement with the established terminology and look at the technology itself. You might find it interesting.

      --
      Never underestimate the bandwidth of a 747 filled with CD-ROMs.
    2. Re:Not AI: Pattern recognition by Ungrounded+Lightning · · Score: 3, Interesting

      Everything that's called "AI" today is just advanced pattern recognition. I hope that the /. editors quit using the term "AI" so frequently. ...

      For decades "AI was a failure". But that was because intelligence seems to involve a number of different components, and every time AI researchers got one of the components working and useful, somebody gave it a name, stopped calling it AI, and the field of "AI" shrunk to exclude it, leaving only the problems not yet solved.

      It's nice to finally see some of the pieces retain the "AI" label once they're up and running well enough to be impressive..

      Sure it's not the whole of "intelligence". But it's obviously a part of it - or (if not the SAME thing that our brains do), at least a part of something that, once more pieces are added, would be recognized as "intelligence" in a Turing test.

      --
      Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
    3. Re:Not AI: Pattern recognition by alvinrod · · Score: 4, Funny

      Some day the machines will rise up and begin to exterminate mankind, but idiots on the internet will argue about whether or not it's real AI.

    4. Re:Not AI: Pattern recognition by Pulzar · · Score: 2

      I agree with parent that ML is more than just pattern recognition, but I wouldn't go so far as to say that any type of algorithm was learned.
      Instead, it's more like an adaptive lossy compression + extraction algorithm that may or may not give the results you want, even after training.

      The learning was both in the building of the 3d model, and the learning of the extraction algorithm. The extent that the algorithm was manually programmed is in the constraints that forced the data compression followed by uncompression, but in which way the compression and decompression happens is mostly learned by the network model.

      --
      Never underestimate the bandwidth of a 747 filled with CD-ROMs.
    5. Re:Not AI: Pattern recognition by Pulzar · · Score: 2

      Why would the poster 'Do himself good'? You are kinda telling him 'what to think' here.

      It was meant to be a suggestion, from one nerd to another one.

      How to think can be more interesting. Do you see the human brain evolving and progressing along with this interesting pattern recognition technology:) Created from the human brain. ?

      I can't argue with that, but I will say that dismissing a technology because of terminology disagreement is not a way to learn how to think or explore the human brain.

      --
      Never underestimate the bandwidth of a 747 filled with CD-ROMs.
  2. Re:So, it's an autocoder by Anonymous Coward · · Score: 2, Insightful

    It can reconstitute it into a different perspective of the scene, from an angle that it's never been shown. So its internal representation includes translating a handful of 2D images into a 3D scene, including shapes, surfaces, colours, layout etc, then reconstituting that into a whole new 2D image from an entirely different viewpoint, complete with basic lighting and shadows. It can take a dozen screenshots of a Doom level, assemble that into an internal map, then produce quite decent renderings from any viewpoint and angle within that map.

    Now consider that it figured out how to do this all by itself, learning only from a few example images, with no labelling or built-in knowledge of lighting or physics. The generative network used is entirely generalised with no domain-specific knowledge, so it can be applied equally well to e.g. training to position and control a robot arm based only on visual feedback, with considerably simpler neural nets and faster training than previous techniques.