Slashdot Mirror


DeepMind Self-training Computer Creates 3D Model From 2D Snapshots (ft.com)

DeepMind, Google's artificial intelligence subsidiary in London, has developed a self-training vision computer that generates 'a full 3D model of a scene from just a handful of 2D snapshots," according to its chief executive. From a report: The system, called the Generative Query Network, can then imagine and render the scene from any angle [Editor's note: the link maybe paywalled; alternative source], said Demis Hassabis. GQN is a general-purpose system with a vast range of potential applications, from robotic vision to virtual reality simulation. Details were published on Thursday in the journal Science. "Remarkably, [the DeepMind scientists] developed a system that relies only on inputs from its own image sensors -- and that learns autonomously and without human supervision," said Matthias Zwicker, a computer scientist at the University of Maryland who was not involved in the research. This is the latest in a series of high-profile DeepMind projects, which are demonstrating a previously unanticipated ability by AI systems to learn by themselves, once their human programmers have set the basic parameters.

27 comments

  1. There's already a program for that... by the_skywise · · Score: 1, Informative

    I'm not sure why this is a big deal. MS had the tech for this about 5 years (Send us a buncha 2d pictures and we'll turn it into a VR set) extrapolating to models isn't that far of a reach.

    1. Re:There's already a program for that... by Anonymous Coward · · Score: 1

      Adobe has been doing it for years as well.

    2. Re: There's already a program for that... by Anonymous Coward · · Score: 0

      Oh, haven't you heard? It's because of this new guy called AI that is here to destroy those human Luddites and take away all their jobs through 100% skynet autonomy.

    3. Re:There's already a program for that... by quantaman · · Score: 2

      I'm not sure why this is a big deal. MS had the tech for this about 5 years (Send us a buncha 2d pictures and we'll turn it into a VR set) extrapolating to models isn't that far of a reach.

      It sounds like the key difference here is they're predicting parts of the scene they haven't seen, such as what the other side of an object they haven't seen looks like.

      I don't know if they do that just based on clues like shadows, or the system says "that looks like the front of a sphere, therefore I can assume it's round on the other side as well."

      --
      I stole this Sig
    4. Re:There's already a program for that... by Hognoxious · · Score: 1

      I was surprised, because I'd read that it was difficult. Specifically it was about creating a bump map from a photo of things like those decorative carved panels you sometimes see on buildings.

      It's something humans can do quite easily.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    5. Re:There's already a program for that... by dinfinity · · Score: 2

      This is something completely different than Photosynth. Completely.
      At least just glance at TFA.

    6. Re:There's already a program for that... by ShanghaiBill · · Score: 1

      At least just glance at TFA.

      I can't. It's paywalled.

    7. Re:There's already a program for that... by Anonymous Coward · · Score: 0

      Non pay wall of real paper: http://science.sciencemag.org/content/360/6394/1204.full

      This is different than photosynth and other related efforts, because it is not creating an explicit mathematical representation based on SIFT or other points, but rather learning the room and forming an internal representation of it. I think that it's important because this is far closer to what animals do.

    8. Re: There's already a program for that... by Anonymous Coward · · Score: 0

      Care to provide a reference?

    9. Re:There's already a program for that... by Kjella · · Score: 1

      It sounds like the key difference here is they're predicting parts of the scene they haven't seen, such as what the other side of an object they haven't seen looks like.

      Yeah, the key feature here seems to be extrapolation from a very small number of observations. Say you're in a park and you're snapping a few photos and from that you're trying to reconstruct a 3D model of the park. You don't have nearly enough data to do that properly, but you can make an educated guess. At least that seems to be the focus to me, how well can it fill in the blanks and make some copy-paste assumptions.

      --
      Live today, because you never know what tomorrow brings
    10. Re:There's already a program for that... by Anonymous Coward · · Score: 0

      Guess that VR set from MS will be larger than 256 floats.

    11. Re: There's already a program for that... by fubarrr · · Score: 1

      Man, are you in Shenzhen? Any much near Nanshan area? Want to invite you for a tea... joke, beer will be better.

  2. Not possible by Anonymous Coward · · Score: 0

    I know this because I met this girl on tinder and made a judgement call about how big her boobs were from the pics, but when I got her naked they were WAY HUGER and by a lot. No way a computer could have modeled those monsters from just a photo.

    1. Re:Not possible by DontBeAMoran · · Score: 1

      Pics or it didn't happen.

      It's for.... eh... the archives.

      --
      #DeleteFacebook
  3. not unanticipated by phantomfive · · Score: 4, Insightful

    Not only was that anticipated, and not only have computers been "teaching" themselves for years in AI once the basic parameters are set, that is exactly what neural networks were DESIGNED to do nearly 50 years ago when they were invented

    --
    "First they came for the slanderers and i said nothing."
    1. Re: not unanticipated by Anonymous Coward · · Score: 0

      That is like saying empirical modern medicine was invented a hundred years ago so cute for HIV is nothing new.

      Dude 50 years ago neural nets are shit.... it could not out perform statistical modelling methods.

      People have dreamed of chaining multiple layers of neural nets, but no one could get deep neural nets with more than 1 or 2 layers to work until about 10 years ago. Since that breakthrough deep nets exploded with innovations and advanced at a crazy pace.

      Those who say stuff like itâ(TM)s been around for 50 years and that AI is just rules clearly have been out of touch on what happened in the last 10 years.

    2. Re:not unanticipated by mbkennel · · Score: 1

      No, neural networks have not been 'teaching themselves' to perform this many cognitively impressive capabilities **without significant detao;ed human-labeled data** for 50 years.

      The unsupervised or weakly supervised achivements are new.

    3. Re:not unanticipated by phantomfive · · Score: 1

      The unsupervised or weakly supervised achivements are new.

      Not really.....

      --
      "First they came for the slanderers and i said nothing."
  4. stupid human by Anonymous Coward · · Score: 0

    that is exactly what neural networks were DESIGNED to do nearly 50 years ago when they were invented

    Neural networks were "invented" millions of years ago, pathetic human

  5. Paywalled by Anonymous Coward · · Score: 0

    Please stop posting links to paywalled web servers.

  6. Re:KDE Applications for Windows 32+64bit + other O by Anonymous Coward · · Score: 0

    You should talk to APK he needs your help.

  7. Anime? by DontBeAMoran · · Score: 1

    How well does it work on 2D anime drawings?

    --
    #DeleteFacebook
    1. Re:Anime? by Anonymous Coward · · Score: 0

      a bit disappointing actually, where do you think waifu pillows came from?

  8. Link is DEFINITELY paywalled by wonkey_monkey · · Score: 1

    The system, called the Generative Query Network, can then imagine and render the scene from any angle [Editor's note: the link maybe paywalled; alternative source],

    Why didn't you just use the definitely non-paywalled source?

    --
    systemd is Roko's Basilisk.
  9. This is called photogrammetry by Anonymous Coward · · Score: 0

    The technique of solving several images into a 3 dimensional object is called photogrammetry. https://en.wikipedia.org/wiki/Photogrammetry. There are many software solutions that do this, and there are non software solutions that were used to for example create the topographic maps published by the USGS where elevations were extracted from aerial photographs. Most just output 3D geometry information. The new thing might be the way deep mind uses the 3d model results.