Uber has Cracked Two Classic '80s Video Games by Giving an AI Algorithm a New Type of Memory (technologyreview.com)

← Back to Stories (view on slashdot.org)

Uber has Cracked Two Classic '80s Video Games by Giving an AI Algorithm a New Type of Memory (technologyreview.com)

Posted by msmash on Tuesday November 27, 2018 @05:42AM from the closer-look dept.

An algorithm that remembers previous explorations in Montezuma's Revenge and Pitfall! could make computers and robots better at learning how to succeed in the real world. From a report: A new kind of machine-learning algorithm just mastered a couple of throwback video games that have proved to be a big headache for AI. Those following along will know that AI algorithms have bested the world's top human players at the ancient, elegant strategy game Go, one of the most difficult games imaginable. But two pixelated classics from the era of 8-bit computer games -- Montezuma's Revenge and Pitfall! -- have stymied AI researchers. There's a reason for this seeming contradiction. Although deceptively simple, both Montezuma's Revenge and Pitfall! have been immune to mastery via reinforcement learning, a technique that's otherwise adept at learning to conquer video games.

DeepMind, a subsidiary of Alphabet focused on artificial intelligence, famously used it to develop algorithms capable of learning how to play several classic video games at an expert level. Reinforcement-learning algorithms mesh well with most games, because they tweak their behavior in response to positive feedback -- the score going up. The success of the approach has generated hope that AI algorithms could teach themselves to do all sorts of useful things that are currently impossible for machines. The problem with both Montezuma's Revenge and Pitfall! is that there are few reliable reward signals. Both titles involve typical scenarios: protagonists explore blockish worlds filled with deadly creatures and traps. But in each case, lots of behaviors that are necessary to advance within the game do not help increase the score until much later. Ordinary reinforcement-learning algorithms usually fail to get out of the first room in Montezuma's Revenge, and in Pitfall! they score exactly zero.

100 comments

Min score:

Reason:

Sort:

Pitfall by Anonymous Coward · 2018-11-27 05:47 · Score: 0

The key in any of these kinds of games is not to make assumptions about anything one hasnâ(TM)t already seen
short term vs long term gain by LostOne · 2018-11-27 05:50 · Score: 4, Insightful

So researchers have discovered that short term gains can come at the expense of long term success? *gasp* Say it isn't so!
Actually, that's been a known problem for a long time.You end up at a local maximum on the "score" function and now you have no possible way to improve so re-enforcement learning just keeps you there even though you might do substantially better if you actually took a decrease in the "score" and ended up on the path to some other maximum on the function.
(Oh, and "Fr1st ps0t!", especially if it isn't.)

--

If it works in theory, try something else in practice.
1. Re:short term vs long term gain by lgw · 2018-11-27 05:59 · Score: 4, Interesting
  
  This is more important than youmake it out to be. The key to these games is that you have to make a map to succeed. That's not the kind of learning you get from "machine learning", as obvious as it might be to a human player.
  One of the many ways that AI is nothing like intelligence is the absence of any representational model of the real world. It's no accident that the neurological seat of human intelligence is an addition to our massive vision processing wetware - understanding the world in terms of objects precedes self awareness in the only example we can study. "AI" doesn't work that way, at least for the most part.
  I find it impressive that someone has managed to connect the idea of making a map with the internals of machine learning (which are completely arbitrary matrices that have no obvious connection to the result).
  
  --
  Socialism: a lie told by totalitarians and believed by fools.
2. Re:short term vs long term gain by liquid_schwartz · 2018-11-27 07:40 · Score: 1, Insightful
  
  So researchers have discovered that short term gains can come at the expense of long term success? *gasp* Say it isn't so!
  To be fair many an MBA hasn't gotten this internalized so it is progress.
3. Re:short term vs long term gain by Areyoukiddingme · 2018-11-27 07:54 · Score: 4, Interesting
  
  One of the many ways that AI is nothing like intelligence is the absence of any representational model of the real world.
  There are many kinds of AI. Neural nets don't construct a representational model of the world from visual input but other AI techniques do. The Soar framework used so successfully for the machine-controlled antagonists in Descent (among many other uses) supports chunking, reinforcement learning, episodic learning, and semantic learning. It is based on the unified theory of cognition. It has both a temporary and permanent representational memory. It's fundamentally rule-based, rather than a neural net.
  There was at one time a neural net version of Soar called Neuro-Soar but it's not part of the mainstream Soar library.
4. Re:short term vs long term gain by jellomizer · 2018-11-27 08:53 · Score: 1
  
  Well the normal problem is the longer your plan out the more variations you need to figure in. Short term success often leads us to a point where we can face the long term problem, vs failing before you get to that point.
  The idea of the perfect AI algorithm has been available for generations. Just simulate all possible next solutions and pick the path to best success. But the problem is these steps take massive amount of computational time, that grow exponentially the further you go out. A good AI design is about programming the shortcuts in the logic, to allow the computer to make the decision faster in an actionable time. More computing power will only go too far. Moors law is doubling every few years, while each step further can be multiples of hundreds. Sure my PC is 1000 times faster then it was 20 years ago. But that would mean that AI Algorithm from taking a year to figure out the next step to 8 hours, which would be enough to figure out the next step.
  
  --
  If something is so important that you feel the need to post it on the internet... It probably isn't that important.
5. Re:short term vs long term gain by Kjella · 2018-11-27 09:05 · Score: 1
  
  One of the many ways that AI is nothing like intelligence is the absence of any representational model of the real world.
  I agree that could be a problem if the number of possible interactions is so large that you need a semantic understanding to whittle it down to a reasonable number. But if it can backtrack random luck to identify the game mechanics that triggered it that's a huge step up, like for example to open the treasure chest it must first find the key or to cross the drawbridge it must first lower it - even though that gives no score by itself. It's hard for an AI to pick out the "meaningful" actions from all the rest, it probably needs some kind of trial and error.
  
  --
  Live today, because you never know what tomorrow brings
6. Re:short term vs long term gain by Anonymous Coward · 2018-11-27 10:22 · Score: 0
  
  Sounds like a lot of talk without any walk. What are its scores on Montezuma's revenge or the other games mentioned here?
  It doesn't matter what you want to call the theory, if it isn't producing functional, behaving, models on standard benchmarks, there's really no sense in talking about it in a serious discussion.
  >Neural nets don't construct a representational model of the world from visual input
  No. This is exactly what they do.
7. Re:short term vs long term gain by MMC+Monster · 2018-11-27 11:00 · Score: 1
  
  So researchers have discovered that short term gains can come at the expense of long term success?
  Unfortunately the AI was deleted before it could take control of the company and shut it down for the greater good of humanity.
  
  --
  Help! I'm a slashdot refugee.
8. Re: short term vs long term gain by Anonymous Coward · 2018-11-28 15:52 · Score: 0
  
  Like taking certain steps in chess would yield the king vs an easy pawn.
  We expect AI to go thru all these permutations, n tell us that the lever for this drawbridge is the best possible move even if we think crossing the river by foot is faster. That's the point.
9. Re:short term vs long term gain by DeVilla · 2018-11-28 17:39 · Score: 1
  
  understanding the world in terms of objects precedes self awareness in the only example we can study
  So you're saying they've laid the foundation for skynet, right? I like the idea that understanding Montezuma's Revenge can lead to self awareness.
10. Re:short term vs long term gain by DeVilla · 2018-11-28 17:54 · Score: 1
  
  The idea of the perfect AI algorithm has been available for generations. Just simulate all possible next solutions and pick the path to best success.
  When I was in AI classes, this wasn't considered AI. This was just an exhaustive search. AKA, brute force. The point in AI (back then) was to try to identify how people were able to intelligently avoid exhaustive searches of problem spaces while finding still a reasonably good solution so we could better simulate it in software.
  Well that, and some people wanted to build C-3PO.
Meh. by JMZero · 2018-11-27 05:56 · Score: 1

These games aren't hard for a computer to play. You could write a fairly straightforward algorithm that would play them both well.
What's hard is to develop a very general learning algorithm - one that doesn't know about the task - that just happens to pass the test of being able to learn these games.
The approach here seems "cheaty". That's not to say their technique is useless (and maybe it's more generalizable than I'm giving it credit for) - but from the vague overview of the article it seems like they're effectively juicing their performance.

--
Let's not stir that bag of worms...
1. Re: Meh. by Anonymous Coward · 2018-11-27 05:59 · Score: 0
  
  I donâ(TM)t think so. Juicing would be equivalent to choosing solution steps that were completely specific to an instance of a problem. Fundamental problem solving relies on seeing the core elements of a problem, validating those elements against a generic hypothetical solution, and then tailoring the hypothetical solution to the true problem
2. Re: Meh. by Anonymous Coward · 2018-11-27 07:37 · Score: 0
  
  I say that would be OK, so long as the algorithm went out to gamer websites and parsed the posts revealing how to solve this or that puzzle.
3. Re:Meh. by Areyoukiddingme · 2018-11-27 07:56 · Score: 1
  
  The approach here seems "cheaty". That's not to say their technique is useless (and maybe it's more generalizable than I'm giving it credit for) - but from the vague overview of the article it seems like they're effectively juicing their performance.
  Is teaching children cheating?
4. Re:Meh. by JMZero · 2018-11-27 08:21 · Score: 1
  
  No, but telling them the answers to the test can be. To be fair, from the article it's hard to tell where on that spectrum they are.
  
  --
  Let's not stir that bag of worms...
Didn't know they were copy protected by Anonymous Coward · 2018-11-27 05:58 · Score: 0

I didn't think the 2600 had any copy protection *to* crack, never mind requiring AI to crack anything that might be there. WEIRD.
1. Re:Didn't know they were copy protected by OrangeTide · 2018-11-27 06:36 · Score: 1
  
  They mean "cracked" in a difference sense. For example you can crack an egg, or a tech journalist can be addicted to crack.
  
  --
  “Common sense is not so common.” — Voltaire
2. Re:Didn't know they were copy protected by Anonymous Coward · 2018-11-27 06:46 · Score: 0
  
  Breaking classic video games is not cool! :P And all that about tech journalists... is there anyone who ISN'T in the know?
If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-27 06:09 · Score: 2

Here is the key quote from the article that vaguely describes the algorithm they used:

The team’s new family of reinforcement-learning algorithms, dubbed Go-Explore, remember where they have been before, and will return to a particular area or task later on to see if it might help provide better overall results. The researchers also found that adding a little bit of domain knowledge, by having human players highlight interesting or important areas, sped up the algorithms’ learning and progress by a remarkable amount.

--
"First they came for the slanderers and i said nothing."
1. Re:If a human chooses the algorithm, is it AI? by bluefoxlucid · 2018-11-27 06:16 · Score: 1
  
  So they've figured out that getting X to happen is rewarding, but hard because X happens if you do something after W happens, which occurs if you do something after V happens, etc..
  
  --
  Support my political activism on Patreon.
2. Re:If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-27 06:23 · Score: 1
  
  I think the essential difficulty they are running into is that a human will look at the screen and say, "Oh, that is a room, there is gravity, and it looks like I can jump." A human doesn't need to play through the scenario 10,000,000 times, humans learn in ways besides just reinforced repetition. I would even suggest that is not our primary form of learning, although it is a powerful one in some situations.
  
  It seems unlikely that a system that only learns through many repetitions will become general intelligence, because there are many things in the universe that cannot be learned that way.
  
  --
  "First they came for the slanderers and i said nothing."
3. Re:If a human chooses the algorithm, is it AI? by Anonymous Coward · 2018-11-27 06:33 · Score: 0
  
  Do you have any evidence that humans don't need repetition to learn? I know that I have played many times different games over and over again and I have become a semi master at game playing. It is rather frustrating to watch my kids play the same games when they have so much trouble in them. E.g. they don't figure out what to do even when the clue is right there where they can see it. But as they play more and more, they have become better at it.
4. Re:If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-27 06:48 · Score: 1
  
  Do you have any evidence that humans don't need repetition to learn?
  You're getting confused. I didn't say "humans never learn by repetition," I said, "humans don't always learn by repetition."
  
  If I say, "My name is phantomFive," a lot of people can remember that the first time. But in any case it won't take thousands of repetitions to remember it.
  
  Also, if I show you how to travel to a particular place, there's a good chance you'll remember it the first time, (or second time) instead of needing to show you 30,000 times.
  
  Also if I teach you the rules of chess, you'll probably be playing right away. You aren't going to want to learn the rules by watching thousands of games until you can intuitively understand. Note that this is also how computers learn the rules of chess, but it is hard-coded, rather than a general algorithm.
  
  --
  "First they came for the slanderers and i said nothing."
5. Re:If a human chooses the algorithm, is it AI? by guruevi · 2018-11-27 06:54 · Score: 1
  
  Humans rely more on strategy and intuition (which is basically an interpretation of what you think other people would do or have done), not pure repetition otherwise games wouldn't be fun at all.
  If you conquer a puzzle game like described, you think more about how you would make the game and go from there. You visualize/fantasize realms you haven't quite yet seen. That's a level ahead of AI learning which right now just runs in a direction until it hits a wall, then retries with slightly different variables.
  
  --
  Custom electronics and digital signage for your business: www.evcircuits.com
6. Re:If a human chooses the algorithm, is it AI? by AvitarX · 2018-11-27 07:12 · Score: 1
  
  Isn't that how deep blue beat a human grand master the first time?
  Seems a good step.
  
  --
  Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
7. Re:If a human chooses the algorithm, is it AI? by bluefoxlucid · 2018-11-27 07:17 · Score: 1
  
  A human doesn't need to play through the scenario 10,000,000 times, humans learn in ways besides just reinforced repetition
  Humans learn about things by experience. You have a lot of information you've learned through repetition--like gravity and pain. I stepped on a bee once because I'd gotten stung earlier and wanted to confirm that stepping on a bee caused pain; I now know that some animals can sting and don't need to mess with them to figure out they'll inflict pain. I learned that hornets and scorpions sting by repeatedly being stung by bees.
  I can work out how things work without doing them because I can simulate them inside my mind. I don't need to physically attempt something to engage in repetition learning because I've already learned parts of the machine through repetition, and can reassemble a new machine internally and examine the outcome.
  
  --
  Support my political activism on Patreon.
8. Re:If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-27 07:20 · Score: 1
  
  Isn't that how deep blue beat a human grand master the first time?
  
  Seems a good step.
  If your goal is building an algorithm that can beat a human, then it's a great first step. So if your goal is to create general AI, then it's not a step at all.
  
  These ancient games by themselves aren't particularly interesting though, no one cares if a computer can beat a human. The only reason they are of interest is as a potential stepping stone to general AI.
  
  --
  "First they came for the slanderers and i said nothing."
9. Re:If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-27 07:21 · Score: 1
  
  Did you understand what I said?
  
  --
  "First they came for the slanderers and i said nothing."
10. Re:If a human chooses the algorithm, is it AI? by AvitarX · 2018-11-27 07:37 · Score: 1
  
  How effective is general purpose Real Intelligence without being guided?
  The Octopus is maybe an answer?
  
  --
  Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
11. Re:If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-27 07:41 · Score: 1
  
  How effective is general purpose Real Intelligence without being guided?
  We can answer that question. If someone stuck you in the room with the game and no instructions, could you get higher than a score of zero? Could you beat the game? For indeed, you are a general purpose Real Intelligence.
  
  --
  "First they came for the slanderers and i said nothing."
12. Re:If a human chooses the algorithm, is it AI? by AvitarX · 2018-11-27 07:44 · Score: 1
  
  Sure, but I've had over 30 years of human guidance to get there.
  
  --
  Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
13. Re:If a human chooses the algorithm, is it AI? by Waffle+Iron · 2018-11-27 07:58 · Score: 1
  
  "Oh, that is a room, there is gravity, and it looks like I can jump." A human doesn't need to play through the scenario 10,000,000 times, humans learn in ways besides just reinforced repetition. I would even suggest that is not our primary form of learning, although it is a powerful one in some situations. .
  I don't know about that. Humans spend the first few of months of their lives randomly jerking their limbs around until they've figured out the basics of gravity. It takes a surprisingly long time for an infant to figure out how to roll over on its own, much less jump.
  All of that past repetition learning about gravity and geometry is factored in when you play this video game, so now it just *looks* like intuition.
14. Re: If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-27 09:04 · Score: 1
  
  Are you going to claim that humans only learn through repetition?
  
  --
  "First they came for the slanderers and i said nothing."
15. Re: If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-27 09:08 · Score: 1
  
  Now you're intentionally being obtuse. If it were just a matter of time, we could upload our current AI algorithms into a Boston Dynamics piece of hardware and wait.
  
  --
  "First they came for the slanderers and i said nothing."
16. Re:If a human chooses the algorithm, is it AI? by goose-incarnated · 2018-11-27 09:47 · Score: 1
  
  Sure, but I've had over 30 years of human guidance to get there.
  Due to humans being so slow, that's still about 5 orders of magnitude less than the guidance that the software got.
  
  --
  I'm a minority race. Save your vitriol for white people.
17. Re:If a human chooses the algorithm, is it AI? by Anonymous Coward · 2018-11-27 09:58 · Score: 0
  
  Games are designed by humans for humans, hence they use human APIs (think visual cortex). Output the game state as a streaming string of bits instead of a "graphic", see if it takes 10,000,000 times for a human to master it (I think they'll give up long before then =) ).
18. Re:If a human chooses the algorithm, is it AI? by Anonymous Coward · 2018-11-27 10:18 · Score: 0
  
  He has to re-read your post a few dozen times first. Maybe print it out and step on it.
19. Re: If a human chooses the algorithm, is it AI? by Waffle+Iron · 2018-11-27 10:27 · Score: 1
  
  Are you going to claim that humans figure out basics like gravity without repetition?
20. Re: If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-27 10:54 · Score: 2
  
  In a single post you learned I'm a moron. No repetition needed.
  
  --
  "First they came for the slanderers and i said nothing."
21. Re: If a human chooses the algorithm, is it AI? by AvitarX · 2018-11-27 12:02 · Score: 1
  
  If this technique can lead to an AI capable of learning many previously un achievable tasks with minor human input, I would think it's progress in creating AI and not not a step at all.
  If I played pitfall as a child, I wouldn't know what to do without context (money bags good, and what not), but an older sibling may say "go over there, there's money bags".
  I actually remember being confused by pitfall and just dyeing a lot actually.
  
  --
  Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
22. Re: If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-27 12:46 · Score: 1
  
  Your second paragraph is a clear example of learning without plenty of repetition.
  
  --
  "First they came for the slanderers and i said nothing."
23. Re: If a human chooses the algorithm, is it AI? by AvitarX · 2018-11-27 12:59 · Score: 1
  
  Isn't that effectively what they did?
  Gave hints on where to go, and then let trial and error create a better player?
  
  --
  Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
24. Re:If a human chooses the algorithm, is it AI? by ceoyoyo · 2018-11-27 13:41 · Score: 1
  
  If you give me directions I don't have to learn how to follow them, because I've previously learned how to do that. I have lots of practice following directions, and all of the subtasks that go into that.
  There's a great paper showing that you can easily create a game that is very difficult for a reinforcement learning system to learn to play, but quite easy for humans. Except if you remove contextual information from that game (the princess you're supposed to rescue is pink, the thing you're supposed to avoid looks like fire, etc.) then humans do just as badly.
25. Re:If a human chooses the algorithm, is it AI? by ceoyoyo · 2018-11-27 13:43 · Score: 1
  
  It's more than just gravity. If you take a video game and remove the stereotypical visual cues humans do very poorly at them.
26. Re: If a human chooses the algorithm, is it AI? by ceoyoyo · 2018-11-27 13:46 · Score: 1
  
  That is an experiment that's being done. The idea is to learn how much behaviour can arise spontaneously with a reinforcement learning algorithm, some basic motivations, and real-world sensory input.
  That IS how we do a great deal of our learning, almost exclusively when we're young. When we advance a little further we get some direct supervised learning mixed in.
27. Re:If a human chooses the algorithm, is it AI? by ceoyoyo · 2018-11-27 13:49 · Score: 3, Interesting
  
  Humans are not slow. Hinton has computed the amount of sensory information that is processed by the human brain, using reasonable approximations for things like the effective sampling rate of the eyes and ears. It's enormous.
  The *consciousness* that we subjectively experience is slow. We're also pretty horrible at tasks we have to consciously think about as we're doing them too. Both of which suggest that "consciousness" might be considerably less important than many give it credit for.
28. Re: If a human chooses the algorithm, is it AI? by Pulzar · 2018-11-27 15:04 · Score: 1
  
  He is saying that when humans are first "turned on", they take a hell of a lot of time to learn basics.
  You're comparing a "trained" human and their ability to learn to a completely fresh network. There are some very interesting papers that shows that a networks that's trained to recognize certain things can be re-trained to recognize some completely different things incredibly quickly. It's obviously not human-level ability, but it's not as black and white as you try to paint it.
  
  --
  Never underestimate the bandwidth of a 747 filled with CD-ROMs.
29. Re: If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-27 16:36 · Score: 1
  
  It's provably not human level. And don't talk about papers if you're not going to cite them, it makes you sound like BS.
  
  --
  "First they came for the slanderers and i said nothing."
30. Re:If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-27 17:55 · Score: 1
  
  There's a great paper
  Link please.
  
  --
  "First they came for the slanderers and i said nothing."
31. Re:If a human chooses the algorithm, is it AI? by Anonymous Coward · 2018-11-27 18:48 · Score: 0
  
  So the AI discovered backtracking. We are now able to solve a task that would take microseconds on a 486 in milliseconds using a 1+ GHz GPU with thousands of cores.
  Yay for millennial CS.
32. Re:If a human chooses the algorithm, is it AI? by ceoyoyo · 2018-11-28 01:22 · Score: 1
  
  https://arxiv.org/abs/1802.102...
33. Re:If a human chooses the algorithm, is it AI? by bluefoxlucid · 2018-11-28 02:01 · Score: 1
  
  Yes. You said a human doesn't have to run through a scenario repeatedly to learn it; you seem to think each scenario is unique, and not comparable to different scenarios already experienced repeatably--i.e. that most "new" scenarios are actually old scenarios.
  Do you know why you can't retain fluency in language just by a study of grammar and vocabulary?
  You can't just memorize words. I remember words--even English words--because every concept I want to convey links to thousands of phrases I've heard, said, or otherwise processed in my life. When I speak or write, I'm recalling events, songs, simple quotes, or even people via theory of mind (copies of other people's behavior that simulates them inside your head).
  Everything anyone says is an experience I've had millions of times already, even if the phrase and context are completely novel.
  The same is true of pretty much everything. Hit something totally-foreign and you stop hard: no prior experience. Learn a new programming language and you're just identifying how different parts are like other programming languages.
  
  --
  Support my political activism on Patreon.
34. Re:If a human chooses the algorithm, is it AI? by parkinglot777 · 2018-11-28 05:04 · Score: 1
  
  Yes. You said a human doesn't have to run through a scenario repeatedly to learn it; you seem to think each scenario is unique, and not comparable to different scenarios already experienced repeatably--i.e. that most "new" scenarios are actually old scenarios.
  No, what he/she said is that unlike AI, human does NOT need to learn from repetitions. Human can learn by using logic (as an assumption).
35. Re:If a human chooses the algorithm, is it AI? by bluefoxlucid · 2018-11-28 06:09 · Score: 1
  
  Humans don't learn using logic; they apply logic to assess. Logic is a tool learned by repetition.
  Logic also does not apply to the unknown: if I explain to you the R-Star target and the movement of equities, you can't tell me what should or shouldn't happen to the economy. It's completely-logical, but you don't have prior knowledge to assess it.
  Even then, when you get down to it, you're applying an array of tools that you built through repetition of experience. You're quite slow at getting a result someone else can achieve in seconds because they have learned the immediate skill through repetition and you are trying to synthesize advanced knowledge through more-fundamental tools. You also won't likely remember much of anything complex without a lot of repetition.
  Again: most "new" scenarios are actually combinations of old scenarios, which you've probably learned to deal with through repetition. You also learned the skill of analyzing by analogical thinking, taking those scenarios apart and assembling them into the current scenario--quickly--to respond as if the current scenario is familiar instead of novel.
  When you say humans can learn by using logic, you're stating something such as that humans can learn to read a Chess board by having played Go for years. The human has learned to visualize and project moves through repetition of strategies to abstract and simplify those projections; they're applying that learning to a new topic (Chess).
  It's not learning; it's application of already-learned knowledge. That already-learned knowledge was learned by repetition.
  
  --
  Support my political activism on Patreon.
36. Re:If a human chooses the algorithm, is it AI? by goose-incarnated · 2018-11-28 06:30 · Score: 1
  
  Humans are not slow. Hinton has computed the amount of sensory information that is processed by the human brain, using reasonable approximations for things like the effective sampling rate of the eyes and ears. It's enormous.
  Humans can "process" maybe 1 high-res photo per second. Computers can do the same several thousands times faster. Humans *ARE* slow and discard almost all sensory input in any given second (flash 1 million stills at a human in a single second and you'll be lucky if they manage to catch even one of those images).
  
  The *consciousness* that we subjectively experience is slow.
  
  In which case my point to the OP still stands - his "30 years of human guidance" can be sent to a NN in a weekend.
  
  --
  I'm a minority race. Save your vitriol for white people.
37. Re:If a human chooses the algorithm, is it AI? by ceoyoyo · 2018-11-28 07:11 · Score: 1
  
  No, it can't. The human neural network processes a large amount of data, not just vision but lots of other sensory input as well, and it does it a lot faster than any existing computer doing anything of even vaguely comparable sophistication. We don't know exactly how 30 years of a human observing and learning about the world translates into analogously training an artificial neural network, but it is definitely more than a weekend's worth. Reasonable estimates put it at more than 30 years worth, not to mention using a truly enormous amount of training data. One of the big problems with reinforcement learning is that nobody wants to wait long enough for it to work in the real world so systems are trained in vastly simplified simulations (like 80s video games). When you vastly simplify, you necessarily lose things.
  I have some experience with this. I very strongly suspect you do not. But you don't even have to believe me. Geoff Hinton has made the same argument, and he has some pretty reasonable qualifications.
38. Re:If a human chooses the algorithm, is it AI? by parkinglot777 · 2018-11-28 07:30 · Score: 1
  
  Still, you missed the point of the poster. And yes, we can learn from logical conclusion without going through a scenario or repetitions.
  Repetition is a brute-force learning. We, humans, can still learn a different way that is not necessary to be brute-force. All you are talking about is just an obvious learning and basic -- first-hand experience. Logic is a part of second-hand experience because we could use knowledge proven by others to make a conclusion; thus it is another type of learning.
39. Re:If a human chooses the algorithm, is it AI? by bluefoxlucid · 2018-11-28 08:56 · Score: 1
  
  yes, we can learn from logical conclusion without going through a scenario or repetitions.
  You're not learning; you're applying previous knowledge.
  If I say: 3x^2 + 2x = 164, solve for x, what do you do? You apply your knowledge of algebra to obtain the value of x.
  What if you don't know algebra?
  How did you get to know algebra?
  You got to know algebra by repetition learning. You can't just read a book on algebra and have it memorized; you need to perform repetitive tasks.
  
  Logic is a part of second-hand experience because we could use knowledge proven by others to make a conclusion; thus it is another type of learning.
  That's not learning. Obtaining a result by following a process is doing. You haven't learned until you can follow that process without using the outside instruction sheet.
  Intuiting the process yourself also isn't learning. You're using a tool (prior knowledge--you've learned to analyze problems and apply other knowledge in the analysis so as to structure a solution). Learning occurs when you no longer need to re-derive that process from logical analysis, but instead can simply recall this new information without the reasoning which explains why it works as such.
  You're trying to call things learning when they're not learning. It's like saying a cat has fur, thus is a type of dog.
  Let's demonstrate.
  Houses are built by assembling pieces of material. This requires material and tools.
  You're suggesting that a person doesn't need to actually assemble the material: people already have a house when they have a hammer, nails, and wood. If your house burns down, you still have a place to stay, because you can assemble wood into a house.
  That obviously doesn't work: the house doesn't shield you from rain or retain heat until you've assembled it.
  So how does this apply to learning?
  We all know that things move when struck with sufficient force. We've seen baseball players play baseball, so we're aware of that. We have a lot of experience with the general idea of kicking or throwing things, so it makes sense.
  If I were to throw you a baseball, your first time holding a bat, you might hit it. You might miss.
  By repeatedly swinging the bat at balls, you cause the neurons in your arm, spine, and brain to adjust. Certain impulses, when summed together, suggest that certain actions will create a certain outcome. You assess that outcome, feed back the data, and your brain adjusts how it weights those impulses. Your nerves physically change their structure to improve on the outcome of swinging a bat.
  That's learning.
  So how did you know to swing the bat?
  Well, when you were first born, your brain didn't know anything about how striking an object with a force made it move. As a small child, you saw people do things. You eventually threw things across the room. You kicked and smacked blocks, and they tumbled and fell.
  From that repetition, you learned about momentum, force, gravity, the movement of objects through space when struck or thrown.
  When propositioned with two rigid objects and an interaction between them, you used that already learned knowledge to drive the ball away. You didn't learn something new by looking at it; you already had that knowledge. You used a tool: prior knowledge.
  As you repeated the action, your brain learned to do it better, more consistently, with greater and more-controllable outcomes.
  That is, by definition, learning. Learning is not assembling what you already know into a new form--that's called "creativity" or "engineering". Learning is the retention of new knowledge.
  Retention only comes with repetition. Even what little you can remember from a single experience is stored by referencing a lot of similar things you've experienced in the past and stringing them together to roughly describe what this new thing is.
  
  --
  Support my political activism on Patreon.
40. Re:If a human chooses the algorithm, is it AI? by parkinglot777 · 2018-11-28 09:12 · Score: 1
  
  You are obviously not listening or even think out of the box as many others do. I am not going to try further to explain it to you anymore. You keep using the same reasoning which is the base part of learning. Anyway I should have left it as is (as others stopped after they said/insulted you). I now know why others don't want to reply to your objection. It is my fault to attempt to show you a way.
41. Re:If a human chooses the algorithm, is it AI? by goose-incarnated · 2018-11-28 09:28 · Score: 1
  
  I have some experience with this. So have I. AI hype restarted just a few years ago so by now almost anyone interested has looked into it, played with it, etc. Some of us have even done postgrad in it.
  
  I very strongly suspect you do not. But you don't even have to believe me. Geoff Hinton has made the same argument, and he has some pretty reasonable qualifications.
  And his numbers fail the most basic of tests - how much information can a human process in a single second. His numbers are counting each sensory input separately while comparing to a NN looking at whole images. If we count each individual pixel in each image the NN receives it is vastly faster than the human.
  Looking at images alone, humans can't "see" more than a few distinct images per second while the NN can see, react and adjust to a few hundreds of thousands of images per second. A 30 year period of human images can be fed into the NN in a fraction of the time it took the human to acquire them.
  So when OP said he received human guidance for 30 years, I fail to see why he thinks that a computer that processes images thousands of times faster than he does also needs 30 years of training.
  
  --
  I'm a minority race. Save your vitriol for white people.
42. Re:If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-28 18:39 · Score: 1
  
  If you understood it, then I'm sure you can think of your own examples of not needing many repetitions (like a neural network would) to learn something.
  
  Or maybe you don't understand AI well enough to reason logically about it.
  
  --
  "First they came for the slanderers and i said nothing."
43. Re:If a human chooses the algorithm, is it AI? by bluefoxlucid · 2018-11-29 02:00 · Score: 1
  
  You are obviously not listening or even think out of the box as many others do.
  
  I'm listening; you're just wrong.
  
  You keep using the same reasoning which is the base part of learning
  Reasoning is spelled with letters, like "R". Learning starts with an "L".
  You keep telling me I can't fly, but I put one foot in front of the other. Isn't that flying?
  No, it's walking, just like reasoning is reasoning and learning is learning.
  Learning is specifically the encoding of new knowledge so that it is retained and doesn't have to be reasoned out or observed again. It has to be recallable as its own thing or it isn't learned. Reasoning something out isn't knowing it, and reasoning it out repeatedly because you haven't learned the new fact isn't having learned it--although the repetition does lead to learning.
  
  It is my fault to attempt to show you a way.
  It's your fault for calling a dog a cat.
  
  --
  Support my political activism on Patreon.
44. Re:If a human chooses the algorithm, is it AI? by bluefoxlucid · 2018-11-29 02:28 · Score: 1
  
  Give me an example and I'll tell you what previously-learned implement you're actually using.
  
  --
  Support my political activism on Patreon.
45. Re:If a human chooses the algorithm, is it AI? by phantomfive · 2018-11-29 05:02 · Score: 1
  
  idiot. I give you knowledge and you'd rather argue.
  
  --
  "First they came for the slanderers and i said nothing."
46. Re:If a human chooses the algorithm, is it AI? by bluefoxlucid · 2018-11-29 06:19 · Score: 1
  
  Have you considered that maybe you're wrong?
  Let's say I work at a fast food place and sell food. Someone orders a burrito ($2.89), milk shake ($1.93), and nachos ($1.28). I add them up and get $6.10.
  You assert that I have learned that a burrito, milk shake, and nachos cost $6.10.
  You're wrong.
  Three customers later, someone orders a burrito, milk shake, and nachos. I remember this just happened a few minutes ago, but I don't remember how much they cost, together. I must re-compute these things using arithmetic skills I learned long ago through repetition because I haven't learned that burrito + milk shake + nachos = $6.10.
  Now say this happens over and over and over. It's a common order.
  In a few days, I stop computing that. As soon as you say burrito, milk shake, and nachos, I say $6.10. I've seen that combination two hundred times; it's always $6.10. Now I've learned.
  So in conclusion: idiot, you keep describing something that isn't learning, and calling it learning.
  
  --
  Support my political activism on Patreon.
47. Re:If a human chooses the algorithm, is it AI? by phantomfive · 2018-12-02 19:37 · Score: 1
  
  Nice paper.
  
  --
  "First they came for the slanderers and i said nothing."
Uber? by Anonymous Coward · 2018-11-27 06:23 · Score: 1

So what does this (an article about Alphabet's deep learning) have to do with Uber???
1. Re:Uber? by Anonymous Coward · 2018-11-27 08:58 · Score: 0
  
  RTFA:
  
  The new algorithms come from Uber’s AI research team in San Francisco, led by Jeff Clune, who is also an associate professor at the University of Wyoming. The team demonstrated a fundamentally different approach to machine learning within an environment that offers few clues to show an algorithm how it is doing.
  
  The approach leads to some interesting practical applications, Clune and his team write in a blog post released today—for example, in robot learning. That’s because future robots will need to figure out what to do in environments that are complex and offer only a few sparse rewards.
  
  Uber launched its AI lab in December 2016, with the goal of making fundamental breakthroughs that could prove useful to its business. Better reinforcement-learning algorithms could ultimately prove useful for things like autonomous driving and optimizing vehicle routes.
  
  AI researchers have typically tried to get around the issues posed by by Montezuma’s Revenge and Pitfall! by instructing reinforcement-learning algorithms to explore randomly at times, while adding rewards for exploration—what’s known as “intrinsic motivation.”
  
  But the Uber researchers believe this fails to capture an important aspect of human curiosity. “We hypothesize that a major weakness of current intrinsic motivation algorithms is detachment,” they write. “Wherein the algorithms forget about promising areas they have visited, meaning they do not return to them to see if they lead to new states.”
cool! by Anonymous Coward · 2018-11-27 06:24 · Score: 0

Now all they need to do is make an AI that can figure out how to make them profitable!
Non-differentiable functions are hard to optimize by smoothnorman · 2018-11-27 06:25 · Score: 2

Once again, this seems to be a case of "AI" research re-discovering some basic math: if the function has discontinuities or is otherwise non-differentiable ("behaviors that are necessary to advance within the game do not help increase the score until much later.") then its optimization is hard or dependent on fortunate starting conditions.
As an aside, have we even developed an accepted definition for what properly qualifies as "AI"? Recently I was being flogged some software who's selling point was an "AI engine" which turned out to be little more than a previous version with little bit of Bayesian statistics bolted on.
"Montezuma's Revenge" by Anonymous Coward · 2018-11-27 06:29 · Score: 0

I remember chuckling as an eight year old at the name of this video game when it came cout. 30+ years later, I still can't believe they named it what they did.
For those unfamiliar with the American idiom: https://en.wikipedia.org/wiki/Traveler%27s_diarrhea
Pitfall? by Anonymous Coward · 2018-11-27 06:31 · Score: 0

If it is the Activision version being referred to, that game was so repetitive we used to have contests about how far you could get playing with your eyes closed.
Blockish Worlds??? by UnknownSoldier · 2018-11-27 06:44 · Score: 1

> protagonists explore blockish worlds filled with deadly creatures and traps.
No, they ARE block rooms -- each room IS exactly 40x25 tiles (on the Apple ][ it only displays 40x24 tiles.) The tiles just happen to be a) animated, and b) mega-tiles such as ladders which are three tiles wide.
Also, here is a map of the world --- It make a pyramid shape, go figure!
Impressive that it could fit in 99 rooms in less then 32 KB !
Machine Learning, not A.I. by Anonymous Coward · 2018-11-27 07:01 · Score: 0

n/t
Re:Non-differentiable functions are hard to optimi by Anonymous Coward · 2018-11-27 07:03 · Score: 0

Once again, this seems to be a case of "AI" research re-discovering some basic math: if the function has discontinuities or is otherwise non-differentiable ("behaviors that are necessary to advance within the game do not help increase the score until much later.") then its optimization is hard or dependent on fortunate starting conditions.
Except they didn't say it was hard. Since you are too smart to even read before commenting, here's a picture, which shows how much better they've done than all other "AI" and a human.
Free energy principle? by jhoger · 2018-11-27 07:05 · Score: 1

Well why not just make building models of the environment and reducing surprise at observation compared to expectation the optimization goal?
Isn't that the point of "free energy principle" thinking?
Re:Non-differentiable functions are hard to optimi by smoothnorman · 2018-11-27 07:21 · Score: 1

How do you know I didn't read it prior to commenting? That's amazing! ...and yet I did read it before I commented. It was me saying it's hard. I didn't say they said it. At least you're not lacking for assumptions in your gratuitous reply.
Hacked by Anonymous Coward · 2018-11-27 07:32 · Score: 0

Technically it was hacked, not cracked.
reinforcment by avandesande · 2018-11-27 07:43 · Score: 1

Why wouldn't you assign points for a strategy not resulting in something negative (ie staying alive points)? Seems like an easy tweak.

--
love is just extroverted narcissism
1. Re: reinforcment by Anonymous Coward · 2018-11-27 07:49 · Score: 0
  
  Because standing still indefinitely may well keep you alive but isnâ(TM)t desirable behaviour.
2. Re: reinforcment by avandesande · 2018-11-27 08:45 · Score: 1
  
  Sign of the times I guess when people thinking doing nothing is action ; )
  
  --
  love is just extroverted narcissism
uber already has one death how many more before by Joe_Dragon · 2018-11-27 07:52 · Score: 1

uber already has one death how many more before they get an safe auto drive AI?
1. Re:uber already has one death how many more before by mentil · 2018-11-27 20:05 · Score: 1
  
  I dunno, how many more would it have to kill before YOU would consider it safe?
  Oh, wait...
  
  --
  Corruption is convincing someone that the selfless ideal is the same as their selfish ideal.
Pitfall was the best by Anonymous Coward · 2018-11-27 07:54 · Score: 0

Shit I loved that game. What an awesome game it was... swinging from vines and jumping on croc heads. Hell yes!
that line of thinking is very bad in an car! by Joe_Dragon · 2018-11-27 07:58 · Score: 1

After failing off an drawbridge next time we can slam on the gas or add map data that there is one.
After driving off an pier next time we can add wait for ferry to the map data.
1. Re:that line of thinking is very bad in an car! by Anonymous Coward · 2018-11-27 11:20 · Score: 0
  
  It doesn't have to be this car. It can be a simulated car, or someone else's car.
Want to read the actual details? Here's the blog. by SuperKendall · 2018-11-27 08:02 · Score: 4, Informative

I couldn't find this link anywhere in the actual article Slashdot linked to or the summary - the blog post laying out what Go-Explore is in more detail:
http://eng.uber.com/go-explore/

--
"There is more worth loving than we have strength to love." - Brian Jay Stanley
Montezumaâ(TM)s revenge gameplay by Neo-Rio-101 · 2018-11-27 08:02 · Score: 1

I remember managing to finish all 9 levels of the game back in the day in a single sitting. I do recall that you have to map out your path to the bottom of the pyramid, so essentially thereâ(TM)s a lot of exploration involved... however, past level 3 the map remains largely unchained and instead the lower levels go dark requiring a torch. At this point thereâ(TM)s rote memorization of the platforms that needs to happen because you canâ(TM)t see them before finding the torch ... especially one of the rooms that has the torch because it is a nightmare.
I figure that as long as the AI is good at exploring and mapping, it will have no issues remembering when and where to jump, even as it canâ(TM)t see the floor in the later levels.

--
READY.
PRINT ""+-0
1. Re:Montezumaâ(TM)s revenge gameplay by mentil · 2018-11-27 20:07 · Score: 1
  
  Wait that doesn't sound at all like... oh wait, I was thinking of Custer's Revenge. I'd wondered why they picked THAT game.
  
  --
  Corruption is convincing someone that the selfless ideal is the same as their selfish ideal.
Problem is reality... by blahplusplus · 2018-11-27 08:09 · Score: 1

... is unstructured. When you put an 'algorithm' in a game it has no awareness to make discoveries about goals and motivations, take the idea of them mentioning one of their algorithms not being able to get out of the first room...
Now think of what that means, your AI has no sense of when to move on because it doesn't realize there's nothing interesting going on in the space it finds itself. When goals don't exist or are unstructured you basically have to invent goals, aka come to the realization your wasting time and resources in a space that isn't interesting. Instead of making an algorithm to go through a game, they should basically automate the navigation and come up with algorithms that can come up with goals on it's own when there is no stated goal, aka it should be able to seperate areas of interest from areas of disinterest and then from there use those to come up with goals.
Many of the algorithms when I read about them don't sound very interesting because the problem with the real world is that tasks are unstructured and you usually have to do a lot of "legwork" first before you can even infer a goal or come up with a task.
AKA there really needs to be algoritms that come up with sense making of an environment when that environment has no particular 'end state'. Imagine an open world game where there are tasks and activities to do, but there's no 'finish' line. To take an example, say you got fishing minigame, you can drive around town, etc. The AI should come up with a way to discover or generate interest and goals for itself when the world is simply unstructured.
Chart parsers by nut · 2018-11-27 08:33 · Score: 1

This reminds me in some ways of the chart parsers I was playing around with in university for a paper on natural language processing. I think these days they are mostly used in the context of code compilation, but I must admit I don't know much about modern natural language processing tools, so I don't know if they're still a thing there.

--
Never trust a man in a blue trench coat, Never drive a car when you're dead
Title Vs summary by enriquevagu · 2018-11-27 09:15 · Score: 1

The title mentions novel AI techniques by Uber involving a new type of memory. Cool!!
The summary does not mention Uber nor this new memory. What are exactly the news?
Monty's Revenge by Anonymous Coward · 2018-11-27 09:32 · Score: 0

It's heartening to know that one of my favourite games of that era was also one of the hardest for AI to learn. I don't know how many hours I poured into that one.
Re:Non-differentiable functions are hard to optimi by ceoyoyo · 2018-11-27 13:58 · Score: 1

Reinforcement learning basically exists to solve problems that have the properties you describe. Researchers in the field have been aware of them for a long time. Many modern reinforcement learning algorithms basically use artificial neural networks to estimate the trickier bits in a Q-learning framework. Q-learning was introduced in 1989 and the basic theory developed in the early nineties.
Human intervention much? by mutherhacker · 2018-11-27 21:59 · Score: 1

From TFA: "The researchers also found that adding a little bit of domain knowledge, by having human players highlight interesting or important areas, sped up the algorithms’ learning and progress by a remarkable amount." This defeats the whole purpose of autonomous independent exploration.
Lots of games have been solved by OrangeTide · 2018-11-28 03:46 · Score: 1

Games like tic-tac-toe are solved, and I believe Checkers/Draughts is solved as well. Maybe Chess and Go will be solved in the near future.
Nethack may remain the one unsolved game for the foreseeable future.

--
“Common sense is not so common.” — Voltaire