DeepMind's AI Agents Exceed 'Human-Level' Gameplay In Quake III (theverge.com)

← Back to Stories (view on slashdot.org)

DeepMind's AI Agents Exceed 'Human-Level' Gameplay In Quake III (theverge.com)

Posted by BeauHD on Thursday July 5, 2018 @11:40AM from the rise-of-the-machines dept.

An anonymous reader quotes a report from The Verge: AI agents continue to rack up wins in the video game world. Last week, OpenAI's bots were playing Dota 2; this week, it's Quake III, with a team of researchers from Google's DeepMind subsidiary successfully training agents that can beat humans at a game of capture the flag. DeepMind's researchers used a method of AI training that's also becoming standard: reinforcement learning, which is basically training by trial and error at a huge scale. Agents are given no instructions on how to play the game, but simply compete against themselves until they work out the strategies needed to win. Usually this means one version of the AI agent playing against an identical clone. DeepMind gave extra depth to this formula by training a whole cohort of 30 agents to introduce a "diversity" of play styles. How many games does it take to train an AI this way? Nearly half a million, each lasting five minutes. DeepMind's agents not only learned the basic rules of capture the flag, but strategies like guarding your own flag, camping at your opponent's base, and following teammates around so you can gang up on the enemy. "[T]he bot-only teams were most successful, with a 74 percent win probability," reports The Verge. "This compared to 43 percent probability for average human players, and 52 percent probability for strong human players. So: clearly the AI agents are the better players."

16 of 137 comments (clear)

Min score:

Reason:

Sort:

Wow! by Anonymous Coward · 2018-07-05 11:47 · Score: 5, Insightful

I'm sure aimbotting & instantaneous team communication had nothing to do with their success.
1. Re:Wow! by gweihir · 2018-07-05 14:00 · Score: 2
  
  Indeed. Another meaningless stunt.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
2. Re:Wow! by gl4ss · 2018-07-05 18:24 · Score: 5, Insightful
  
  they probably use a "virtual" camera" already.
  still makes them aimbots though.
  quake III is not a good candidate for this, simply due to being a reactions game.
  
  --
  world was created 5 seconds before this post as it is.
3. Re:Wow! by gravewax · 2018-07-05 18:50 · Score: 2
  
  and? a computer can process the image information on a screen a 1000 times faster and react to that information a 1000 times faster. numbers or pictures is irrelevant, computers have an inherent advantage here, the fact it didn't reach 100% victory rate says it still has a ways to go given it is starting from such a huge tactical advantage.
Bad Challenge by HeckRuler · 2018-07-05 11:50 · Score: 4, Insightful

But that's a skill-based game, as opposed to strategy or anything needing intelligence. "Skill" as in reaction time to seeing an opponent and successfully moving clicking the mouse of their head. Give me a couple minutes and I can script up a bot that dominates players. That's not hard. And it's not even fun.
To have a real comparison, you'd have to let humans play with cheat-codes. Aim-bots and enemy highlighters. Maybe set it to ultra-slow, or add in bullet-time or something. But at that point, you're no longer playing Quake.
The part where it learned the interface, the objectives, and some strategies on it's own are fun and interesting. The sort of thing I'd expect from an undergrad in comSci. But it's been done and it's not any more impressive than having it learn how to beat MarioBros.
Chess and Go are games that require thought. Quake require twitch.
1. Re:Bad Challenge by Djoulihen · 2018-07-05 12:54 · Score: 4, Informative
  
  From TFA: "DeepMind’s agents also didn’t have access to raw numerical data about the game — feeds of numbers that represents information like the distance between opponents and health bars. Instead, they learned to play just by looking at the visual input from the screen, the same as a human"
  You've got your very least, but I'm pretty sure you'll find another way to turn this into just "shite" work.
2. Re:Bad Challenge by dcollins117 · 2018-07-05 15:34 · Score: 2
  
  Google search is AI. It does a great job of finding you cat pictures and shit.
  Word to the wise: do not google "cat pictures and shit."
3. Re:Bad Challenge by Solandri · 2018-07-05 19:35 · Score: 2
  
  Chess and Go are games that require thought. Quake require twitch.
  Chess and Go are deterministic. The same set of moves always results in the same outcome, meaning there is always a "right" answer to "what's the best move?"
  
  Quake, by virtue of being a twitch game (and multi-player) is non-deterministic. That makes it a much harder problem for AI to solve, because a rule which works the first time may not work in subsequent trials. That is, the effectiveness of a rule is not the binary success/fail like you get in deterministic systems. The effectiveness spans the entire range from 0% to 100% probability of success, and the probability is constantly updated with new trials, and that probability can change if opponents start to use different strategies.
  
  Give me a couple minutes and I can script up a bot that dominates players. That's not hard. And it's not even fun.
  That's the whole point. They didn't script up a bot. Heck, they didn't even teach it the rules of the game. They programmed an AI, and let it come up with scripts on its own by playing the game and "discovering" for itself what actions resulted in a win vs a loss. You're correct that the AI has a huge advantage over human players (unless they had it play by pointing a camera at the screen, and using a mechanical arm to move a mouse). But that's tangential to what makes this research interesting.
4. Re: Bad Challenge by martyros · 2018-07-05 21:58 · Score: 4, Informative
  
  But that's a skill-based game, as opposed to strategy or anything needing intelligence. "Skill" as in reaction time to seeing an opponent and successfully moving clicking the mouse of their head.
  Strangely enough, they already thought of that:
  
  First, we noticed that the agents had very fast reaction times and were very accurate taggers, which could explain their performance. However, by artificially reducing this accuracy and reaction time we saw that this was only one factor in their success. ...Even with human-comparable accuracy and reaction time the performance of our agents is higher than that of humans.
  Both the summary and the Verge article seem to have missed the point of this development -- an improvement to the agent design scheme.
  Last year, after smashing both go and chess with their self-play-from-zero strategy, they tried the same thing with Starcraft. And they lost spectacularly -- even after millions of games, their self-trained DeepMind agents were unable to beat even the most simplistic "scripted" StarCraft AI -- the ones designed for n00b humans to beat up on. They discovered that while the self-play agents were able to eventually figure out activities like "harvest minerals", they were unable to put those together into higher-level activities like building an army and winning a game.
  One of the key refinements they introduce in this paper is to allow the agents to evolve their own internal "rewards", which were sub-steps towards winning. These goals included things like killing an opponent, capturing a flag, recapturing their own flag, avoiding being killed, and so on. The programmers architected in that such rewards were *possible*, but let the learning algorithm define what those rewards actually were and how much the reward was for each one.
  They call this architecture 'FTW'. Then they ran their vanilla "self-play from nothing" bots again, and found that just like in StarCraft, the bots never made much progress; but they found that the new bots, which had self-made internal rewards, were able to consistently beat strong humans, even after having their reaction time and visual accuracy reduced below that of measured humans.
  
  --
  TCP: Why the Internet is full of SYN.
In 2022... by AmazingRuss · 2018-07-05 11:59 · Score: 5, Insightful

... we will be hunted to extinction by packs of weaponized roombas.
Stripped down by thePsychologist · 2018-07-05 13:54 · Score: 4, Informative

While interesting and promising, it's worth noting that the game they were playing was not the "real" Quake 3 arena with all the weapons but a highly stripped down version with one weapon, no power-ups, and brightly-coloured walls to help the AI perceive the level design.

--
"What lies behind us, and what lies before us are tiny matters compared to what lies within us." Ralph Waldo Emerson
1. Re:Stripped down by gweihir · 2018-07-05 14:02 · Score: 2
  
  So they needed to cheat pretty badly in order to get their meaningless stunt going.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Not BETTER - Just FASTER by kenwd0elq · 2018-07-05 14:00 · Score: 4, Insightful

Video games like Quake, Starcraft II, and DOTA have a limited number of possible moves, and the FASTER player is usually victorious. Bots aren't better players; they're just WAY faster.
Good, I can get back my life now by Nethead · 2018-07-05 14:26 · Score: 2

Once I can afford one of these AIs I can let it do all my gaming and I can go back to having a life.

--
-- I have a private email server in my basement.
Ha ha by JustAnotherOldGuy · 2018-07-05 16:59 · Score: 2

Fifty years from now the few remaining survivors of the Robot Apocalypse will look back on these early years in AI research, and they'll marvel at how we were just too stupid to foresee or even consider that AI would become the dominant "life form" on the planet, replacing us as the apex predator.
"Yes, before the Robots took over the world," said Og, as he threw another stick on the fire, huddling in the ash gray wasteland that used to be New York.
"The scientists said AI was 'totally safe' and 'nothing could go wrong'," Og continued, "but you kids don't remember that because that was back when we had electricity and people talked into little boxes they carried in their pockets."
The children all laughed at Og, he always told the biggest lies because he was so old (almost 30!) and so his stories could not be believed.
"What's a 'sy-en-tiss'?" whispered Janey.
"They were the people that knew stuff and made the world run." Og said.
The children laughed again, "No one makes the word run, silly!" they hooted.

--
Just cruising through this digital world at 33 1/3 rpm...
Re:Give the humans aimbot program by Impy+the+Impiuos+Imp · 2018-07-05 18:57 · Score: 2

I cut my grass because I want to; sure it too could be automated but WHY? Where's the pleasure in that?
If you want, come on over and double your pleasure with my lawn.

--
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.