Can DeepMind's AI Really Beat Human Starcraft II Champions? (arstechnica.com)
Google acquired DeepMind for $500 million in 2014, and its AI programs later beat the world's best player in Go, as well as the top AI chess programs. But when its AlphaStar system beat two top Starcraft II players -- was it cheating?
Long-time Slashdot reader AmiMoJo quotes BoingBoing: It claimed the AI was limited to what human players can physically do, putting its achievement in the realm of strategic analysis rather than finger twitchery. But there's a problem: it was often tracked clicking with superhuman speed and efficiency.
Aleksi Pietikainen writes "It is deeply unsatisfying to have prominent members of this research project make claims of human-like mechanical limitations when the agent is very obviously breaking them and winning its games specifically because it is demonstrating superhuman execution."
"It wasn't an entirely fair fight," argues Ars Technica, noting the limitations DeepMind placed on its AI "seem to imply that AlphaStar could take 50 actions in a single second or 15 actions per second for three seconds." And in addition, "This API may allow the software to glean more information... " After playing back some of AlphaZero's back-to-back 5-0 victories over StarCraft pros, the company staged a final live match between AlphaStar and [top Starcraft II player Grzegorz "MaNa"] Komincz. This match used a new version of AlphaStar with an important new limitation: it was forced to use a camera view that tried to simulate the limitations of the human StarCraft interface. The new interface only allowed AlphaStar to see a small portion of the battlefield at once, and it could only issue orders to units that were in its current field of view....
We don't know exactly why Komincz won this game after losing the previous five. It doesn't seem like the limitation of the camera view directly explains AlphaStar's inability to respond effectively to the drop attack from the Warp Prism. But a reasonable conjecture is that the limitations of the camera view degraded AlphaStar's performance across the board, preventing it from producing units quite as effectively or managing its troops with quite the same deadly precision in the opening minutes.
Long-time Slashdot reader AmiMoJo quotes BoingBoing: It claimed the AI was limited to what human players can physically do, putting its achievement in the realm of strategic analysis rather than finger twitchery. But there's a problem: it was often tracked clicking with superhuman speed and efficiency.
Aleksi Pietikainen writes "It is deeply unsatisfying to have prominent members of this research project make claims of human-like mechanical limitations when the agent is very obviously breaking them and winning its games specifically because it is demonstrating superhuman execution."
"It wasn't an entirely fair fight," argues Ars Technica, noting the limitations DeepMind placed on its AI "seem to imply that AlphaStar could take 50 actions in a single second or 15 actions per second for three seconds." And in addition, "This API may allow the software to glean more information... " After playing back some of AlphaZero's back-to-back 5-0 victories over StarCraft pros, the company staged a final live match between AlphaStar and [top Starcraft II player Grzegorz "MaNa"] Komincz. This match used a new version of AlphaStar with an important new limitation: it was forced to use a camera view that tried to simulate the limitations of the human StarCraft interface. The new interface only allowed AlphaStar to see a small portion of the battlefield at once, and it could only issue orders to units that were in its current field of view....
We don't know exactly why Komincz won this game after losing the previous five. It doesn't seem like the limitation of the camera view directly explains AlphaStar's inability to respond effectively to the drop attack from the Warp Prism. But a reasonable conjecture is that the limitations of the camera view degraded AlphaStar's performance across the board, preventing it from producing units quite as effectively or managing its troops with quite the same deadly precision in the opening minutes.
This is not really beating a human fairly. If you could click that fast then sure, but otherwise it's not a fair fight.
Just cruising through this digital world at 33 1/3 rpm...
When AlphaZero was pitted against Stockfish, the best chess AI, they set the match up with an outdated version of stockfish, bizarre time controls that removed stockfish's edge in time management (a static time per move was enforced), stockfish didn't get its opening books (a mini database containing information about the best moves to start with), nor did it get endgame tablebases (another mini database of information about moving at the end of games) and it was limited to a very small amount of ram (only 1GB when it should've had 64GB or more). Deepmind will CONTINUE to mislead people about what they've accomplished at every opportunity.
It should be clear to anyone that in the final game Mana won by the drop play. The bot kept sending it's whole army back and forth like most noobs do. It seems like he found it's weakness like most pros eventually do with game AIs.
This 'AlphaStar' agent is definitely impressive but has a long ways to go
The issue is clicking accuracy. Watch the pros play: how many of the 15 clicks per second are actually accurate. When they micro, are they selecting exactly the right units? Are they clicking on exactly the right region to move to?
Either reduce the AI clicking clicking accuracy, or reduce its clicking rate.
... to test AI. Since RTS games already have a bad UI where the bottneck is the human being in the chair, aka trying to control many units with a limited UI v ia keyboard and mouse is cumbersome at best. It was even back in the Warcraft 2 days when you tried to bloodlust ogres or heal paladins -- healing paladins being damn near impossible. While warcraft 3 'fixed' the issue with impossible casting /w large numbers of units using autocast.
The main problem being is that games like starcraft can be played perfectly because it's really an action game masquerading as a strategy game, aka the actions take place in real time. So for a computer like deepmind, the human appears super slow. Imagine if you ropponent appeared retarded in terms of their reflexes. That's basically deepmind vs any human opponent in an RTS. So a computers perfect information and perfect reflexes mean making 99% accurate micromanaging decisions for units everywhere at once.
You can't do that as a human player. Deepmind for an RTS is like having an aimbot in quake. Not really impressive since we already know making bots that can win against humans is trivially easy.
Just wanted to note something people seem to be missing. Some pro player was incredibly dominant winning 7 of 9 big tournaments. He peaks at 500 and has an EPM of 330? Have you considered that he's not winning on strategy but on speed and precision alone?
So when an AI does it it is all of a sudden not fair? That's pretty bogus for me. All it tells me is that SC2 is a horrible "strategy" game and that the strategy aspect of it peaks really easily and then after that the primary method of winning is to simply outspeed and outmuscle your opponent with unit abilities or units that counter them and with superior unit control and superior macro. That's it. If they limited AlphaStar to human speed we'd just see 50/50 games. Players that are faster would win more. Players that are slower would lose more and there'd be a huge range in there as SC2 is really unit spammy and really snowbally with 1A armies and maybe a few micro spots here or there. So if you manage to issue the commands out in time that counter your opponent's actions in those moments you win and if you don't you lose. No strategy about it.
Watson won at Jeopardy also because it could press the button faster, which is considered the key skill. However, despite that, it still had to answer the questions, which was impressive.
I would say that this result is also impressive, even if the machine was not really quite as good as the humans.
I've seen a couple of comments already where folks are talking about DeepMind being able to micro and click faster than a human. While that's neat, that's not entirely the goal here with DeepMind. The makers already put an artificial limit on the actions DeepMind can execute to 600/minute. In comparison the humans are executing at around 250-300 actions per minute. Now that indeed made DeepMind's micro game strong, what was the real tipping point was that DeepMind could see anywhere where a unit was located. Humans however can only see a "screen at a time". When DeepMind's makers went back and implemented "screen at a time" limitation, DeepMind was easily fooled again. And that's the thing here. Not the "can I beat a human?" but "can a human fool me?". As soon as the amount of information coming IN to DeepMind was reduced, the data coming OUT couldn't compensate and the humans were able to slowly figure out how to trick the AI into an unwinnable situation.
There's a continual fallacy on Slashdot where pure research like DeepMind is confused for "who's jerb can it take and reasons why it can't take that jerb." The media here is presenting in the terms of "Hey look! Something AI can do better than us worthless puny humans!" but DeepMind is mostly research first. The entire point here isn't, "Hey can I pawn this guy?" It's why did limiting the input allow the human to so easily fool the machine? Because researchers aren't sure why the AI was so easily fooled where when it had a wider field of view, it could not be so easily fooled. That question has a lot more wider ranging implications than how great the micro game is for DeepMind.
Sounds like a bug that it did poorer in areas that should've been unrelated to camera view. Makes me wonder if it was trained on whole-map data, and it was unsure how to act when only given part of the map.
Corruption is convincing someone that the selfless ideal is the same as their selfish ideal.
While watching the match, it seemed to me that it was not winning on being human level intelligent or adjusting on the fly. The AI was simply able to not only click on units really quickly, but it was able to switch to different groups faster than a person could. This allowed the computer to effectively be on the entire board all at the same time.
When you think about it, the AI was showing no intelligence. It had ran a statistical analysis, and came up with the notion that if you can click really fast, these units are the best in all situations. It does not adjust to new stuff. Just in effect, runs a script.
"Liberalism is a very noble idea, currently controlled by some very bad people. Be sure you do not get the two confused.
We don't know exactly why Komincz won this game after losing the previous five.
Well you could start with MaNa's explanation:
We noticed that the agent sticks to the basic units a lot. ... It doesn't really transition out of it, like, it does make some upgrades but it constantly makes the basic units.... Our thought process is...the way that we should exploit agent play is...I should just make an immortal army, basically, but do not do anything besides that.... I can probably, most likely, defeat AlphaStar with simply better unit composition rather than unit control.
He and TLO discussed how AlphaStar liked to make a lot of stalkers and rely on excellent control of them, so they used a strategy to counter that.
(The AI did run into some odd behavior as well, with all of its stalkers huddled against the edge of the map trying to get at MaNa's warp prism.)
We don't know exactly why Komincz won this game after losing the previous five
You could know if you'd watch the games. In the first set, DeepMind won with inhumanly superior micro. It was really cool, but computers have been better at micro for a long time. Speed and precision are things computers are good at, that's why we have aimbots.
In the second set, the human readjusted, and thought of strategies that would defend against the superior micro (by building more powerful units), while taking advantage of the computer's weaknesses (poor knowledge of army compositions, weak knowledge of positioning, and seemingly no object permanence: once enemy units are out of view, it has no idea where they are or if they exist).
"First they came for the slanderers and i said nothing."
bizarre time controls that removed stockfish's edge in time management
AlphaZero got the same time control.
stockfish didn't get its opening books
AlphaZero didn't get an opening book either.
nor did it get endgame tablebases
Neither did AlphaZero. Also note that in many of the games, Stockfish was basically lost in the early middlegame.
only 1GB when it should've had 64GB or more
That's the only legitimate concern, but the whole argument is stupid nitpicking nevertheless. This is like a race between a horse and the first model car. The exact conditions and outcome are secondary to the proof of validity of general principles. AlphaZero was just the first iteration of a new development. The fact that it came close to Stockfish at all is enough to show that this approach has merit. Sure, the SF setup was not optimal, but it wasn't completely crippled either.
Where exactly does one draw the line about what constitutes "cheating", versus exploiting a natural advantage?
Is it cheating if the AI gets to use API calls to control its forces, rather than physically pushing keys on a keyboard and moving a mouse, the way a human player does? Arguably so, if keyboard-and-mouse-dexterity are considered part of the skill set for the game. Perhaps a fair contest should require the AI to use robotic arms and video cameras on a gaming PC.
On the other hand, if it's only the strategy portion of the game that is seen as relevant, then the ability to interface electronically to the game's engine (rather than through slow fingers and nerves) is simply an advantage the AI has over clumsy humans (at least until someone perfects a neural brain-stem link, I suppose).
Was the steam-powered rock-drill cheating when it competed against John Henry in a hole-drilling contest? Yes, no, maybe, depends on how you define cheating?
I don't care if it's 90,000 hectares. That lake was not my doing.
Still it sounds like they have chosen the battlefield that suits them best. E.g. maybe Deepmind is just better without the libraries because Stockfish was counting on these, and wasn't optimized to work without them?
That's like saying a fat runner is optimized to use a car. Stockfish isn't really optimized for opening books, it just sucks without them, mainly because the difference between a good and poor move in the opening may not manifest itself in a concrete eval difference far beyond the search horizon. As shown in some of the games, Stockfish doesn't care if its bishop gets trapped behind its own pawns. A bishop is still a bishop. It may get a penalty for limited mobility, but it doesn't get a penalty for being stuck for 40 moves. And the heuristic eval that Stockfish uses is just too simple to recognize these concepts.
Still I would conclude that Google's Deepmind only showed that it does not need an opening library.
I would say a neural network just combines the advantages of a database and of calculating moves ahead. The weighting of different connections in a neural network seems pretty equivalent to me to storing a library of good and bad starting moves.
Therefore it has pretty much a database function, and it is not surprising that it is superior to a software without one.
bizarre time controls that removed stockfish's edge in time management
AlphaZero got the same time control.
stockfish didn't get its opening books
AlphaZero didn't get an opening book either.
nor did it get endgame tablebases
Neither did AlphaZero.
[snipped]
Sure, the SF setup was not optimal, but it wasn't completely crippled either.
If you let me choose the parameters of the game, I'd beat AlphaZero in chess even though I would play under the same parameters[1]. Its totally fair because I'd be playing with the same restrictions as AlphaZero! That's how you measured fairness, right?
[1] Single core, 386 with 4MB of RAM. Sure, it's not optimal for AZ, but it's not completely crippled either!
I'm a minority race. Save your vitriol for white people.
The big difference is that an opening book contains literal moves, whereas a neural net represents generalized patterns, similar to how a human grandmaster's brain has these patterns. If you give AlphaZero a position that's not in any of the games it played, it will still find appropriate patterns and use them to evaluate the position.
it is not surprising that it is superior to a software without one
If you take a weak engine with an opening book, then Stockfish is still going to be superior, because as soon as it plays a non-book move, the weaker engine is on its own. Even if the move was technically a mistake, it's unlikely that a weaker engine is going to be able to exploit it against Stockfish. The engine would actually have to recognize that the Stockfish move was bad, and understand how to exploit it.
For example, if Stockfish makes a bad move that potentially traps its bishop, the opponent needs to understand what moves to play to keep the bishop trapped, and why those are important. With specific patterns for trapped bishops, that's not going to happen.
I'd beat AlphaZero in chess even though I would play under the same parameters (Single core, 386 with 4MB of RAM)
Let me get this clear. You are arguing that your brain is roughly equivalent to a single core 386 ?
I'd beat AlphaZero in chess even though I would play under the same parameters (Single core, 386 with 4MB of RAM)
Let me get this clear. You are arguing that your brain is roughly equivalent to a single core 386 ?
No, I'm saying that I'd work under the same parameters. I might not even turn on the 386 handed to me, after all. But I'm still working with the same parameters, which as you pointed out, is totally fair.
I'm a minority race. Save your vitriol for white people.
bizarre time controls that removed stockfish's edge in time management
AlphaZero got the same time control.
stockfish didn't get its opening books
AlphaZero didn't get an opening book either.
nor did it get endgame tablebases
Neither did AlphaZero. Also note that in many of the games, Stockfish was basically lost in the early middlegame.
only 1GB when it should've had 64GB or more
That's the only legitimate concern, but the whole argument is stupid nitpicking nevertheless. This is like a race between a horse and the first model car. The exact conditions and outcome are secondary to the proof of validity of general principles. AlphaZero was just the first iteration of a new development. The fact that it came close to Stockfish at all is enough to show that this approach has merit. Sure, the SF setup was not optimal, but it wasn't completely crippled either.
This is like a race between a horse and a car, which the horse wins because the car was forced to try to run with an empty gas tank, and then you go ahead and say "So what? The horse didn't get any petrol either so it was a tooootally fair setup".
which as you pointed out, is totally fair.
No, it's not fair, because a 386 with 4MB is completely incapable of even running the AlphaZero code (or Stockfish code), simply because the program and its data won't even fit.
That's what I call crippled.
The conditions in the match were not optimal, but did not make a huge difference. Maybe you could have gained a few dozen elo by optimizing the system, which is not really a big deal overall, and certainly not "crippled", especially considering that Deepmind could have added similar elo to AlphaZero by adding proper time management and endgame tablebases.
This is like a race between a horse and a car, which the horse wins because the car was forced to try to run with an empty gas tank, and then you go ahead and say "So what? The horse didn't get any petrol either so it was a tooootally fair setup".
In my analogy, Stockfish is the horse. And AlphaZero is an early model steam car. People complain because the horse didn't get the best food, and it wasn't the world's fastest horse, and it was too hot outside, and the horse didn't get proper rest.
What they are missing is that the early car is still at the beginning of the development curve, while the horse is already at its peak.
In a few years, neural network engines will be a few hundred elo higher, and there simply won't be any contest anymore.
If it doesn't make a huge difference, why not let them have it? Additionally, I loved your mental gymnastics to differentiate between a database of moves and a neural net of moves. What we have here, ladies and gentlemen, is a worshipper at the altar of the AI religion (full marketing kool aid drivel branch). Here's a newsflash for you: we're not 10 years off, we're not even 50 years off. All those self driving cars? They found the last 1% was really, really disproportionately difficult. All those jobs replaced by AI? Turns out dumb "AI" burger flipping robots are a novelty that aren't as fast, cheap or reliable as a human. Get back to me when you have a neural net that can't be fooled into identifying a garbage arrangement of pixels as a handgun. In fact, get back to me when we know what thinking and intelligence are, so that we have something to aim at.
"Oh it totally would win eventually" - maybe so, but it proves the point that Deepmind is a marketing stunt who don't perform trustworthy comparisons. Don't be disappointed when your car turns out to be an Edsel, and that real AI eschews dumb neural nets for something different.
Battle Chess ran on my 8086 with 640KB of RAM. Given the scenario outlined by the grandparent, I'd bring it along to play against AlphaGo. Sure, AlphaGo would segfault on startup having exhausted the memory and forfeit, but even when playing as white I'd get one valid move before it died.
DeepMind's claim is that their deep neural network design can beat something programmed by a human programmer. If they'd added a similar database to their code, it would not have supported their claim. If they could have won with Stockfish keeping theirs, then they'd have done so.
I am TheRaven on Soylent News
which as you pointed out, is totally fair.
No, it's not fair, because a 386 with 4MB is completely incapable of even running the AlphaZero code (or Stockfish code), simply because the program and its data won't even fit.
Are you claiming that the contest is only fair if both parties get all the resources they claim they need?
I'm a minority race. Save your vitriol for white people.
Sounds more like the owner of the horse put square tires on the car and claimed victory.
No it isn't nitpicking, stockfish is designed to have those databases whereas AlphaZero is not. You haven't beaten it if you haven't beaten it as intended to run.
I very much doubt his brain or yours could compete with a single core 386. Sure it has some impressive stats on massively parallel micro-ops but the single threaded performance sucks.
The entire point of AlphaZero is to not need the databases. The whole point is for it to beat the classical chessbot with its artificial intelligence.
If you don't set up the chessbot as it is normally run you haven't beaten the chessbot. If you give AlphaZero tables you've also defeated the point of the experiment, the idea isn't that AlphaZero is the better chessbot, the idea is that AlphaZero's intelligence is such that it can even beat a chessbot.
It's like setting the difficulty to "Dumb Noob" on a game and claiming you beat it.
When those parties are machines and on a neutral playing field and they are technical requirements, yes.
If you are trying to establish that your gasoline based vehicle outperforms the diesel model it isn't exactly fair to refuse to allow diesel fuel in the race. It also isn't fair to exclude the types of optimizations that work better for diesel than gasoline.
"In my analogy, Stockfish is the horse. And AlphaZero is an early model steam car. People complain because the horse didn't get the best food, and it wasn't the world's fastest horse, and it was too hot outside, and the horse didn't get proper rest.
What they are missing is that the early car is still at the beginning of the development curve, while the horse is already at its peak."
And that is a useful experiment how? You still use a rested racing horse that has been properly fed and benchmark how much you've shaved off the loss over time. You don't have the jockey ride the horse near to death before and then claim your steam car kicks the shit out of champion horses when it wins.
People seem to forget that these are tech demos, and just like other tech demos, they do not demonstrate real-world performance. To everybody's credit, though, tech companies have begun presenting what is essentially vapor as a finished product with unmatched zeal (for $$$) for a good long while now. In other words: they lie. It's that simple.
Are you claiming that the contest is only fair if both parties get all the resources they claim they need?
When those parties are machines and on a neutral playing field and they are technical requirements, yes.
Well, that was my point: it wasn't really fair - in the AZ vs SF, SF wasn't given all the resources that SF was claimed to require while AZ was given all the resources it apparently needed.
So, of course, if I'm allowed to determine the parameters under which the contestants will compete, it's possible to always choose the winner in advance simply by tuning the parameters to favour one party over the other while being "fair" because both parties get the same parameters.
I'm a minority race. Save your vitriol for white people.
I'm not impressed by any piece of software that beats humans at games, and it certainly doesn't make me any more impressed with the so-called half-assed excuse for 'AI' they keep trotting out. Feels mostly like more of the cheap marketing bullshit they keep pushing on us.
Ow what would you call it if the machine has all games of its opponent to prepare and the human opponent has none of the machine? This whole thing was an useless stunt. And did you notice how fast the Go machine was retired afterwards? Very likely because it would have had no chance after the humans had it seen play a few times.
Why do people fall for this kind of crap?
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
As the promise of "AI" slowly crumbles, a lot of people are desperate to hide the severe limitations and problems of this tech. One way they try to scam the public is by rigged "contests", like this one or the meaningless "Go" stunt.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Where exactly does one draw the line about what constitutes "cheating", versus exploiting a natural advantage?
I suggest that by looking a the goal, you can determine the line. Google here wants to create intelligence, but they won with a pure mechanical advantage.
So we can congratulate them for......making a computer that clicks with precision. Good job Google. But from watching the games, it's clear they failed on intelligence.
"First they came for the slanderers and i said nothing."
The machine lost because it did not train that way. Train it that way and it will win. There is little strategy in StartCraft II that cannot be reproduced by a Machine Learning.
That's an apples to oranges comparison - DeepMind isn't supposed to need an explicit opening book, because the knowledge is trained into the neutral network. StockFish utilized an entirely different approach optimized around having one, so it would really only need fair to go against it as designed.
It is defined as cheating as it was "supposed" to have the physical limitations of a human, but in actuality it did not have such a limitation as it was able to click and select and micro manage many times faster than the best human. I don't think anyone would argue that a computer isn't better at speed, but they were not trying to test speed, they will trying to test ability with the same limitations as a human.
Why do we still use the term AI when it actually doesn't even exist yet?
Right. I'd contend you optimize both as best you reasonably can. If that would make SF victory a given, so be it. You just benchmark your progress over time against SF. There is no need to rig contests and generate all sorts of bogus claims about being the undisputed champion of all things.
If anything that is just going to hamper AZ's development and progress. Now there is no incentive to work on whatever deficits might lead to AZ failing against SF and any improvements that would have been gained trying will be lost.
Well of course it was cheating. They gave it TOTAL AWARENESS of the whole map (that it could legally see via fog-of-war rules). They fudged this one by limiting it's "screen changes" as in, while it knew everything that was happening across the map, it could only choose one area to issue commands to. They ALSO had some games were it had to use those screens to see what was happening at those locations (just like a human). And it lost. Arguably, it didn't have enough training time with that setup.
It only played a single map. That's... not that big of a deal. They could have multiplied it's learning time per map. Easy.
It also only played protoss. They could have multiplied it's learning time per race (3). Easy. And most humans do this too. I play protoss as well.