tamer*f · Slashdot Mirror

reward on Quake Bots Rock The Prefrontal Cortex · 2003-06-10 04:23 · Score: 1

One of the things that made our task hard was the unpredictability of the Quake world. In other words, when the ANN bot misses a shot completely it is unable to get any useful info from it, potentially confusing it. The situation where it shoots the right gun at its enemy and hits the other bot sometimes arose as well. This made locking on to the right dimension difficult for the model because of this unreliable reward schedule. If we gave too much reward for a hit then the ANN bot could get stuck on something random and never give it up. If we didnt give enough reward it would never be convinced that it was doing well. Also not knowing when the ANNbot will score its next hit made things tricky.

We havent given much thought to applying the model to a different task yet. We simply wanted to see if the model could overcome the complexity of the QIIIa world. Since i recently graduated I'm no longer working on research projects as much as trying desperately to get a job.

Speaking of positive and negative feedback, one of my earlier forays into NN's in QUake IIIa was a simple mod where I controlled where the bots aimed different weapons with TD learning (a technique where a NN gets trained through rewards based on the action the agent took). Not having the right rewards produced some interesting bots: When I only gave positive reward for hitting an enemy the bots learned to spin in circles either to the left or right because this behavior guaranteed they would at least get a shot in on someone once a rotation since there were so many bots on the map. To fix that behavior I had to give small negative reward for turning away from an opponent and larger positive reward for killing the enemy. This encouraged the bots to finish the job instead of spinning around whipping rockets at enemies across the room.

Another thing that was wierd is I would start like 10 of these bots training, all equals with the same untrained network, and after a few hours some of them were completely hopeless and others were brilliant. It seems that their experieces during training were responsible for the difference. Some bots learned themselves into corners, eventually expecting their own failure and essentially giving up. Others would get better and better the longer I trained them. It's difficult to try to think of what these experiences were that caused some bots to get depressed and others to succeed.

-t

Re: I didn't read the article, but I ran the mod on Quake Bots Rock The Prefrontal Cortex · 2003-06-09 11:29 · Score: 1

Hi dan,

I was wondering what gave you that impression about our mod. Did you run it correctly? Please refer to my post about seting up the experiment. I realize now that I should have put clearer instructions up on my site.

-Tamer

How to run the mod on Quake Bots Rock The Prefrontal Cortex · 2003-06-09 11:15 · Score: 3, Informative

It seems like it is a little unclear to some people how to correctly run our mod (largely due to me leaving out significant details on the download page). It is intended as an experiment involving one learning bot utilizing the PFC model and two special dummy bots. You need to make sure you run the game using the map I made so that the bots have constant interaction. Using the command line listed on the download page, start the game, and then switch yourself over to be an observer. Run the following console command: /idscenario

Now you have some very colorful dummy bots with unoriginal names running around doing nothing. But it gets better....

Use the console command /addannbot just like you would /addbot to put a learning bot in the mix. For best results, use a bot that doesnt suck. Now switch over to the learning bot's first person view to see the status of its Neural network and PFC layers changing as a result of its perception. Igor, its aliiive.

Now, there is no damage or death in the mod because we didnt want this to complicate the experiment. What you should see is a blue icon appear above an enemy that is hit. A deflected shot will bounce off like it hit the invulnerability sphere. When the bot hits, you will notice the little white box at the top of the network status overlay (upper right corner of the screen) go solid. This signals that the bot got a reward.

Directly under that are two yellow boxes, these represent how much the bot wants to choose each weapon (full is highest). Once the bot learns something you will notice these switching dramatically in response to the characteristics of its enemy. The bottom row of red boxes shows the characteristics of the current enemy (shield color, gun color, position, ID).

Now with all this information the bot tries to figure out what about its opponent is important in deciding how to kill it. The top row of yellow boxes at the left of the screen encodes what dimension the bot is considering, shield color, gun color, location, or ID(name). When the bot picks the right dimension, it can reliably slam its opponent with the right beam color. When it chooses the wrong dimension, it performs miserably until it gives up and explores something else.

Our experiment is set up such that the first correct dimension is shield color. After the bot figures it out the experiment will autoswitch to the ID dimension. When this happens you will see a message appear at the top of the screen in red. When its behaving well, the bot will catch on quickly.

Thanks for checking out the mod, and sorry about being late with this info. If you've got questions a lot is explained in the code walkthrough on the site, otherwise just ask me here. cheers,

-Tamer

Re:Caught on Quake Bots Rock The Prefrontal Cortex · 2003-06-09 10:36 · Score: 1

I would agree that a bot with only the ability to determine the best fighting weapon is not entirely impressive. Our main goal was to enhance the model we were working with so that it could function in a rich environment like QIII. What's cool is that the bot using this model was able to not only learn a strategy for weapon selection through practice, but then discard its strategy and quickly adapt when we changed the rules on it.

For instance, the experiment starts and our learning bot figures out that its railgun can harm opponents with the red quad damage shader (what we refer to as the red shield). Great, it starts whooping ass and it's happy. (This kind of thing can be achieved with a simple backprop NN amongst other things). Then we switch the rules on it, making its railgun only effective against players with a certain character model or something like that. Our learning bot starts to suck, since it's still paying attention to shield color, now a useless detail. What's cool is that it will scrap this idea and start experimenting until it figures out that the character model is what it should be paying attention to. And it does this fairly quickly.

So essentially, it's this short term changing of strategy that some scientists believe the human PFC is responsible for that makes for a cool bot, not really our little weapon switching scenario. There are many uses for this talent that I think could really make a great automated player in FPS games.

-Tamer

Re:AI programming on Quake Bots Rock The Prefrontal Cortex · 2003-06-09 09:03 · Score: 2, Informative

The model of PFC that we used is a simple backpropagation artificial neural network with some modifications. Recurrent connections to layers with special activation weights enable the working memory and cognitive control functionality of the model. It would be a little much to clearly describe it here, but if you are interested you can check out http://psych.colorado.edu/~oreilly/pubs-abstr.html #01_id_ed -the original paper describing the model. I've made an effort to describe it in simpler terms on the site we are commenting on. There are plenty of neural net resources and sites on the internet. If you are interested in Game AI I suggest checking out the book "Game AI Programming Wisdom" edited by Steve Rabin. Even if it's too technical or you're too busy to use the techniques yourself it's interesting to see what's going on behind the scenes in your favorite games. Also, the source to our mod is available and includes some very simple and useful NN utilites written by my Professor. I believe I listed the names of those files at the bottom of the downloads page. Please read the GNU public license first. Hope this helps. -T

Slashdot Mirror

User: tamer*f

Comments · 5