Fighting game is a real-time gaming genre in which the player ( human player or game AI ) controls an on-screen character and engages in close combat with an opponent using various attacks and evasion.

Rule-based AIs are very common for designing the opponent of human player in PvC (Player VS Computer) games. But as it only perform actions that are predefined in their rule bases, they often perform poorly in situations where the pattern can be foreseen and outsmarted by opponent.

Monte-Carlo Tree Search (MCTS) is an algorithm to overcome this drawback. In the selection process of MCTS UCB is applied (Upper Confidence Bound) that overcomes all of the limitations of strategies based on exploration followed by commitment, including the need to know the horizon and sub-optimality gaps. The UCB algorithm has many different forms, depending on the distributional assumptions on the noise.The algorithm is based on the principle of optimism in the face of uncertainty, which is to choose your actions as if the environment is as nice as is plausibly possible.

In MCTS, the aforementioned four steps are repeated for a certain number of times, or until a fixed amount of time is elapsed. Here, it decides a next own action by stochastic simulations and is already successful in many games.

 

Monte Carlo Simulation is random sampling. The more random points we take the better our approximation becomes.

 

  • STEP 1 Selection A next action is chosen based on a given tree policy. This step is repeated several times until a terminal node is reached. As the tree policy, we use UCB1  
    • (1): 

where C is a balance parameter, Ni is the number of visits at the i-th node and Np i is the number of visits at the parent node of the i-th node. In addition, Xi is the average reward

                 (2):

 

where evalj is the value of the reward at the j-th simulation given by

                (3): 

 

 

where the first term is the own hit-point (HP) difference before and after the j-th simulation while the second term is the opponent’s one.

 

  • STEP 2 Expansion If the search reaches a terminal node whose number of visits is larger than the threshold Nmax and the depth of the game tree is smaller than the threshold Dmax, then all adjacent child nodes are added.

 

  • STEP 3 Simulation A simulation is performed for Tsim frames. As stated earlier, the AI will perform a sequence of actions in the path from the root node to the current leaf node, and after this it will perform random actions until the simulation-time budget of Tsim frames is reached; concurrently, the opponent will perform random actions. Finally, a reward is calculated according to (3).

 

  • STEP 4 Back Propagation After reaching the end of simulation, an update is performed for the UCB1 value of each tree node that was traversed in the path.

 

Function Flow for MCTS

 

 

  •  mcts()

call the function uct() as many time as possible.

then return the most visited node.

 

 

  •  playout()

clear MyAction

clear OppAction

Add selected action into nAction.

If  Size of SelectedAction < S

Add more action to nAction from MyAction.

Add 5 action from OppAction from OppAction in random order.

Perform simulation by calling simulate function.

return Score by calling getScore(nFrameData)

 

  •  processing()

mctsPrepare() //Set myAction & oppAction

call Root Node() here it create instance of the root node

rootNode.CreateNode() /*As we have just created the root node

       we have nothing to Select and we start

       EXPAND number of children we make

       are equal to the possible action of the

       player.*/

Action bestAction  =  rootNode.mcts() /* mcts() simulator for UCT_TIME

        calculate score for different child

        by calling uct() Then calls

        getBestVisitAction() This just

        returns the most Visited node

        during simulation*/

                  

        

CommandCenter.CommandCall(bestAction.name()) /* it calls best action given

    by bestAction object of

   Action using commadCenter */   

  •  uct()

first select node by iterating through

the children & select node with best ucb (ucb= Score/ number of games)

If  SelectedNode.game ==0

playout()

Else

If(SlelectedNode.depth.children==Null)

If(SelectedNode.depth<UCT_TREE_DEPTH)

If(UCT_CREATE_NODE_THRESHOLD

< =  SelectedNode.games )

SelectedNode.createNode()

SelectedNode.iscreateNode() = true

score = SelectedNode.uct()

Else

score = SelectedNode.playout()

Else

score = SelectedNode.playout()

Else

If SelectedNode.depth < UCT_TREE_DEPT

score = SelectedNode.uct();//EXPANSION

Else

score = SelectedNode.playout();

 

  •  setMyAction()

if character in air

iterate through all action in air

Add the action in deque if it has sufficient energy to perform the

action

Else

Adds the special skill to the deque MyAction if it has the energy to perform

that action

Iterate through ALL GROUND Action and adds it to deque MyAction if it ` has the energy to perform that action.

 

  • setOppAtion()

if character in air

iterate through all action in air

Add the action in deque if it has sufficient energy to perform the

action

Else

Adds the special skill to the deque OppAction if it has the energy to perform

that action

Iterate through ALL GROUND Action and adds it to deque MyAction if it  has the energy to perform that action.

Most existing fighting game AIs rely on rule bases and react to every situation with predefined actions, making them predictable for human players. We attempt to overcome this weakness by applying MCTS, which can adapt to different circumstances without relying on predefined action patterns or tactics. An AI based on Upper Confidence bounds applied to Trees (UCT) and MCTS is first developed. Next, we propose improving the AI with Roulette Selection and a rule base. Through testing and evaluation using FightingICE, an international fighting game AI competition platform, it is proven that the aforementioned MCTS-based AI is effective in a fighting game.

Leave a Reply