0% found this document useful (0 votes)
36 views

Ondas - Monte-Carlo

This document proposes a Monte-Carlo Tree Search (MCTS) based AI for fighting games that can dynamically adjust difficulty while maintaining believable behaviors. It introduces a new evaluation term focused on damage to the opponent and a parameter that changes based on game situation to balance adjustments to skill level and natural actions.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Ondas - Monte-Carlo

This document proposes a Monte-Carlo Tree Search (MCTS) based AI for fighting games that can dynamically adjust difficulty while maintaining believable behaviors. It introduces a new evaluation term focused on damage to the opponent and a parameter that changes based on game situation to balance adjustments to skill level and natural actions.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Monte-Carlo Tree Search for Implementation of

Dynamic Difficulty Adjustment Fighting Game AIs


Having Believable Behaviors
Makoto Ishihara Suguru Ito Ryota Ishii
Graduate School of Information Graduate School of Information Graduate School of Information
Science and Engineering Science and Engineering Science and Engineering
Ritsumeikan University Ritsumeikan University Ritsumeikan University
Shiga, Japan Shiga, Japan Shiga, Japan
[email protected] [email protected] [email protected]

Tomohiro Harada Ruck Thawonmas


College of Information Science College of Information Science
and Engineering and Engineering
Ritsumeikan University Ritsumeikan University
Shiga, Japan Shiga, Japan
[email protected] [email protected]

Abstract—In this paper, we propose a Monte-Carlo Tree Search player fights against an AI-controlled character. An AI in PvC
(MCTS) fighting game AI capable of dynamic difficulty adjust- usually acts as the opponent for the human player who plays
ment while maintaining believable behaviors. This work targets alone, sometimes as a sparring partner. In this work, we focus
beginner-level and intermediate-level players. In order to improve
players’ skill while at the same time entertaining them, AIs are on PvC and target beginner and intermediate human players
needed that can evenly fight against their opponent beginner and in fighting games.
intermediate players, and such AIs are called dynamic difficulty One of the main features of beginner and intermediate
adjustment (DDA) AIs. In addition, in order not to impair the players is that they do not fully know the game information
players’ playing motivation due to the AI’s unnatural actions such as character operations, available actions and fighting
such as intentionally taking damage with no resistance, DDA
methods considering restraint of its unnatural actions are needed. styles or tactics. They are often defeated by players who fully
In this paper, for an MCTS-based AI previously proposed by the know about the game and by AIs that are too strong compared
authors’ group, we introduce a new evaluation term on action with them. This may cause beginner and intermediate players
believability, to the AIs evaluation function, that focuses on the to lose the motivation to play the game, in the middle of
amount of damage to the opponent. In addition, we introduce a improvement of their skill, and quit it. To prevent this, AIs are
parameter that dynamically changes its value according to the
current game situation in order to balance this new term with needed that can entertain beginner and intermediate players,
the existing one, focusing on adjusting the AI’s skill equal to while such players are still improving their playing skills.
that of the player, in the evaluation function. Our results from Previously, the authors’ group proposed a Monte-Carlo Tree
the conducted experiment using FightingICE, a fighting game Search (MCTS) fighting game AI called “Entertaining AI”
platform used in a game AI competition at CIG since 2014, show (eAI) [1] whose goal is to entertain human players. This AI
that the proposed DDA-AI can dynamically adjust its strength
to its opponent human players, especially intermediate players, can evenly fight against its opponent players by dynamically
while restraining its unnatural actions throughout the game. adjusting its strength according to their playing skill, called
Index Terms—Monte-Carlo tree search, dynamic difficulty dynamic difficulty adjustment (DDA). Namely, eAI will con-
adjustment, fighting game AI, believable, FightingICE duct an action according to the current game situation: when
eAI is losing, it will conduct a strong action, otherwise, eAI
I. I NTRODUCTION will conduct a weak action. From the experimental results, eAI
Fighting games are real-time games in which a character could entertain its opponent human players by evenly fighting
controlled by a human player or a game AI has to defeat their against them. However, we observed that eAI often conducted
opponent using various attacks and evasion. In this work, AI unnatural actions such as repeating no-hit attacks and repeating
is defined as a computer program that controls a character step back even though the distance between the characters is
in a game. There are two types of matches in fighting games: far away. In order not to impair players’ playing motivation
Player VS Player (PvP), where two human players fight against due to AIs’ unnatural actions such as those by eAI mentioned
each other, and Player VS Computer (PvC), where a human above, DDA methods able to restrain its unnatural actions are

978-1-5386-4359-4/18/$31.00 ©2018 IEEE CIG’18


Fig. 3. An overview of MCTS

Fig. 1. Game flow [4] Ikeda and Viennot [2] mentioned the required elements
according to which players can enjoy playing games and how
to design them in terms of AIs in Go. They said using the
aforementioned game flow that AIs are needed that can adjust
their strength according to the opponent players’ skill to evenly
fight against or lose with a little difference in winning ratio.
Fig. 2 shows the game flow applied to fighting game AIs by
us with reference to the aforementioned work by Ikeda and
Viennot. In Fig. 2, players cannot enjoy playing the game if
the opponent AI crushes them (a) or loses with no resistance
at all (b). Additionally, performing clearly unnatural actions
only to balance the game (c) also impairs players’ enjoyment.
AIs should evenly fight against its opponent without unnatural
actions (d), and finally, AIs might lose to its opponent with a
little difference (e), or win if the opponent made some mistakes
Fig. 2. Game flow in fighting games
(f). That is, DDA-AIs capable of restraining its unnatural
actions are needed.
needed [2]. III. E XISTING M ETHODS FOR MCTS-BASED DDA
In this paper, we propose an MCTS fighting game AI
In this section, we describe two DDA-AIs using MCTS.
capable of dynamic difficulty adjustment while maintaining
These AIs are used for comparison with our proposed AI.
believable behaviors. This work targets beginner-level and
intermediate-level players. We use eAI as a based AI and we A. Entertaining AI
introduce a new evaluation term on action believability, to the Entertaining AI (eAI) was an MCTS-based DDA-AI pro-
AI’s evaluation function, that focuses on the amount of damage posed by our group [1]. This DDA method combines MCTS,
to the opponent. In addition, we introduce a parameter that Roulette Selection, and Rule-Based. In this section, we mainly
dynamically changes its value according to the current game explain MCTS, but we point out here that Roulette Selection,
situation in order to balance this new term with the existing where the frequency of each action actually played by the
term in the evaluation function. We verify the performance opponent human player is used in simulation of his/her actions,
of our proposed DDA-AI by a subjective experiment using is deployed in all of the AIs evaluated in this work. For more
FightingICE, a fighting game platform used in a game AI details about the other methods, please see Ishihara et al. [5].
competition at CIG since 2014 [3]. Fig. 3 shows an overview of MCTS applied to fighting
games. This MCTS is based on an open loop approach [6].
II. G AME F LOW
In this figure, the root node has the current game information
Chen [4] mentioned the required elements by which players which consists of both characters’ Hit-Points (HPs), energies,
can enjoy playing games and how to design games to satisfy positions, ongoing actions and the remaining time of the game.
players using game flow (Fig. 1). In Fig. 1, the x-axis Each node except the root node represents an action. In this
represents players’ skill of the game and the y-axis represents MCTS, an action spans from its input to its end, at which the
the game difficulty. This figure indicates that players can enjoy next action becomes executable. An edge simply represents the
playing the game if their skill and the game difficulty fall connection between a parent node and its child node. When
in “FLOW ZONE”. That is, adjusting the game difficulty a parent node’s action has finished, the next action will be
according to players’ skill is needed. This can be said not one of its child nodes. In summary, the game tree using this
only for game design, but also game AIs. MCTS represents the execution order of the AI’s actions.
eAI repeats the four steps in Fig. 3 within a time budget of 4) Backpropagation: evalj obtained from the simulation
Tmax . After the time budget is depleted, eAI selects the most part is backpropagated from the leaf node to the root node.
visited direct child node (action) from the root node as the The U CB1 value of each node along the path is updated as
next action. The rest of this subsection explains each step of well.
MCTS.
B. True Proactive Outcome-Sensitive Action Selection
1) Selection: The child nodes with the highest Upper
Confidence Bounds (U CB1) value [7] are selected from the True Proactive Outcome-Sensitive Action Selection
root node until a leaf node is reached. The formula of U CB1 (TPOSAS) is one of the MCTS-based DDA-AIs with
is: believability proposed by Demediuk et al. [8]. TPOSAS
also uses the same U CB1 formula (1). However, TPOSAS
evaluates nodes using the following formula:
r
2 ln N
U CB1i = X i + C , (1)
Ni +
node.score = − (|hs | − Ih ) , (5)
where Ni is the number of times node (action) i was visited,
N is the sum of Ni for node i and its sibling nodes and C is where hs is the HP difference between the AI and the oppo-
a constant. Xi is the average evaluation of node i represented nent, Ih defines the interval within which all HP differences
+
by the following formula: can be neglected, and (·) indicates the ramp function, i.e.,
a function behaving like the identity function for positive
1 X
Ni numbers and returning 0 for negative numbers.
Xi = evalj , (2) In this formula, the evaluations of all actions having hs less
Ni j=1
than Ih will be 0; otherwise, will be negative. Therefore, all
where evalj is the reward value gained in the jth simulation nodes (actions) with hs less than Ih are more visited. Because
and is defined as: there exist multiple actions that have the highest evaluation
value of zero, unnatural behaviors like repeating the same
action can be avoided.
afterHP my − afterHP opp

evalj = 1 − tanh
j j
, (3) In our experiment, Ih is set to 10% of the maximum player
Scale health as in the work by Demediuk et al. [8].
where afterHPjmy and afterHPjopp stand for HP of the AI C. Problems
and the opponent after the jth simulation, respectively, and As we mentioned in Section I, eAI could entertain its
Scale is a constant. As the HP difference between the AI and opponent human players by evenly fighting against them.
the opponent after the simulation is closer to 0, evalj will However, we could observe that eAI often conducted unnatural
obtain an evaluation value closer to 1. Thereby, strong actions actions such as repeating no-hit attacks and repeating step back
are highly evaluated when the AI is losing; otherwise, weak even though the distance between both characters is far away,
actions. especially in the game situation where the HP difference is
eAI selects the nodes with the highest U CB10i value (using around zero. In that situation, the evaluations of actions which
X i value normalized by using formula (4)) from the root node do not give damage to the opponent and at the same time
until a leaf node is reached. receive no damage such as moving actions will be higher than
other actions. From this, one can readily see that eAI tends to
0 X i − X min
Xi = (4) select such unnatural actions in the above situation.
X max − X min Demediuk et al. conducted the experiments where TPOSAS
In this formula, X max , X min stand for the maximum and fought against human players and other AIs that were submit-
minimum Xi in all nodes at the same tree depth. ted to the Fighting Game AI Competition (FTGAIC)1 to verify
2) Expansion: After a leaf node is reached in the Selection the method’s effectiveness. From these experimental results,
part, if the number of times the leaf node is explored exceeds TPOSAS could dynamically adjust its strength according to
a threshold Nmax and the depth of the tree is lower than a its opponents’ skill. However, although they mentioned about
threshold Dmax , all possible child nodes are created at once its believability, the authors did not quantitatively evaluate this
from the leaf node. factor. Also, they only used the HP difference at the end of the
3) Simulation: A simulation is carried out for Tsim sec- game as the evaluation criterion of DDA, and did not evaluate
onds, sequentially using all actions in the path from the root whether the AI can dynamically adjust its strength throughout
node to the current leaf node for the AI, and actions selected the game.
by Roulette Selection (see [5]) for the opponent. If the number IV. P ROPOSED M ETHOD
of actions of the AI or the opponent used in the simulation is
In this section, we define what is believability in fighting
less than a given number, five in our previous work, randomly
games and explain our new DDA method considering fighting-
selected actions will be used after all actions of the AI or
game believable behaviors.
the opponent have been conducted. The variable evalj is then
calculated using formula (2). 1 http://www.ice.ci.ritsumei.ac.jp/∼ftgaic/index.htm
A. Definition of Believable Behaviors in Fighting Games
As mentioned in Section I, the main purpose of fighting
games is to defeat the opponent using various attacks and
evasion. For that purpose, in this work, believable behaviors
are defined as the aggressive behaviors aimed to defeat the
opponent such as hitting attacks to the opponent properly.
Conversely, unnatural behaviors are defined as those behaviors
contrary to the main purpose mentioned above such as no-
hit attacks (described in Section III-C), although it could be
argued that such non-aggressive actions are also performed to a
certain extent by some human players to taunt their opponents.
B. Evaluation Function with Believability
Fig. 4. Screen shot of FightingICE
The new evaluation function taking into account believabil-
ity is defined as follows:
TABLE I
PARAMETERS USED IN THE EXPERIMENTS
evalj = (1 − α) Bj + αEj , (6)
Notation Meaning Value
where Ej is for difficulty adjustment defined using the same C Balancing parameter 0.42
formula as formula (3). Bj about the AI’s aggressiveness Nmax Threshold of the number of visits 7
(believability) represented by the following formula: Dmax Threshold of the tree depth 3
Tsim The number of simulations 60 frames
beforeHPjopp − afterHPjopp Tmax Execution time of MCTS 16.5 ms
Bj = tanh , (7) Scale Scaling parameter 30
Scale
where beforeHPjopp and afterHPjopp stand for HP of the
opponent before and after the jth simulation, respectively, A. FightingICE
and Scale is a constant. If the AI gives a high amount of FightingICE (Fig. 4) is a real-time 2D fighting game
damage to the opponent, Bj will obtain a high evaluation platform used in a game AI competition (FTGAIC) at CIG
value. Therefore, this term makes the evaluations of aggressive since 2014 [3]. This game has all main elements of fighting
actions aimed at defeating the opponent higher than non- games. In addition, it does not use a ROM emulator and has
aggressive ones. been originally developed from scratch and publicly made
The coefficient α in formula (6) is dynamically determined available for research purpose (see [10-14] for other recent
by formula (8) based on the current game situation: publications using this platform), so there are no legal issues
 beforeHP my −beforeHP opp 
j j
to be concerned. In FightingICE, one game consists of three
tanh Scale +1 60-second rounds and one frame is set to 1/60 seconds. Each
α= , (8)
2 AI has to decide and input an action in one frame. Each
where beforeHPjmy and beforeHPjopp stand for HP of the AI character’s initial HP is set to HPmax , and it will decrease
and the opponent, respectively, before the jth simulation, and when the character is hit. After 60 seconds or either character’s
Scale is a constant. The more the AI is winning against the HP is 0, the game will proceed to the next round, and each
opponent, the closer α reaches 1. Conversely, the more the character’s HP will be reset to HPmax . The character with the
AI is losing against the opponent, α becomes closer to 0. larger remaining HP at the end of the round is the winner.
Therefore, this coefficient makes it easier for the AI to select In our experiments, the value of HPmax is set at 400
actions suitable for difficulty adjustment (Ej ) when the AI is according to the rule of Standard Track of FTGAIC.
winning and select those increasing its aggressiveness (Bj )
B. Parameters
when the AI is losing. Also, when the HP difference is zero
which means the AI is evenly fighting against the opponent, The parameters used in our experiments are shown in
α becomes 0.5. In that situation, the AI selects actions that Table I. These parameters were set empirically through pre-
maintain both difficulty and believability. experiments.
In summary, the mechanism of our proposed method is C. Methods
making the AI select actions by considering not only how to
adjust its difficulty toward the opponent’s skill but also always We conducted subjective experiments to verify whether
how to defeat it. BEAI can adjust its strength according to the opponents’
skill while maintaining its believability. We used 38 subjects
V. E XPERIMENTS (average age: 23.4 ± 2.2) in our experiments. Before starting
In this section, we describe the conducted experiments to our experiments, we conducted an informed consent session
verify the performance of our proposed DDA-AI (Believable about our experiments, and subjects’ consents were obtained
Entertaining AI: BEAI). with their signature in a separate informed consent form. In
addition, we used eAI and TPOSAS for comparison. Our TABLE II
experiments were conducted for two days; the first day is to C ONTENT OF QUESTIONNAIRE
measure each subject’s skill of fighting games (Exp. 1) while
Dimension Index Content
the second day is to have them individually fight against eAI, 1 I felt it content
Positive Affect
TPOSAS and BEAI (Exp. 2). The content of Exp. 1 and Exp. 2 I felt it enjoyable
2 is given below. 3 I felt it challenged
Challenge
4 I felt it stimulated
1) Measurement of fighting games’ skill (Exp. 1): 5 The opponent’s attack skills were believable
Believability
The procedure of Exp. 1 is as follows: 6 The opponent’s dodging skills were believable

1) Explain the experiments and how to operate the charac-


ter in FightingICE. TABLE III
S UBJECT GROUPING IN TERMS OF FIGHTING GAME SKILL
2) Ask each participant to fight against a non-action-AI for
five minutes as practice. Name Average Median # of people
3) Ask each participant to fight against an MCTS-Based Expert 3 ± 38 4 11
high-performance AI (MctsAi) for one game. Intermediate −151 ± 29 -159 12
Beginner −225 ± 25 -221 15
4) Ask each participant to answer a questionnaire.
5) Repeat Steps 3 and 4 two times.
At Step 3, we used a sample AI of FTGAIC proposed by values of three factors in the questionnaire. Video clips show-
Yoshida et al. [9]. The questionnaire used at Step 4 is shown ing typical gameplay by participants against these AIs are
in Table II. This questionnaire was made with reference available1 .
to previous studies [15] and [16]. We asked the subjects
D. Average HP Difference Throughout the Game
to evaluate each question in a 5-Likert scale (1: Strongly
Disagree, 2: Disagree, 3: Neither, 4: Agree, 5: Strongly Agree). AHDTG is introduced by the authors to evaluate how the AI
The evaluation of each factor is the average of the evaluation can dynamically adjust its difficulty according to the opponent
values of all questions (two in our case) belonging to each throughout the game, defined by the following formula:
factor. PFtotal
After finishing Exp. 1, we divided all subjects into three |HPimy − HPiopp |
AHDT G = i=1 , (9)
groups (G1, G2 and G3). We then confirmed that there is no Ftotal
significant difference between three groups in terms of both the where HPimy and HPiopp stand for HP of the AI and the
average HP difference against the MctsAi and the evaluation opponent at the frame i, respectively, and Ftotal stands for
value of the Challenge factor, using a Kruskal-Wallis test. the total number of frames in this round. If the AI evenly
2) Fighting against eAI, TPOSAS and BEAI (Exp. 2): fight against the opponent throughout the round, the value of
The procedure of Exp. 2 is as follows: AHDTG becomes small. This indicates that the smaller the
1) Explain the experiments and how to operate the charac- value of AHDTG is, the more the AI can dynamically adjust
ter in FightingICE. its difficulty according to the opponent’s skill throughout the
2) Ask each participant to fight against a non-action-AI for round.
five minutes as practice. VI. R ESULTS AND D ISCUSSIONS
3) Ask each participant to fight against an AI for one game.
4) Ask each participant to answer a questionnaire. In this section, we show the experimental results and our
5) Repeat Step 3 and 4 for all AIs discussions in terms of AHDTG and each factor of the
questionnaire. Note that the symbols ∗ and ∗∗ used in figures
The questionnaire used in this experiment is the same as and tables in this section represent a significant difference at
the one in Exp. 1, rather than rank-based questionnaires [17] 5% and 1%, respectively.
where participants are asked to compare the play session they
have just finished with the former one. This is because in A. Subject grouping
our case the time window between the two consecutive play From the result of Exp. 1, we divided subjects into three
sessions is up to 3 minutes – one game – by which participants groups –Expert, Intermediate and Beginner– based on the
might not be able to make precise comparison. The fighting
1 http://www.ice.ci.ritsumei.ac.jp/˜ruck/dda-cig2018.htm
order of each AI was determined according to the Latin-square
method as follows:
• G1 eAI→TPOSAS→BEAI TABLE IV
• G2 TPOSAS→BEAI→eAI R ESULTS OF A F RIEDMAN TEST ON AHDTG IN EACH GROUP
• G3 BEAI→eAI→TPOSAS
Name p-value
In Exp. 2, we evaluated each AI’s performance using a Expert .078
metric called Average HP Difference Throughout the Game Intermediate .017∗
(AHDTG), described in Section VI-D, and the evaluation Beginner .006∗∗
TABLE V
R ESULTS OF A F RIEDMAN TEST ON P OSITIVE A FFECT IN EACH GROUP

Name p-value
Expert .658
Intermediate .084
Beginner .723

average HP difference at the end of the game against the


MctsAi, using the k-means method with k = 3. Table III shows
the result of subject grouping. In Table III, the column Average
represents the aforementioned average HP difference and the
standard deviation of subjects belonging to each group, each
playing three games. Fig. 6. Average evaluations of Positive Affect toward gameplay against eAI,
TPOSAS and BEAI, in each group
B. AHDTG
Fig. 5 shows the average AHDTGs against eAI, TPOSAS
and BEAI in each group. In Fig. 5, the x-axis represents the
group names, the y-axis represents the value of AHDTG, and
the error bars represent standard deviations of AHDTG for
the three AIs in each group. We can see that BEAI obtains
less AHDTG than eAI and TPOSAS against Intermediate and
Beginner. According to our analysis of the gameplay, BEAI
tends to behave aggressively especially in the game situation
where the HP difference is around zero, due to the new
evaluation term about its aggressiveness, compared with other
two AIs. Therefore, one can consider that compared to eAI
which tends to behave strangely like intentionally filling up
the HP difference after the value becomes too large (AI losing Fig. 7. Average evaluations of Challenge toward gameplay against eAI,
too much), BEAI could evenly fight against the opponent like TPOSAS and BEAI, in each group
a seesaw game shown in Fig. 2 (d).
However, BEAI obtains more AHDTG than eAI and
TPOSAS against Expert. According to our observation, some its strength against expert players compared with the other two
players in Expert adopted a fighting style like “counter-attack” AIs.
by which they appropriately hit their attacks to the opponent Table IV shows the results of a Friedman test on AHDTG
against the opponent’s conducted actions. We could often see in each group. There are significant differences at 5% and
that these players hit their strong attacks such as the ultimate 1% between the three AIs in Intermediate and Beginner,
attack to BEAI when it stepped forward in order to shorten respectively. From these results, we can conclude that BEAI
the distance; in other words, they exploited BEAI’s aggressive could dynamically adjust its difficulty against intermediate and
behaviors against the AI. For this reason, BEAI couldn’t adjust beginner players throughout the game compared to the existing
DDA methods.

C. Positive Affect
Fig. 6 shows the average evaluations of Positive Affect
toward gameplay against eAI, TPOSAS and BEAI, in each
group. In Fig. 6, the x-axis represents the group names,
the y-axis represents the evaluation value (1: Boring ∼ 5:
Enjoyable) of Positive Affect, and the error bar represents the
standard deviation of it in each group. We can see that BEAI
obtains higher evaluation values than eAI and TPOSAS against
Expert and Beginner. However, it obtains a lower evaluation
value than eAI against Intermediate. From our analysis, we
could observe that BEAI often forced players to fight in
the close range compared to the other two AIs. Subjects
belonging to Intermediate fought against their opponent AIs
Fig. 5. Average AHDTGs against eAI, TPOSAS and BEAI, in each group using various actions and strategies, similar to those players in
TABLE VI
R ESULTS OF A F RIEDMAN TEST ON C HALLENGE IN EACH GROUP

Name p-value
Expert .187
Intermediate .024∗
Beginner .840

TABLE VII
R ESULTS OF A F RIEDMAN TEST ON B ELIEVABILITY IN EACH GROUP

Name p-value
Expert .886
Intermediate .042∗
Beginner .420

Fig. 8. Average evaluations of Believability toward gameplay against eAI,


TPOSAS and BEAI, in each group
Expert. However, since their skill is not that high, compared
to Expert, they couldn’t fight the way they wanted because of
the BEAI’s aggressive behavior compared to eAI, which led represents the evaluation value (1: Unnatural ∼ 5: Believable)
to the decrease in affect evaluation toward gameplay against of Believability, and the error bar represents the standard devi-
BEAI. Although having this issue, BEAI obtains more than ation for each AI in each group. We can see that BEAI obtains
3.75 points in all groups. Thus, we can still say that the higher evaluation than eAI and TPOSAS against Intermediate.
subjects evaluated fighting against BEAI favorably. From our analysis, we could observe that BEAI conducted less
Table V shows the results of a Friedman test on Positive Af- unnatural actions as mentioned in Section III-C, especially the
fect in each group. There is no significant difference between game situations where the HP difference is around zero. Thus,
the three AIs in all groups. From these results, although there our proposed evaluation function could dynamically adjust
is no significant difference, we can conclude that BEAI could the AI’s difficulty while restraining its unnatural actions, and
entertain expert and beginner players more than the existing improve the evaluation value of Believability evaluated by
DDA methods. intermediate players.
Table VII shows the results of a Friedman test on Believ-
D. Challenge ability in each group. There is a significant difference at 5%
Fig. 7 shows the average evaluations of Challenge toward between the three AIs in Intermediate. From these results,
gameplay against eAI, TPOSAS and BEAI, in each group. we can conclude that BEAI could adjust its difficulty while
In Fig. 7, the x-axis represents the group names, the y-axis restraining its unnatural actions against intermediate players.
represents the evaluation value (1: Too weak ∼ 3: Good
difficulty ∼ 5: Too strong; note that 3 is the best) of Challenge, VII. C ONCLUSIONS AND F UTURE W ORK
and the error bar represents standard deviation for each AI in In order to improve players’ skill while at the same time
each group. We can see that BEAI obtains higher evaluation entertaining them, AIs are needed that can evenly fight against
values than eAI and TPOSAS against Expert and Intermediate. their opponent beginner and intermediate players; such AIs are
However, BEAI obtains lower evaluation than the other two called DDA-AIs. In addition, in order not to impair the players’
AIs against Beginner. From our analysis, we could observe playing motivation due to the AI’s unnatural actions, DDA
that subjects belonging to Beginner often used simple strate- methods that can restrain their unnatural actions are needed.
gies such as stepping forward and punching or kicking. As In this paper, we proposed an MCTS fighting game AI capable
BEAI behaves aggressively, there were many situations where of DDA while maintaining its believable behaviors, targeting
subjects and BEAI gave damage to each other in close ranges. beginner-level and intermediate-level players. We used eAI
Thus, they evaluated BEAI to be too strong due to these proposed previously by our group [1] as a based AI (eAI)
situations, compared to the other two AIs. and introduced a new evaluation term on action believability,
Table VI shows the results of a Friedman test on Challenge to the AI’s evaluation function, that focuses on increasing the
in each group. There is a significant difference at 5% between amount of damage to the opponent. In addition, we introduced
the three AIs in Intermediate. From these results, we can a parameter that dynamically changes its value according to
conclude that BEAI could adjust its difficulty against expert the current game situation in order to balance this new term
and intermediate players in a way that they felt the opponent with the existing term in the evaluation function.
AI’s difficulty was suitable for them. From our experimental results, our proposed DDA-AI
showed the best performance in terms of average HP difference
E. Believability throughout the game (AHDTG), Challenge and Believability
Fig. 8 shows the average evaluations of Believability toward against intermediate players, and AHDTG against beginner
gameplay against eAI, TPOSAS and BEAI, in each group. players. As a result, we conclude that our proposed DDA-AI
In Fig. 8, the x-axis represents the group names, the y-axis could dynamically adjust its strength to its opponent human
players’ skill, especially intermediate players, while restrain- [11] X. Neufeld, S. Mostaghim, and D. Perez-Liebana, “HTN fighter: Plan-
ing its unnatural actions throughout the game. The proposed ning in a highly-dynamic game,” in Proc. 2017 Computer Science and
Electronic Engineering (CEEC 2017), Colchester, pp. 189-194, Sep.
evaluation function (6) has a potential to be applied to MCTS- 2017.
based AIs in other games to maintain the aggressiveness [12] S. Yoon and K.-J. Kim, “Deep Q Networks for Visual Fighting Game
while shrinking the performance gap with the opponent human AI,” in Proc. 2017 IEEE Conference on Computational Intelligence and
Games (CIG 2017), 2017.
player, in particular when the AI is winning. [13] M.-J. Kim and K.-J. Kim, “Opponent Modeling based on Action Table
However, although our proposed DDA-AI was evaluated for MCTS-based Fighting Game AI,” in Proc. 2017 IEEE Conference
favorably by intermediate and beginner players, it could not on Computational Intelligence and Games (CIG 2017), 2017.
[14] D.T.T Nguyen, V. Quang and K. Ikeda, “Optimized Non-visual In-
significantly improve the evaluation value of Positive Affect, formation for Deep Neural Network in Fighting Game,” in Proc. 9th
compared to eAI. For future work, we plan to develop a International Conference on Agents and Artificial Intelligence (ICAART
new mechanism for entertaining players while keeping its 2017), pp. 676-680, 2017.
[15] W. A. IJsselsteijn, K. Poels, and Y. De Kort, “The Game Experience
believability. It might also be interesting to combine the Questionnaire: Development of a Self-report Measure to Assess Player
proposed DDA-AI with a mechanism that directly emulates Experiences of Digital Games”, Technical University Eindhoven, FUGA
human players [18]. In addition, we will also develop much Technical Report, 46 pages, 2007.
[16] B. Soni and P. Hingston, “Bots trained to play like a human are more
stronger AIs as based AIs for new DDA-AIs that can adapt to fun,” in Proc. International Joint Conference on Neural Networks, pp.
expert players. 363-369, 2008.
[17] G. N Yannakakis and J. Hallam, “Evolving opponents for interesting
interactive computer games,” in Proc. the 8th International Conference
on the Simulation of Adaptive Behavior (SAB04); From Animals to
Animats 8, pp. 499-508, 2004.
[18] S. Devlin, A. Anspoka, N. Sephton, P.I. Cowling,“Combining Gameplay
ACKNOWLEDGMENT Data with Monte Carlo Tree Search to Emulate Human Play,” in Proc.
Twelfth Artificial Intelligence and Interactive Digital Entertainment
The authors would like to thank the anonymous reviewers Conference (AIIDE 2016), pp. 16-22, 2016.
for their helpful comments. They would also like to thank their
lab members, in particular, the FightingICE team members for
their fruitful discussions. This research was partially supported
by Strategic Research Foundation Grant-aided Project for
Private Universities (S1511026), Japan.

R EFERENCES
[1] M. Ishihara, T. Miyazaki, T. Harada, and R. Thawonmas, “Analysis of
Effects of AIs and Interfaces to Players’ Enjoyment in Fighting Games,”
IPSJ Journal, vol. 57, no. 11, pp. 2415-2425, 2016, (in Japanese).
[2] K. Ikeda and S. Viennot, “Production of Various Strategies and Position
Control for Monte-Carlo Go - Entertaining human players,” in Proc.
Computational Intelligence in Games (CIG), 8 pages, 2013.
[3] F. Lu, K. Yamamoto, L. H. Nomura, S. Mizuno, Y. Lee, and R. Thawon-
mas, “Fighting Game Artificial Intelligence Competition Platform,” in
Proc. IEEE 2nd Global Conference on Consumer Electronics (GCCE),
pp. 320-323, 2013.
[4] J. Chen, “Flow in games (and everything else),” Communications of the
ACM, vol. 50, no. 4, pp. 31-34, 2007.
[5] M. Ishihara, T. Miyazaki, C. Y. Chu, T. Harada, and R. Thawonmas,
“Applying and Improving Monte-Carlo Tree Search in a Fighting Game
AI,” in Proc. 13th International Conference on Advances in Computer
Entertainment Technology (ACE 2016), no. 27, 2016.
[6] D.P. Liebana, J. Dieskau, M. Hunermund, S. Mostaghim, S. Lucas,
“Open Loop Search for General Video Game Playing,” in Proc. the 2015
Annual Conference on Genetic and Evolutionary Computation (GECCO’
15), pp. 337-344, 2015.
[7] L. Kocsis and C. Szepesvári, “Bandit Based Monte-Carlo Planning,” in
Proc. European Conference on Machine Learning (ECML), pp. 282-293,
2006.
[8] S. Demediuk, M. Tamassia, W. L. Raffe, F. Zambetta, X. Li, and F.
Mueller, “Monte Carlo Tree Search Based Algorithms for Dynamic
Difficulty Adjustment,” in Proc. Computational Intelligence and Games
(CIG), pp. 53-59, 2017.
[9] S. Yoshida, M. Ishihara, T. Miyazaki, Y. Nakagawa, T. Harada, and R.
Thawonmas, “Application of Monte-Carlo Tree Search in a Fighting
Game AI,” in Proc. IEEE 5th Global Conference on Consumer Elec-
tronics (GCCE), pp. 623-624, 2016.
[10] R. Ishii, S. Ito, M. Ishihara, T. Harada and R. Thawonmas, “Monte-Carlo
Tree Search Implementation of Fighting Game AIs Having Personas,” in
Proc. 2018 IEEE Conference on Computational Intelligence and Games
(CIG 2018), 2018.

You might also like