0% found this document useful (0 votes)
84 views6 pages

A Mahjong Game System Architecture Based On Empirical Knowledge

This document proposes a mahjong game system architecture based on empirical knowledge. It analyzes professional mahjong terms and designs the system and core game strategies. The strategies use a combination of empirical and improved search tree algorithms to formulate player behavior. An evaluation function is constructed to judge if a hand is winning. Experimental results show the system has a high game level for popular mahjong rules and provides reference for other rule sets and imperfect information games.

Uploaded by

380301462jsy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views6 pages

A Mahjong Game System Architecture Based On Empirical Knowledge

This document proposes a mahjong game system architecture based on empirical knowledge. It analyzes professional mahjong terms and designs the system and core game strategies. The strategies use a combination of empirical and improved search tree algorithms to formulate player behavior. An evaluation function is constructed to judge if a hand is winning. Experimental results show the system has a high game level for popular mahjong rules and provides reference for other rule sets and imperfect information games.

Uploaded by

380301462jsy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

A Mahjong Game System Architecture Based on Empirical Knowledge

Zhang Xiaochuan1, Zhao Hailu1, Gan Chunyan1, Chen Junyu1, Zeng Le2, Huang Tongyuan1
I. Institute of Artificial Intelligence, Chongqing University of Technology, Chongqing 100080
E-mail: [email protected]

2. JJWorld (Chengdu) Network Technology Company, Chengdu 610000, China


E-mail: [email protected]

Abstract: Imperfect-information game has always been the field that artificial intelligence computer game researchers
want to crack. As a typical imperfect-information game, mahjong has also received extensive attention from researchers.
This paper mainly takes mahjong as the research carrier, and proposes a mahjong game system architecture based on
empirical knowledge, which mainly including professional mahjong game terms analysis, the design of mahjong
computer game system and core game strategies. In the strategy, a combination of empirical algorithm and improved
search tree algorithm is designed to formulate the player's game behavior in different periods. In the search tree
algorithm, an evaluation function is constructed and a method to judge if the hand is in winning state is proposed. Finally,
the experimental results show that the system architecture proposed in this article has a high level of game for the rules of
popular mahjong, and it also has positive reference significance for other rules of mahjong game and
imperfect-information items.

Key Words: Imperfect-information Game, Mahjong, Empirical Knowledge, Game Strategy


2021 33rd Chinese Control and Decision Conference (CCDC) | 978-1-6654-4089-9/21/$31.00 ©2021 IEEE | DOI: 10.1109/CCDC52312.2021.9602779

In 2015, the mahjong Al system developed by Naoki


1. INTRODUCTION Mizukami, a doctoral student in Tokyo, established an
opponent model to speculate part of opponent's hidden
Computer game is one of the most challenging filed in the information, and combined Monte Carlo simulation method
artificial intelligence [IJ. Past years have witnessed a spurt of to find best action [sJ. In August, 2019, the mahjong AI
progress in computer technology, people successively system "Suphx", developed by Microsoft [91, became the
conquered chess and card games such as checkers and Go strongest AI system by defeated many high-level players on
121-141, and then until March 2016, Alpha Go 151 defeated Li the professional mahjong platform "Tenhou", using deep
Shishi, a Go nine player, marking that mankind has reinforce leaning method. The team in Nanchang University
officially broken the most difficult problem of Go in combined the Double DQN and Expectimax search
perfect-information area, and researchers have gradually algorithms to improve the strategy and evaluation function
1101, which improves the winning rate of the system. In 2020,
focused their attention on the imperfect-information games
which are more challenging and difficult. Mahjong is an the team used ResNet-101 to build mahjong game system
intellectual game suitable for all ages and deeply loved by has also achieved good results 1111.
people in Asian countries. In August 2020, the China Although the DRL method improves the winning rate of the
Computer Game Championship officially included the mahjong AI system l12J-[I4l, it does not care about the
mahjong game in its competition item sequence, marking construction of specific game strategies. The key to the
that the mahjong computer game has officially entered the design is the construction of the model and the selection of
public's vision. features. What's more, it needs a lot of high-quality data for
At present, the computer mahjong game still faces many training to get an accurate decision model, which requires a
problems. First of all, as a multiplayer game, mahjong game lot of time and hardware resources to support. To sum up,
has more hidden information than the general two-player this article intends to develop a mahjong game system based
game. Each player has three opponents at the same time, and on empirical knowledge to achieve an efficient mahjong Al
the opponent's tiles are not visible. It is impossible to search system under the rules of the 2020 National College Student
for the player's best action by constructing a traditional Computer Championship Mahjong Project, and analyze the
game tree 161-171; secondly, mahjong is a non-sequential construction process of the computer mahjong game, and
game. The action "Pon" will disrupt the sequence of promote the computer game research work moves forward
players, thus greatly increasing the randomness and together. The main contents are as follows: (1) the design of
unpredictability of the game; in addition, the rules of score interface and information transmission of mahjong
calculation are more complex than others, and it usually computer game system architecture, etc. (2) A mahjong
requires several rounds, and the accumulated points are game strategy based on empirical knowledge and improved
obtained based on each round's score. game tree search is proposed.

This work is supported by National Nature Science Foundation under


Grant 61702063

978-1-6654-4089-9/21/$31.00 @2021 IEEE 1825

Authorized licensed use limited to: TU Delft Library. Downloaded on November 28,2023 at 12:23:26 UTC from IEEE Xplore. Restrictions apply.
2. ANALYSIS OF PROFESSIONAL TERMS • Pon: It means that player owns a pair of the same tiles,
when any other player plays the third same tile, you
2.1 Discretization of the Basic Rules ofMahjong can call pon;
• Open-kan: A kind of kan which means if the player has
+Basic rules three identical tiles, when any other player plays the
This kind of mahjong contains I 08 tiles, including three fourth tile of the block, the player can call this action;
suits of dot (D), bamboo (B), and character(C). Each suit • Closed-kan: Also a kind of kan. When the player has four
contains numbers from one to nine and every number has same tiles, the player can call this action. The
the same four tiles. Four players are respectively seated in difference between open-kan and closed-kan is that the
the south, east, north, and west. In the game, each player player obtains the fourth tile from the tile-wall;
starts with 13 tiles or 14 tiles. Every player can do actions • Repaired-kan: After the player already having a pon, the
including: Hu, Waiting, Kan, Pon, and Chi, the priority of player gets the fourth tile which is the same as the tile
the action is: Hu > Kan > Pon > Chi, and players can call in a pon, and can also call kan, this is called
waiting anytime if you own specific combination of tiles; repaired-kan;
the execution time for each action limited in 3 seconds; • Waiting: The player's hand is only one tile short of the Hu
when a player completes a winning hand consisting of 14 state, known as waiting state. Players can choose
tiles, game over. whether to call waiting or not. After calling waiting,
+ Scoring rules they can get a score reward, but they can't change tiles,
There are five ways to get scores in popular mahjong: Chi, and only can discard whatever tiles they get.
Pon, Kan, Waiting, and Hu. The details of scores • Hu: The player gains the 14th tile and forms a specific
calculation, fan information and ways of Hu are shown in tiles, and then the player can call Hu to end this round.
the Table 1, Table 2 and Table 3: • Shanten-number: There are still a number of specific
tiles, and if you can be in waiting state after getting
Tablel. Scores calculation some specific tiles for consecutive n times, the state at
this time is called n-shanten. Nis also the least number
Actions Score
of times to reach waiting state [IsJ. Calculating the
Waiting I fan x 3p shanten-number of players' hands can better guide the
Chi lfan x I p players to get and discard tiles;
Pon 2fan x 2p
• Valid tiles: It refers to tiles related to the player's current
hand, which can make the player's hand form shunzi
Open-kan 4fan x I p
(ABC), kezi (AAA) or pair (DD) and satisfy the basic
Closed-kan 3fan x 3p formula of Hu: (4 - a)* AAA+ A* ABC+ DD, where
Repaired-kan !fan x I p a E [0,4].

Table2. F an information 2.3 Design of Game State Information Table

For the information in the Mahjong game, the following


Tile species Fan number
information example tables can be designed respectively:
Basic type 6 fan three opponent status tables, which record each player's
Pon-pon 8 fan discard tiles; an overall status information example table,
Uniform-tiles 12 fan indicating the tiles that have appeared on the field; a hidden
information table, shows tall he unknown tiles including the
Seven-pairs 12 fan
hands of three opponents. The specific format of the
Table3. Ways of Hu information table is shown in Table 4 below, taking the
overall status information table as an example:
Ways Score
Table4. All tiles information
Get winning tile from others Fan x I
I 2 3 4 5 6 7 8 9
Get winning tile from tile-wall Fan x 3
Dot 3 2 2 1 0 0 0 2 3

Using the above table definition content, the discretization Bamboo 0 1 1 0 0 1 0 0 1


of Mahjong game information can be completed. Character 4 0 1 0 I I 0 3 1

2.2 Basic Conception

This section will introduce some proper terms in Mahjong In addition, the player's hand is represented by a
and define some basic concepts, so that readers can two-dimensional array [I6l like hand [][], with three rows and
understand the rest of this article. nine columns. The three rows respectively represent
• Chi: Chi means that player has two consecutive or bamboo, dot, and character. The 0-8 column represents the
separated tiles, and when your previous player tile number 1-9 in turn. For example, hand [1] [3] 2 =

discarded the third consecutive or middle tile, you can indicates that the player has two four dots.
call chi;

1826 2021 33rd Chinese Control and Decision Conference (CCDC)

Authorized licensed use limited to: TU Delft Library. Downloaded on November 28,2023 at 12:23:26 UTC from IEEE Xplore. Restrictions apply.
3. DESIGN OF THE SYSTEM strategy module, can access the network through HTTP
API, and compete with other AI players online. At the same
3.1 Functional Design time, the referee can create the game, start the game, and
watch the team match in real time. After end of the
Firstly, design the functional structure of the mahjong competition, players can look back at the game video to
computer game system, as shown in Figure 1: view the game related information.
fhc Mahjong Game System The four-layer architecture diagram adopted by the system
is shown in Figure 3. The data access layer represents the
database used by the system; the business layer defines the
access method of the database. The system uses DAO to
access the database to reduce the degree of coupling; the
application layer is the core business processing layer of the
. � � g:�


;� �
, �
i
r
� i
r
,
� '"

r
��
H
system and is responsible for the connection between
previous layer and next layer. The presentation layer is a

collection of components interacting with the system, and
Fig I. The structure of System function the system receives requests or instructions from the user
The data representation module in the figure refers to the through this layer, and calls the functions of the application
representation of tiles, the reception and transmission of layer.

s����
messages, the representation of player positions, etc. During
the entire Mahjong game; the player-side module refers to

+ J-
the functions of the player-side of the system, including:
login, review video, etc.; referee-side module refers to the -·-··-··-··-··-··-··-··-··-··-··-·----·----·-··-··-· ··-··-··-··-··-··-·---·----···-··-· ··-··-··-··-··-··-··-·---·----···-··-··-··-··-··-··- -

functions of the referee side of the system, including:


logging in, creating a game, starting a game, etc. p ����
L:J

G'.J (lJ � r==i


It is worth noting that the core strategy part of the system is
freely designed by the player, and set relevant information,
____,___,,_,,_,,___ ..........____ ........ ,_,,_,,_,,_,,___,_,_,_,_,_, 1
_,,_,,_,,_,,_,,,,,_______ ................,_,,_____,_ f ·-·

B-
and the specified data format information is returned and
received through the interface to realize the player's � . �' �.

real-time online confrontation mode. [oataH&\IJ � c::=::J 00

-

The design of system game strategy in this subject,


including the design of evaluation function, Hu method and 1 -

core algorithm, will be introduced in detail in section 3.3.

3.2 Structure Design


BBB
Fig 3. The four-layer architecture of the system

3.3 The Overall Design of Algorithm

Mahjong game is a kind of non-sequential, random and


imperfect-information game, which cannot be processed by
the conventional search tree algorithm. According to the
rules of mahjong, the decision-making actions are divided
into two categories: discard actions and playing-actions (chi,
pon and kan). This article uses empirical-based and
improved search tree algorithms to process two types of
decision-making actions, and uses phased activation of
different algorithms to process decisions in different periods.
In the first 8 rounds of mahjong, we aims to form a specific
hand quickly, and all decisions are processed using an
empirical-based method; after 8 rounds, the search
algorithm is activated, and the improved game tree will be
constructed with the information of the tiles and then
selecting the global optimal tile. Additional,
playing-actions' decision will be dealt with empirical
methods. The specific process of algorithm activation is
shown in the following Figure 4, where count represents the
Fig 2. The physical structure of the system number of rounds:
Figure 2 shows the physical structure of the system. The
system is divided into player side and referee side, in which
AI server means player side, that is, the program of the core

2021 33rd Chinese Control and Decision Conference (CCDC) 1827


Authorized licensed use limited to: TU Delft Library. Downloaded on November 28,2023 at 12:23:26 UTC from IEEE Xplore. Restrictions apply.
Start (DD). Therefore, the game tree only contains the
getting-nodes and discarding-nodes.
We define tuple< A, H > to represent player's actions and
Search Tree hand changes during the entire mahjong game, where
Algori thrn
A = { a" a2, a3 ...a; } , represents the set of all actions of
the player in the game, and H represents the set of player
hand status, H = { h,, �, � ...h,} , h represents the set of

tiles , h = {tpl2,t3 .. 1n}. In addition, Eh is used to


;
represent the set of valid tiles in the player's current hand,
Nh is the shanten-number of the player's hand h;, and D;
;
is the set of discarding-nodes that construct the game tree
End
under the current hand, that is, all possible discarding tiles
Fig 4. The specific process of the algorithm choices, we use T; to represent the set of getting-nodes,

3.3.1 Empirical Algorithm where T, = Eh , which can greatly reduce the number of
;
Statistics show that in the first 8 rounds, players generally branches of the game tree and reduce the search depth of the
have a high shanten-number of their hand, and the game tree. Di meets the following definitions:
infonnation in the game is less, and the state is changeable.
Definition: Let and
It is faster to use empirical algorithm to decide players'
action. The specific principles of empirical algorithm are as set'\!t E D, , t satisfy Formula (1) and Fonnula (2):
follows:
(I) When the shanten-number equals 1, select the tile with h,+1 =hi- { ! } (1)
the most waiting type to be discarded; Nh+ = .min (N,,. ) (2)
(2) When the shanten-number is greater than I, the single il h ;+1Elff+1 "i+l
tile with the lowest degree of integration will be discarded Assuming that the player's hand is 233458, the constructed
first, which means the tile with the least contact with the game tree is shown in Figure 5 below:
surrounding tiles, and follow the principle of discarding tiles
moving the two sides closer to the middle;
(3) When the shanten-number is greater than l and there is
no single tile, we will select it from the consecutive tiles. In
this case, the tile types are more complicated and can be Discard
discussed separately.
(4) Chi: if the shanten-number equals I, choose to Chi in a
direction that can get the most types of tiles to reach the
Take
waiting state; additional, when the shanten-number is
greater than I, two consecutive tiles with a quantity of l also
Win 3334[8 2234[8 234[[8 233348
choose to Chi; for two interval tiles and the special tiles like
1 and 9; after chi action, the shanten-number is less than Fig 5. The process of construction game tree

before, so choose to chi;


(5) Pon: Since the score of this action is high, we choose to This article mainly uses recursive thought to traverse the
pong if the number of pairs is greater than 2 and the game tree, and combines the evaluation function (explained
shanten-number decreases after this action. in section 3.3.3) and Hu method (explained in section 3.3.4)
(6) Kan: It always has a higher score and there are fewer to evaluate the leafnodes (that is, the end state) to obtain the
opportunities to meet this situation, so we choose to do this score. According to the score to select the best discard-tile,
action as soon as we can. Algorithm 1 describes the specific process of traversing the
(7) Waiting: When the shanten-number [I 7J is greater than or search tree:
equals to 2, the program will send the message to server for
calling waiting. Algorithm! traversing the search tree

3.3.2 Search Algorithm Input: h,, P;

After 8 rounds, every action of the player is very important. Output: bestTile
We would better considering the information of all the tiles 0: function Search( h;, p,)
in the game, and search for the global optimal card, reaching
I: value� 0
the Hu status as soon as possible. This paper simplifies the
mahjong playing process as follows: playing-actions (chi, 2: ifHu( h,) == True
pon, kan) can be regarded as getting a tile played by others
3: value = pi* score( h;)
or from tile-wall to fonn shunzi (ABC), kezi (AAA) or pair

1828 2021 33rd Chinese Control and Decision Conference (CCDC)

Authorized licensed use limited to: TU Delft Library. Downloaded on November 28,2023 at 12:23:26 UTC from IEEE Xplore. Restrictions apply.
4: return value 3.3.4 The Design of Hu Method
5: else: The algorithms to judge whether a player's hand is in Hu
6: calculate D, for h, state are mainly backtracking, disassembling, and
search-tables methods. Both the backtracking and the
7: max- 0
disassembly methods have certain requirements on the
8: fort in D;: computing power and high time complexity. Although
search-tables method has certain requirements on memory,
9: hi+I - h, - t
it also has the highest efficiency in judging Hu. Therefore,
10: value = 0 this paper adopts a special tile table based on the idea of
splitting. The structure of the definition table is shown in
11: P;+1 - P;
Figure 6 and suppose player's hand is 2334Cl23666D:
12: calculate Eh,+, for h;+i lconst std: :•ap<int, int> searchTable : {
{ 0, 0), { 3, 3), { 4,3 ),
{ 30,30 }, { 31,30 }, { 32,30 }, ( 33,33 }, ( 34,33 }, { 44,33 },
13: for tile in Eh + :
•I {
{
111,111
122,111
},
},
{ 112,111
{ 123,111
}, { 113,111 }, { 114,114 },
}. { 124,111 },
},
hi+2 - hi+I
{ 133,111 }, { 134,111
14: +tile { 141,141 }, { 142,141 }, { 143,141 }, { 144,144 },
{ 222,222 }, { 223,222 }, { 224,222 },
}, { 244,222 },
Pi+2 - P;+1 p(tile) *
{ 233,222 }, { 234,222
15:
{ 300,300 }, { 301,300 }, { 302,300 }, { 303,303},
{ 311,300 }, { 312,300 }, { 313,300 }, { 314,300 },
value += Search( h;+2, }, { 324,300
16: pi+2)
{ 322,300 }, { 323,300 },
{ 330,330 }, { 331,330 }, { 332,330 }, { lH,H3 }, { H4,Hl },
{ 341,330 }, }, { 34l,H0
if value> max:
{ 342,H0 }, { 344,lll },
17: { 411,411 }, { 412,411 }. { 413,411 }, { 414,414 }.
{ 422,411 }, { 423,411 }, { 424,411 }, { 433,411 }, { 434,411 },
18: max- value { 441,441 }, { 442,441 }, { 443,441 }, { 444,444 }
};
19: bestTile- t
Fig 6. The structure ofHu tile table
20: end if
21: end for Step I: We choose a pair of tiles to remove;
22: end for Step2: Then, we split it into three part including characters
23: return bestTile 111, dots 111003;
24: end function Step3: In this step we start to calculate these numbers.
Characters 111 - 111 0, dots 111003 - 11 IOOO - 3 O;
= =

In the game we use card_num to express the nwnber of Step4: Character & dots & bamboo = 0 and then send the
player's hand and if it satisfies the formula card_ num% 3 =
wining message;
2, the player will choose one tile to discard. At this point, the
4. EXPERIMENT
search tree will be constructed to get the final tile. If card_
num% 3 I, the player needs to deal with the opponent's
= In order to verify the performance of mahjong computer
actions, such as chi, pon, kan, etc., which are decided by the game system in this article, we fight with EXP _Mahjong,
experience algorithm. which strategies are all based on the experience method, and
SER_Mahjong, which strategies are all based on the search
3.3.3 Evaluation Method Design
tree algorithm, were respectively played against 3000 times.
In order to evaluate the progress of the state of Mahjong, it The battle schedule is shown in Table 5 and the two
is necessary to construct a function to estimate the game experiments results will be showed in Table 6 and Table 7:
situation. The evaluation method adopted in this article is to
Table5. The battle schedule
calculate the expected value of victory using following
formulas (see Formula (3) and Formula (4)): Seat 0 1 2 3

E<win) = !( path)* score (3) Experiment I EXP+SER EXP EXP EXP

n SER SER SER

!( path)= n
Experiment 2 EXP+SER
p(lite,) (4)
i=l
The results of each of the 3000 games are as follows:
In the formula, E(win) represents the expected wining value
Table6. The result of experiment I
of the node; !( path) represents the probability of choosing
Highest
the path which equals to path node multiplication; score Version Total score Total fan Hu times score in
represents the final score of the player's end hand state (leaf a game
node). If the hand is not in Hu state, the score is 0, or, the EXP+SER 3085 6992 987 25

score is calculated according to the fan type; f(ltle,) EXPI 1280 4521 726 18

represents the probability that the player chooses to get this EXP2 -2349 5689 783 18

tile along the path. After evaluation, the path of the leaf node EXP3 -2016 3876 504 16
with the largest expectation value of victory is selected and
obtain the tile result. Table7. The result of experiment 2

2021 33rd Chinese Control and Decision Conference (CCDC) 1829


Authorized licensed use limited to: TU Delft Library. Downloaded on November 28,2023 at 12:23:26 UTC from IEEE Xplore. Restrictions apply.
Highest [5] David Silver, Aja Huang, Chris J. Maddison, et al. Mastering the

Version Total score Total fan Hu times score in game of Go with deep neural networks and tree search. 2016,
a game 529(7587):484-489.
EXP+SER 4332 6218 1003 30
[6] Anonymous. Chinese artificial intelligence series white papers -

SERI -3248 3956 704 18 machine game 2017 [C]11 0

SER2 1476 4790 812 25 [7] Chaslot G , Bakkes S , Szita I , et al. Monte-Carlo Tree Search: A

SER3 -2560 2660 481 18 New Framework for Game Al[C]// Artificial Intelligence &
Interactive Digital Entertainment Conference. DBLP, 2008.

According to the game data available in the table, [8] Mizukami, N., & Tsuruoka, Y. (2015, August). Building a computer

combining empirical methods and strategies to improve the mahjong player based on monte carlo simulation and opponent

search tree algorithm can get more points. Using the models. In 2015 IEEE Conference on Computational Intelligence
empirical method, the tiles are played fast, and the and Games (ClG) (pp. 275-283). IEEE.
decision-making is mainly focused on the processing of the [9] Li J , Koyamada S , Ye Q , et al. Suphx: Mastering Mahjong with
isolated tiles. The search algorithm is used in the later
Deep Reinforcement Leaming[J]. 2020.
period, and the decision-making is more comprehensive.
[IO] Jiewei Lei, et al."A Novel AI-based Method of Playing Incomplete
The program also participated in the 2020 National College
Student Computer Championship and won the second prize lnfonnation Competition via Expectimax Search and Double

of the country. DQN." Computer Engineering.():.

doi:10.19678/j.issn.I000-3428.0057309.
5. CONCLUSION
[II] Mingyan Wang, Tianwei Yan, Mingyuan Luo, et al. A novel deep

The authors of this article jointly constructed a mass residual network-based incomplete informatio006E competition
mahjong game system architecture based on empirical strategy for four-players Mahjong games. 2019,
knowledge, and introduced the system construction and 78(16):23443-23467.
strategy design in detail. In the system, the player side and
[12] GAO S, Okuya F, Kawahara Y, et al. Building a Computer Mahjong
the referee side are designed separately to make use of the
Player via Deep Convolutional Neural Networks [J]. 2019.
game system more convenient. In the strategy part, designed
[13] Kurita M, Hoki K. Method for Constructing Artificial Intelligence
an improved algorithm to avoid falling into local optimal
solution. Player with Abstraction to Markov Decision Processes in

In the future, in addition to improving the system Multiplayer Game ofMahjong[J]. 2019.

architecture and using more efficient algorithms to enhance [14] Aravind Rajeswaran, Igor Mordatch and Vikash Kumar. A Game

the system's confrontation capability and also the system Theoretic Framework for Model Based Reinforcement Learning.
should minimize the number of times giving wining tile to 2020.
other plays. What's more, we should pay attention to the [15] L. K. Chuang, A study ofMahjong Program Design, Taiwan, China,
analysis of opponents' hands and improve the system's
2015.
defense capabilities.
[16] Q. Gao, X. H. Xu, H. Wang, et al, An empirical framework ofTexas

REFERENCES poker game system [J]. Journal ofintelligent systems, 2020, 15 (03):

468-474.
[I] Y. J. Wang, H. K. Qiu, and Y. Y. Wu, et al. Research and

development of computer games [J]. CAAi Transactions on

lntelligent Systems, 2016, 11(06):788-798.

[2] Campbell, M, Jr H, et al. Deep Blue [J]. ARTIFICIAL

INTELLIGENCE -AMSTERDAM- ELSEVIER-, 2002.

[3] Jonathan Schaeffer, Neil Burch, and Yngvi Bjornson, et al. Checkers

ls Solved. 2007, 317(5844):1518-1522.

[4] David Silver, Julian Schrittwieser, Karen Simonyan, et al. Mastering

the game of Go without human knowledge. 2017,

550(7676):354-359.

1830 2021 33rd Chinese Control and Decision Conference (CCDC)

Authorized licensed use limited to: TU Delft Library. Downloaded on November 28,2023 at 12:23:26 UTC from IEEE Xplore. Restrictions apply.

You might also like