A Mahjong Game System Architecture Based On Empirical Knowledge
A Mahjong Game System Architecture Based On Empirical Knowledge
Zhang Xiaochuan1, Zhao Hailu1, Gan Chunyan1, Chen Junyu1, Zeng Le2, Huang Tongyuan1
I. Institute of Artificial Intelligence, Chongqing University of Technology, Chongqing 100080
E-mail: [email protected]
Abstract: Imperfect-information game has always been the field that artificial intelligence computer game researchers
want to crack. As a typical imperfect-information game, mahjong has also received extensive attention from researchers.
This paper mainly takes mahjong as the research carrier, and proposes a mahjong game system architecture based on
empirical knowledge, which mainly including professional mahjong game terms analysis, the design of mahjong
computer game system and core game strategies. In the strategy, a combination of empirical algorithm and improved
search tree algorithm is designed to formulate the player's game behavior in different periods. In the search tree
algorithm, an evaluation function is constructed and a method to judge if the hand is in winning state is proposed. Finally,
the experimental results show that the system architecture proposed in this article has a high level of game for the rules of
popular mahjong, and it also has positive reference significance for other rules of mahjong game and
imperfect-information items.
Authorized licensed use limited to: TU Delft Library. Downloaded on November 28,2023 at 12:23:26 UTC from IEEE Xplore. Restrictions apply.
2. ANALYSIS OF PROFESSIONAL TERMS • Pon: It means that player owns a pair of the same tiles,
when any other player plays the third same tile, you
2.1 Discretization of the Basic Rules ofMahjong can call pon;
• Open-kan: A kind of kan which means if the player has
+Basic rules three identical tiles, when any other player plays the
This kind of mahjong contains I 08 tiles, including three fourth tile of the block, the player can call this action;
suits of dot (D), bamboo (B), and character(C). Each suit • Closed-kan: Also a kind of kan. When the player has four
contains numbers from one to nine and every number has same tiles, the player can call this action. The
the same four tiles. Four players are respectively seated in difference between open-kan and closed-kan is that the
the south, east, north, and west. In the game, each player player obtains the fourth tile from the tile-wall;
starts with 13 tiles or 14 tiles. Every player can do actions • Repaired-kan: After the player already having a pon, the
including: Hu, Waiting, Kan, Pon, and Chi, the priority of player gets the fourth tile which is the same as the tile
the action is: Hu > Kan > Pon > Chi, and players can call in a pon, and can also call kan, this is called
waiting anytime if you own specific combination of tiles; repaired-kan;
the execution time for each action limited in 3 seconds; • Waiting: The player's hand is only one tile short of the Hu
when a player completes a winning hand consisting of 14 state, known as waiting state. Players can choose
tiles, game over. whether to call waiting or not. After calling waiting,
+ Scoring rules they can get a score reward, but they can't change tiles,
There are five ways to get scores in popular mahjong: Chi, and only can discard whatever tiles they get.
Pon, Kan, Waiting, and Hu. The details of scores • Hu: The player gains the 14th tile and forms a specific
calculation, fan information and ways of Hu are shown in tiles, and then the player can call Hu to end this round.
the Table 1, Table 2 and Table 3: • Shanten-number: There are still a number of specific
tiles, and if you can be in waiting state after getting
Tablel. Scores calculation some specific tiles for consecutive n times, the state at
this time is called n-shanten. Nis also the least number
Actions Score
of times to reach waiting state [IsJ. Calculating the
Waiting I fan x 3p shanten-number of players' hands can better guide the
Chi lfan x I p players to get and discard tiles;
Pon 2fan x 2p
• Valid tiles: It refers to tiles related to the player's current
hand, which can make the player's hand form shunzi
Open-kan 4fan x I p
(ABC), kezi (AAA) or pair (DD) and satisfy the basic
Closed-kan 3fan x 3p formula of Hu: (4 - a)* AAA+ A* ABC+ DD, where
Repaired-kan !fan x I p a E [0,4].
This section will introduce some proper terms in Mahjong In addition, the player's hand is represented by a
and define some basic concepts, so that readers can two-dimensional array [I6l like hand [][], with three rows and
understand the rest of this article. nine columns. The three rows respectively represent
• Chi: Chi means that player has two consecutive or bamboo, dot, and character. The 0-8 column represents the
separated tiles, and when your previous player tile number 1-9 in turn. For example, hand [1] [3] 2 =
discarded the third consecutive or middle tile, you can indicates that the player has two four dots.
call chi;
Authorized licensed use limited to: TU Delft Library. Downloaded on November 28,2023 at 12:23:26 UTC from IEEE Xplore. Restrictions apply.
3. DESIGN OF THE SYSTEM strategy module, can access the network through HTTP
API, and compete with other AI players online. At the same
3.1 Functional Design time, the referee can create the game, start the game, and
watch the team match in real time. After end of the
Firstly, design the functional structure of the mahjong competition, players can look back at the game video to
computer game system, as shown in Figure 1: view the game related information.
fhc Mahjong Game System The four-layer architecture diagram adopted by the system
is shown in Figure 3. The data access layer represents the
database used by the system; the business layer defines the
access method of the database. The system uses DAO to
access the database to reduce the degree of coupling; the
application layer is the core business processing layer of the
. � � g:�
�
�
;� �
, �
i
r
� i
r
,
� '"
r
��
H
system and is responsible for the connection between
previous layer and next layer. The presentation layer is a
�
collection of components interacting with the system, and
Fig I. The structure of System function the system receives requests or instructions from the user
The data representation module in the figure refers to the through this layer, and calls the functions of the application
representation of tiles, the reception and transmission of layer.
s����
messages, the representation of player positions, etc. During
the entire Mahjong game; the player-side module refers to
+ J-
the functions of the player-side of the system, including:
login, review video, etc.; referee-side module refers to the -·-··-··-··-··-··-··-··-··-··-··-·----·----·-··-··-· ··-··-··-··-··-··-·---·----···-··-· ··-··-··-··-··-··-··-·---·----···-··-··-··-··-··-··- -
B-
and the specified data format information is returned and
received through the interface to realize the player's � . �' �.
-
�
3.3.1 Empirical Algorithm where T, = Eh , which can greatly reduce the number of
;
Statistics show that in the first 8 rounds, players generally branches of the game tree and reduce the search depth of the
have a high shanten-number of their hand, and the game tree. Di meets the following definitions:
infonnation in the game is less, and the state is changeable.
Definition: Let and
It is faster to use empirical algorithm to decide players'
action. The specific principles of empirical algorithm are as set'\!t E D, , t satisfy Formula (1) and Fonnula (2):
follows:
(I) When the shanten-number equals 1, select the tile with h,+1 =hi- { ! } (1)
the most waiting type to be discarded; Nh+ = .min (N,,. ) (2)
(2) When the shanten-number is greater than I, the single il h ;+1Elff+1 "i+l
tile with the lowest degree of integration will be discarded Assuming that the player's hand is 233458, the constructed
first, which means the tile with the least contact with the game tree is shown in Figure 5 below:
surrounding tiles, and follow the principle of discarding tiles
moving the two sides closer to the middle;
(3) When the shanten-number is greater than l and there is
no single tile, we will select it from the consecutive tiles. In
this case, the tile types are more complicated and can be Discard
discussed separately.
(4) Chi: if the shanten-number equals I, choose to Chi in a
direction that can get the most types of tiles to reach the
Take
waiting state; additional, when the shanten-number is
greater than I, two consecutive tiles with a quantity of l also
Win 3334[8 2234[8 234[[8 233348
choose to Chi; for two interval tiles and the special tiles like
1 and 9; after chi action, the shanten-number is less than Fig 5. The process of construction game tree
After 8 rounds, every action of the player is very important. Output: bestTile
We would better considering the information of all the tiles 0: function Search( h;, p,)
in the game, and search for the global optimal card, reaching
I: value� 0
the Hu status as soon as possible. This paper simplifies the
mahjong playing process as follows: playing-actions (chi, 2: ifHu( h,) == True
pon, kan) can be regarded as getting a tile played by others
3: value = pi* score( h;)
or from tile-wall to fonn shunzi (ABC), kezi (AAA) or pair
Authorized licensed use limited to: TU Delft Library. Downloaded on November 28,2023 at 12:23:26 UTC from IEEE Xplore. Restrictions apply.
4: return value 3.3.4 The Design of Hu Method
5: else: The algorithms to judge whether a player's hand is in Hu
6: calculate D, for h, state are mainly backtracking, disassembling, and
search-tables methods. Both the backtracking and the
7: max- 0
disassembly methods have certain requirements on the
8: fort in D;: computing power and high time complexity. Although
search-tables method has certain requirements on memory,
9: hi+I - h, - t
it also has the highest efficiency in judging Hu. Therefore,
10: value = 0 this paper adopts a special tile table based on the idea of
splitting. The structure of the definition table is shown in
11: P;+1 - P;
Figure 6 and suppose player's hand is 2334Cl23666D:
12: calculate Eh,+, for h;+i lconst std: :•ap<int, int> searchTable : {
{ 0, 0), { 3, 3), { 4,3 ),
{ 30,30 }, { 31,30 }, { 32,30 }, ( 33,33 }, ( 34,33 }, { 44,33 },
13: for tile in Eh + :
•I {
{
111,111
122,111
},
},
{ 112,111
{ 123,111
}, { 113,111 }, { 114,114 },
}. { 124,111 },
},
hi+2 - hi+I
{ 133,111 }, { 134,111
14: +tile { 141,141 }, { 142,141 }, { 143,141 }, { 144,144 },
{ 222,222 }, { 223,222 }, { 224,222 },
}, { 244,222 },
Pi+2 - P;+1 p(tile) *
{ 233,222 }, { 234,222
15:
{ 300,300 }, { 301,300 }, { 302,300 }, { 303,303},
{ 311,300 }, { 312,300 }, { 313,300 }, { 314,300 },
value += Search( h;+2, }, { 324,300
16: pi+2)
{ 322,300 }, { 323,300 },
{ 330,330 }, { 331,330 }, { 332,330 }, { lH,H3 }, { H4,Hl },
{ 341,330 }, }, { 34l,H0
if value> max:
{ 342,H0 }, { 344,lll },
17: { 411,411 }, { 412,411 }. { 413,411 }, { 414,414 }.
{ 422,411 }, { 423,411 }, { 424,411 }, { 433,411 }, { 434,411 },
18: max- value { 441,441 }, { 442,441 }, { 443,441 }, { 444,444 }
};
19: bestTile- t
Fig 6. The structure ofHu tile table
20: end if
21: end for Step I: We choose a pair of tiles to remove;
22: end for Step2: Then, we split it into three part including characters
23: return bestTile 111, dots 111003;
24: end function Step3: In this step we start to calculate these numbers.
Characters 111 - 111 0, dots 111003 - 11 IOOO - 3 O;
= =
In the game we use card_num to express the nwnber of Step4: Character & dots & bamboo = 0 and then send the
player's hand and if it satisfies the formula card_ num% 3 =
wining message;
2, the player will choose one tile to discard. At this point, the
4. EXPERIMENT
search tree will be constructed to get the final tile. If card_
num% 3 I, the player needs to deal with the opponent's
= In order to verify the performance of mahjong computer
actions, such as chi, pon, kan, etc., which are decided by the game system in this article, we fight with EXP _Mahjong,
experience algorithm. which strategies are all based on the experience method, and
SER_Mahjong, which strategies are all based on the search
3.3.3 Evaluation Method Design
tree algorithm, were respectively played against 3000 times.
In order to evaluate the progress of the state of Mahjong, it The battle schedule is shown in Table 5 and the two
is necessary to construct a function to estimate the game experiments results will be showed in Table 6 and Table 7:
situation. The evaluation method adopted in this article is to
Table5. The battle schedule
calculate the expected value of victory using following
formulas (see Formula (3) and Formula (4)): Seat 0 1 2 3
!( path)= n
Experiment 2 EXP+SER
p(lite,) (4)
i=l
The results of each of the 3000 games are as follows:
In the formula, E(win) represents the expected wining value
Table6. The result of experiment I
of the node; !( path) represents the probability of choosing
Highest
the path which equals to path node multiplication; score Version Total score Total fan Hu times score in
represents the final score of the player's end hand state (leaf a game
node). If the hand is not in Hu state, the score is 0, or, the EXP+SER 3085 6992 987 25
score is calculated according to the fan type; f(ltle,) EXPI 1280 4521 726 18
represents the probability that the player chooses to get this EXP2 -2349 5689 783 18
tile along the path. After evaluation, the path of the leaf node EXP3 -2016 3876 504 16
with the largest expectation value of victory is selected and
obtain the tile result. Table7. The result of experiment 2
Version Total score Total fan Hu times score in game of Go with deep neural networks and tree search. 2016,
a game 529(7587):484-489.
EXP+SER 4332 6218 1003 30
[6] Anonymous. Chinese artificial intelligence series white papers -
SER2 1476 4790 812 25 [7] Chaslot G , Bakkes S , Szita I , et al. Monte-Carlo Tree Search: A
SER3 -2560 2660 481 18 New Framework for Game Al[C]// Artificial Intelligence &
Interactive Digital Entertainment Conference. DBLP, 2008.
According to the game data available in the table, [8] Mizukami, N., & Tsuruoka, Y. (2015, August). Building a computer
combining empirical methods and strategies to improve the mahjong player based on monte carlo simulation and opponent
search tree algorithm can get more points. Using the models. In 2015 IEEE Conference on Computational Intelligence
empirical method, the tiles are played fast, and the and Games (ClG) (pp. 275-283). IEEE.
decision-making is mainly focused on the processing of the [9] Li J , Koyamada S , Ye Q , et al. Suphx: Mastering Mahjong with
isolated tiles. The search algorithm is used in the later
Deep Reinforcement Leaming[J]. 2020.
period, and the decision-making is more comprehensive.
[IO] Jiewei Lei, et al."A Novel AI-based Method of Playing Incomplete
The program also participated in the 2020 National College
Student Computer Championship and won the second prize lnfonnation Competition via Expectimax Search and Double
doi:10.19678/j.issn.I000-3428.0057309.
5. CONCLUSION
[II] Mingyan Wang, Tianwei Yan, Mingyuan Luo, et al. A novel deep
The authors of this article jointly constructed a mass residual network-based incomplete informatio006E competition
mahjong game system architecture based on empirical strategy for four-players Mahjong games. 2019,
knowledge, and introduced the system construction and 78(16):23443-23467.
strategy design in detail. In the system, the player side and
[12] GAO S, Okuya F, Kawahara Y, et al. Building a Computer Mahjong
the referee side are designed separately to make use of the
Player via Deep Convolutional Neural Networks [J]. 2019.
game system more convenient. In the strategy part, designed
[13] Kurita M, Hoki K. Method for Constructing Artificial Intelligence
an improved algorithm to avoid falling into local optimal
solution. Player with Abstraction to Markov Decision Processes in
In the future, in addition to improving the system Multiplayer Game ofMahjong[J]. 2019.
architecture and using more efficient algorithms to enhance [14] Aravind Rajeswaran, Igor Mordatch and Vikash Kumar. A Game
the system's confrontation capability and also the system Theoretic Framework for Model Based Reinforcement Learning.
should minimize the number of times giving wining tile to 2020.
other plays. What's more, we should pay attention to the [15] L. K. Chuang, A study ofMahjong Program Design, Taiwan, China,
analysis of opponents' hands and improve the system's
2015.
defense capabilities.
[16] Q. Gao, X. H. Xu, H. Wang, et al, An empirical framework ofTexas
REFERENCES poker game system [J]. Journal ofintelligent systems, 2020, 15 (03):
468-474.
[I] Y. J. Wang, H. K. Qiu, and Y. Y. Wu, et al. Research and
[3] Jonathan Schaeffer, Neil Burch, and Yngvi Bjornson, et al. Checkers
550(7676):354-359.
Authorized licensed use limited to: TU Delft Library. Downloaded on November 28,2023 at 12:23:26 UTC from IEEE Xplore. Restrictions apply.