0% found this document useful (0 votes)
21 views

Event Detection

Uploaded by

mr.parth147
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Event Detection

Uploaded by

mr.parth147
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/317888462

Utilizing Artificial Neural Networks to Detect Compound Events in Spatio-


Temporal Soccer Data

Conference Paper · August 2017

CITATIONS READS

12 772

3 authors, including:

Keven Richly Christian Schwarz


Hasso Plattner Institute Signavio GmbH, Berlin, Germany
26 PUBLICATIONS 159 CITATIONS 16 PUBLICATIONS 97 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Keven Richly on 09 August 2017.

The user has requested enhancement of the downloaded file.


Utilizing Artificial Neural Networks to Detect Compound
Events in Spatio-Temporal Soccer Data
Keven Richly Florian Moritz Christian Schwarz
Hasso Plattner Institute Hasso Plattner Institute Signavio GmbH
Prof.-Dr.-Helmert-Straße 2-3 Prof.-Dr.-Helmert-Straße 2-3 Kurfürstenstraße 111
Potsdam, Germany Potsdam, Germany Berlin, Germany
[email protected] [email protected]. [email protected]
uni-potsdam.de
ABSTRACT a game – can have a major impact on the training and tac-
In the world of professional soccer, performance analytics tic of a team [6]. For professional soccer clubs performance
about the skill level of a player and the overall tactics of a analysis is an integral part of the coaching process [4]. In the
match are supportive for the success of a team. These analyt- context of performance analysis in soccer, many analyses are
ics are based on positional data on the one hand and events based on manually tracked and chronological ordered lists of
about the game (e.g. pass, shot on target) on the other hand. game events on the one hand or the positional information
The positional data of the ball and players is tracked auto- of the players on the other hand [13]. For that reason, the
matically by cameras or via sensors. However, the events are significance and accuracy of analysis strongly correlates with
still captured manually by human, which is time-consuming the quality of the provided data. Detecting events manually
and error-prone. In this paper, we introduce a novel approach is a time-intensive and error-prone task. Based on the data
to detect events in soccer matches by utilizing artificial neural of matches of the German Bundesliga, we discovered that
networks. As input for the neural network, we used several the events are not time-synchronized with the positional
time-dependent features, which were calculated on basis of information and sometimes associated with the wrong player.
the positional data. The evaluation of the results showed that Therefore, in this paper we present the implementation and
it is possible to recognize soccer events in spatio-temporal evaluation of a system that leverages a trained artificial neural
data with a high accuracy. Apart of that, we discovered that network to automatically detect events in the positional data
the size of the used model and the data granularity have a of soccer matches. Based on the data of the ball we computed
strong influence on the quality of the predicted results. a set of different significant features to characterize basic
events in soccer (e.g. a pass). In order to train and test the
KEYWORDS neural network, we also created a gold standard on the basis
of the positional data and video recordings of the matches.
event detection, neural networks, soccer analytics, spatio-
Additionally, we used a grid-search approach to optimize the
temporal data
configurations of the applied supervised learning algorithm.
ACM Reference format: To evaluate the accuracy of our results we used the metrics
Keven Richly, Florian Moritz, and Christian Schwarz. 2017. Uti- precision, recall, and 𝐹1 -score.
lizing Artificial Neural Networks to Detect Compound Events The paper is organized in the following structure. In Sec-
in Spatio-Temporal Soccer Data. In Proceedings of SIGKDD’17 tion 2 we examine related work. Afterwards, we explain the
Workshop on Mining and Learning from Time Series (MiLeTS), properties of the provided data and introduce the created
Halifax, Nova Scotia Canada, August 2017 (MiLeTS’17), 7 pages.
gold standard. In following section, we describe how the fea-
https://doi.org/
tures are computed based on the positional data and Section
5 shows how we used these features to train an artificial neu-
1 INTRODUCTION ral network. We also provide an evaluation about the quality
of our results (see Section 6). Before we conclude the paper
In recent years the use of spatio-temporal data strongly in- in Section 8, we present an overview about future work.
creased in various areas. Especially in the highly competitive
sport sector new insights gained by positional information of
players – tracked by different systems and methods during 2 RELATED WORK
Event detection from time-series data is an important task
Permission to make digital or hard copies of part or all of this work
for personal or classroom use is granted without fee provided that
in many areas. There are various publications, which demon-
copies are not made or distributed for profit or commercial advantage strate that artificial neural networks are a promising approach
and that copies bear this notice and the full citation on the first page. to achieve this task. For example in the biomedical domain,
Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s). neural networks are used to detect epileptic spikes from EEG
MiLeTS’17, August 2017, Halifax, Nova Scotia Canada signals [8]. In industrial security, data from wireless sensor
© 2017 Copyright held by the owner/author(s). networks is monitored to detect fires or other hazards [19].
ACM ISBN . . . $15.00
https://doi.org/
Neural networks are a common machine learning model for
MiLeTS’17, August 2017, Halifax, Nova Scotia Canada

this tasks [7, 8, 17]. In the world of sports, analytics and sta- (-52.5|34.0) (52.5|34.0)

tistics are an important aspect in various decision processes


of coaches, analysts, scouts, and managers. Wickramaratna
et al. use neural networks to detect goal events from video Y
data [20]. Lee et al. use neural networks to classify baseball
hits based on video data [11]. The use of spatio-temporal
(0|0) X
data for sports and soccer analytics has received some atten-
tion from researchers. Yue et al. use statistical methods to
evaluate player and team behavior of a soccer match based
on two-dimensional data [21]. Kim et al. discuss several fea-
tures that can be computed from two-dimensional tracking
data [9]. Miller et al. use two-dimensional position data to (-52.5|-34.0) (52.5|-34.0)
detect shooting habits of basketball players [14]. Richly et
al. compare different the three machine learning approaches Figure 1: Soccer pitch with dimensions of bounds
k nearest neighbors, support vector machines, and random
forests to recognize kick events in soccer data [16]. There
is also research on how positional data can be used to gain
strategic insights. Lucy et al. compare team strategies in
recordings of the games and by taking into consideration the
home and away games using a k-nearest neighbor approach
acceleration values of the ball. The gold standard includes
with ball possession data [12]. As tracking data can be large
the following match sections:
in volume, Bialkowski et al. research conducting player and
team analysis on a large data set for one complete soccer ∙ Set A25
season [2]. Kim et al. take soccer analytics one step further Match: Berlin vs. Mainz
and present a system that predicts short-term future ball Season: 2014/15
positions based on motion fields calculated from video [10]. Time: 00:00 - 03:08
Temporal resolution: 25 Hz
3 DATA FOUNDATION ∙ Set A10
Match: Berlin vs. Mainz
As mentioned before, there are various providers of spatio-
Season: 2014/15
temporal data for professional soccer games. The quality,
Time: 00:00 - 03:08
granularity, and accuracy of the data vary between different
Temporal resolution: 10 Hz
competitors and also strongly depend on the used tracking
∙ Set B25
technology. The provided data sets typically consist of the
Match: Berlin vs. Mainz
positional information of the players and the ball, the manu-
Season 2014/15
ally tracked list of game events as well as some meta data
Time: 25:00 - 31:42
about the teams and players. In this paper, we focus on data
Temporal resolution: 25 Hz
of games of the German Bundesliga. Defined by the pitch
∙ Set C10
size, the range of the two-dimensional coordinates goes from
Match: Berlin vs. Braunschweig
−52.5 to 52.5 for x and the data range of y goes from −34
Season: 2013/14
to 34 (for pitches of the size of 105 m * 68 m). Since the
Time: 70:00 - 73:20
pitch size is not exactly defined, these numbers can differ for
Temporal resolution: 10 Hz
other stadiums. The center of the pitch has the coordinates
(0, 0). The position values can exceed these limits. This indi- The presented data sets have different temporal resolution.
cates that the ball went out of bounds. Figure 1 shows the The original data have a resolution of 25 Hz. To filter noise
schematic layout of a soccer pitch and the coordinates of its and smooth the data, we applied a simple smoothing function
bounds. on the provided data sets A and C. We suspect that this
The list of game events includes the timestamp, event type step could simplify the classification, especially for small data
and involved players. All events are classified in the categories sets.
pass, shot on target, neutral contact, clearance, duel, foul, From the selected sections, we excluded the times, when the
offside, caution, and substitution. Several events, such as ball was out of bounds or the game was paused. Afterwards
fouls, cautions or substitutions, cannot be detected just by we compared the gold standard with the provided event list.
the positional data of the ball and players. They also depend We were able to find 121 out of 194 (62.4%) matching events,
on other information, e.g. the signals of the referee. Addi- within a time period of two seconds and with the same event
tionally, the events are not synchronized with the positional type as our event. These events had an average time delay
information. The delay can be up to several seconds. To eval- of 0.77 seconds. As a next step we examined the assigned
uate and train the supervised machine learning algorithms, player for these events. For the matched events, 18 out of 121
we manually created a gold standard based on the video (14.9%) players were not assigned correctly.
MiLeTS’17, August 2017, Halifax, Nova Scotia Canada

Table 1: Tagged events for gold standard A linear movement results in no significant change of the
direction feature, whereas rapid movement tends to have
Set A25/10 Set B25 Set C10 Total a notable change of direction. We computed the change of
Pass 49 36 50 135
direction as visualized in Figure 2.
Reception 17 17 12 46
Clearance 0 5 1 6
Shot on Target 2 3 2 7 x P3
d2
Total Events 68 61 65 194 P2
Played Time 3:08 min 6:42 min 3:20 min 13:10 min
Excluded Time 0:58 min 1:49 min 1:36 min 4:23 min
Total Time 2:10 min 4:53 min 1:44 min 8:47 min d1 dc1

4 FEATURE COMPUTATION P1
Multiple features of the tracked objects characterize specific y
events in soccer matches. These objects move on the soccer
pitch and influence each other mutually. Events occur when Figure 2: Direction change of object
one or multiple features show a specific characteristic. In
this section, we present the definition of the implemented
Given the three position data points 𝑃0 = 𝑝(𝑜, 𝑡0 ), 𝑃1 =
features. All features are computed based on the positional
𝑝(𝑜, 𝑡1 ) and 𝑃2 = 𝑝(𝑜, 𝑡2 ), the first direction vectors are
data described in the previous section. The positional data
defined as 𝑑0 = 𝑑(𝑜, 𝑡0 ) and 𝑑1 = 𝑑(𝑜, 𝑡1 ). The angle created
is received per tracked object in a 2-by-𝑛 matrix where 𝑛 is
by 𝑑0 and 𝑑1 is the change of direction 𝑑𝑐1 . Possible values
the number of collected data points in a specific time period.
for direction changes are in the range from 0 to 180. To
Each column vector represents the position of the object 𝑜
determine the direction change value, the 𝑎𝑟𝑐𝑐𝑜𝑠 function is
at time 𝑡.
applied to the quotient of the scalar product of 𝑑0 and 𝑑1 and
(︂
𝑥𝑜,𝑡1 𝑥𝑜,𝑡2 · · · 𝑥𝑜,𝑡𝑛
)︂ the product of length of 𝑑0 and 𝑑1 . The direction change 𝑑𝑐
𝑃 𝑜𝑠𝑜,𝑛 = (1) of object 𝑜 at time 𝑡𝑛+1 is defined in the following way:
𝑦𝑜,𝑡1 𝑦𝑜,𝑡2 · · · 𝑦𝑜,𝑡𝑛
We can derive the following definitions from the received
(︂ )︂
𝑑(𝑜, 𝑡𝑛 ) · 𝑑(𝑜, 𝑡𝑛+1 )
positional data. The position of object 𝑜 at time 𝑡 is defined as 𝑑𝑐(𝑜, 𝑡𝑛+1 ) = arccos (6)
|𝑑(𝑜, 𝑡𝑛 )| · |𝑑(𝑜, 𝑡𝑛+1 )|
𝑝(𝑜, 𝑡). Whereas the horizontal position of object 𝑜 at time 𝑡
is 𝑝𝑥 (𝑜, 𝑡) and the vertical position of object 𝑜 at time 𝑡 is 5 EVENT DETECTION
𝑝𝑦 (𝑜, 𝑡). Based on the spatio-temporal data, we calculated In following section, we present our approach to recognize
the time-dependent movement features velocity, acceleration, events based on the features already introduced in the previ-
and change of direction. In this context, we concentrated ous section. The most central object of a soccer match is the
primarily on features of the ball, because it represents the ball. The ball is the object that shows the most and highly
main interaction point in the game. rapid movements on the pitch. Therefore, we computed all
To determine the velocity of two consecutive positions features based on the spatio-temporal data of the ball and
𝑝(𝑜, 𝑡1 ) and 𝑝(𝑜, 𝑡2 ) with 𝑡2 = 𝑡1 + 1, we initially compute created a vector for every time 𝑡 containing all corresponding
the Euclidian distance of these points. Based on the distance, feature values.
we can compute the average velocity or rate of change of Velocity and acceleration describe the current momentum.
position over time as defined in Equation 3. Acceleration peaks are a indicator for interactions with the
ball. The direction change feature covers ball interactions
𝑑𝑖𝑠𝑡(𝑜, 𝑡1 ) = with high intensity (e.g. passes) as well as ball interactions
√︁
with little intensity (e.g. ball touches during dribbling). Each
(𝑝𝑥 (𝑜, 𝑡2 ) − 𝑝𝑥 (𝑜, 𝑡1 ))2 + (𝑝𝑦 (𝑜, 𝑡2 ) − 𝑝𝑦 (𝑜, 𝑡1 ))2 (2)
vector describes an instant of the soccer match and consecu-
𝑤𝑖𝑡ℎ 𝑡2 = 𝑡1 + 1 tive vectors can represent a certain event. Depending on the
∆𝑑𝑖𝑠𝑡(𝑜, 𝑡) type of the event, features become more or less important and
𝑣(𝑜, 𝑡) = (3) have characteristic values. To determine the specific events,
∆𝑡
Accordingly, Equation 4 determines the acceleration as the we trained and used an artificial neural network.
rate of change of the velocity over time. Neural networks are biologically inspired models, that can
model complex non-linear functions [15]. They consist of
∆𝑣(𝑜, 𝑡) several connected layers of artificial neurons as shown in
𝑎(𝑜, 𝑡) = (4)
∆𝑡 Figure 3. A neuron is a single computational function that
While objects move on the soccer pitch they will eventually maps several 𝑥𝑖 of an input vector ⃗ 𝑥 (and a bias term b)
change their direction 𝑑. to a single output value 𝑎 called activation. It computes a
weighted linear combination 𝑧 of the inputs by using different
𝑑(𝑜, 𝑡1 ) = 𝑝(𝑜, 𝑡2 ) − 𝑝(𝑜, 𝑡1 ) 𝑤𝑖𝑡ℎ 𝑡2 = 𝑡1 + 1 (5) weights 𝑤𝑖 for each input and transforms it using a non-linear
MiLeTS’17, August 2017, Halifax, Nova Scotia Canada

Figure 4: Acceleration of the ball (squared and nor-


malized, 10 Hz data).

Figure 3: Neural Network consisting of 3 layers with 5.1 System Architecture


a single output neuron. The core of the event detection system is an artificial neural
network. To train the system, the computed feature data
is transferred from the database and preprocessed for each
activation function ℎ as shown in Equation 7. A commonly match-period. As a first step, the data of each feature is
used activation is the sigmoid function which is defined in normalized. Based on the data of the computed features, the
Equation 8. windows are determined accordingly to the labels of the gold
(︃ )︃ standard (see Section 3). By analyzing the acceleration peaks
∑︁
𝑎 = ℎ(𝑧) = ℎ 𝑤𝑖 𝑥𝑖 + 𝑏 (7) (see Figure 4) and the corresponding video sequences, we
𝑖 determine the window size as 7 frames for the 10 Hz data sets
1 (𝐴10 , 𝐶10 ) and 18 frames for the 25 Hz data sets (𝐴25 , 𝐵25 ).
𝜎(𝑧) = (8) In the next step, the three-dimensional feature windows are
1 + 𝑒𝑥𝑝(−𝑧)
flattened to form a one-dimensional training instance, which
As shown in Figure 3, the neurons are arranged in layers,
is used to train the network.
where the outputs or activations of a layer serve as inputs
The network consists of three layers. The size of the input
for the following layers. The term hidden layer denotes all
layer depends on the window size of the used data set. For
layers between the input and output layer. The network maps
the data sets 𝐴1 0 and 𝐶1 0 the input layer has 21 neurons to
the inputs ⃗𝑥 to outputs y based on the weights 𝑊 and bias
account for the size of the flattened training instances. In the
terms 𝐵 of its neurons (cf. Equation 9). To compute the
case of 25 Hz data the input layer has 54 neurons. The number
output for a given input, the activation values of each layer
of neurons in the hidden layer can be set to an arbitrary
are computed beginning with the input layer. The next layer
number. However, the number of hidden neurons can effect
uses the activation of the previous layer (e.g. the input layer)
the detection performance significantly [15]. For that reason,
as input. This process is called forward propagation.
we attempt to optimize this aspect of the network design (see
Section 6). As mentioned in the pervious section, the input
𝑦 (⃗
𝑥, 𝑊, 𝐵) = ⃗
𝑦 (9)
layer and hidden layer use the sigmoid activation function
The classification of various event types requires differ- and for the output layer we used the softmax function. After
ent output functions. For the multiclass classification of 𝑘 the training phase the system can be used to determine
mutual exclusive classes, we used the softmax function [5]. the posterior probabilities for each class by analyzing the
The softmax function for 𝑘 classes is defined in Equation 10. computed feature data windows.
In contrast to other activation function, the output of the The system is implemented in Python using the libraries for
softmax function is the posterior probability of each class [3]. scientific computing numpy1 , scipy2 and pandas3 . The neural
network implementation is based on the machine learning
𝑒𝑥𝑝(𝑦𝑘 )
𝑦𝑘 (⃗
𝑥, 𝑊, 𝐵) = ∑︀ (10) frameworks scikit-neuralnetwork4 and scikit-learn5 .
𝑗 𝑒𝑥𝑝(𝑦𝑗 )
To train the artificial neural network in a supervised man- 5.2 Model Parameters
ner, we used a gradient-descent algorithm [1]. As first step, we There are several parameters that influence the detection
initialized the network with randomly setting weight. After- accuracy of the underlying neural network model. There-
wards, the training input data with the associated outputs is fore, we tried to increase the performance by optimizing the
used to train the neural network. Based on the corresponding configuration of the parameters for the soccer data.
error function the gradient-descent algorithm of that func-
1
tion by iteratively updating the weights for 𝑊 and 𝐵 in the http://www.numpy.org/
2
http://www.scipy.org/
direction of the negative gradient of 𝑊 and 𝐵. However, this 3
http://pandas.pydata.org/
problem is not convex and the found minimum could be only 4
https://github.com/aigamedev/scikit- neuralnetwork
5
a local minimum [3]. http://scikit- learn.org/
MiLeTS’17, August 2017, Halifax, Nova Scotia Canada

In general, there are several parameters in a neural network, Table 2: Optimal parameter configurations for the
which have an effect on the learning outcome. One of these different temporal resolutions
is the architecture of the network. The number of neurons
in the hidden layer is not specified and can be adjusted to Parameter 25 Hz 10 Hz
the given data set. Neural Networks with a higher number of Number of Hidden Units 50 20
neurons have the ability the represent the data characteristics Learning Rate 0.01 0.01
more precise, but they also have the risk of over fitting the Dropout 0.05 0.01
training data [15]. Therefore we tried to find a number of
hidden neurons that produces the highest general accuracy.
and the true event we can compute the precision and recall
Another factor that has to be taken into account is the
scores to quantify the detection quality.
learning rate. It controls the rate at which the weights are
updated on the basis of new information in the learning pro-
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
cess. Low values result in a network that adopts very slowly. 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (11)
However, if these values are too high, the learning process 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓 𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
may not converge [15]. To avoid over fitting we implemented 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑟𝑒𝑐𝑎𝑙𝑙 = (12)
a technique called dropout. Hereby, a random number of 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓 𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
activations is set to zero for each training instance. This We also computed the 𝐹1 -score, which is the harmonic
mechanism helps to prevent co-adaption of neurons on the mean of precision and recall, and use it as our main evaluation
training data [18]. A too high dropout rate complicates the metric to compare different network settings.
effective learning of the network.
To find an optimal model we used a grid-search approach 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 · 𝑟𝑒𝑐𝑎𝑙𝑙
𝐹1 = 2 · (13)
to test multiple parameter configurations. Here, we list the 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
parameters and the specified values we used:
6.2 Model Optimization
∙ Number of Hidden Units Since we have data sets with different temporal resolutions,
We used different values for 10 Hz and 25 Hz to account we conducted the parameter search for each data set sepa-
for their different sizes of input to the network. rately. For the grid-search approach we selected the labeled
Values (25 Hz): 1 to 50 data of the first minutes of the game Berlin vs. Mainz (𝐴25 ,
Values (10 Hz): 1 to 20 𝐴10 ). Based on the data sets, we generated the training and
∙ Learning Rate testing instances. We tested each configuration by using a
Values: 0.1, 0.05, 0.01, 0.005, 0.001 five-fold cross validation with a 60/40 split for training and
∙ Dropout testing instances. To compare the accuracy of the different
Values: 0, 0.01, 0.05, 0.1, 0.2 configuration the 𝐹1 -score was used. Table 2 shows the con-
These parameters are augmented by the fixed parameter figurations that achieved the highest scores for the given data
for the window size as described in the previous section. The set.
grid-search implementation uses parallel processing to test
different configurations in parallel and speed up the process. 6.3 Model Comparison
The results of the grid-search for the presented parameters By using the configurations presented in the previous section,
are evaluated in following section. we analyzed in more detail the accuracy of the different
models. To compare the performance of the 10 Hz and 25 Hz
6 EVALUATION model, we tested each one using a 100-fold cross validation.
In this section, we present the evaluation results of our ap- Analogous to the configuration computation, we used a 60/40
proach. Based on the presented data (see Section 3), we split between training and testing instances. Afterwards,
optimized the configuration of your neural network and com- we calculated an overall precision, recall and 𝐹1 -score for
pared the accuracy of the different settings. For the evaluation, each iteration and averaged them over all iterations. The
we focused on pass events, which occur most frequently in results are shown in Table 3. The evaluation shows that the
the gold standard. A pass event consists of two consecutive model trained and tested on the 10 Hz data performs much
actions – kick and reception. better than the 25 Hz model with an averaged 𝐹1 -score of
0.89, and averaged precision and recall scores of 0.89 and
0.90 respectively. The 25 Hz model only achieves averaged
6.1 Preliminaries precision, recall and 𝐹1 -scores of 0.52, 0.52 and 0.49.
To evaluate the quality of our event detection, we used the In the next step, we compared the performance of the two
gold standard to generate test and training instances. These models per class. Therefore, we calculated the precision, recall
instances are labeled windows over the feature data at specific and 𝐹1 -scores for each class separately over all iterations. The
time points. The system is trained on the training instances results are shown in Table 4. As expected, we observed that
and then presented with the unknown test instances, which the 10 Hz have a higher accuracy for both classes compared
it has to label. Based on the assigned label by the system to the 25 Hz model. For kick events, both models achieved
MiLeTS’17, August 2017, Halifax, Nova Scotia Canada

Table 3: Precision, recall and 𝐹1 -score, averaged over this evaluation, we focused on the 10 Hz model, because it
all classes produced the best results in the previous experiments. The
motivation for this evaluation is that the characteristics (e.g.
Data set Precision Recall 𝐹1 -Score playing speed, team tactics) of a game could vary between
25 Hz 0.52 0.52 0.49 different matches and teams.
10 Hz 0.89 0.90 0.89 First, we trained and tested the model on the merged data
of two different matches (𝐴10 and 𝐶10 ) to set a baseline. We
Table 4: Precision, recall and 𝐹1 -score per class merged the data sets of the matches of Berlin vs. Mainz and
Berlin vs. Braunschweig. Afterwards, we extracted training
Data set Event Precision Recall 𝐹1 -Score and testing instances based on a 60/40 split. The presented
25 Hz
Kick 0.73 0.91 0.81 results for the merged matches strategy are averaged over
Reception 0.32 0.14 0.18 100 iterations with a random selection of training and testing
Kick 0.95 0.92 0.93
10 Hz
Reception 0.82 0.89 0.85
values.
The second evaluated strategy is the across matches strat-
egy. In this case, the training instances for the model were
Table 5: Comparison of overall results for different
randomly selected from data set 𝐴10 and afterwards tested on
training and testing strategies for two matches.
instances of the data set 𝐶10 . The results of the two strategies
are listed in Table 5 together with the single match results
Strategy Precision Recall 𝐹1 -Score
Merged Matches 0.81 0.75 0.75
of the previous section.
Across Matches 0.65 0.73 0.66 In general, we observed that the scores for the merged
Single Match 0.89 0.90 0.89 matches strategy have a higher accuracy compared to the
results of the across matches strategy. However, both have a
lower performance in comparison to the single match strategy,
a similar recall value of 0.91 or 0.92 respectively. However, where only one single match was used for training and testing.
the 10 Hz model had a better precision score of 0.95 than This suggests that we have to consider differences in playing
the 25 Hz model, which had a precision score of 0.73. That style between different matches.
results in a 𝐹1 -score of 0.93 for the 10 Hz model, and one of When we drill down and evaluate the performance per
0.81 for the 25 Hz model. Both models detect most of the class, the results show that the scores for kick events are
true kick events in the data. However, the lower precision of generally higher than those for reception events in all evalu-
the 25 Hz model implies that this model is more likely to ated strategies. For the kick events both strategies produced
detect a false kick event. Next to that, the comparison for results comparable to these of the model, which was eval-
the reception event showed a more diverging picture. The uated on a single match. One exception to that was the
10 Hz model achieved precision, recall and 𝐹1 scores of 0.82, recall of the across matches strategy, which was slightly lower
0.89 and 0.85. The 25 Hz model however performed not as with a value of 0.72 compared to 0.93 and 0.92 for the other
well with only 0.32 for precision and 0.14 for recall, with an strategies. The implication of this is that applying the across
averaged 𝐹1 -score of 0.18. matches strategy will not be able to detect as many of the
This great difference in performance could be due to the real kick-events. In comparison to the scores of the kick class,
fixed size of the gold standard and the fact that the 25 Hz the scores of the reception class are much lower. While for
model is more complex to train due to its larger structure. the merged matches strategy the receptions have a precision
In this experiment we used the data of match 𝐴10 and 𝐴25 , of 0.75 and recall of 0.56, for the across matches strategy
for which the gold standard holds 85 labeled kick events but they have a precision of 0.39 and a recall of 0.75.
only 34 reception events. One reason for the performance To conclude, the results showed that neural networks
differences could be the unequal distribution of kick and present a viable model to detect events in soccer data. Our
reception events. experiments showed that keeping the complexity of the model
To summarize the previous experiment we can state that low in combination with smoothed data helps to achieve bet-
a model trained and tested on 10 Hz data achieved a higher ter results. The best results were achieved if the model was
accuracy compared to one trained and tested on the 25 Hz trained and tested on data of a single match or mixed matches.
data, using the given gold standard. This could be due to For that reason, we recommend to use merged data of differ-
the fact that the 10 Hz data has been smoothed and has ent matches as training instances to classify completely new
therefore fewer outliers. matches.

6.4 Model Evaluation


7 FUTURE WORK
In this section, we evaluate the effects on the performance
of the event detection when the training and testing data Our experiments have shown that neural networks are gener-
derived from different matches (merged matches strategy) or ally a suitable model to perform event detection on spatio-
the model was trained with data form one soccer match and temporal soccer data. However, since we used two-dimensional
tested with data from another (across matches strategy). For data without height information, the features we calculated
MiLeTS’17, August 2017, Halifax, Nova Scotia Canada

Table 6: Comparison of results by event for different in neural information processing systems (1996), 757–763.
training/testing strategies for two matches [2] Alina Bialkowski, Patrick Lucey, Peter Carr, Yisong Yue, Sridha
Sridharan, and Iain Matthews. 2014. Large-scale analysis of
soccer matches using spatiotemporal tracking data. In 2014 IEEE
Strategy Event Precision Recall 𝐹1 -Score International Conference on Data Mining. IEEE, 725–730.
Kick 0.86 0.93 0.90 [3] Christopher M Bishop. 2006. Pattern recognition. Machine
Merged Matches
Reception 0.75 0.56 0.61 Learning 128 (2006), 1–58.
Kick 0.92 0.72 0.81
Across Matches [4] Christopher Carling, A Mark Williams, and Thomas Reilly. 2005.
Reception 0.39 0.75 0.51
Handbook of soccer match analysis: A systematic approach to
Kick 0.95 0.92 0.93 improving performance. Psychology Press.
Single Match
Reception 0.82 0.89 0.85 [5] Alexandre de Brébisson and Pascal Vincent. 2015. An Exploration
of Softmax Alternatives Belonging to the Spherical Loss Family.
arXiv preprint arXiv:1511.05042 (2015).
cannot capture ball movements on the z-axis. This fact leads [6] Peter Dizikes. 2013. Sports analytics: a real game-changer. Mas-
sachusetts Institute of Technology, MIT News Mar 4 (2013).
to small inaccuracies in the computed features. For that [7] Wenchao Jiang and Zhaozheng Yin. 2015. Human Activity Recog-
reason, the incorporation of the z values could improve the nition using Wearable Sensors by Deep Convolutional Neural
accuracy of the features and consequently lead to an improve- Networks. In Proceedings of the 23rd ACM international confer-
ence on Multimedia. ACM, 1307–1310.
ment of the results. As our experiments have shown, there is [8] Payal Khanwani, Susmita Sridhar, and Mrs K Vijaylakshmi. 2010.
a difference in detection quality between the models working Automated Event Detection of Epileptic Spikes using Neural
with the smoothed 10 Hz and 25 Hz data. We have to further Networks. International Journal of Computer Applications 2, 4
(2010).
analyze, if this is due to the fact that the gold standard [9] Ho-Chul Kim, Oje Kwon, and Ki-Joune Li. 2011. Spatial and
includes only manageable number of events or if the data spatiotemporal analysis of soccer. In Proceedings of the 19th
ACM SIGSPATIAL international conference on advances in
smoothing supports the learning capabilities. Accordingly, geographic information systems. ACM, 385–388.
the effects of different smoothing function on the accuracy of [10] Kihwan Kim, Matthias Grundmann, Ariel Shamir, Iain Matthews,
the results could be evaluated. Jessica Hodgins, and Irfan Essa. 2010. Motion fields to predict
play evolution in dynamic sport scenes. In Computer Vision and
Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE,
8 CONCLUSION 840–847.
[11] Wen-Nung Lie, Ting-Chih Lin, and Sheng-Hsiung Hsia. 2004.
In this paper we presented a system that is able to de- Motion-based event detection and semantic classification for base-
tect events from spatio-temporal soccer data. Using two- ball sport videos. In Multimedia and Expo, 2004. ICME’04. 2004
dimensional positional data, we computed velocity, accelera- IEEE International Conference on, Vol. 3. IEEE, 1567–1570.
[12] Patrick Lucey, Dean Oliver, Peter Carr, Joe Roth, and Iain
tion and change of angle features to capture time-dependent Matthews. 2013. Assessing team strategy using spatiotemporal
movement information from the data. On these features, we data. In Proceedings of the 19th ACM SIGKDD international
conference on Knowledge discovery and data mining. ACM,
then trained a neural network to detect kick and reception 1366–1374.
events and optimize its parameters through a grid-search [13] Rob Mackenzie and Chris Cushion. 2013. Performance analysis
approach. We evaluated and compared the event detection in football: A critical review and implications for future research.
Journal of sports sciences 31, 6 (2013), 639–676.
performance on raw 25 Hz data and smoothed 10 Hz data. [14] Andrew Miller, Luke Bornn, Ryan Adams, and Kirk Goldsberry.
Our experiments showed, that the neural network trained and 2014. Factorized Point Process Intensities: A Spatial Analysis of
tested on 10 Hz data achieved an 𝐹1 -score of 0.89 whereas a Professional Basketball.. In ICML. 235–243.
[15] Thomas M Mitchell. 1997. Machine learning. New York (1997).
network for 25 Hz data achieved only a score of 0.49. Both [16] Keven Richly, Max Bothe, Tobias Rohloff, and Christian Schwarz.
models achieved high scores for kick events, however the 10 2016. Recognizing Compound Events in Spatio-Temporal Football
Data. In International Conference on Internet of Things and
Hz model performed substantially better on reception events Big Data (IoTBD).
with a 𝐹1 -score of 0.85, compared to 0.18 for the 25 Hz [17] Sami Saalasti. 2003. Neural networks for heart rate time series
model. The evaluation of the precision, recall, and 𝐹1 -score analysis. University of Jyväskylä.
[18] Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya
showed that neural networks are a viable model to detect Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a sim-
events in spatio-temporal soccer data. Further experiments ple way to prevent neural networks from overfitting. Journal of
showed that training and testing on different matches have Machine Learning Research 15, 1 (2014), 1929–1958.
[19] Chinh T Vu, Raheem A Beyah, and Yingshu Li. 2007. Compos-
a significant effect on the accuracy of the results. This indi- ite event detection in wireless sensor networks. In 2007 IEEE
cates that different matches and teams have different game International Performance, Computing, and Communications
Conference. IEEE, 264–271.
characteristics, which influence the detection performance. [20] Kasun Wickramaratna, Min Chen, Shu-Ching Chen, and Mei-Ling
To minimize those effects, the training data should consist of Shyu. 2005. Neural network based framework for goal event detec-
data from different matches. tion in soccer videos. In Seventh IEEE International Symposium
on Multimedia (ISM’05). IEEE, 8–pp.
[21] Zengyuan Yue, Holger Broich, Florian Seifriz, and Joachim Mester.
REFERENCES 2008. Mathematical analysis of a soccer game. Part I: Individual
[1] Shun-ichi Amari, Andrzej Cichocki, Howard Hua Yang, et al. 1996. and collective behaviors. Studies in applied mathematics 121, 3
A new learning algorithm for blind signal separation. Advances (2008), 223–243.

View publication stats

You might also like