0% found this document useful (0 votes)
2 views

Deep_learning_passes

The document presents a convolutional neural network model designed to predict soccer passes by analyzing spatial relations among players on the field. The model processes static snapshots of game situations to derive features that capture the dynamics of potential passes, achieving promising results compared to previous methods. Experimental evaluations indicate that the model outperforms earlier approaches and approaches human-level performance in predicting pass outcomes.

Uploaded by

Sergio PeFer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Deep_learning_passes

The document presents a convolutional neural network model designed to predict soccer passes by analyzing spatial relations among players on the field. The model processes static snapshots of game situations to derive features that capture the dynamics of potential passes, achieving promising results compared to previous methods. Experimental evaluations indicate that the model outperforms earlier approaches and approaches human-level performance in predicting pass outcomes.

Uploaded by

Sergio PeFer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/332256995

Deep Learning from Spatial Relations for Soccer Pass Prediction

Chapter · April 2019


DOI: 10.1007/978-3-030-17274-9_14

CITATIONS READS
9 601

3 authors, including:

Ondřej Hubáček Gustav Šír


Czech Technical University in Prague Czech Technical University in Prague
10 PUBLICATIONS 152 CITATIONS 24 PUBLICATIONS 232 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Deep Relational Learning View project

All content following this page was uploaded by Gustav Šír on 18 June 2021.

The user has requested enhancement of the downloaded file.


Deep Learning from Spatial Relations
for Soccer Pass Prediction

Ondřej Hubáček, Gustav Šourek, and Filip Železný

Czech Technical University, Prague, Czech Republic


{hubacon2,souregus,zelezny}@fel.cvut.cz

Abstract. We propose a convolutional architecture for learning repre-


sentations over spatial relations in the game of soccer, with the goal to
predict individual passes between players, as a submission to the pre-
diction challenge organized for the 5th Workshop on Machine Learning
and Data Mining for Sports Analytics. The goal of the challenge was
to predict the receiver of a pass given location of the sender and all
other players. From each soccer situation, we extract spatial relations
between the players and a few key locations on the field, which are then
hierarchically aggregated within the neural architecture designed to ex-
tract possibly complex gameplay patterns stemming from these simple
relations. The use of convolutions then allows to efficiently capture the
various regularities that are inherent to the game. In the experiments,
we show very promising performance of the method.

1 Introduction
Predictive sport analytics is a modern discipline where various statistical mod-
els are employed to assess different aspects of a game. In this paper, we focus
on the game of soccer and predicting individual passes between players during
the match, given a static snapshot of each pass situation, i.e. indication of ball
possession and locations of all the players. This setting was given by the predic-
tion challenge organized for the 5th Workshop on Machine Learning and Data
Mining for Sports Analytics held in conjunction with ECML.
Since each learning example is, in this case, just an independent, static view-
point on the game, we approach the problem from a simple geometrical perspec-
tive. In that view, we take each situation, determined by mere absolute locations
of the players, enrich these with a few soccer-specific contextual locations, and
turn their absolute positions into mutual, relative distances. This way we en-
able to the model to generalize across different situations, reasoning about the
mutual spatial patterns between the players, rather than their positions on the
filed. These spatial patterns are represented with convolutional filters, capturing
the inherent symmetries and geometrical regularities arising from the rules of
the game. These patterns are then further aggregated with pooling and com-
bined in a fully connected manner to help the model to explore their relations.
As opposed to some existing works based quite heavily on expert knowledge,
we employ just a very few assumptions on the patterns and rather aim at the
benefits of end-to-end learning.
2 Hubáček, Šourek, Železný

1.1 Related Work

Inductive logic programming model [7] trained on qualitative spatial representa-


tions [2] was previously used to tackle the task of predicting soccer passes. Similar
approach was used for discovering offensive patterns [6]. Spatio-temporal data
were further utilized to infer teams’ play-styles [1,4] and to examine the likelihood
of scoring a goal from a shot [3]. Another approach leveraged a physics-based
model of soccer ball motion to predict the receiver of the pass [5].

1.2 Dataset

The dataset consisted of 12 124 soccer passes from which 10 045 passes were
successful (meaning that the sender and the receiver of the pass were from the
same team). We decided to focus on predicting only the successful passes as was
done previously [7].
Unlike in the previous work [7], the dataset contained solely the snapshot of
the game in form of coordinates of each of the 22 players on the field, making
the situations independent of each other. This makes the prediction task much
harder, because we have no information about players’ momentum, or orientation
in space etc. Neither were are capable to determine the same team or player
across multiple situations. The dataset also contained the timestamp of when
the pass was send and received. Due to the predictive nature of the task, we
decided to omit the timestamp of the pass receipt, since it is obviously not
available when making the actual prediction.
In 367 cases, only 21 players’ coordinates were present, presumably after
one player had been sent off. To deal with the missing coordinates we inputted
surrogate large numbers as the coordinates, so this position became meaningless
for the predictions.

2 Predictive model

The proposed model is a neural architecture consisting of a convolutional layers


with diverse filters, max-pooling and a fully connected layers with a softmax out-
put. Each of the convolutional filters encodes a certain feature-set transformation
designed to extract a particular context from the game snapshots. Intuitively,
these may collect information on how occupied the potentially receiving player
is, how pressured the sender of the pass is, or where is the receiver positioned
on the field w.r.t. his teammates. The max-pooling layer helps the model to be-
come agnostic to the particular positioning and ordering of the players in order
to generalize better, based on the intuition that typically only a very few closest
players are relevant to each pass. The softmax output then naturally encodes
the exclusive outcomes of each situation, since only one pass at a time is ever
carried out.
Deep Spatial Relations for Soccer Passes 3

Table 1. Enriching spatial snapshots with contextual locations.

Feature Description
f1 : dist(ps , pr ) Distance between sender ps and potential receiver pr .
f2 : dist(ps , goal) Distance of ps to the center of the opponent’s goal.
f3 : dist(ps , side) Perpendicular distance of ps to closest sideline.
f4 : dist(ps , centert1 ) Distance of ps to the center of gravity of his teammates.
f5 : dist(ps , centert2 ) Distance of ps to the center of gravity of his opponents.
f6 : dist(pr , goal) Distance of pr to the center of the opponent’s goal.
f7 : dist(pr , side) Perpendicular distance of pr to closest sideline.
f8 : dist(pr , centert1 ) Distance of pr to the center of gravity of his teammates.
f9 : dist(pr , centert2 ) Distance of pr to the center of gravity of his opponents.
f10 : dist(pr , pi ) Distance of pr to a teammate pi .
f11 : dist(ps , pi ) Distance of ps to a teammate pi .
f12 : dist opp(pr , pi ) Distance of pr to an opponent pi .
f13 : dist opp(ps , pi ) Distance of ps to an opponent pi .

2.1 Knowledge Representation

The raw data come in a simple table format where, for each pass situation during
the course of each game, we are given x-y coordinates of the 22 players on the
field with an indicator of the sender of the pass ps , i.e. a tuple of

(timestamp, p1x , p1y , . . . , p11x , p11y , p12x , p12y . . . , p22x , p22y , ps )

For the purpose of pass prediction, we look at each snapshot from the perspective
of potential successful passes between the ball-possesing player ps and all his
teammates (potential receivers) pr , i.e. for each situation we have 10 pairs of
players
(
{p1 , . . . , p11 } \ ps , if s ∈ {1, . . . , 11}
(ps , pr ), such that pr ∈
{p12 , . . . , p22 } \ ps , if s ∈ {12, . . . , 22}

As a preprocessing step, we enrich these pairs with several key static and dy-
namic locations from the field, upon which we measure distances as described in
Table 1. These enriched pairs, representing the potential passes, then constitute
our learning examples.

2.2 Neural Architecture

An overview of the neural architecture is displayed in Figure 1. At the input to


the model, the resulting spatial relations described in Section 2.1 are being ag-
gregated into sets to form feature maps for the convolutional filters. Particularly,
for each potential pass (ps , pr ), we conform the relations into different filters ex-
pressing different viewpoints on the pass, such as cover of the receiving player or
pressure on the sender and alternatives available to him, as detailed in Table 2.
4 Hubáček, Šourek, Železný

Table 2. Conformation of spatial relations into convolutional filters.

Filter Features Context


alternative (f1 , f10 , f11 ) The alternatives available to the sender.
cover (f1 , f12 , f13 ) The occupation of the sender by his opponents.
pressure (f1 , f12 , f13 ) The pressure on the sender from his opponents.

Each of these feature sets, or filters, may be instantiated multiple times w.r.t.
the variables pi iterating over the opponents of the sender (cover, pressure)
and teammates of the sender (alternative). Within the context of each filter,
we order the remaining players w.r.t. the f10 , f12 and f13 for alternative, cover
and pressure, respectively. This way we enforce ordering on these instantiations,
resulting into 1D feature maps upon which the filters operate, as depicted in Fig-
ure 1. Thus despite the cover and pressure filters operating on the same feature
sets, they will result into different feature maps. Also, since all these filters prin-
cipally share the common static context of where within the field the current
situation occurs, described by the features f1 . . . f9 , we exclude these from the
individual filters to merge them later in the model only to prevent redundancy
in the feature maps.
The resulting values from these filters are then aggregated via max-pooling.
While multiple pools could be connected with a standard overlay to capture the
different sub-regions of the distance space, we set a global pool over all instanta-
tions of each single filter, following the intuition that only the closest players are
typically relevant, suppressing the potential noise from the rest. To alleviate this
somewhat radical assumption, we also employ wider filters to capture couples
of the remaining players rather than individuals. This way we may also reason
about more complex spatial patterns between the relevant players. These filters
of size 3 × 2 and 3 × 3 further distinguish the use of cover and pressure. Finally
the pooling helps to neglect the potentially harmful effect of the, to a certain
degree ad-hoc, overall ordering.
The patterns extracted with the help of the filters and selected by the pools
form an input to the fully connected layers (Figure 1). The purpose of these
layers is to combine all the different patterns into a final value expressing the
potential of each individual pass (ps , pr ). Intuitively, these layers express the logic
of decision making the sender ps is normally going through, incorporating the
relational contexts (filters) of the receiving player pr w.r.t. his own, while learning
how to weight the importance of the individual patterns in each combination.
Finally, with the softmax output (Figure 1), we enable the model to reason
jointly over the whole set of 10 possible passes (ps , pr ). As opposed to separating
each pass situation into 10 independent learning examples and normalizing over
these as a postprocessing step, with the joint output the gradient directly steers
the model towards exclusive predictions as part of the learning process.
Deep Spatial Relations for Soccer Passes 5

Convolution Pooling Merge Fully


connected
+ softmax

alternative

3 x 10 x 10 3 x 10 x 10 3 x 1 x 10

cover

3 x 11 x 10 3 x 11 x 10 3 x 1 x 10

1 x 10
pressure

3 x 11 x 10 3 x 11 x 10 3 x 1 x 10

18 x 1 x 10
f1-f9

9 x 1 x 10

Fig. 1. Architecture of the neural model. Four feature maps of size #f eatures ×
#instantiations × #possibilities are at the input. Filters of size 3 × 1, 3 × 2 and
3 × 3 are applied to each feature map. The outputs of the convolution are reduced
by max pooling and merged with the f1 − f9 features providing their static context.
Finally, 2 dense layers with 3, respectively 1, neurons are applied to each possibility.
For clarity only 3 out of 10 possibilities (depth dimension) are displayed.

3 Experiments

We performed 10-fold crossvalidation, evaluated the model w.r.t. mean reciprocal


rank and how many times the actual receiver of the pass was among the three
most likely predictions. We compared our result with [7], where the authors made
use of both static and dynamic features derived from the flow of the game, which
was unavailable to us. Therefore it would be fair to compare our results with the
Static model from the mentioned work. Nevertheless, our model outperformed
both the Static and the Combined model, which combined static and dynamic
features (Table 3).

3.1 Human-level Performance

We measured human-level performance to assess the inherent difficulty of the


task. We were particularly curious about the effect of the missing dynamic con-
text of the game that humans are used to from standard visual recordings,
providing much more information than the mere static snapshots. We measured
and averaged the predictive performance of three soccer enthusiasts on a sample
6 Hubáček, Šourek, Železný

Table 3. Comparison of the model’s (CNN) performance with previous work [7] and
human-level performance.

MRR top-1 top-2 top-3


Static 0.39 25.49 36.33 41.22
Combined 0.42 27.87 41.59 46.70
CNN 0.48 29.87 45.63 55.91
Human 0.56 34.00 56.25 70.25

of 200 randomly selected situations. To put the data into a more familiar per-
spective, we created a simple interactive visualization1 that may be utilized for
further measurements. The task proved to be difficult even for humans. While
the top-1 accuracies of the model and humans were close, the top-3 accuracies
and MMR showed that humans clearly rank the alternatives better.

3.2 Discussion

We analyzed the predictions made by the model to obtain further insights. The
main weakness of the model was that it usually considered only a few options
as viable, even when their alternatives were very similar. This could be due
to the use of softmax in combination with cross-entropy loss when training the
network instead of some kind of ranking loss. The network was strong in spotting
uncovered teammates, sometimes even overvaluing their positions. Generally, the
network preferred passes to sidelines, even when we could guess that the ball most
likely just came from those positions. The human intuition thus seems superior
when capturing this underlying “flow” of the game.
Vizualization of an example situation illustrates the difficulty of the task 2.
Without the information about the senders orientation on the field, there are
many viable alternatives. The model marked the pass to the sideline as the most
probable as this is a common pattern – midfielder developing a play from the
center of the field with a pass to the sideline. The actual pass was the model’s
second guess. From a human perspective there are far too many options assigned
near zero probability. Especially the passes to the players 9 and 2 should have
been prioritized more.
The decomposition of the static context features f1 . . . f9 from the convolu-
tional filters, as depicted in Figure 1, might suggests that the model could be
split into two. While these complementary feature sets provide context to each
other and were thus meant to work together, we also measured their separate
performance, proving the convolutional features to be more valuable (MRR 0.46)
than the static context features (MRR 0.42) in separate experiments.

1
https://github.com/Hudler/pass-viz
Deep Spatial Relations for Soccer Passes 7

Fig. 2. Example model prediction. Possible passlines are depicted by yellow lines, with
the actual pass marked by red. The percentages near the passlines show the predicted
probabilities.

4 Conclusion

We detailed our model for soccer pass prediction given static spatial snapshots
of the game. The model was a neural architecture based on a set of convolutional
filters, carefully designed to extract different relational contexts from each game
situation, i.e. mutual positions of players on the field. We argued how such an
architecture may learn possibly complex relational patterns via aggregation of
simple spatial relations. Finally, on a large dataset of captured soccer passes, we
showed that promising results can be achieved with such an approach.

Acknowledgements Authors acknowledge support by “Deep Relational Learn-


ing” project no. 17-26999S granted by the Czech Science Foundation. Compu-
tational resources were provided by the CESNET LM2015042 and the CERIT
Scientific Cloud LM2015085, provided under the programme “Projects of Large
Research, Development, and Innovations Infrastructures”.
8 Hubáček, Šourek, Železný

References
1. Joel Brooks, Matthew Kerr, and John Guttag. Using machine learning to draw
inferences from pass location data in soccer. Statistical Analysis and Data Mining:
The ASA Data Science Journal, 9(5):338–349, 2016.
2. Juan Chen, Anthony G Cohn, Dayou Liu, Shengsheng Wang, Jihong Ouyang, and
Qiangyuan Yu. A survey of qualitative spatial representations. The Knowledge
Engineering Review, 30(1):106–136, 2015.
3. Patrick Lucey, Alina Bialkowski, Mathew Monfort, Peter Carr, and Iain Matthews.
quality vs quantity: Improved shot prediction in soccer using strategic features from
spatiotemporal data. In Proc. 8th annual mit sloan sports analytics conference, pages
1–9, 2014.
4. Patrick Lucey, Dean Oliver, Peter Carr, Joe Roth, and Iain Matthews. Assessing
team strategy using spatiotemporal data. In Proceedings of the 19th ACM SIGKDD
international conference on Knowledge discovery and data mining, pages 1366–1374.
ACM, 2013.
5. William Spearman, Austin Basye, Greg Dick, Ryan Hotovy, and Paul Pop. Physics-
based modeling of pass probabilities in soccer. In Proceeding of the 11th MIT Sloan
Sports Analytics Conference, 2017.
6. Jan Van Haaren, Vladimir Dzyuba, Siebe Hannosset, and Jesse Davis. Automati-
cally discovering offensive patterns in soccer match data. In International Sympo-
sium on Intelligent Data Analysis, pages 286–297. Springer, 2015.
7. Vincent Vercruyssen, Luc De Raedt, and Jesse Davis. Qualitative spatial reason-
ing for soccer pass prediction. In Machine Learning and Data Mining for Sports
Analytics ECML/PKDD 2016 workshop, 2016.

View publication stats

You might also like