A-Query-Language-for-Exploratory-Analysis-of-VideoBased-Tracking-Data-in-Padel-Matches_2023_MDPI
A-Query-Language-for-Exploratory-Analysis-of-VideoBased-Tracking-Data-in-Padel-Matches_2023_MDPI
Article
A Query Language for Exploratory Analysis of Video-Based
Tracking Data in Padel Matches
Mohammadreza Javadiha 1,† , Carlos Andujar 1, *,† and Enrique Lacasa 2
Abstract: Recent advances in sensor technologies, in particular video-based human detection, object
tracking and pose estimation, have opened new possibilities for the automatic or semi-automatic
per-frame annotation of sport videos. In the case of racket sports such as tennis and padel, state-of-
the-art deep learning methods allow the robust detection and tracking of the players from a single
video, which can be combined with ball tracking and shot recognition techniques to obtain a precise
description of the play state at every frame. These data, which might include the court-space position
of the players, their speeds, accelerations, shots and ball trajectories, can be exported in tabular
format for further analysis. Unfortunately, the limitations of traditional table-based methods for
analyzing such sport data are twofold. On the one hand, these methods cannot represent complex
spatio-temporal queries in a compact, readable way, usable by sport analysts. On the other hand,
traditional data visualization tools often fail to convey all the information available in the video
(such as the precise body motion before, during and after the execution of a shot) and resulting plots
only show a small portion of the available data. In this paper we address these two limitations by
focusing on the analysis of video-based tracking data of padel matches. In particular, we propose
a domain-specific query language to facilitate coaches and sport analysts to write queries in a very
compact form. Additionally, we enrich the data visualization plots by linking each data item to a
specific segment of the video so that analysts have full access to all the details related to the query. We
Citation: Javadiha, M.; Andujar, C.; demonstrate the flexibility of our system by collecting and converting into readable queries multiple
Lacasa, E. A Query Language for tips and hypotheses on padel strategies extracted from the literature.
Exploratory Analysis of Video-Based
Tracking Data in Padel Matches. Keywords: sports science; racket sports; video-based analysis; player tracking; sport analytics; data
Sensors 2023, 23, 441. https:// analysis; data visualization
doi.org/10.3390/s23010441
player has an enormous and endless margin for evolution. This explains why the analysis of
padel matches is so attractive and justifies the need of tools allowing researchers and coaches
to access the data to understand and study them.
Figure 1. A padel court is substantially smaller than a tennis court. The court is enclosed by plexiglass
walls and a metal mesh.
examples of how raw video-based data on points, shots, and frames can be structured into
tables. The video-based tracking nature of the data is reflected by the fact that sometimes
teams and players are identified by their position on the video (e.g., top left or TL player)
rather than by name.
Table 1. Example of raw tabular data for padel points. The time units for the first three columns are
frames. The Winner team is identified by the top/bottom position on the video.
Start (f) End (f) Duration (f) Duration (s) Winner Points A Points B Top Left Top Right Bottom Left Bottom Right
13,606 13,820 214 7.1 B 0 15 L Sainz G Triay A Sánchez A Salazar
14,093 14,785 692 23.1 T 15 15 L Sainz G Triay A Sánchez A Salazar
15,332 16,204 872 29.1 T 30 15 L Sainz G Triay A Sánchez A Salazar
16,932 17,004 72 2.4 B 30 30 L Sainz G Triay A Sánchez A Salazar
17,378 17,661 283 9.4 B 30 40 L Sainz G Triay A Sánchez A Salazar
Table 2. Example of raw tabular data for padel shots. The shot type uses the classification proposed
in [5]. The Lob column contains a Boolean that indicates whether the shot is a lob.
Table 3. Example of raw tabular data for the frames of a video. Players are referred to with their
location in the video at serve time (e.g. TL means top-left player). Positions are given in image space
(i, j are in pixels) and court-space (x, y are in meters).
Frame TL i TL j TR i TR j BL i BL j BR i BR j TL x TL y TR x TR y BL x BL y BR x BR y
13,614 478 202 785 266 401 554 911 553 1.89 18.53 7.49 13.60 2.50 1.60 7.87 1.66
13,615 477 202 785 266 401 555 912 554 1.87 18.52 7.49 13.60 2.50 1.59 7.88 1.62
13,616 477 203 785 266 400 553 914 555 1.89 18.43 7.49 13.60 2.49 1.63 7.89 1.60
13,617 479 204 785 266 399 550 915 556 1.94 18.35 7.49 13.59 2.47 1.70 7.90 1.57
13,618 480 206 785 266 398 549 918 556 1.96 18.21 7.49 13.59 2.46 1.74 7.93 1.58
Unfortunately, the advances in getting player/ball tracking data from sports videos
are not on par with the development of interactive data analysis and exploration tools
enabling non-IT professionals to perform complex queries on such datasets. The focus of
this paper though is not on getting the data but rather on providing a high-level language
to facilitate their analysis.
We can apply traditional data analysis approaches to analyze the data, but these
approaches do not allow non-experts to retrieve or analyze complex spatiotemporal re-
lationships in a compact, readable way. In the context of video-based padel data, we
have identified two major limitations in current data analysis approaches, which are dis-
cussed below.
1.4. Retrieving Data about Specific In-Game Situations from Tabular Data
Sports analysts, coaches, and professional players often make strategy recommenda-
tions (player positioning, synchronized actions, best technical actions for a given scenario)
that might or might not be sufficiently supported by empirical evidence, or that might
apply only to certain circumstances (e.g., they may apply to professional players but not to
amateurs). Some samples of typical recommendations for padel are:
Sensors 2023, 23, 441 4 of 28
E1 “Players should try to win the net zone as much as they can; it is easier to score a point
from the net zone than from the backcourt zone”.
E2 “An effective way to win the net zone is to play a lob”.
E3 “When a player is about to serve, his/her partner should be waiting in the net zone,
at about 2 m from the net”.
The availability of tracking data from padel matches opens great opportunities to
provide empirical support to such recommendations, to refute them, to quantify their
impact, or to analyze under which circumstances they apply (men’s matches vs. women’s
matches, professional vs. amateur, adult vs. child players, right-handed vs. left-handed).
Similarly, coaches and sports analysts might be interested in comparing the decision-making
processes of a player with those of elite players.
Following the example sentences above, there are many options to exploit the data.
For E1, we could estimate the conditional probabilities P(winning the point net zone) and
P(winning the point backcourt zone) by computing the relative frequencies of winning
points for the two conditions, from a sufficiently large and representative set of matches. If
matches are conveniently labeled, we could also compute and compare these probabilities
for different match categories (e.g., indoor vs. outdoor).
Regarding E2, we could follow a similar approach and estimate P(winning the net lob),
that is, the probability that a team wins the net after playing a lob. If large datasets on elite
players are available, we could also measure the relative frequency of lob shots compared
to other types of shots, and assume that elite players take the best technical actions most of
the time.
Concerning E3, we could plot the court-space position of server partners, and analyze
e.g., whether the distance dn to the network and the distance dw to the lateral wall are
normally distributed. If so, we could compute a simple Gaussian model for these variables,
e.g., dn ∼ N µn , σn2 ), where the parameters µn , σn2 can be estimated from the data.
These types of analyses are certainly possible using tracking data in tabular form.
However, to the best of our knowledge, no specific languages/tools have been reported to
transform the raw tabular data from a collection of matches into the data that are relevant
to the problem at hand. In other words, we are not aware of any high-level domain-specific
language facilitating the filtering and retrieval of such padel data. The same lack of tools
also applies to tennis and other racket sports. As a consequence, such analyses must be
based on conventional tools, for example through manual counting, spreadsheets (filters,
transformations, formulas), or computer programs operating on the tabular data [38].
Referring to the previous examples E1–E3, let us consider what queries could retrieve
data to support, refute or qualify them. The following queries (in plain English form) could
be useful for this task:
Q1 Retrieve all points, distinguishing by the winning team and the zone of the hit-
ter player.
Q2 Retrieve all lob shots with an additional column indicating whether the players could
win the net zone or not.
Q3 Retrieve all frames immediately after a serve, along with the court-space position of
the server’s partner.
Although all these queries can be implemented, for example, in a spreadsheet, de-
pending on the query complexity these tasks might require a considerable effort. Let us
suppose that, starting from a table similar to Table 3, we wish to retrieve the position of
the server’s partner in the 2 s immediately after each serve. We start with a spreadsheet
example, as this is a tool commonly used by sports analysts. First, we should identify,
for each frame, which of the four players is the server’s partner. Since this information is
missing on the Frames table, we could add a column “Server partner” that, given a frame
number, retrieves the game it belongs to, and the server’s partner for that game. A vertical
lookup function (vlookup in most spreadsheets) could help with this task. Then, we should
remove all frames outside the 2-s window after a serve. Again, this would require adding
Sensors 2023, 23, 441 5 of 28
more columns (with non-trivial lookup functions) to compute the time offset between each
frame and the serve. Additional functions will be required to select the server’s partner
position out of the four player. Finally, we could sort the data by time offset, remove the
rows with an offset above the 2-s threshold, select the (new) column with the network
distance, and plot/analyze the results.
The spreadsheet example above already shows the different drawbacks of this ap-
proach. First, it requires non-trivial transformations of the data: adding new columns, using
lookup functions (just computing column offsets for the result is error prone), sorting the
data (or setting up filters/dynamic tables). Second, this approach lacks scalability. When
new data come in, many of the steps above have to be repeated for each match. Third, it
lacks flexibility: if our definition of “net zone” changes (e.g., it is moved 50 cm away), this
would require extensive changes in the spreadsheets. Finally, it lacks readability, as the
computation and filter formulas are spread over the cells.
It can be argued that, as a preprocess, we could enrich the tabular data to simplify
these kinds of analyses. As we shall see (Section 5.3), padel concepts are so diverse that this
approach would only benefit the simplest queries. Notice that the query example above
(based on E3 and Q3) is relatively simple. Queries involving sequences of events (e.g.,
drive-lob-volley) further hinder the required transformations.
High-level programming languages provide convenient data structures and methods
to analyze tabular data. Python has a relatively smooth learning curve compared to other
programming languages, and it is extensively used for data analysis. Pandas is a well-
known Python package that provides a DataFrame class, which is essentially a convenient
representation of tabular data. Similar data structures and methods are available in other
languages (such as R, Octave and MatLab). These languages provide a convenient way
to transform and query tabular data, but the resulting code is often too complex and
unreadable to be usable by coaches and sports professionals. Some queries do admit a
very simple expression. For example, retrieving all serves in Python using pandas can be
as simple as: serves = shots[shots['Shot type']=='Serve'], where shots is the input
DataFrame (Table 2), and serves is the output DataFrame. Unfortunately, other types
of queries are harder to write (Section 8.4). Many queries require combining data from
multiple tables, which at the end require using either lookup functions or, in the case of
DataFrames, different types of joins [39] (inner joins, outer joins, left joins, right joins).
Join operators are a concept from database theory and relational algebra that requires
data-retrieval skills. However, even mastering join operators, queries involving sequences
of events (e.g., “lobs followed by a defensive smash and then a volley in the net zone”)
require additional operators that are usually too complex for people with no background
in relational algebra.
1.6. Contributions
The main contribution of this paper is the definition (and a free and open-source
prototype implementation) of a domain-specific query language to define queries on
video-based data from padel matches. Domain-specific languages (DSLs) are tailored
to a specific application domain and thus provide important advantages over general-
purpose languages (GPLs) in such domain [40]. In particular, we propose a domain-specific
language embedded in a GPL (more precisely, a Python API). Our language has greater
expressive power, and ease of use, thus enabling writing queries in a simple, compact,
flexible, and readable way.
Furthermore, and although not the main focus of the paper, we propose a collection
of interactive visualization tools to visually explore the output of such queries. A major
novelty is that data items are seamlessly connected to video segments so that a precise
analysis of specific technical actions is integrated into the exploratory analysis process.
For evaluating the power and expressiveness of the query language, we have collected
multiple statements about padel strategies (tips, comments, pieces of advice, hypotheses. . . )
from different published sources (books, papers). We discuss how to design queries to
support, refute or analyze these hypotheses, and show how these queries can be written
using our query language. The Supplementary Material shows a demonstration of our
query system running on a Jupyter notebook.
On a spatial plane, computer vision and other sensing techniques allow the automatic
tracking of the players’ positions during a match. Therefore, the video-recorded match
also contains the collection of player states, one for each frame, describing the position of the
player within the court. These positions allow the computation of new parameters, such as
speed, acceleration, distances to different court elements, and distance to the partner or
opponent players.
We can further analyze each of these concepts. For the sake of brevity, we only focus
on shots, since they represent arguably the most relevant technical actions in padel. Figure 4
shows the FD of a shot in padel. A shot starts with some player (hitter) hitting the ball. The
shot belongs to a specific match, set, game, and point, as shown in Figure 3. Shots have a
start frame (when the player hits the ball) and an end frame (corresponding to the next shot,
or end of the rally). Finally, shots have a collection of properties (such as the shot code)
that, due to their complexity, will be discussed later in Section 7.
Sensors 2023, 23, 441 8 of 28
The first block of Table 4 refers to concepts related to video-based padel matches (Figure 2).
All these concepts are translated into Python classes or class properties. The second and third
blocks refer to queries about matches (Figure 5). Essentially, a query definition corresponds
to the definition of a Python function, and a query execution translates into a function call.
We use Python decorators to simplify queries as much as possible. A decorator is a simple
mechanism of the Python language for defining higher-order functions, that is, functions that
take another function and extend its behavior. This mechanism is convenient because it moves
a large part of the boilerplate code from the query definition to the internal implementation of
the API.
4. Components of a Query
A query in our language requires four major components (see Figure 6), which are
described below.
Figure 6. Simple query using the proposed API, along with its main components.
Query type The output of all our queries is a table with the retrieved data (more precisely,
a QueryResult object that holds a Pandas’ DataFrame). The Query type refers to the different
types of queries according to the expected output (that is, the type of the rows in the output
DataFrame). Table 5 shows the query types supported by our language. From now on, we
will use the generic word “item” to collectively refer to the entities (points, shots, frames. . . )
that will form the rows of the output.
Query definition The query definition is a Boolean predicate that establishes which items
should be retrieved (e.g., all shots that match a specific shot type). In our language, this
takes the form of a decorated Python function that takes as input an object of the intended
class (e.g., a Shot object if defining a shot query) and returns a true/false value. These
predicates act as filters that discard items for which the predicate evaluates to false, and
Sensors 2023, 23, 441 10 of 28
collect items for which the predicates evaluate to true. The output table will contain as
many rows as items satisfy the predicate.
Attributes This refers to the collection of attributes we wish for every item in the output
table (that is, the output table will have one column for each attribute). For example, for
every smash, we might be interested only in the name of the player, or also in its court-
space position, or just the distance to the net. One of the key ingredients of our language is
that attributes are arbitrary Python expressions, with the only condition that they should
be able to evaluate correctly from the item. For example, a shot has attributes such as
hitter (the player that executed the shot), frame (the frame where the shot occurs), etc.
Attributes can be simple expressions such as shot.hitter or more complex ones such as
shot.next.hitter.distance_to_net < 3.
Scope Once we have specified the elements above, we might want to execute the query on
different collections of matches. The scope is the collection of matches that will be searched
for items fulfilling the predicate.
The separation of the different query components allows analysts to maximize reusabil-
ity. For example, the query definition in Figure 6 filters all the shots to select only volleys.
Later on, we can reuse this definition with different attribute collections, depending on
what data about each volley we wish to analyze. Some attributes that make sense for this
query definition include hitter.last_name, hitter.position.x, hitter.position.y,
and frame.frame_number, just to give a few examples.
The relationships among the other classes in Figure 7 work the same way. These
operations can be chained arbitrarily to get access to the data we are interested in. This
is especially useful in query definitions (since the Boolean function gets as a parameter a
single object, for example, a Shot) and it is also useful for attributes:
shot.hitter # player that executed the shot
shot.prev.hitter # hitter of the previous shot
shot.next.next.hitter.distance_to_net # for two shots ahead, distance to net of the hitter
shot.point.winner # team that won the point the shot belongs to
shot.point.game.winner # team that won the game the shot belongs to
Although not shown in Figure 7 for simplicity, methods that allow traversing the
hierarchy upwards can skip intermediate classes. For example, the expression
Sensors 2023, 23, 441 11 of 28
frame.shot.point.game.set.match.gender
can be written simply as frame.match.gender. Although implementation details are
discussed in the Appendix A, we wish to note that the methods above are implemented as
Python properties (using the @property decorator). Therefore instead of writing
shot.next().next().frame()
we can omit the parentheses and write
shot.next.next.frame
which is a bit more compact. Since query definitions require read-only access to all these
objects, we consider that using properties instead of methods is safe.
Figure 7. Main temporal units (Python classes) in our API, and methods/properties connecting them.
In the example above, “vd” and “vr” refer to drive volley and backhand volley, resp. [5].
The new concept can be added to a match by simply invoking the function on a match or
list of matches so that queries can use the new concept. The Python function decorator
deals with the necessary code to traverse the match items (in the above case, shots) to check
whether the new tag has to be inserted in the tag set.
6. A Complete Example
Before describing the API in more detail, here we briefly discuss a complete example,
including also a first analysis of the query results. Lines beginning with # are just comments.
# Define the query
@shot_query
def attack_drive_volley(shot):
return shot.like("vd") and (shot.hitter.distance_to_net < 5)
# Define the attributes
attribs = ["tags", "frame.frame_number", "hitter.position.x",
"hitter.distance_to_net", "hitter.last_name"]
# Execute query on a match
match = load("Estrella Damm Open'20 - Women's Final")
result = attack_drive_volley(match, attribs)
# Analyze the results
result.analyze()
result.plot_positions()
The example is analyzing the position of the players when playing a drive volley less
than 5 m away from the net. The query execution returns a QueryResult object, which
provides some essential visualization methods. Figure 8 shows the output of the analyze
and plot_positions methods.
Figure 8. Output from the execution of a query on drive volleys less than 5 m away from the net. We
only show the first rows of the output DataFrame.
Figure 9. Main classes and methods in our API. Colors indicate coherent methods across classes.
Referring to Figure 9, about one-half of the methods refer to the hierarchical and se-
quence relationships already discussed in Section 5.1. As already mentioned, these methods
allow analysts to navigate through the different elements, as in shot.point.prev.winner,
that given a shot, gets the team that won the preceding point.
Besides these hierarchical and sequence relationships, all these classes have a tags
attribute (not shown in Figure 9) that contains a set of strings encoding specific concepts
about the class (Section 5.3). The presence of a tag can be checked with the like method,
as in the expression shot.like("serve"). All these classes have associated a time inter-
val of the video, represented either as start_frame and end_frame properties, or just a
frame_number. Sets, Games, Points, Shots also have a number with their position within
their parent class. For example, for the first set of a match, set.number==1. Winning and
losing teams are available for all temporal units for which this makes sense (Match, Set,
Game, and Point). Points include a valid attribute to distinguish e.g., net shots.
7.3. Player
Figure 10 summarizes the main methods of the Player class. Since most methods are
self-explanatory, here we only explain position, speed, and acceleration methods. These
three methods return a 2D point (position) or a 2D vector (speed, acceleration) with (x, y)
coordinates/components. Figure 10 shows the global coordinate system for the global
position of the players within the court. We also provide relative distances to major court
Sensors 2023, 23, 441 14 of 28
elements (net and walls). Notice that, for these relative distances, the reference element is
taken with respect to the player. For example, in distance_to_right_wall, the right wall
is defined with respect to the player; the left wall for the players of one team is the right
wall for the opponents and vice versa.
Table 6. Shot types with native support in our API. These shot types are based on the classification
proposed in [5] (except the serve). Each row corresponds to a shot type. We provide the shot code, its
Spanish expansion [5], and some equivalent strings to represent them in queries.
Figure 10. Left: Main methods of the Player class. The methods in green are available for any Player
object, whereas those in blue are available only for Player instances bound to a particular Frame.
Right: Reference system for players’ positions and distances.
Sensors 2023, 23, 441 15 of 28
8. Evaluation
We evaluated the expressiveness of our query language by selecting many different
statements about padel from the literature, and translating them into (informal) plain
English queries and then into query definitions.
8.3. Queries
Each of the statements above can be translated into multiple queries. We show below
some plausible options, both in natural language and using our DSL.
Sensors 2023, 23, 441 16 of 28
Now, we will retrieve all volleys with a simple query. Notice that we can now use
like("volley") within the query definition:
@shot_query
def volleys(shot):
return shot.like("volley")
We will estimate the volley direction by computing the vector from the player’s
position to the receiver player’s position, so we need to include as attributes the position
of shot.hitter and shot.next.hitter players. We will also add additional attributes for
plotting the data (such as player’s last name, and shot direction encoded as an angle):
volley(match)
attribs = ["frame.frame_number", "hitter.position.x", "hitter.position.y", "hitter.last_name",
"next.hitter.position.x", "next.hitter.position.y", "angle", "abs_angle"]
q = volleys(match, attribs)
q.plot_directions(color='angle')
q.plot_distribution(density='angle', extent=[-30,30], groupby='player')
Figure 11 shows the resulting plots. For each segment, the larger dots represent the
volley origin, and the smaller dots the volley destination (estimated from the position of
the opponent player that returned the ball).
Figure 11. Direction of the volleys for the women’s final, with shots colored by angle (left). We show
as well the estimated density distribution of the volley angle variable, for the four players (right).
Notice that the query above can be extended easily to look for specific types of volleys:
for example, volleys played at a certain maximum distance from the net, after a specific
type of shot, or from a specific player:
# Volleys shot from less than 4 m from the net
@shot_query
Sensors 2023, 23, 441 17 of 28
def volley_from_attack_zone(shot):
return shot.like("volley") and shot.hitter.distance_to_net < 4
q=fast_down_the_line_volley(match, attribs)
q.plot_directions(color='angle')
q.plot_directions(color='winning')
Figure 12. Direction of fast, down-the-line volleys for the test match, with shots colored by angle (left)
or depending on whether the player won the point, after this shot or later on (right).
We can plot the distribution of the duration variable (mean = 1.25 s for the test match),
as well as the position of the shots, colored by duration:
Sensors 2023, 23, 441 18 of 28
q.plot_distribution("duration")
q.plot_positions(color='duration')
Figure 13 shows the resulting plots. Most of the long shots, as expected, correspond to
lobs and passing shots.
Figure 13. Distribution of the shot duration (s) for a test match (left), and position of the players for
each shot (right), colored by shot duration. One half of the shots had a duration below 1.1 s.
Where attribs list contains the necessary attributes for the analysis. We can plot, for
example, serve directions, coloring them either by angle or by player (Figure 14).
q.plot_directions(color="angle")
q.plot_directions(color="player")
Figure 14. Serve directions, colored by angle (left) or by player (right). Please recall that, for all plots
in this paper, we considered a subset of the points (this is not needed when using the interactive and
zoomable plots).
Sensors 2023, 23, 441 19 of 28
8.3.5. S5: “Serving Player’s Partner Should Be Waiting in the Net Zone”
The query definition is very similar to the previous example, but now we will retrieve
(as an attribute) the position of the partner:
Q5 “Retrieve all serves; for each serve, get the position of the partner of the serving player”.
Using our API, we would use the query definition of the previous example, but we add
query attributes to acquire data about the player’s partner (see Figure 15 for the results).
q = serves(match, attribs + ["hitter.partner.last_name", "hitter.partner.position.x",
"hitter.partner.position.y"])
q.plot_positions(color='player')
Figure 15. Position of the partner during serves (and a zoom into the clusters on the bottom side).
8.3.6. S6: “After Serving, the Player Should Move Quickly to the Net”
All the queries so far required data (e.g., the type of shot, the position of the players)
at a very specific moment (the time a shot is executed). Now we wish to analyze the
movement/paths of the players for some time (e.g., one second immediately after a serve).
This means we will have to use a frame_query which can provide data about arbitrary
segments of the video.
Q6 “Retrieve all frames immediately after a serve (e.g., for 1 s); for each frame, get the
position of the serving player”.
Using our API, we could compare the current frame number with that of the point’s
start frame, to filter the frames immediately after a serve (i.e., immediately after the start of
the point). See Figure 15 for the results.
@frame_query
def all_frames_after_serve(frame):
return (frame.time - frame.point.start_time) < 1
q=all_frames_after_serve(match, fattribs)
q.plot_player_positions()
We plot all four players, although of course we could filter only serving players’
paths (Figure 16).
Sensors 2023, 23, 441 20 of 28
Figure 16. Motion of the four players a after a serve: 0.5, 1.0, 1.5 and 2.0 s. Please note that court-space
players’ positions in our test datasets were approximate, with a larger error for players in the top side
of the court.
8.3.7. S7: “Players Should Try to Win the Net Zone as Much as They Can”
We will analyze S7 through two queries.
Q7a “For each point, compute the total time both players were in the net zone, for the team
that wins the point, and also for the team that loses the point”.
Q7b “For each winning shot, retrieve the position (and distance to the net) of the player
that hit that ball”.
Using our API, Q7a can be translated as
@frame_query
def time_on_net_for_winning_team(frame):
winner = frame.shot.point.winner
return winner and frame.distance_to_net(winner.forehand_player) < 4
and frame.distance_to_net(winner.backhand_player) < 4
where we compute for how long both players of a team are in the net zone (here, 4 m from
the net).
Similarly, we can define a query for the losing team, and combine both queries into a
single plot:
q1.addColumn("Team", "Point Winner")
q2.addColumn("Team", "Point Loser")
q = concat([q1,q2], "Time on the net")
q.plot_bar_chart(x='point', y='duration', color='Team')
Figure 17 shows the plot we got for our test match. This shows that, for that match,
the time on the net for the point winning team was higher (total time: 110.8 s) than for the
point losing team (total time: 68.7 s).
Similarly, Q7b can be translated as follows:
@shot_query
def winning_shots(shot):
if not shot.hitter.from_point_winning_team:
return False # not from the winning team
return not shot.next or not shot.next.next # just winning shots
q = winning_shots(match, attribs)
Sensors 2023, 23, 441 21 of 28
Figure 17. For each point, the bar chart shows the time (s) spent in the net zone (less than 4 m from
the net) by the players of the point winning team and the point losing team. Point labels include the
game-set-point id of the point, and the team that won that point.
We can plot the position of the players at the moment they played the winning shot:
q.plot_positions()
q.plot_histogram("distance")
The resulting plot is shown in Figure 18. Notice that some winning shots were executed
from the defense zone; by checking these in the video we observed that in most cases, the
opponent made an unforced error when returning these shots (that is, the opponents could
hit the ball, but not accurately enough to keep playing the point).
Figure 18. Position of the players (and distance to the net) at the moment they played the winning
shot, for our test match.
Using our API, we could just retrieve all frames and plot the players’ positions, colored
by name:
@frame_query
def all_frames(frame):
return True
q=all_frames(match, fattribs)
q.plot_player_positions()
Alternatively, we can compute the traversed distance on a per-frame basis, and then
group by point. The example below compares the traversed distance of two players from
the same team (Figure 19):
q1=all_frames(match, [("salazar.distance_from_prev_frame", "distance"), ...])
q1.sum("point")
q1.addColumn("Player", "Salazar")
q2.addColumn("Player", "Sánchez")
q = concat([q1,q2], "Distance traversed")
q.plot_bar_chart(x='point', y='distance', color='Player')
@shot_query
def lobs(shot):
return shot.like("lob") and shot.next.like("defensive")
We can use query attributes to get additional information about, for example, where
the opponents had to return the lob, next.hitter.distance_to_backwall, or the distance
to the net of the players two shots after the lob, next.next.hitter.distance_to_net,
next.next.hitter.partner.distance_to_net.
where shot is the serve, shot.next is the return, shot.next.next is the volley, and
shot.next.next.next is the volley’s return.
Notice that we had to loop over the DataFrame (df) rows because the filter involves
multiple rows (we are looking for a specific shot sequence). Notice also the use of index
offsets, which decreases readability. Finally, we have assumed that the input DataFrame
has already been enriched with additional columns (e.g., distance of the player from the
net). Otherwise, extra code would be required to compute these columns. This code might
Sensors 2023, 23, 441 24 of 28
not be straightforward if, for example, the filter involves data from other DataFrames (e.g.,
to relate who played the volley with the team that won the point).
Using our approach, Q10 definition can be written in a more compact and read-
able way:
@shot_query
def first_volley(shot):
return shot.like("serve") and shot.next.hitter.distance_to_side_wall > 2.5 and
shot.next.next.like("volley") and shot.next.next.next.hitter.distance_to_side_wall < 2.5
and shot.next.hitter == shot.next.next.next.hitter
9. Discussion
Expressiveness was our main priority when designing a domain-specific query lan-
guage for padel matches. Besides the statements discussed in Section 8, we have been able
to write queries to support, refute or analyze all statements about padel strategies under the
only assumption that the queries involve concepts captured by the tabular data. This is not
a limitation of the language, but of the sensing technology to generate the input datasets.
For example, our dataset has no information about shot effects (backspin, topspin, slice).
We have observed that being able to navigate across temporal concepts (from shots to
frames, frames to shot, from a shot to the next. . . ) largely simplifies the readability and
compactness of the queries. As shown in many examples, the separation between the query
definitions (“retrieve all volleys”) and query attributes (“get the player’s position”) facili-
tates code reusability. User-defined tags and properties also provide a simple mechanism
to include new concepts (“deep lob”) that further simplify the task of writing queries in a
language close to that of coaches.
The output of a query is a QueryResult object that keeps internally a Panda’s DataFrame
object. As shown in many examples, this class provides methods for essential plots. Supported
plots include scatter plots for players’ positions on an overhead view of the court, as well as
bar charts and histograms. More complex plots can be obtained by accessing directly the query
output and using any visualization tool (we used Vega-Altair as well as HoloViz’s hvPlot).
Our approach though has some limitations. The flexibility of using a programming
language, with arbitrary predicates on the query definitions, and arbitrary expressions on
query attributes, comes at the price of raising the entry barrier for sports analysts to use
the tool, since some Python skills are needed to write new queries. Although we believe
that minor edits to the query definitions are doable with little Python knowledge, the main
problem is the interpretation of potential syntax errors. Despite this, we believe that the
queries using the proposed API are more readable than alternative methods.
Although the focus of this paper is the query language and not how the input data
have been obtained, the availability of large datasets including accurate data about many
matches would certainly influence the impact of this work. On our test dataset, the most
relevant issue was the accuracy of the positional data, which was questionable for players
on top of the video, and also when jumping, since the perspective correction we apply to
move from image space to court space coordinates assumes that the feet are on the floor.
Advances in video tracking and pose estimation techniques, or the use of multi-camera
approaches, would improve the quality of the data and thus the reliability of analysis tools.
10. Applications
The proposed API has multiple practical applications, as it simplifies writing queries
and facilitates the exploratory analysis of padel matches.
At a professional level, the tool speeds up the analysis of many variables and their
relationship. For example, the tool can be useful in the following tasks:
• Determine the game profile in professional padel, considering variables such as the
number of games, number of points, average duration of games and points, time interval
between shots, number of winning points, and number of unforced errors [5,44].
Sensors 2023, 23, 441 25 of 28
• Analyze the frequency and success rate of the different technical actions (types of shots,
their direction and speed) according to the in-game situation (preceding technical
actions, position, and speed of the partner and the opponents).
• Analyze the distance covered by the players, their positions, displacements, and
coordinated movements, and relate them with the other variables [45].
• Analyze how all the variables above vary between women’s matches and men’s matches.
• Analyze how other external factors (e.g., outdoor match) might affect the variables above.
• Retrieve specific parts of the video (e.g., certain shot sequences) to quickly analyze
visually other aspects not captured by the input tabular data.
At an amateur level, coaches and padel clubs might offer the opportunity to record
the videos of the training sessions to further analyze them. For example,
• Compare the technical actions adopted by the trainee in particular scenarios against
those adopted by professional players.
• Show trainees specific segments of professional padel videos to provide visual evi-
dence and representative examples of strategic recommendations.
• If multiple videos are available, compare the different variables defining the game
profile of a trainee with those of other amateur or professional players.
• Help to monitor the progress and performance improvement of the trainees.
Supplementary Materials: The following supporting information can be downloaded at: https://
www.mdpi.com/article/10.3390/s23010441/s1, Video S1: Interactive Demo.
Author Contributions: Conceptualization, M.J. and C.A.; software, M.J. and C.A.; validation and
resources, all authors; test statements, E.L. and C.A.; writing—original draft preparation, C.A.;
writing—review and editing, all authors; visualization, C.A.; supervision, C.A. All authors have read
and agreed to the published version of the manuscript.
Funding: This research was funded by the Spanish Ministry of Science and Innovation and FEDER
funds, grant number PID2021-122136OB-C21, MCIN/AEI/10.13039/501100011033/FEDER, UE.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The source code of a free implementation of the API will be released
upon acceptance. The test video is publicly available at https://youtu.be/7s55wB9dR78 (accessed on 1
December 2022).
Acknowledgments: We would like to thank Angel Ric for his helpful feedback and support on the
visualization of the query outputs. This project has received funding from the Spanish Ministry of
Science and Innovation and FEDER funds (PID2021-122136OB-C21).
Sensors 2023, 23, 441 26 of 28
References
1. Priego, J.I.; Melis, J.O.; Belloch, S.L.; Soriano, P.P.; García, J.C.G.; Almenara, M.S. Padel: A Quantitative study of the shots and
movements in the high-performance. J. Hum. Sport Exerc. 2013, 8, 925–931. [CrossRef]
2. Escudero-Tena, A.; Sánchez-Alcaraz, B.J.; García-Rubio, J.; Ibáñez, S.J. Analysis of Game Performance Indicators during 2015–2019
World Padel Tour Seasons and Their Influence on Match Outcome. Int. J. Environ. Res. Public Health 2021, 18, 4904. [CrossRef]
[PubMed]
3. Almonacid Cruz, B.; Martínez Pérez, J. Esto es Pádel; Editorial Aula Magna; McGraw-Hill: Sevilla, Spain, 2021. (In Spanish)
4. Demeco, A.; de Sire, A.; Marotta, N.; Spanò, R.; Lippi, L.; Palumbo, A.; Iona, T.; Gramigna, V.; Palermi, S.; Leigheb, M.; et al.
Match analysis, physical training, risk of injury and rehabilitation in padel: Overview of the literature. Int. J. Environ. Res. Public
Health 2022, 19, 4153. [CrossRef] [PubMed]
5. Almonacid Cruz, B. Perfil de Juego en pádel de Alto Nivel. Ph.D. Thesis, Universidad de Jaén, Jaén, Spain, 2011.
6. Santiago, C.B.; Sousa, A.; Estriga, M.L.; Reis, L.P.; Lames, M. Survey on team tracking techniques applied to sports. In
Proceedings of the 2010 International Conference on Autonomous and Intelligent Systems, AIS 2010, Povoa de Varzim, Portugal,
21–23 June 2010; pp. 1–6.
7. Shih, H.C. A survey of content-aware video analysis for sports. IEEE Trans. Circuits Syst. Video Technol. 2017, 28, 1212–1231.
[CrossRef]
8. Mukai, R.; Araki, T.; Asano, T. Quantitative Evaluation of Tennis Plays by Computer Vision. IEEJ Trans. Electron. Inf. Syst. 2013,
133, 91–96. [CrossRef]
9. Lara, J.P.R.; Vieira, C.L.R.; Misuta, M.S.; Moura, F.A.; de Barros, R.M.L. Validation of a video-based system for automatic tracking
of tennis players. Int. J. Perform. Anal. Sport 2018, 18, 137–150. [CrossRef]
10. Pingali, G.; Opalach, A.; Jean, Y. Ball tracking and virtual replays for innovative tennis broadcasts. In Proceedings of the 15th
International Conference on Pattern Recognition. ICPR-2000, Barcelona, Spain, 3–7 September 2000; Volume 4, pp. 152–156.
11. Mao, J. Tracking a Tennis Ball Using Image Processing Techniques. Ph.D. Thesis, University of Saskatchewan, Saskatoon, SK,
Canada, 2006.
12. Qazi, T.; Mukherjee, P.; Srivastava, S.; Lall, B.; Chauhan, N.R. Automated ball tracking in tennis videos. In Proceedings of the 2015
Third International Conference on Image Information Processing (ICIIP), Waknaghat, India, 21–24 December 2015; pp. 236–240.
13. Kamble, P.R.; Keskar, A.G.; Bhurchandi, K.M. Ball tracking in sports: A survey. Artif. Intell. Rev. 2019, 52, 1655–1705. [CrossRef]
14. Zivkovic, Z.; van der Heijden, F.; Petkovic, M.; Jonker, W. Image segmentation and feature extraction for recognizing strokes in
tennis game videos. In Proceedings of the ASCI, Heijen, The Netherlands, 30 May–1 June 2001.
15. Dahyot, R.; Kokaram, A.; Rea, N.; Denman, H. Joint audio visual retrieval for tennis broadcasts. In Proceedings of the 2003
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’03), Hong Kong, China, 6–10 April 2003;
Volume 3, p. III-561.
16. Yan, F.; Christmas, W.; Kittler, J. A tennis ball tracking algorithm for automatic annotation of tennis match. In Proceedings of the
British Machine Vision Conference, Oxford, UK, 5–8 September 2005; Volume 2, pp. 619–628.
17. Ramón-Llin, J.; Guzmán, J.; Martínez-Gallego, R.; Muñoz, D.; Sánchez-Pay, A.; Sánchez-Alcaraz, B.J. Stroke Analysis in Padel
According to Match Outcome and Game Side on Court. Int. J. Environ. Res. Public Health 2020, 17, 7838. [CrossRef]
Sensors 2023, 23, 441 27 of 28
18. Mas, J.R.L.; Belloch, S.L.; Guzmán, J.; Vuckovic, G.; Muñoz, D.; Martínez, B.J.S.A. Análisis de la distancia recorrida en pádel en
función de los diferentes roles estratégicos y el nivel de juego de los jugadores (Analysis of distance covered in padel based on
level of play and number of points per match). Acción Mot. 2020, 25, 59–67.
19. Vučković, G.; Perš, J.; James, N.; Hughes, M. Measurement error associated with the SAGIT/Squash computer tracking software.
Eur. J. Sport Sci. 2010, 10, 129–140. [CrossRef]
20. Ramón-Llin, J.; Guzmán, J.F.; Llana, S.; Martínez-Gallego, R.; James, N.; Vučković, G. The Effect of the Return of Serve on the
Server Pair’s Movement Parameters and Rally Outcome in Padel Using Cluster Analysis. Front. Psychol. 2019, 10, 1194. [CrossRef]
[PubMed]
21. Javadiha, M.; Andujar, C.; Lacasa, E.; Ric, A.; Susin, A. Estimating Player Positions from Padel High-Angle Videos: Accuracy
Comparison of Recent Computer Vision Methods. Sensors 2021, 21, 3368. [CrossRef] [PubMed]
22. Chen, K.; Wang, J.; Pang, J.; Cao, Y.; Xiong, Y.; Li, X.; Sun, S.; Feng, W.; Liu, Z.; Xu, J.; et al. MMDetection: Open MMLab Detection
Toolbox and Benchmark. arXiv 2019, arXiv:1906.07155.
23. Xiao, B.; Wu, H.; Wei, Y. Simple baselines for human pose estimation and tracking. In Proceedings of the European Conference
on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 466–481.
24. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE
Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [CrossRef] [PubMed]
25. Cai, Z.; Vasconcelos, N. Cascade R-CNN: High Quality Object Detection and Instance Segmentation. IEEE Trans. Pattern Anal.
Mach. Intell. 2019, 43, 1483–1498. [CrossRef]
26. Wu, Y.; Kirillov, A.; Massa, F.; Lo, W.Y.; Girshick, R. Detectron2. 2019. Available online: https://github.com/facebookresearch/
detectron2 (accessed on 1 November 2021).
27. Chen, K.; Pang, J.; Wang, J.; Xiong, Y.; Li, X.; Sun, S.; Feng, W.; Liu, Z.; Shi, J.; Ouyang, W.; et al. Hybrid task cascade for instance
segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA,
16–20 June 2019.
28. Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018;
pp. 4510–4520.
29. Newell, A.; Yang, K.; Deng, J. Stacked hourglass networks for human pose estimation. In Proceedings of the European Conference
on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 483–499.
30. Huang, J.; Zhu, Z.; Guo, F.; Huang, G. The Devil Is in the Details: Delving Into Unbiased Data Processing for Human Pose
Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 14–19
June 2020.
31. Zhang, F.; Zhu, X.; Dai, H.; Ye, M.; Zhu, C. Distribution-aware coordinate representation for human pose estimation. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 14–19 June 2020; pp. 7093–7102.
32. Sun, K.; Xiao, B.; Liu, D.; Wang, J. Deep high-resolution representation learning for human pose estimation. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 5693–5703.
33. Cheng, B.; Xiao, B.; Wang, J.; Shi, H.; Huang, T.S.; Zhang, L. HigherHRNet: Scale-Aware Representation Learning for Bottom-Up
Human Pose Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual,
14–19 June 2020; pp. 5386–5395.
34. Wojke, N.; Bewley, A.; Paulus, D. Simple online and realtime tracking with a deep association metric. In Proceedings of the 2017
IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3645–3649. [CrossRef]
35. Zhang, D.; Guo, G.; Huang, D.; Han, J. PoseFlow: A Deep Motion Representati–on for Understanding Human Behaviors in
Videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA,
18–22 June 2018.
36. Bergmann, P.; Meinhardt, T.; Leal-Taixe, L. Tracking without bells and whistles. In Proceedings of the IEEE/CVF International
Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 941–951.
37. Šajina, R.; Ivašić-Kos, M. 3D Pose Estimation and Tracking in Handball Actions Using a Monocular Camera. J. Imaging 2022, 8,
308. [CrossRef]
38. Soto-Fernández, A.; Camerino, O.; Iglesias, X.; Anguera, M.T.; Castañer, M. LINCE PLUS software for systematic observational
studies in sports and health. Behav. Res. Methods 2022, 54, 1263–1271. [CrossRef]
39. Mishra, P.; Eich, M.H. Join processing in relational databases. ACM Comput. Surv. 1992, 24, 63–113. [CrossRef]
40. Fister, I.; Fister, I.; Mernik, M.; Brest, J. Design and implementation of domain-specific language easytime. Comput. Lang. Syst.
Struct. 2011, 37, 151–167. [CrossRef]
41. Van Deursen, A.; Klint, P. Domain-specific language design requires feature descriptions. J. Comput. Inf. Technol. 2002, 10, 1–17.
[CrossRef]
42. Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. In Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1492–1500.
43. Remohi-Ruiz, J.J. Pádel: Lo Esencial. Nivel Iniciación y Medio; NPQ Editores: Valencia, Spain, 2019. (In Spanish)
Sensors 2023, 23, 441 28 of 28
44. Mellado-Arbelo, Ó.; Vidal, E.B.; Usón, M.V. Análisis de las acciones de juego en pádel masculino profesional (Analysis of game
actions in professional male padel). Cult. Cienc. Deporte 2019, 14, 191–201.
45. Ramón-Llin, J.; Guzmán, J.F.; Belloch, S.L.; Vuckovic, G.; James, N. Comparison of distance covered in paddle in the serve team
according to performance level. J. Hum. Sport Exerc. 2013, 8, S738–S742. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.