Major Project College Report
Major Project College Report
1
Tennis Game Analysis using Deep Learning and Machine Learning
CHAPTER 1
Introduction
1.1 Context
Tennis, a globally cherished sport, has witnessed a remarkable evolution not only in terms of player skills but
also in the technology employed to analyze and present the game to enthusiasts. The advent of cutting-edge tools
such as Hawk-Eye and IBM Slamtracker has transformed the tennis viewing experience by providing a
comprehensive and detailed statistical analysis of each match. This project report delves into the significance of
these advanced technologies and their impact on enriching the understanding and enjoyment of tennis for fans
around the world.
Hawk-Eye, a sophisticated system featuring up to ten high-speed cameras, has redefined precision in tracking
the ball's real-world position. With the ability to capture and reconstruct any bounce with unparalleled accuracy,
Hawk-Eye contributes significantly to enhancing the visual experience of a tennis match. This system enables
the automatic generation of statistics related to serves through the middle or in the corners, the depth of ball
placement, and player preferences for left or right, depending on their positioning on the court.
IBM Slamtracker, on the other hand, is an application that elevates the real-time engagement of fans by
presenting a wealth of statistics – ranging from 15 to 25 parameters for each point. This highly sophisticated
system, comprising 8 to 10 high-speed cameras capable of capturing up to 1,000 frames per second, is powered
by an extremely advanced computer. While Slamtracker offers unparalleled accuracy in its analysis, the
equipment-intensive nature, substantial cost, and the need for specialized expertise in installation limit its
availability to the high-profile venues of major tournaments.
As we explore the impact of these technologies on tennis, this project report aims to shed light on how the
integration of Hawk-Eye and IBM Slamtracker has revolutionized the way fans perceive and engage with the
sport. By providing a comprehensive overview of the capabilities of these systems, we will delve into the intricate
details of statistical analysis, unveiling patterns, preferences, and nuances that enhance the overall tennis-
watching experience.
2
Tennis Game Analysis using Deep Learning and Machine Learning
In the subsequent sections of this report, we will delve into the technical aspects, benefits, and challenges
associated with implementing these advanced technologies, with the ultimate goal of showcasing their
significance in shaping the future of tennis analytics and spectator engagement.
The overarching purpose of this project is to discern the profound implications of Hawk-Eye and IBM
Slamtracker on the landscape of tennis analytics and fan engagement. By meticulously examining the
capabilities, nuances, and challenges inherent in these state-of-the-art technologies, we seek to illuminate their
role in elevating the comprehension and enjoyment of tennis matches for a global audience. Furthermore, the
project aims to contribute to the discourse surrounding the future trajectory of tennis analytics, providing valuable
insights for stakeholders, enthusiasts, and technologists alike.
The problem addressed by this project is the absence of a comprehensive and user-friendly tennis analytics
platform. This project seeks to develop a solution that leverages deep learning and machine learning techniques
to analyze tennis game data, providing insights into player performance, strategy, and game dynamics. By
creating a user-friendly interface, the project aims to make advanced tennis analytics accessible to players,
coaches, and enthusiasts, enhancing the overall understanding and improvement of tennis gameplay.
1.3 OBJECTIVES
The main objective of the project are mentioned below: -
The principal objective of this project is to conceive and actualize a revolutionary video analysis tool
designed to integrate ball tracking, court monitoring, bounce detection, and player tracking seamlessly.
This groundbreaking endeavor is underpinned by the ambitious goal of harnessing the capabilities of a
3
Tennis Game Analysis using Deep Learning and Machine Learning
single camera operating at an optimal frame rate of 25–30 frames per second. The overarching aim is to
democratize comprehensive tennis analytics, making it more accessible and cost-effective, thereby
transcending the current limitations imposed by sophisticated, resource-intensive systems. This
innovative approach seeks to usher in a new era of technological inclusivity, fostering broader
engagement and understanding within the tennis community and beyond.
This project endeavors to pioneer a cutting-edge video analysis tool that amalgamates ball tracking, court
monitoring, bounce detection, and player tracking into a unified and efficient system. The scope encompasses
the development of a singular camera-based solution, operating at a frame rate of 25–30 frames per second, to
achieve comprehensive tennis analytics. The key components of the project's scope are delineated below:
Ball Tracking:
Implementing algorithms and computer vision techniques to precisely trace the trajectory of the tennis ball
throughout the course of a match.
Ensuring real-time accuracy in capturing the ball's movement, including velocity, spin, and directional changes.
Court Monitoring:
Designing features that enable continuous monitoring of the entire tennis court, providing contextual information
to enhance the analysis.
Utilizing computer vision to identify key court areas and zones, facilitating a nuanced understanding of player
positioning and shot placement.
Bounce Detection:
Developing sophisticated algorithms for reliable detection and analysis of ball bounces on the court surface.
4
Tennis Game Analysis using Deep Learning and Machine Learning
Enhancing the tool's capability to differentiate between various types of bounces, such as topspin or slice, to
provide a more nuanced perspective on the game dynamics.
Player Tracking:
Employing advanced tracking algorithms to follow players' movements across the court accurately.
Extracting insightful data on players' speed, distance covered, and preferred areas of play to contribute to strategic
analyses.
Addressing the challenge of achieving comprehensive analytics with a single camera, thereby reducing the
equipment footprint and associated costs.
Optimizing the camera's capabilities to capture the requisite data points without compromising on the accuracy
of analysis.
Ensuring that the camera operates at an optimal frame rate of 25–30 frames per second to capture fast-paced
tennis action with precision.
Implementing strategies to minimize latency and enhance the real-time nature of the analytics tool.
Focusing on creating a solution that is accessible to a broader spectrum of tennis facilities, tournaments, and
enthusiasts.
Mitigating the financial barriers associated with adopting sophisticated tennis analytics tools, thereby
democratizing access to valuable insights.
5
Tennis Game Analysis using Deep Learning and Machine Learning
In conclusion, this project aspires to redefine the landscape of tennis analytics by creating an innovative,
accessible, and cost-effective video analysis tool. The integration of ball tracking, court monitoring, bounce
detection, and player tracking through a single camera represents a pioneering approach that holds the potential
to revolutionize the way tennis is analyzed and appreciated.
6
Tennis Game Analysis using Deep Learning and Machine Learning
CHAPTER 2
LITERATURE SURVEY
The landscape of tennis game analysis has witnessed a surge in innovation, with several notable projects
contributing to the advancement of the field. These projects, each unique in its approach, leverage cutting- edge
technologies such as computer vision, Deep Learning and machine learning to unravel the intricacies of tennis
gameplay.
The "Open Tennis" project, hosted on GitHub by StanlyHardy, stands as a promising initiative with a
comprehensive roadmap. This project outlines a clear trajectory for the integration of advanced analytics into
tennis analysis. The roadmap encompasses various aspects, including data collection, feature engineering, and
the implementation of machine learning models. The commitment to transparency and collaboration within the
open-source community is evident, positioning this project as a valuable resource for future developments.
In tandem, JeffSackmann's GitHub repository serves as a goldmine for tennis-related datasets. This extensive
collection provides a diverse range of data, enabling researchers and developers to explore and analyze tennis
statistics comprehensively. The richness of the dataset contributes significantly to the depth and accuracy of
tennis analytics, laying a foundation for sophisticated modeling and insights.
Diving into the realm of data science and tennis, Towards Data Science has been a prolific source of information.
The tagged articles on tennis provide insightful discussions on various analytical techniques applied to the sport.
Covering topics from player performance prediction to strategic analysis, these articles offer a nuanced
perspective, enriching the discourse on tennis analytics.
Moving towards more specific projects, Prateek Puri's ATP Serving Strategy project, hosted on GitHub, delves
into the strategic aspect of tennis, particularly focusing on serving patterns. By leveraging machine learning, this
project aims to unravel nuanced serving strategies employed by players. The technical depth of this project lies
in its algorithmic approach to discerning patterns and providing a quantitative understanding of serving
dynamics.
7
Tennis Game Analysis using Deep Learning and Machine Learning
A notable academic contribution comes from the work titled "Computer Vision and Machine Learning for In-
Play Tennis Stroke Classification" by Vinyes et al. This research, explores the application of computer vision
and machine learning for classifying tennis strokes during gameplay. The technical intricacies of the model
design and classification algorithms are detailed, shedding light on the potential of these technologies in refining
stroke analysis.
The intersection of sensor data, machine learning, and multi-objective optimization is explored in a research
paper titled "Analyzing Tennis Game through Sensor Data with Machine Learning and Multi-objective
Optimization." The authors present a holistic approach to analyzing tennis gameplay, emphasizing the fusion of
sensor data and machine learning for comprehensive insights. The technical depth lies in the optimization
techniques applied to glean meaningful patterns from the sensor data.
On a similar note, the project documented in "Projects- Report LV8 Project" by gml16 provides a detailed
exploration of data analysis in tennis. This project, presents an in-depth analysis of tennis matches using machine
learning. The technical nuances of feature engineering, model selection, and evaluation metrics are discussed,
offering a valuable reference for researchers and enthusiasts alike.
Conclusively, these projects and research endeavors collectively contribute to the burgeoning field of tennis
analytics. While each project possesses its unique strengths, the common thread lies in the utilization of advanced
technologies to decode the intricacies of tennis gameplay. The diversity of approaches, from open-source
community collaboration to academic research and individual projects, collectively propels the evolution of
tennis analytics into a dynamic and promising domain.
The implementation of this pioneering tennis analytics tool involves the judicious integration of cutting-edge
technologies to achieve a harmonious synergy between precision and accessibility. The core technological
components are succinctly outlined below:
8
Tennis Game Analysis using Deep Learning and Machine Learning
Leveraging advanced computer vision algorithms to interpret visual data from the single camera, enabling robust
ball tracking, player movement analysis, and court monitoring.
Employing feature extraction and pattern recognition techniques to discern critical elements such as ball trajectory,
player positions, and court zones.
Integrating machine learning models to enhance the tool's predictive capabilities and refine its ability to discern
nuanced patterns in player behavior.
Training the models on extensive datasets to continuously improve accuracy in ball tracking, bounce detection,
and player movement prediction.
Real-time Processing:
Employing high-performance real-time processing techniques to ensure instantaneous analysis of the captured
video frames.
Optimizing algorithms to mitigate latency, thereby delivering timely and accurate insights to enhance the live
viewing experience.
Implementing optical flow analysis to discern the dynamic movement of the tennis ball and players on the court.
Utilizing this technology to extract valuable information on ball speed, spin, and player agility, contributing to a
comprehensive understanding of match dynamics.
Single-camera Calibration:
Employing sophisticated calibration techniques to maximize the efficacy of the single camera in capturing precise
spatial and temporal information.
9
Tennis Game Analysis using Deep Learning and Machine Learning
Ensuring that the camera operates seamlessly across the entire tennis court, providing consistent and reliable data
for analysis.
Designing an intuitive and user-friendly graphical interface to present the analytics data in a visually engaging
manner.
Integrating customizable features to empower users to tailor the display based on their specific analytical
requirements.
Implementing robust data security protocols to safeguard the integrity and confidentiality of the captured and
processed information.
Adhering to industry standards and best practices to ensure the ethical and secure utilization of tennis analytics
data.
In essence, the technological framework of this project converges at the intersection of computer vision, machine
learning, and real-time processing, fostering a dynamic and innovative tool that not only elevates the precision of
tennis analytics but also prioritizes accessibility and cost-effectiveness in its implementation.
The literature underscores the transformative journey of tennis analytics, from traditional manual scoring to the
contemporary era of technologically driven insights.
10
Tennis Game Analysis using Deep Learning and Machine Learning
Historical analyses reveal the shift towards data-driven approaches, emphasizing the role of technology in
unraveling the nuances of player performance and match dynamics.
Notable research contributions highlight the utilization of multi-camera systems, such as Hawk-Eye, and their
impact on precision ball tracking and player movement analysis.
Comparative assessments shed light on the limitations of current technologies, particularly in terms of cost and
accessibility, paving the way for innovative solutions.
The literature underscores the instrumental role of computer vision and machine learning in enhancing the
accuracy and predictive capabilities of sports analytics tools.
Insights from studies in related fields, such as soccer and basketball analytics, offer valuable methodologies for
adapting these technologies to the intricacies of tennis.
Research findings highlight the challenges inherent in developing robust tennis analytics tools with a single
camera, including issues of spatial calibration and frame rate optimization.
Scholarly discussions present opportunities to address these challenges through advancements in computer vision
algorithms and real-time processing techniques.
The literature review emphasizes the growing importance of democratizing sports technology, making advanced
analytics tools accessible to a broader spectrum of users.
Case studies on cost-effective solutions in sports analytics provide valuable insights into creating tools that
balance sophistication with affordability.
11
Tennis Game Analysis using Deep Learning and Machine Learning
Ethical considerations surrounding the collection, storage, and utilization of sports data emerge as a recurrent
theme in the literature.
Scholarly discourse on ethical frameworks and guidelines informs the project's commitment to ensuring data
security and responsible use of analytics insights.
In essence, the literature overview serves as a guiding compass, navigating the project through the rich tapestry
of advancements and challenges in tennis analytics. By assimilating insights from diverse sources, the project
aims to contribute to the evolving discourse in sports technology, offering a novel synthesis that addresses
contemporary limitations and pushes the boundaries of what is achievable in tennis analytics.
Harnessing the tenets of computer vision, the model envisages the extraction of meaningful information from
visual data captured by a single camera.
Emphasizing feature extraction, object recognition, and spatial analysis, the computer vision paradigm forms the
cornerstone for accurate ball tracking, player movement, and court monitoring.
The theoretical model integrates machine learning as a catalyst, endowing the system with the capacity to adapt
and refine its analytical prowess over time.
Supervised learning algorithms are employed for training on extensive datasets, enabling the model to discern
intricate patterns in ball behavior, bounce dynamics, and player strategies.
Single-Camera Optimization:
12
Tennis Game Analysis using Deep Learning and Machine Learning
Addressing the challenge of utilizing a single camera, the model incorporates advanced calibration techniques to
optimize spatial accuracy across the tennis court.
Employing monocular vision principles, the model leverages temporal and spatial cues to reconstruct a dynamic
three-dimensional representation of the game.
The theoretical framework prioritizes real-time processing as an imperative, ensuring the instantaneous analysis
and dissemination of insights during live tennis matches.
Implementing parallel processing and efficient algorithms, the model mitigates latency, facilitating timely
delivery of analytics to enhance the viewing experience.
Drawing inspiration from optical flow analysis, the model captures the dynamic motion of the tennis ball and
players, providing granular data on parameters such as velocity, spin, and strategic player positioning.
This integration contributes to a nuanced understanding of match dynamics and augments the comprehensiveness
of the generated tennis analytics.
Envisioning a user-centric approach, the model incorporates principles of human-computer interaction in the
design of an intuitive GUI.
The GUI serves as the interface for users to interact with and interpret the rich analytics data, offering
customization features to cater to diverse user preferences.
Infused with ethical considerations, the model embeds a robust data security architecture to safeguard the
integrity and confidentiality of the captured and processed information.
Adhering to industry standards and guidelines, the architecture ensures responsible and secure utilization of
tennis analytics data.
13
Tennis Game Analysis using Deep Learning and Machine Learning
In summary, the theoretical framework orchestrates a symphony of computer vision, machine learning, and real-
time processing principles, tailored to the unique challenges and opportunities presented by tennis analytics. This
fusion of theoretical underpinnings forms the bedrock for the development of a revolutionary video analysis tool,
poised to redefine the landscape of tennis analytics.
2.4 Functioning
The operational architecture of the proposed tennis analytics tool is intricately designed to seamlessly
integrate advanced technologies, ensuring precise and accessible insights into the dynamics of the sport.
The key functionalities are succinctly outlined below:
Data Acquisition:
Initiation of the process involves the capture of high-resolution video footage using a single camera
operating at an optimal frame rate of 25–30 frames per second.
The captured video serves as the primary data source for subsequent analysis, encompassing all facets of
the tennis match, from ball trajectories to player movements.
Implementation of sophisticated computer vision algorithms enables the extraction of crucial information,
such as ball trajectory, player positions, and court zones, from the video frames.
These algorithms are finely tuned to operate seamlessly with a single camera, mitigating the need for
complex multi-camera setups.
Integration of machine learning models, trained on extensive datasets, enhances the tool's predictive
capabilities.
The models contribute to the accuracy of ball tracking, bounce detection, and the nuanced analysis of player
14
Tennis Game Analysis using Deep Learning and Machine Learning
Real-time Processing:
High-performance real-time processing techniques ensure the instantaneous analysis of captured video
frames, maintaining pace with the dynamic nature of tennis matches.
The optimization of algorithms minimizes latency, delivering timely insights to users, thus enhancing the
real-time viewing experience.
Development of an intuitive GUI provides users with a user-friendly platform for accessing and interpreting
analytics data.
The interface is designed to be customizable, allowing users to tailor the display based on specific analytical
requirements and preferences.
Implementation of advanced algorithms for bounce detection on the court surface contributes to a nuanced
understanding of ball behavior.
Concurrently, court monitoring features employ computer vision to identify key areas and zones, offering
contextual insights into player positioning and shot placements.
Security Protocols:
Robust data security measures are embedded within the tool to safeguard the integrity and confidentiality
of the captured and processed information.
Adherence to industry standards ensures ethical and secure utilization of tennis analytics data, addressing
concerns related to data privacy and integrity.
In essence, the functioning of the proposed tennis analytics tool revolves around the seamless orchestration
of cutting-edge technologies, each contributing to the overarching goal of providing comprehensive and
accessible insights into the intricate dynamics of tennis matches. The amalgamation of data acquisition,
15
Tennis Game Analysis using Deep Learning and Machine Learning
advanced algorithms, real-time processing, and user-friendly interfaces marks a paradigm shift in tennis
analytics, promising an innovative and inclusive approach to understanding the sport.
Image Preprocessing:
The initial phase involves preprocessing the input image by extracting white pixels, recognizing that tennis court
lines are inherently white. Pixels surpassing a predefined intensity threshold in a monochrome representation of
the court are isolated.
Employing the Hough Transform, the system detects lines in the preprocessed image, classifying them as either
horizontal or vertical. This delineates the essential court structure, capturing the spatial arrangement of the lines.
Reference court configurations are utilized to compare the detected lines. Through the determination of a
homography matrix, calculated based on known points from both the reference and detected lines, the system
projects the reference court onto the frame.
Warp perspective techniques are applied using the homography matrix to align the detected court lines with the
reference configuration. The subsequent hit detection involves counting the overlaps between reference court
lines and the binary image, revealing the best-matching lines.
Despite its utility, the classical computer vision approach exhibits drawbacks such as sluggish processing speeds,
sensitivity to hyperparameters, and challenges with generalization across varying court conditions.
Dataset Creation:
A comprehensive dataset is compiled through a semi-automated process, extracting frames from video highlights
of different tennis tournaments. The dataset undergoes refinement through classical computer vision algorithms
to filter out suboptimal images. This dataset, consisting of 8841 images, spans various court types.
The proposed deep learning network, inspired by the TrackNet architecture, is tailored for tennis court detection.
The model's input tensor comprises a single image, and the output tensor includes 15 channels—14 points from
the dataset and an additional point denoting the center of the tennis court.
The dataset is divided into training and test sets, with images resized to expedite training. Adam optimizer is
employed to optimize network weights, and the model undergoes rigorous training using defined parameters.
Postprocessing Techniques:
Postprocessing involves refining key points using classical computer vision techniques, rectifying biases in
predicted locations. Additionally, a homography matrix is utilized to reconstruct shifted key points, aiding in
precise alignment with reference points.
Evaluation Metrics:
Performance metrics, including precision, accuracy, and median distance between predicted and ground truth
points, are calculated based on Euclidean distances. Postprocessing techniques' impact on final metrics is duly
considered.
17
Tennis Game Analysis using Deep Learning and Machine Learning
For tennis ball detection, the TrackNet deep learning network is employed. This heatmap-based model is trained
to recognize the ball image and learn flying patterns from consecutive frames.
A specialized dataset containing labeled frames from broadcast videos is utilized. The ground truth is represented
as a heatmap of a 2D Gaussian distribution centered on the tennis ball. Binary cross-entropy loss is applied for
training.
Evaluation Metrics:
Positioning error (PE) is defined to evaluate the accuracy of ball detection, considering the Euclidean distance
between predictions and ground truth. Specific metrics, including true positives, false positives, and false
negatives, contribute to the assessment.
4. Bounce Detection:
Leveraging trajectory patterns obtained from tennis ball detection, sudden changes in ball direction are identified
to ascertain bounce frames. Machine learning, employing CatBoostRegressor, is employed to predict bounce
points based on features such as coordinate differences and distance relations.
Evaluation:
Precision and recall metrics are employed to evaluate the effectiveness of bounce detection. The methodology is
validated against a defined assumption of accurately detecting bounce frames within a specified range.
Combined Workflow:
The methodology culminates in a holistic pipeline, integrating court detection, tennis ball tracking, and bounce
identification. The pretrained neural networks collectively contribute to providing a comprehensive
representation of player movements, ball trajectories, and court dynamics.
6. Possible Improvements:
18
Tennis Game Analysis using Deep Learning and Machine Learning
Speed Optimization:
Strategies to enhance algorithm speed, such as replacing the neural network backbone and applying quantization
and pruning, are suggested for future improvements.
Model Enhancement:
The possibility of combining court and ball detection into a unified neural network, marked-up with a
consolidated dataset, is proposed to streamline the model architecture.
Quality Enhancement:
Augmentation techniques during court detection training and the refinement of ball detection quality through
additional frames without the ball are identified as potential areas for improvement.
In summary, the methodology encompasses a hybrid approach, blending classical computer vision techniques
with deep learning paradigms to provide a robust and versatile tennis analytics tool. The incorporation of
innovative strategies and the recognition of potential areas for improvement underscore the dynamic nature of
the methodology, positioning it at the forefront of tennis analytics innovation.
19
Tennis Game Analysis using Deep Learning and Machine Learning
CHAPTER 3
REQUIREMENT ANALYSIS
Data Visualization (Matplotlib and Seaborn): Useful for visualizing images, graphs, and performance
metrics during different stages of the project.
Version Control (Git): Facilitates collaborative development, version control, and tracking changes in
the codebase.
IDE (Integrated Development Environment) (VSCode): Provides an interactive environment for
developing and testing code, especially for data exploration and model development.
21
Tennis Game Analysis using Deep Learning and Machine Learning
CHAPTER 4
DESIGN
4.1 DESIGN GOALS
Court Detection
Detecting tennis court lines is a critical step in establishing a meaningful coordinate system for player and ball
positioning in our system. The conventional computer vision approach involves several sequential steps. Initially,
the image is processed to extract white pixels, as tennis court lines are typically white. Subsequently, lines, both
horizontal and vertical, are detected using the Hough Transform. The resulting lines are then compared with a
reference court configuration, and a homography matrix is determined based on known points. This matrix
projects the reference court onto the frame, providing valuable spatial information for further analysis.
However, this classical approach encounters several limitations. Its execution is notably slow, with a speed as
low as 15 seconds per image. The method's quality is compromised due to the multitude of hyperparameters
involved, leading to instances where lines are not accurately detected. Additionally, the approach proves to be
unstable in the presence of varying angles, colors, shadows, and court types.
22
Tennis Game Analysis using Deep Learning and Machine Learning
To overcome the limitations of the classical approach, a deep learning strategy is proposed. The fundamental
concept involves employing a neural network to identify 14 key points on a tennis court. These key points, along
with an additional point representing the center of the court, enable the reconstruction of the entire court
configuration within the image.
23
Tennis Game Analysis using Deep Learning and Machine Learning
Dataset Collection:
A dataset for training and evaluation was semi- automatically generated using video highlights from diverse
tennis tournaments. Frames from the videos were extracted, processed through the classical computer vision
algorithm, and manually filtered to enhance image quality. The resulting dataset comprises 8841 images covering
various court types.
24
Tennis Game Analysis using Deep Learning and Machine Learning
Model Architecture:
The deep learning network is structured similarly to the TrackNet architecture, with modifications to
accommodate a single input image and an output tensor featuring 15 channels. The resolution of both the input
and output images is set to 640x360.
Training:
The dataset is split into training (75%) and test (25%) sets, with frames resized to 640x360 to expedite training.
The Adam optimizer is employed, and training parameters are optimized for efficient convergence.
Postprocessing Techniques:
Two postprocessing techniques are implemented to refine key points. The first involves refining points using
classical computer vision, where a rectangular area around predicted key points is examined for white line
intersections. The second technique utilizes a homography matrix to reconstruct shifted key points.
Evaluation:
Performance metrics are defined based on the assumption that a key point is accurately detected if the Euclidean
distance between the model prediction and ground truth is less than 7 pixels. Precision, accuracy, and median
distance between predicted and ground truth points are calculated, showcasing the overall efficacy of the deep
learning approach. Postprocessing techniques are evaluated for their impact on final metrics, highlighting the
refinement achieved through these methods.
Ball Detection:
The ball detection component of our project is a crucial element, and we employ the TrackNet deep learning
network for this purpose.
Detecting a tennis ball in broadcast videos poses challenges due to small, blurry, and occasionally invisible ball
images. TrackNet, a heatmap-based deep learning network, is specifically trained to not only identify the ball in
individual frames but also to learn its flying patterns across consecutive frames. Operating on images with a size
25
Tennis Game Analysis using Deep Learning and Machine Learning
of 640 × 360, TrackNet generates a detection heatmap from either a single frame or a sequence of frames,
achieving high precision even in public domain videos.
Our dataset comprises 10 broadcast video clips, encompassing various stages from ball serving to scoring,
totaling 19,835 labeled frames. The video resolution is 1280×720, with a frame rate of 30 fps. Each frame in the
label file includes attributes such as "Frame Name," "Visibility Class" (VC), "X," "Y," and "Trajectory Pattern."
VC indicates the ball's visibility, with values 0, 1, 2, and 3 denoting not in frame, easily identified, in frame but
not easily identified, respectively. "Trajectory Pattern" categorizes ball movement into flying, hit, and bouncing.
During training, the ground truth is represented by a heatmap of a magnified 2D Gaussian distribution centered
on the tennis ball. The Gaussian distribution's variance corresponds to the assumed diameter of tennis ball
images, approximately 10 pixels.
For evaluation, the tennis ball's image diameter in the video varies from 2 to 12 pixels, with a mean diameter of
approximately 5 pixels. Given that prediction errors within a unit size of the ball are acceptable for trajectory
identification, we define the positioning error (PE) threshold as 5 pixels to determine accurate ball detection. PE
is computed as the Euclidean distance between model predictions and ground truth. Evaluation metrics, as
proposed by the authors, include:
This comprehensive approach to ball detection and tracking demonstrates the meticulous design and evaluation
methods employed in our project, highlighting its potential impact on refining tennis analytics.
Bounce Detection:
Bounce detection constitutes a pivotal aspect of our solution, enabling us to pinpoint the frame when the ball
makes contact with the ground. Leveraging the dataset used for tennis ball detection, which includes annotated
frames indicating ball-court contact, we visualize the ball trajectory's x and y coordinates throughout the game.
In the visual representation of the ball trajectory, green dots signify frames corresponding to the ball's contact
with the court, while brown dots highlight frames where players struck the ball. Given the challenge of predicting
the bounce frame during gameplay, we adopted a machine learning approach to address this issue. Specifically,
we employed the CatBoostRegressor to determine the bounce point, considering features such as the difference
between x and y coordinates of adjacent points, distance relations between previous and subsequent ball points,
26
Tennis Game Analysis using Deep Learning and Machine Learning
and other relevant parameters. Binary classification was employed, designating class 1 for bounce and class 0
for other instances.
Following the dataset processing, we retained 18,570 samples, with 518 of them labeled as bounce points. To
mitigate overfitting, we implemented a 10-fold cross- validation approach, utilizing 10 labeled games for training
(9 for training and 1 for testing). A grid search involving essential parameters like iterations, learning rate, and
depth further optimized the training process.
The evaluation of our method focuses on precision and recall. While exact bounce frame detection is not critical
for our demo, we acknowledge a margin of error of 1 frame. Introducing a weighted approach by assigning a 0.5
weight to neighboring points marked as bounce in the ground truth, we observed a trade-off: a drop in precision
due to false labels but a significant increase in recall.
In summary, our strategy for bounce detection not only addresses the intricacies of predicting ball contact frames
during the game but also utilizes machine learning techniques, specifically the CatBoostRegressor, to
enhance precision and recall metrics. The weighted evaluation approach further allows for a nuanced assessment
of the model's performance.
27
Tennis Game Analysis using Deep Learning and Machine Learning
CHAPTER 5
CONCLUSION
In summary, our presented solution introduces a tooldesigned for the automated analysis of tennis matches,
leveraging the capabilities of deep learning and computer vision. This innovative tool facilitates the detection
and tracking of various elements, including the court, ball trajectory, and players' movements, while also offering
automatic identification of outs during gameplay.
The neural network proposed for court detection surpasses previous solutions based on classical computer
vision techniques in terms of speed, robustness, and effectiveness. The integration ofpostprocessing techniques
further enhances the neural network's performance, showcasing a notable advancement in the field of tennis
match analysis.
28