0% found this document useful (0 votes)
47 views

Interim Layout

epic document

Uploaded by

moxiro1709
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views

Interim Layout

epic document

Uploaded by

moxiro1709
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Rating Prediction of Football Players using Machine Learning

FirstName Surname† FirstName Surname Chaitanya Dindore


Department Name Department Name Btech CSE
Institution/University Name Institution/University Name Manipal Institute of Technology
City State Country City State Country Bangalore Karnataka India
[email protected] [email protected] [email protected]
pal.edu

ABSTRACT out the performance of players. Allowing


This research investigates the prediction of valuable insights into how a match would take
basketball player ratings through the application place and this also helps coaches and trainers to
of diverse machine learning algorithms such as analyze and evaluate their athletes. This also
linear regression , snap random forest regressor, helps talent scouts from basketball teams to find
snap boosting machine regressor , Random forest promising players who could work better with
regressor, Extra trees regressor. The study their team.
explores how well these methods can capture the With the number of people who try out to play
relation between player attributes and rating. for major league teams and the number of people
Results reveal the direct strengths of each who professionally play basketball the task of
algorithm, with ensemble methods providing filtering out valuable players is downright
superior predictive accuracy and linear impossible if not difficult. In this context data
regression variants providing valuable driven player analysis emerges as an invaluable
interpretability also with regularization aid.
techniques increasing robustness. This study
With the above factors building a Machine
provides an insight into which algorithm to
learning model comes to mind where the
choose for precise player rating predictions.
performance of a basketball player is predicted
KEYWORDS using various machine learning algorithms.
Ratings , Machine Learning ,Algorithmes, Section 2 of this paper will summarize previous
Regression algorithmes and Basketball players. research in this area Section 3 will summarize the
methodologies used and Section 4 will contain the
results along with conclusion.
1.INTRODUCTION
2.Literarture and survey
The importance of knowing the rating of a
Basketball player using data driven assessment of In [1] the authors have elucidated on using big
their performance has increased in the basketball data and machine learning to predict player
industry. Various television broadcasters game rating.
sights and the fans of the sport have benefited
from the utilization of basketball statistics to find
Paper [2] shows the authors using random forest Algorithms-
regression to predict all star player in their This section will shed some light upon the
national basketball players. different algorithms used and elucidates on why
Paper [3] talks about the meshing of sports they were used
performance analysis and machine learning in 1.random forest regressor:
team sports
Random forest regressor uses multiple decision
In Paper[4] the authors talk about using machine trees to improve accuracy and reduce overfitting.
learning algorithms to predict matches instead of We have used it in our research because:
skill based forecast.
Handles complex non-linear relationships
Authors of paper[5] talk about using hybrid fuzzy between variables well.
SVM models to predict outcomes of basketball
matches. Helps in identifying important player statistics
and team performance metrics.
Paper[6] talks about using machine learning to
construct an ideal basketball team. 2. Boosting machine regressor:

Paper[7] talks about using artificial neural A boosting machine regressor was used in this
networks for predicting results of and NBA research because:
game. It enhances prediction accuracy by sequentially
Paper[8] use data from historical matches to building decision trees boosting regressors
predict future matches. progressively refine predictions leading to more
accurate forecasts
Paper[9] talks about using multiple player factors
to predict matches and player performance Addresses all the complex dynamics of a
basketball game.
Paper[10] talks about injury rate predictions in
sports such as basketball by using Machine 3.Extra trees regressor:
learning models. This is an ensemble method similar to a random
3.Methadology forest regressor and is used here because of its
ability to reduce overfitting of training data
Dataset- leading to more generalized predictions.
To come up with an accurate prediction we have performance metrics
use a dataset with 464 players with 8 parameters.
X here can be featured as the data frame and has In order to make the prediction model efficient
the following values and accurate multiple model evaluation metrics
were used in it.
X=df[[‘rating’,’team’,’position’,’height’,’weight’,’s
alary’,’country’,’draft year’,’draft round’,’draft A descriptive note on the metrics used in the
peak’,’college’]] prediction model is mentioned ahead.
1. Root mean squared error- principal component analysis and other
 Root mean squared error (RMSE) is a dimensionality reduction techniques. It
metric used widely in machine learning to quantifies the amount of variability in the
assess the accuracy of a predictive model. dependent variable that is explained by
It measures the difference between the independent variables or principal
predicted value and actual value.  components.
 Mathematically RMSE is expressed as:  Mathematically explained variance is

√ 1
n expressed as:
 RMSE= ∑ ( y i− ^y i)2 Var ( y− ^y )
n i=1  Explained varience=1−
 where: Var ( y)
 n is the number of observations in the  Where:
dataset.  Y is the actual values of the dependent
 yi is the actual value of the dependent variable.
variable for the observation i.  Ŷ is the predicted values of the dependent
 ŷi is the predicted value of the dependent variable.
variable for the observation i.  Var(y) is the total variance of the
 dependent variable.
2. R squared- 
 R squared is also widely used metric in 4. Mean squared error-
machine learning models. It measures the  Mean squared error is a commonly used
proportion of the variance in the metric for evaluation of regression model
dependent variable that is predictable performance. It measures the average of
from the independent variable. the squares of the errors or deviations
 R-squared is expresses as: between predicted and actual values in a
SS residual dataset.
2
 R =1−  MSE is expressed as:
SStotal
n
1
 Where:  MSE= ∑ ( y i−^y i )
2

 SSresidual is the sum of squared residuals n i=1


(difference between observed and  Where:
predicted values).  N is the number of observations in the
 SStotal is the total sum of squares, which dataset.
measures the total variance in the  Yi is the actual value of the dependent
dependent variable. variable for observation i.
  Ŷi is the predicted values of the dependent
3. Explained variance- variable for observation i.
 Explained variance is very similar to R- 
squared. It is used for assessing 5. Mean squared log error-
performance of regression model. It is  Mean square log error is particularly
precise when it comes to techniques like useful while dealing with skewed target
variable or when you are interested in the absolute difference between predicted and
relative errors between predicted and actual values in a dataset.
actual values rather than the absolute  MedAE is expressed as:
errors.  MedAE=median ¿
 MSLE is expressed as:  Where:
n
1  N is the number of observations in the
MSLE= ∑ (log ( 1+ y i )−log (i− ^y i ))
2

n i=1 dataset.
 Where:  Yi is the actual value of the dependent
 N is the number of observations in the variable for observation i.
dataset.  Ŷi is the predicted value of the dependent
 Yi is the actual value of the dependent variable for observation i.
variable for observation i. 
 Ŷi is the predicted value of the dependent 8. Root mean squared log error-
variable for observation i.  Similar to MSLE Root mean squared log
 error is a metric commonly used when
6. Mean absolute error- dealing with skewed target variables and
 Mean absolute error is another renowned the relative errors between predicted and
metric used for performance evaluation of actual values are of interest. RMSLE is a
regression models. It measures the variation of RMSE that incorporates the
average absolute difference between the natural logarithm of the predicted and
predicted and actual values in a dataset. actual values.
 It is expressed as:  RMSLE is expressed as:
n
1
 MAE= ∑ ¿ y i− ^y i ∨¿ ¿

n
1
n i=1 RMSLE= ∑ (log ( 1+ y i ) −log (i− ^y i))
2
n i=1
 Where:
 n is the number of observations in the  Where:
dataset.  n is the number of observations in the
 Yi is the actual value of the dependent dataset.
variable for observation i.  Yi is the actual value of the dependent
 Ŷi is the predicted value of the dependent variable for observation i.
variable for observation i.  Ŷi is the predicted value of the dependent
 variable for observation i.
7. Median absolute error-
 Median absolute error is a robust metric
particularly used in situations where the Frontend
dataset might contain outliers or is
skewed. It measures the median of the
Results and discussion:
Explained 0.727 0.458
variance
Snap random forest regressor:
Mean squared 17.609 23.306
error
Measures Holdout score Cross Mean squared 0.003 0.004
validation log error
score
Mean absolute 3.141 3.873
Root mean 3.792 5.213 error
square error
Root mean 2.547 3.154
R squared 0.729 0.414 squared error
Explained 0.783 0.419 Root mean 0.052 0.062
variance squared log
error
Mean squared 14.379 27.358
error

Mean squared 0.002 0.004 Random Forest regressor:


log error
Measures Holdout score Cross
Mean absolute 3.230 4.185 validation
error score

Root mean 3.105 3.429 Root mean 4.156 5.080


squared error square error

Root mean 0.049 0.065 R squared 0.674 0.441


squared log
Explained 0.721 0.446
error
variance
Extra trees regressor:
Mean squared 17.266 25.892
Measures Holdout score Cross error
validation
Mean squared 0.003 0.004
score
log error
Root mean 4.196 5.021
Mean absolute 3.273 3.959
square error
error
R squared 0.668 0.454
Root mean 2.681 3.213
squared error

Root mean 0.052 0.063


squared log
error

Snap boosting machine regressor:

Measures Holdout score Cross


validation
score

Root mean 3.938 5.069


square error

R squared 0.708 0.444

Explained 0.742 0.447


variance

Mean squared 15.510 25.762


error

Mean squared 0.002 0.004


log error

Mean absolute 2.896 3.882


error

Root mean 1.995 3.139


squared error

Root mean 0.050 0.062 Frontend


squared log
error

First we have a splash screen


Then for security reasons we redirect the
customer to enter their phone number to enable 2
factor authentication

We are then redirected to a login page if this is


the first time the application has been opened
then we have the main screen where a person networks and more statistics within the training
could enter the stats of the player data.

References:

[1]Gu, Wei, et al. "A game-predicting expert


system using big data and machine learning."
Expert Systems with Applications 130 (2019): 293-
305.

[2]Soliman, G., Misbah, A. and Eldawlatly, S.,


2017, September. Predicting all star player in the
national basketball association using random
forest. In 2017 Intelligent Systems Conference
(IntelliSys) (pp. 706-713). IEEE.

[3]Bunker, Rory, and Teo Susnjak. "The


application of machine learning techniques for
predicting match results in team sport: A review."
Future Work:
Journal of Artificial Intelligence Research 73
This paper shows the effectiveness of various (2022): 1285-1322.
machine learning algorithms at predicting the
rating of basketball players. It shows how each
algorithm has its own pros and cons with [4]Sukumaran, C., et al. "Application of Artificial
understanding the intricate relationships in the Intelligence and Machine Learning to Predict
given data. It shows how the ensemble models Basketball Match Outcomes: A Systematic
predict better than the regession ones but the Review." Computer Integrated Manufacturing
regression ones are better at finding relationships Systems 28 (2022): 998-1009.
within the data.

[5]Jain, Sushma, and Harmandeep Kaur.


Future endevours to better this process could "Machine learning approaches to predict
include the usage of deep learning neural basketball game outcome." 2017 3rd International
Conference on Advances in Computing,
Communication & Automation (ICACCA)(Fall).
IEEE, 2017.

[6]Ke, Yuhao, Ranran Bian, and Rohitash


Chandra. "A unified machine learning framework
for basketball team roster construction: NBA and
WNBA." Applied Soft Computing 153 (2024):
111298.

[7]Thabtah, Fadi, Li Zhang, and Neda


Abdelhamid. "NBA game result prediction using
feature analysis and machine learning." Annals of
Data Science 6.1 (2019): 103-116.

[8]Bunker, Rory P., and Fadi Thabtah. "A machine


learning framework for sport result prediction."
Applied computing and informatics 15.1 (2019):
27-33.

[9]Lee, Dae-Jin, and Garritt L. Page. "Big Data in


Sports: Predictive Models for Basketball Player's
Performance." (2021).

[10]Claudino, João Gustavo, et al. "Current


approaches to the use of artificial intelligence for
injury risk assessment and performance
prediction in team sports: a systematic review."
Sports medicine-open 5 (2019): 1-12.

You might also like