Football_Player_Transfer_Value_Prediction_Using_Advanced_Statistics_and_FIFA_22_Data[1]
Football_Player_Transfer_Value_Prediction_Using_Advanced_Statistics_and_FIFA_22_Data[1]
Abstract—In football, transfers are one of the most exciting Football has developed to the point where advanced statis-
things for fans. A transfer involves a player moving from one tics are readily available for the fans and these are often
club to another. The buying club will pay a large amount of utilised by teams and analysts. These statistics can paint an
money to the selling club to acquire the services of the player.
This amount is really large and its crucial not to overpay for extremely detailed picture about every aspect of a player, not
players for the sustainability of a club. In this study, we make just how well he plays but also the way in which he plays.
use of the FIFA 22 game data and information from football This study aims to predict the transfer value of professional
data websites like FBref and Transfermarkt to accurately predict football players based on their rankings in the video game
what a player is worth, using regression machine learning models. FIFA 22 by EA Sports, the above mentioned statistics and a
Other than player skills, we also considered important factors
like age and contract remaining, which has a significant impact few other factors like years remaining in the contract. FIFA
on transfer value. Results show that Gradient Boosting and 22 gives insight into the skills of a player and this is a very
eXtreme Gradient boosting were found to be the best algorithms. good measuring stick for understanding how good a player is
This work is beneficial for professional teams as well as football in real life. The transfer values of players are maintained by
websites. a website called Transfermarkt.com. We used the data from
Index Terms—Football Analytics, Transfer Value, Machine
Learning, Regression Transfermarkt for training our model. Statistics are obtained
from a website called FBREF which provides a vast amount
of data pertaining to player performance. Through our system,
I. I NTRODUCTION
we are trying to provide a statistical method for estimating
Football is one of the most popular sports in the world transfer value.
with millions of followers. Most countries have their own
football associations and leagues. In each of the leagues, we II. L ITERATURE R EVIEW
usually have around 10-24 teams who all strive for continuous Behravan and Razavi(2020) looked to address the draw-
improvement. The most direct way for improvement is through backs of Transfermarkt market value by predicting market
player recruitment. By recruiting new and exciting players, a values with the help of the FIFA 20 dataset. From the dataset,
club can improve its roster of players and have a better chance players were divided into 4 clusters (Goalkeeper, Defenders,
of winning more titles. This in turn improves their relationship Midfielders, Attackers) using an automatic clustering method.
with fans. Then PSO was used for automatic feature selection and SVR
A player can either be a free agent or be contracted was used for regression for each cluster. The value of a player
to another club(parent club). A free agent can be signed in the FIFA dataset was considered the true value.
directly without any negotiations with the club he used to Yigit et. al(2020) attempted to predict the transfer value
play for. However, for contracted players, the buying club of football players from major leagues. 5316 players from
has to pay a compensation fee to acquire his/her services. 11 major leagues across Europe and South America were
This compensation is usually money(termed transfer value) considered. Data from the football manager simulation game
and is extremely expensive. This figure often ranges in the was collected and merged with the transfer value from Trans-
multi-million-pound region and therefore, it is of immense fermarkt. Logarithmic transformation was used for better
importance that the right amount is paid for the right player. distribution. The KPMG valuation model was employed for
Otherwise, this impacts the long-term plan of the buying club. training. Goalkeepers were excluded from the dataset as they
Therefore, smart recruitment is crucial. had different features. A variety of regression models like
Authorized licensed use limited to: PES University Bengaluru. Downloaded on November 30,2024 at 14:41:05 UTC from IEEE Xplore. Restrictions apply.
Similarly higher FIFA 22 potential also has a direct
correlation with transfer value.
IV. M ETHODOLOGY
Authorized licensed use limited to: PES University Bengaluru. Downloaded on November 30,2024 at 14:41:05 UTC from IEEE Xplore. Restrictions apply.
V. R ESULTS CrsPA. The R2 value was found to be 0.805 for the best
performing model XGB.
A. Goalkeepers
The table below indicates performance of the models.
Usually goalkeepers have a lower transfer value when
compared to players from other positions. We found that the
features that had the most impact on the transfer value of Defenders
goalkeepers were Overall, Potential, Age, Contract Remaining, Model R2 MAE MSE
GK Reflexes, GK Diving, GK Handling, GK Kicking, and XGB 0.805 3.0 23.75
GK Positioning. In football teams, there is usually only one GRADIENT 0.788 3.13 25.76
goalkeeper who plays the majority of the games. As a result, BOOSTING
the advanced statistics available for goalkeepers are less in LGBM 0.755 3.28 29.00
number and often skewed. Therefore, we only considered FIFA RANDOM FOREST 0.78 3.24 30.90
data for goalkeepers, and the players we considered included
the ones from FIFA 22 dataset across the world. The best
performing algorithm was Gradient Boosting Regressor with
The performance of the best model on test data is shown
an R2 value of 0.82.
below.
The table below illustrates the performance of the models
used. MAE value for this as well as other positions is in
million pounds.
Goalkeepers
Model R2 MAE MSE
XGB 0.805 0.736 4.59
GRADIENT 0.813 0.73 4.41
BOOSTING
LGBM 0.658 0.97 8.08
RANDOM FOREST 0.783 0.78 5.70
Midfielders
Model R2 MAE MSE
XGB 0.81 3.17 26.73
B. Defenders GRADIENT 0.811 3.30 29.62
For this category, we considered center backs, full backs BOOSTING
and wing backs. The FIFA data we took included Overall, LGBM 0.783 3.65 37.19
Potential, Composure, Slide Tackle, Stand Tackle, Aggression, RANDOM FOREST 0.751 3.74 34.98
Interceptions, and Defensive Awareness. These were found
to have a good impact on transfer value. Statistical data we
took comprised of Age, Contract Remaining, Blocks, Int, Clr,
Err, Recov, Fls and CrdY. These factors contributed to how The below graphic shows the best performing model on
good a player is defensively. In addition, to highlight offensive test data.
contribution of defenders, we included Gls, Ast, SCA, and
Authorized licensed use limited to: PES University Bengaluru. Downloaded on November 30,2024 at 14:41:05 UTC from IEEE Xplore. Restrictions apply.
included features like a player’s contract and age which also
significantly affect the transfer value. FIFA data was publicly
available online and the statistical data was extracted from
FBref. We considered the true transfer value of a player to be
his transfermarkt.co.uk value. After the data were collected,
it was initially merged, then split based on player position.
Several regression models were built for each position, among
which the ones that performed the best were the Gradient
Boosting algorithm and Xtreme Gradient Boosting algorithm.
This approach returned satisfactory accuracy and results. Our
work benefits the football community in a couple of ways.
Firstly, it allows football websites to provide information about
a player’s transfer value(based on statistics and FIFA data)
D. Attackers in his profile page. Secondly, professional teams can use this
method to estimate transfer value of players, which helps them
For attackers, we considered strikers, attacking midfielders, identify bargains in the market and set a limit on how much
and wingers. FIFA data we took included parameters like to bid for a particular player.
Overall, Potential, Finishing, Crossing, and Dribbling.
Statistical data comprised of Age, Contract Remaninig, Gls,
Ast, xG, xA, SoT, Sh/90, SCA, GCA, Press, Att Pen, Min, VII. F UTURE S COPE
CrdY, DSucc, and CPA. The best results were found by using
the Gradient Boosting model with an R2 value of 0.86. The The project has a massive potential in the future. The
table below shows the performance of the various models amount and variety of stats are ever increasing and this will
used. definitely have a big positive impact on the project. As of
now, the finely detailed statistics are available only for the
biggest and most popular football competitions. This restricts
Attackers the amount of players we have access to and the magnitude
Model R2 MAE MSE of specialization we can give towards each position on the
XGB 0.856 3.57 37.03 football pitch. Once this changes, we will have a huge number
GRADIENT 0.860 3.56 36.51 of players to study from and thus the accuracy of the model
BOOSTING and the information it can extract from it will definitely
LGBM 0.795 4.07 57.84 increase. The video game market is a hyper competitive one,
RANDOM FOREST 0.783 3.95 58.87 with each game franchise trying to be the dominant player in
the market. As a result, a future where the video games have
much more detailed metrics than EA Sports’ FIFA 22 is easy
to envision. This means that projects like ours’ will be able to
use the data for innovative and elegant purposes like transfer
The performance of the model on test data is given
value prediction and other analyses.
below. Statistics avilable today are the most clear cut for
attackers and this reflects in the model performance.
R EFERENCES
[1] Iman Behravan and Seyed Mohammad Razavi A novel machine learning
method for estimating football players’ value in the transfer market
(2020)
[2] Ahmet Talha Yiğit, Barış Samak, Tolga Kaya Football Player Value As
sessment Using Machine Learning Techniques(2020)
[3] Prabhnoor Singh and Puneet Singh Lamba Influence of crowdsourcing,
popularity and previous year statistics in market value estimation of
football players (2019)
[4] T. Kirschstein and Steffen Liebscher Assessing the market values of
soccer players – a robust analysis of data from German 1. and 2.
Bundesliga (2019)
[5] Oliver Müller, Alexander Simons, Markus Weinmann Beyond crowd
judgments: Data-driven estimation of market value in association foot-
ball (2017)
[6] Rade Stanojevic and Laszlo Gyarmati Towards data-driven football
VI. C ONCLUSION player assessment (2016)
In this paper, we predict the transfer value of footballers [7] Yuan He Predicting Market Value of Soccer Players Using Linear
Modeling Techniques (2015)
using thier FIFA 22 data and statistical data. Contrary to previ-
ous works, we use extremely detailed statistical measures and
Authorized licensed use limited to: PES University Bengaluru. Downloaded on November 30,2024 at 14:41:05 UTC from IEEE Xplore. Restrictions apply.
A PPENDIX
The table below shows all the statistical data we considered.
Data was obtained from FBREF(https://fbref.com/en/)
Authorized licensed use limited to: PES University Bengaluru. Downloaded on November 30,2024 at 14:41:05 UTC from IEEE Xplore. Restrictions apply.