0% found this document useful (0 votes)
70 views

Player Availability Rating

This document proposes a Player Availability Rating (PAR) tool to help NHL general managers quantify player performance. The PAR uses regression models to predict a player's points per game (PPG) based on characteristics like height, weight, and ice time. It calculates the difference between predicted and actual PPG adjusted based on the team's recent and season-long winning percentages, to indicate a player's trade potential. The PAR metric aims to help GMs evaluate players and make roster decisions throughout a season.

Uploaded by

neal pearson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views

Player Availability Rating

This document proposes a Player Availability Rating (PAR) tool to help NHL general managers quantify player performance. The PAR uses regression models to predict a player's points per game (PPG) based on characteristics like height, weight, and ice time. It calculates the difference between predicted and actual PPG adjusted based on the team's recent and season-long winning percentages, to indicate a player's trade potential. The PAR metric aims to help GMs evaluate players and make roster decisions throughout a season.

Uploaded by

neal pearson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Player Availability Rating (PAR) - A Tool for

Quantifying Skater Performance for NHL General


Managers
Shuja Khalid1
1
Department of Computer Science, University of Toronto
arXiv:1811.02885v1 [cs.CY] 15 Oct 2018

November 8, 2018

Abstract points per game (PPG). Neural Nets were found


to produce the estimates that were closest to
This project aims to assess the performance of the established baseline. This gives managers
various regression models in predicting the per- and coaches a way to quantify a player’s contri-
formance of hockey players. The measure of bution to the team based on a model that was
performance is chosen to be points scored (sum trained using data accumulated from 2006-2017.
of goals scored and assists made) by individual This model can also serve as a guide to how
players per game (PPG). This paper thus con- players entering the league, for whom previous
siders the offensive performance of the players National Hockey League data is not available,
and uses PPG as a metric to assign a value to will perform based on their physical attributes
the player. This predicted value can be used to and their usage during an 82 game NHL season.
rank players and is similar to what TSN.com[1] The paper takes these projections a step further
and NHL.com[5] use to rank players before each and proposes a metric that uses the projected
hockey season. A combination of physical char- player contributions (PPG), their actual perfor-
acteristics, shooting percentage, and usage dur- mance during the current season (PPG), and the
ing games are used as training features. A novel team’s performance (points percentage) to rank
metric, Player Availability Rating (PAR) is pro- players that might be available for trade. This
posed which utilizes the offensive predictions to metric can thus serve as a useful tool for any
predict the availability of a player during a sea- general manager, real or fantasy to make roster
son. These metrics can be valuable in the NHL decisions.
as it would allow for general managers to track
a players performance and their availability dur-
ing the course of a season. 2 Process Diagram

1 Introduction
The core competency required of a manager is
to be able to assess a players performance. This
is a challenging endeavor as the performance
of a player depends on a number of attributes
and understandably varies from season to sea-
son. The approach adopted by this paper is to
create a model that uses only player character-
istics such as height, weight, expected time on
ice and expected shooting percentage to create a
projection for a player. Various regression based
models were created in order to estimate the of- Figure 1: Visual representation of flow of data
fensive production of a players in the form of for producing the PAR rating

1
3 Formal Description (P P Gpredicted −P P Gactual )
P AR = (P P CGseason ) + wo ∗
The performance of players in the National (P P Gpredicted −P P Gactual )
Hockey League varies from season to season due (P P CGrecent )
to a variety of reasons such as overall team per-
where
formance, player usage, physiological attributes,
coaching styles etc. Each of the 31 teams in the
league employ an army of scouts that are respon-
sible for analyzing players over the course of a P P Gpredicted : Predicted points/game for
season which can be a daunting task. This pa- skater
per proposes a tool that can be used by general P P Gactual : Actual points/game for
managers to evaluate the performance of play- skater
ers during the course of a season based on their P P CGseason : Percentage of points earned
expected and predicted offensive performances, by team during current
the long-term and short-term performances of season
their teams, and their usage per game. The al- P P CGrecent : Percentage of points earned
gorithm used to determine the performance of by team during last 10 games
the players is presented in Algorithm 1. wo : A parameter to control the
influence of recent team
Result: PAR (Player Availability Rating) performance on player
P P Gpredicted , P P Gactual , P P CGseason , availability
P P CGrecent , Xtest , Ytest , Xtrain , Ytrain ,
wo = 2;
method={’linearRegression’, ’k-NN’, The difference between the predicted and ac-
’NeuralNets’, ’decisionTree’, tual PPG values for each player is indicative of
’randomForest’}; the player’s performance. A negative value cor-
modelminerror = Choose method that responds to the player out-performing expecta-
produces the lowest average error; tions whereas a positive value means that the
while items in Xtest do player is under-performing. The short term and
Use modelminerror with lowest long term winning percentages of the player’s
calculated error and recalculate team can be indicative of the pressure that the
P P Gpredicted ; opposing general manager might be facing to
Extract latest PPG values from make a move to either trade for a player or
NHL.com and assign to P P Gactual ; to trade an under-performing player. Based on
Extract team PPCG values (season) other factors such as health and confidence lev-
from NHL.com and assign to els, which are challenging to quantify, the player
P P CGseason ; might simply need a change of scenery and a
Extract team PPCG values (recent) trade to a different team might be a winning
from NHL.com and assign to proposition for all parties involved. Regularly
P P CGrecent ; using the proposed algorithm can help managers
(P P Gpredicted −P P Gactual ) stay on top of such situations.
PAR = (P P CGseason ) + wo ∗
(P P Gpredicted −P P Gactual )
(P P CGrecent ) ; The PAR estimate captures the essence of
end existing ratings that are dependent on proba-
Algorithm 1: PAR Algorithm bilistic considerations. However, the formula-
tion presented above is not derived from any of
these sources. It also does not use a probabilistic
Each of the models were implemented by us-
method to make predictions.
ing existing libraries (scikit-learn) and were op-
timized by running a grid search over the pa- The PAR estimate uses a combination of neu-
rameters. ral nets and empirical formulation to quantify
This insight is invaluable as the players on the the availability of an individual during an NHL
list above are either severely under-performing, season. Such a formulation was not found in
or their teams are not performing well or a com- literature. However, a number of sources at-
bination of both. These players may be more tempted to quantify player and team perfor-
likely to be surrendered by opposing general mance based on a probabilistic approach [3].
managers during trade negotiations and might The most popular one being TrueSkill [3] which
be considered under the radar acquisitions with is a Bayesian Rating system that has been ap-
the potential for very high reward. The formula plied to other sports such as Basketball [2] and
used to determine this rating is presented below: Football [2].

2
4 Related Work Method/Source Mean Median
Neural Nets 0.211 0.188
The following sources were consulted during Decision Tree 0.222 0.21
literature review: Random Forest 0.215 0.193
k-NN Regression 0.234 0.21
i.) Forecasting Success in the National Hockey Linear Regression 0.245 0.22
League using In-Game Statistics and Textual TSN.ca 0.202 0.173
Data [7]: NHL.com 0.197 0.167
This paper utilizes traditional and advanced
statistics for individual players on a team to Table 1: Calculated error values comparing pre-
predict how teams will perform over the course dicted and actual PPG values for top 100 scorers
of a season. The core concept of using statistics in the NHL
to determine the cumulative performance of
players is similar to the idea presented in this
paper. However, PAR makes a point not to fensive). An attempt had been made to do so in
use advanced statistics and in its stead makes baseball but it was limited to the outcome of
use of physical player characteristics and their games based on recent performance [3].
expected usage over the course of an NHL
season.
Comparison or Demonstration
ii.) Estimating the Value of Major League In order to demonstrate the effectiveness of neu-
Baseball Players [4]: This paper attempts ral nets to predict player performance based
to quantify the value of players to determine on historic data, five models were created us-
how much they should be paid. The author ing five different regression based methods. The
proposes a formulation that considers a num- performance of the neural network was com-
ber of features/factors that might determine pared against these methods. The table be-
player value. PAR attempts to consider similar low summarizes the error values observed (pre-
features but only looks at offensive contribution. dicted PPG-actual PPG) for these methods. Ta-
ble X also looks at predictions made by TSN.ca
iii.) Predicting the Major League Baseball and NHL.com before the start of the 2017-2018
Season [6]: This paper uses neural networks hockey season and compares them to the base-
to solve a binary classification problem in the line (current PPG values) for the top 100 players
form of wins and losses for baseball teams over in the league as determined by their PPG.
the course of a Major League Baseball season. The training, validation and test datasets
Their use of neural networks along with a large were created by creating scripts that extracted
amount of data to make these predictions. the required data from NHL.com and using
an 80-10-10 split. Similarly, additional scripts
iv.) TrueSkill - A Bayesian skill rating sys- were created in order to extract baseline data
tem [3]: The paper above uses a probabilistic from NHL.com[5] and TSN.ca[1]. Each feature
approach to skill assessment to produce a rating was modified using z-score normalization before
based on the outcome of previously played training the model.
games. This paper uses chess rankings to Based on the results in Table 1, neural nets
illustrate their approach. provided the lowest error values of any method
with linear regression having the worst perfor-
v.) Knowing what we don’t know in NCAA mance of any method, as expected. Figure 2 il-
Football ratings: Understanding and using lustrates the performances of each of the meth-
structured uncertainty [2]: This paper uses the ods in predicting the top 100 players with the
TrueSkill method and applies it to evaluate highest PPG values.
team performance for NCAA football games. The top 10 players based on the predicted
The focus is on team performance as opposed PPG determined by the neural nets is presented
to player performance. in Table 2. Their current PPG values are also
presented as a reference. The difference in their
The papers reviewed above focus on a proba- predicted vs actual values can be attributed to
bilistic evaluation of performance. After an ex- the current season being only 30% complete.
haustive search of the literature, no papers were Over the course of the season, the actual PPG
found that use non-probabilistic machine learn- values are expected to decrease.
ing algorithms to produce real valued outputs The final step of the PAR algorithm is to
to evaluate player contributions (offensive or de- apply the PAR formula using the predicted

3
Payer Name Actual Predicted Player Name PAR
Nikita Kucherov 1.48 1.18 Ryan Dzingel 2.29
Brad Marchand 1.17 1.02 Mark Stone 2.07
Claude Giroux 1.07 1 Cam Fowler 2.03
Connor McDavid 1.17 0.99 Brandon Montour 1.66
Anze Kopitar 1.17 0.99 Tyler Myers 1.64
Johnny Gaudreau 1.28 0.98 Brendan Perlini 1.21
John Tavares 1.14 0.97 Nick Foligno 1.14
Evgeny Kuznetsov 1.06 0.96 Tomas Tatar 1.12
Brayden Point 0.85 0.93 Dion Phaneuf 1.02
Mark Scheifele 1.21 0.91 Gabriel Landeskog 0.98

Table 2: Comparison of actual and predicted Table 3: Top 10 players with the highest PAR
PPG values for the current top 10 offensive con- estimates
tributors in the NHL

ers that are very large in size (taller than 6’ 3”


and weigh more than 220 pounds). Over the
course of the last 10 years, the majority of such
players have traditionally been enforcers (play-
ers that are not relied upon to score points).
(a) Linear Regres- (b) k-Nearest Neigh- Only a handful of players that meet this crite-
sion bors ria are prolific scorers. The model thus assigns
low point per game estimates to these types of
players. An example of this is Patrik Laine who
was assigned a value of 0.63. This was extremely
surprising because his actual performance in his
first year in the league far exceeded the estimate
(c) Decision Trees (d) Random Forest (by a factor of 1.5). Similarly, players that are
smaller in size (smaller than 5’ 9” and weigh less
than 170 pounds) are more likely to be assigned
much higher point per game values because tra-
ditionally such players have scored at very high
rates.
One of the most important ways to improve
this model is to include more features. These
features should be a combination of player char-
acteristics and situational usage. Another possi-
ble extension of this project would be to assign
monetary values to these players in an attempt
to present what their valuations should be at any
point during the season. The incorporation of
(e) Neural Networks
the methodology in the paper by Fields [7] would
Figure 2: Comparison of various regression draw upon various features related to team per-
methods against baselines for the top 100 play- formance, situational usage and situational per-
ers with the highest PPG values in the NHL formance to present estimates of valuations for
players. Such a tool would be invaluable for gen-
eral managers in the National Hockey League
PPG projections, baseline data and team-based as they attempt to assemble their rosters while
statistics. The top 10 players with the highest working under tight constraints such as the un-
PAR values are presented in Table 3. availability of funds (for managers of small mar-
ket teams), and a salary cap enforced by the Na-
tional Hockey League which limits the amount
Limitations of money that each team can spend on players.
The prediction criteria defined in the algo-
The presented model provides estimates that are rithm makes it unique as it can also be applied
far from the norm for players that are highly to other sports such as Basketball, Baseball and
skilled. An example of this behavior is for play- Football. This is another area that might be

4
worth exploring in the future.

Conclusions
The goal of the paper was to assess the viabil-
ity of using neural nets to predict player perfor-
mance. The results indicate that neural nets and
other regression based methods can be used to
adequately complete this task. The results were
compared to actual PPG values as well as other
sources such as TSN.ca[1] and NHL.com[5] that
are considered to be a top resource for player
projections. The project was also extended in
order to predict the estimate the Player Avail-
ability Rating (PAR) which is a novel metric
that aims to quantify the availability of a player
based on the current performance of the player
and the performance of their team. The results
indicate that neural nets outperform the other
regression based methods and are comparable to
those made by TSN.ca[1] and NHL.com[5].
The author has also launched a website that
has adopted the algorithm presented in this
paper: gmaiplaybook.com

References
[1] S. Cullen. Statistically speaking: Projected
top 300 scorers, 2017.
[2] T. M. D. Tarlow, .T Graepel. Knowing
what we don’t know in ncaa football ratings:
Understanding and using structured uncer-
tainty. 2014.
[3] G. D. Fatta, G. M. Haworth, and K. W. Re-
gan. Skill rating by bayesian inference. 2009
IEEE Symposium on Computational Intelli-
gence and Data Mining, 2009.
[4] B. Fields. Estimating the value of major
league baseball players. 2001.
[5] P. Jensen. Fantasy: Top 250 rankings for
2017-18, 2017.
[6] C. W. R. Jia and D. Zeng. Predicting the
major league baseball season. 2013.
[7] J. Weissbock and D. Inkpen. Combining tex-
tual pre-game reports and statistical data
for predicting success in the national hockey
league. Advances in Artificial Intelligence
Lecture Notes in Computer Science, page
251–262, 2014.

You might also like