Predicting The Outcome of NBA Playoffs Using The Naïve Bayes Algorithms
Predicting The Outcome of NBA Playoffs Using The Naïve Bayes Algorithms
1. Introduction
Since the winter of 1891, following the day Dr. James Naismith nailed a peach basket into a gym
wall, basketball has evolved into a true American game (NMHOF, 2008). Nearly 270,000 people,
each game day, attend basketball arenas around the country to watch the best of the best sweat,
hustle, and entertain (Ibisworld, 2008). Along with watching the games, millions of fans are
involved with the ever growing arena of fantasy basketball leagues and other gambling
alternatives. Involvement in these leagues and gambling precipitates the desire to know which
team will win before they even participate in a game. And use algebra to predict the result just
offer much fun to people, also supply some benefits to the assuming champion team because
people will take more attention on the strong team, thus the focused team will receive more
income and fame.
According to the rules of the NBA, we know that each team will play at most 7 games with his
opponent in the every round of the playoffs, and the team which wins 4 games firstly will stay
and go on, on the other side, the team which loses 4 games firstly will lost the chance to go to
next round and go home. So in these 7 games, one team which has more winning rate and higher
ranks in the regular season will have 4 games in home and 3 away in the playoffs. Thus it is
always acknowledged that the team which obtains 4 home advantages will be more likely to win
and go to next round. In this paper, we examine the use of naïve Bayes model as tool for
predicting the success of basketball teams in the National Basketball Association (NBA). We
will also investigate whether the home advantages take effect in playoffs and whether it will
promote the assuming accuracy with considering this factor. Finally, we will predict the final
champion of 2010-2011 NBA season.
2.1 Data
We obtained our data set form three sources: NBA ESPN, NBA WIKIPEDIA and HOOPS
STATS. We collect the data covering five seasons from 2005-2006, 2006-2007, 2007-2008,
2008-2009, 2009-2010 and 2010-2011. Figure 1 displays the typical box score downloaded.
Information on the team totals in the game, as well as their home/road situation, were the only
features used to conduct the Naïve Bayes analysis. For each season we transfer the information
into the form like below which is preferred to our model:
Table 1 The total won rate of east top 8 teams in regular season 2009-2010
Eastern team Total winning rate Home winning rate Road winning rate
1.Clevealand 0.744 0.853 0.634
2.Orlando 0.720 0.829 0.609
3.Atlanta 0.646 0.829 0.463
4.Boston 0.610 0.585 0.609
5.Miami 0.573 0.585 0.56
6.MIlwaukee 0.561 0.682 0.439
7Charlotte 0.537 0.756 0.317
8Chicago 0.5 0.585 0.414
Table 2 The record of east top 8 playing with each other in regular season 2009-2010
The database used in this research from NBA games played in the other seasons presented in the
database appendix.
We use Naïve Bayes model algorithm to calculate the probability that team i will beat its
opponent team j when they meet in the playoffs routine according to their regular season record.
Through this way, we predict the results which team will stay and go to next round even to the
final champion. For reference the equation is as follow:
P (player i winning given i and j are playing) =
3.1 Without considering the home-road effect in the results of NBA playoffs
Without considering the home-road effect in the results of NBA playoffs, we calculate the
probability which team will beat his opponent when we know the schedule of games. Then we
predict which team will win and go to the next round. Finally compare the prediction of each
round between two teams with the real results to indicate the accuracy. The result of 2005-2006
playoffs of eastern term as an example is given in Table 3 and Figure 2 and others are presented
in database appendix.
Following this initial assessment, the accuracy of predicting the outcome of 2005-2006 eastern
team playoffs is 0.571; and the accuracy of predicting the outcome of 2005-2006 all team games
is 0.667 (shown in Table 4 below). The result is not very good, and the reason is possible that the
smaller the differences between the total winning rates of two teams is, the more the wrong
probability appears. So, this maybe needs to use other models and algorithms to increase the
overall accuracy of the prediction model. However, the model is very easy compare to others,
such as the Bradley-Terry-Lute model, the contest success function. And Results for the
accuracy analysis of 2005-2010 NBA playoffs are shown in Table 4.
Table 4 The accuracy of Predicting the outcome of NBA playoffs without considering the home-road effect
From the analysis of the accuracy of predicting the outcome of 2005-2010 NBA playoffs, it
demonstrates that the predictions are effective by Naïve Bayes model algorithm. Mostly, the
right probability is 0.733; of course there are also some special things (better or worse). And the
average accuracy is 0.747.
3.2 Account the effect of home-road on the results of NBA playoffs
We account the effect of home-road effect on the winning rate to measure the accuracy to
demonstrate whether it will decrease the error rate of prediction. And Results for the accuracy
analysis are shown below in Table 5.
Table 5 The accuracy of predicting the outcome of NBA playoffs without considering the home-road effect
It is shown that the average accuracy of predicting the outcome of 2005-2010 NBA playoffs
accounting the effect of home-road is 0.733, it is less than without considering that. The further
analysis will be shown in section 3.3.
We compare the accuracy of two methods to analyze whether the home-road effects have
influences on the result of the prediction. The results are given in the figure 3.
considered unconsidered
1
0.9
0.8
0.7
accuracy
0.6
0.5
0.4
0.3
0.2
0.1
0
2009-2010
1 2008-2009
2 2007-2008
3 2006-2007
4 2005-2006
5
year
Fig 3 The comparison with the accuracy of two methods to predict the outcome of NBA playoffs
From the bar diagram above, we conclude that the home-road effect do not have enough
influence to promote the accuracy of prediction when using Naïve Bayes model, and sometime it
has negative effect on the accuracy, such as 2008-2009data, 2006-2007data. In another word, it
is not necessary to take account of the home-away effect as a feather in Naïve Bayes model to
predict the results of playoffs. Thus, we will predict the outcome of the latest data set from NBA
2010-2011 playoffs without considering the home-road effect in the following test.
3.4 Prediction
According to the latest data set-NBA 2010-2011 regular season record, we use Naïve Bayes
model to illustrate the playoffs process and predict which team will win the NBA final
championship without considering the home-road effect. Statistics are presented in Table 6 and
the prediction of 2010-2011 playoffs is given in Figure 4. Other raw data is presented in database
appendix.
western team Total winning rate Home winning rate Road winning rate
San Antonio 0.744 0.878 0.610
LA Lakers 0.695 0.732 0.659
Dallas 0.695 0.707 0.683
Oklahoma City 0.671 0.732 0.610
Denver 0.610 0.805 0.415
Portland 0.585 0.732 0.439
New Orleans 0.561 0.683 0.439
Memphis 0.561 0.732 0.390
Fig 4 The prediction of each round about 2010-2011 NBA playoffs
So far NBA games have been carried out for all games except the finals. Compared to our
prediction of each round, we have successfully predicted 12 games and failed to 2 games. In fact,
we are not able to predict the game’s winner played between Dallas and LA Lakers, because the
total winning rates we calculated between them are equivalent. According to the prediction, we
infer that the final championship of 2010-2011 NBA playoffs is Chicago Bulls, and let time to
prove whether the final champion it is. Even if the answer is wrong, the right probability of
prediction has a high accuracy.
4. Conclusion
When we use Naïve Bayes model to predict the NBA results, we only need winning rate as
feather. On this condition, the algorithm is effective and we are able to predict which team has
more ability to go to next round in playoffs. However, it is not available to predict the result of
single game. In the further, what we have to do to increase the accuracy of prediction is using
Neural Networks as model, then construct and discover new features that can more accurately
capture the effects of real game situation to the winning/losing margin of games.
References
[1]Ralf Herbrich, Tom Minka, Thore Graepel. A Bayesian Skill Rating System, in Advances in
Neural Information Processing Systems 20, MIT Press, January 2007
[2]Bernard Loeffelholz, Earl Bdenar, Kenneth W. Bauer. Predicting NBA Games Using Neural
Networks, Journal of Quantitative Analysis in Sports, Vol. 5 [2009], Iss. 1, Art. 7
[3]Kenneth J. Koehler, Harold Ridpath. An Application of a Biased Version of the Bradley-
Terry-Luce Model to Professional Basketball Results. Journal of Mathematical Psychology 25,
187-205 (1982)
[4] Jia Hao. An Empirical Study of Contest Success Functions: Evidence from the NBA.
Department of Economics, University of California, Irvine, CA 92697-5100, USA. October 17,
2007