Winning Games in NCAA Men's Lacrosse: A Logistic Regression Analysis
Winning Games in NCAA Men's Lacrosse: A Logistic Regression Analysis
Abstract
In sports, statistics are routinely gathered to measure outcomes of players and teams. Using
these statistics with quantitative methods to predict sporting outcomes is increasingly popular
and respected across varying levels of many sports. In collegiate lacrosse, over 20 team
statistics are gathered for every game. Which of these statistics are important in determining
the actual outcome of each game? The purpose of this paper is two-fold. First, is to identify
the statistics (if any) that are significant determinants of winning or losing a collegiate lacrosse
game in the NCAA. Second, is to discuss coaching strategies that reflect the results of the
analysis. The method used to identify significant predictors was logistic regression. This
method was chosen due to the nature of the response variable, which is dichotomous (0=loss,
1=win). Data for this analysis includes reported statistics from all NCAA Division 1 Mens
Lacrosse 2013/2014 games (n=512 games). The sample data set was split for model building
(60%) and model validation (40%) to ensure robustness. Independent variable types include
nominal and interval. Variable selection was done using the stepwise regression procedure.
Quantitative analysis includes: discussion of significant variables, their individual effect on the
outcome of the game, and model performance against the validation data set. This analysis can
be very beneficial to coaches, fans, and rule-makers of the sport, as it gives us a unique
perspective on the rising and evolving game of lacrosse.