Going Deep: Models for ContinuousTime WithinPlay Valuation of Game Outcomes in American Football with Tracking Data
Abstract
Continuoustime assessments of game outcomes in sports have become increasingly common, complex, and important in the last decade. But in American football, only discretetime estimates of the value of plays and game situations were possible until recently, since the most advanced public football datasets were recorded at the playbyplay level. While measures such as expected points and win probability are useful for evaluating football plays and game situations, there has been no research into how these values change throughout the course of a play. In this work, we make two main contributions: First, we provide a general framework for continuoustime withinplay valuation in the National Football League (NFL) using the NFL’s Next Gen Stats player and ball tracking data. Our framework incorporates several modular submodels, so that other recent work involving player tracking data in football can be easily incorporated into our framework. Second, we construct a ballcarrier model. The ballcarrier model estimates how many yards the ballcarrier will gain from their current position, conditional on the locations and trajectories of the ballcarrier, their teammates, and their opponents. We test several modeling approaches for the ballcarrier model, and ultimately find that a long shortterm memory (LSTM) recurrent neural network outperforms alternative approaches. For each moment of each play, we use the LSTM to continuously update the ballcarrier model, and we use this estimate to determine the estimated endofplay yard line. Then, we use the estimated endofplay yard line as input to a betweenplay model for game situation value, such as the expected points or win probability added models from Yurko et al. (2019). This yields an estimate of withinplay value in these terms. Our research has several key benefits: The framework is adaptable, so that any measure of play value (or any model for expected points or win probability) can be used. The framework is modular, so that (for example) existing models for pass attempt outcomes or quarterback decisionmaking can be applied within this framework. Finally, the fullyimplemented framework will allow for continuoustime assessment of all 22 players on the field, which was never before possible at such a granular level.
Keywords: football, recurrent neural networks, expected points, win probability, player tracking data.
1 Introduction
Quantitative analyses of sports have become increasingly complex in the last decade, mostly due to the advent of player and object tracking data across most major sports. Tracking data captures the position and trajectory of the athletes and objects of interest (e.g. balls, pucks, etc) on the playing surface for a given sport. Statistical analysis of tracking data in sports has been an increasingly popular area of research in recent years; we encourage interested readers to read the review paper on this topic from Gudmundsson and Horton (2016) for a detailed summary of the work in this area.
In this work, we focus on a particular but important area of player tracking data analysis: Continuoustime valuation of game outcomes – in our case, for American football. Figure 1 provides a visual representation of this idea, showing how the expected points (A) and win probability (B) change continuously in reaction to onfield events throughout the course of a 47yard touchdown run by Cordarrelle Patterson.
Below, we provide a brief overview of discretetime valuation of game outcomes in football, continuoustime valuation of game outcomes in all sports, and continuoustime valuation of game outcomes in football specifically.
1.1 Previous Work: DiscreteTime (PlaybyPlay) Evaluation of Football Game Outcomes
Commonly, there are two classes of models for discretetime evaluation of game outcomes in football: models for expected points, and models for win probability. Models for expected points seek to answer the question: How many points is the current game situation worth, in expectation, conditional on the features of that game situation (e.g. down, distance, yard line, score differential, time remaining, etc)? Models for win probability ask a fundamentally different question: How likely is it that the possession team will win the game, conditional on the features of that game situation (e.g. down, distance, yard line, score differential, time remaining, etc)? Yurko et al. (2019) provide an overview of these playvaluation frameworks, including a review of prior approaches for building these models, new approaches for building these models, and examples of how these models and their derived metrics can be used to evaluate individual players and teams. These models are typically estimated at the playbyplay level (between plays), since this is the finest level of granularity at which datasets are available. However, there has been no work todate studying how valuation of football game outcomes evolves within plays.
1.2 Previous Work: ContinuousTime Models for Game Outcomes in Sports
Although tracking data is not technically collected in continuoustime – most systems track the locations and trajectories of athletes and objects of interest at rates of 10 to 25 Hz – it is fundamentally different from playbyplay or eventlevel datasets. In particular, the unit of interest in playbyplay or eventlevel data is a single (discrete) play or event, while the units of interest in tracking data are the continuously changing locations and trajectories of players and objects on the playing surface.
Using tracking data, several approaches exist for continuoustime modeling of game outcomes in sports. In basketball, Cervone et al. (2014) and Cervone et al. (2016) provide models for expected possession value (EPV), which is a continuoustime estimate of expected points scored by the team in possession during a single basketball possession, conditional on the locations and trajectories of players (and the ball). The authors use a twolevel Markov chain approach to do this. First, they model the competing hazards of (discrete) possessionchanging events (e.g. passes, shot attempts, turnovers). Second, they model (continuous) player movement on the court. These two models, each of which condition on the locations and trajectories of the players and the ball, are combined hierarchically to estimate EPV at each moment.
In soccer, Link et al. (2016) quantifies the performance of attacking teams in terms of their probability of scoring. The authors provide continuously updating estimates of the probability of a goal being scored at each moment throughout the course of a possession. Fernández et al. (2019) use deep learning to estimate EPV in soccer. They take a multilevel approach similar to Cervone et al. (2016), where discretetime estimates of “expected goals” (describing the likelihood of a shot resulting in a goal, if taken), “passing value” (describing the value, in terms of expected goals, of a pass), and “drive value” (describing the value, in terms of expected goals, of a drive to the net) are combined with continuoustime estimates of action likelihood (shot, pass, or drive) to provide an overall, continuoustime measure of EPV. Each of the submodels in this approach conditions on the locations and trajectories of the players and the ball.
Observant readers will note several similarities between our approach and the approaches of Cervone et al. (2016) and Fernández et al. (2019): combining discretetime and continuoustime models, continuously estimating the value of game outcomes within plays, and using the resulting metrics to quantify the value added of individual athletes.
1.3 Previous Work: ContinuousTime Models for Football
In December 2018, the National Football League (NFL) temporarily made public a subset of player and balltracking data from Weeks 16 of the 2017 season for its inaugural “Big Data Bowl” competition. Although the data has since been taken down, several authors have contributed interesting work to the literature using this data.
Burke (2019) uses a deep learning approach to model outcomes of the passing game. In different variants of this model, the author uses DeepQB to model each receiver’s target probability, the pass outcome probability (complete, incomplete, interception), and the expected yards gained. Each of these variants of DeepQB can be incorporated into the general framework for withinplay valuation of game outcomes that we provide in this paper.
Deshpande and Evans (2019) provide innovative statistical models for the hypothetical completion probability of a pass. The authors use counterfactual analysis of withinplay features to impute upstream and downstream features like the time at which the ball will arrive to the targeted receiver. This model can also be incorporated into the general framework for withinplay valuation of game outcomes that we provide in this paper.
Several other authors have undertaken interesting research topics using the NFLprovided tracking data. For example, Chu et al. (2019) use mixture modeling to automatically identify, cluster, and characterize route types of receivers. Similarly, Sterken (2019) use a convolutional neural network to classify the route types of receivers. Dutta et al. (2019) use clustering models to provide unsupervised, probabilistic annotations for the coverage type of defensive backs. Haar (2019) provides an exploratory analysis of NFL passing plays. These works all involve improving upon the existing leagueprovided tracking data by providing additional information that can be estimated from the underlying player locations and trajectories. However, they do not attempt to model game outcomes, so they are of limited relevance to this paper.
1.4 Our Contributions
Our paper makes two main contributions. First, we provide a general framework for continuoustime withinplay valuation of game outcomes in the NFL, using the leagueprovided tracking data. Our framework, described in Section 3, incorporates several modular submodels, so that the recent work involving player tracking data in football described above can be easily incorporated into our framework.
Second, we construct a novel ballcarrier model, which estimates the yards gained from a ballcarrier’s current position (and thus, the endofplay yard line), conditional on the locations and trajectories of the ballcarrier, their teammates, and their opponents. We find that long shortterm memory (LSTM) recurrent neural networks outperform alternative approaches for this modeling task.
We update the predictions from the LSTM at each frame of the tracking data to continuously update our estimate of the yards gained from the ballcarrier’s current position, and we use the corresponding estimated endofplay yard line as input to the discretetime (between play) models for game situation value (expected points and win probability) from Yurko et al. (2019). As a result, we obtain a continuoustime estimate of withinplay value in terms of expected points and/or win probability on rushing plays. We provide examples of these withinplay valuations of game outcomes in Section 5, and we demonstrate how changes in withinplay valuations of game outcomes can be used for player evaluation.
Our research has several key benefits: First, the framework is adaptable, so that measure of play value (or any model for expected points or win probability) can be used. Second, the framework is modular, so that (for example) any model for pass attempt outcomes or quarterback decisionmaking can be substituted into this framework in place of the approach we use here. For example, one could use the models from Burke (2019) or Deshpande and Evans (2019) in the appropriate places of the framework described in Section 3. Finally, the fullyimplemented framework will allow for continuoustime assessment of offball player movement, quarterback decisionmaking, ballcarrier value added, receiver value added, blocking value added, defensive player value added, and many other evaluative tools that were never before possible at such a granular level.
2 Player and Ball Tracking Data
In December 2018, the NFL became the first North American professional sports league to release a portion of their tracking data to the public when temporarily made available a subset of this data from Weeks 16 of the 2017 season for the inaugural “Big Data Bowl” competition.^{1}^{1}1The NFL ran a separate competition involving analyzing tracking data for punts. However, the data made available for this competition only covered punt plays, and thus is not relevant for this paper.
The NFL’s tracking data collected as follows: Two radio frequency identification (RFID) chips are placed in each player’s shoulder pads (and in the ball). The RFID chips emit a signal to sensors in each stadium, which triangulate the location of the chip on the field. The data is collected at a rate of 10 Hz, so that the onfield location, speed, and angle of each player (and the ball) is recorded 10 times per second. Event annotations (e.g. ball snapped, first contact, pass thrown, etc) are recorded by the NFL for each play. In total, the dataset contains 1,075,720 unique frames (not counting frames separately for each player and ball) across 14,167 plays, each of which records the locations and trajectories (speed, angle) of all 22 players (and the ball) on the field.
Table 1 shows an example of this data for a 47 yard TD run by WR Cordarrelle Patterson, which occurred in a Week 6 game between the Los Angeles Chargers and Oakland Raiders in 2017. Four frames from this play are displayed in Figure 2 displaying the coordinates of the offense (blue), defense (orange), and the ballcarrier (black) at particular events in the play.
frame.id  x  y  s  dir  event  displayName 
24  60.64  29.70  7.55  175.34  handoff  Cordarrelle Patterson 
25  60.77  28.94  7.61  177.10  NA  Cordarrelle Patterson 
⋮  ⋮  ⋮  ⋮  ⋮  ⋮  ⋮ 
44  55.20  14.62  8.92  226.45  first_contact  Cordarrelle Patterson 
⋮  ⋮  ⋮  ⋮  ⋮  ⋮  ⋮ 
This data can easily be joined to existing playbyplay data from the NFL’s API (e.g. via the nflscrapR package), which contains additional information about each play (Horowitz et al., 2017). For the models in Section 4, we identified all ballcarrier sequences for running plays, which includes designed runs and QB scrambles. While the tracking data records the location of the ball in addition to the players, it does not identify who is the ballcarrier for a particular frame. We first identified the ballcarriers for every type of play (pass attempts, runs, returns, etc.) based on the information available from the NFL’s API via nflscrapR, which denotes who was directly involved in each play. Given the roles a player can have (passer, runner, receiver, interceptor, or returner), we used the provided event annotations to determine when a player became the ballcarrier. Since we focus our attention on running plays in this manuscript, we identify the beginning of the ballcarrier sequence when the runner received the ball by either a handoff, lateral, or direct snap (all snaps included for QB runs). The end of the ballcarrier sequence was marked when either the player was tackled, ran out of bounds, fumbled, or scored a touchdown. We excluded all plays missing the necessary information from the NFL API, as well as plays where the snap of the play was missing in the tracking data, and any ballcarrier sequences where either the starting or ending events were missing. After further preprocessing for the covariates described in Section 4, our final modeling dataset consisted of 153,184 frames from 4,447 unique ballcarrier sequences on running plays. Figure 3(A) displays the distribution for the length of ballcarrier sequences, revealing that majority of ballcarrier sequences are between two to five seconds in length, while Figure 3(B) displays the observed change in field position from the ballcarrier’s current location that will be modeled as discussed in 3.4.
3 A Framework for ContinuousTime Play Value in Football
Our approach for providing continuoustime withinplay valuations involves several key pieces, which we combine via the framework presented in Section 3.3. We first discuss models for evaluating game situations at a discrete level between each play (Section 3.2). Next, we describe several submodels for computing various withinplay quantities that comprise the rest of our withinplay valuation framework: A ballcarrier model (Section 3.4), a quarterback decision model (Section 3.5), a target probability model (Section 3.6), an incompletion probability model (Section 3.7), and a catch probability model (Section 3.8).
3.1 Notation
Here, we summarize the notation used in the rest of this section, for easy reference.

Let be some time between the start (i.e. the snap) and end of a play

Let be a random variable representing the yards gained from the ballcarrier’s current position on the field, and be the corresponding endofplay yard line

Let be a data structure representing the locations and trajectories of all players and the ball from the start of the play until time

Let be some filtration of the locations and trajectories of all players and the ball from the start of the play until time , borrowing notation from Cervone et al. (2016)

Let be the expected yards gained from the ballcarrier’s current position, and be the corresponding expected endofplay yard line

Let be a binary random variable describing whether receiver was targeted () or not () on play

Let be a binary random variable describing whether a pass on play is incomplete () or caught by an offensive or defensive player ()

Let be a binary random variable describing whether player caught the ball () or not (), where represents one of the 16 players who can catch a pass (five eligible offensive receivers and 11 defenders)

Let be a probability mass function over the set of decisions a QB can make:

Let be a probability mass function describing the likelihood that a receiver is targeted on play

Let be a probability mass function describing the outcome (incomplete or caught) of a pass on play targeted to receiver

Let be a probability mass function describing whether player caught the ball () or not ()
3.2 Estimating BetweenPlay Value
In Section 1.1, we described prior approaches for estimating betweenplay value in football. Here, we posit that no additional information from the tracking data described in Section 2 will influence the betweenplay valuations of a football game, regardless of which model for betweenplay valuation is used. That is, the value of a game situation between when the previous play ends and the next play begins is a function of only the factors that are observable between plays (e.g. down, yards to go, yard line, score, time remaining, timeouts remaining, etc); these values are conditionally independent of any information that can be gathered from withinplay tracking data.
Intuitively, this makes sense: If the home team has possession of the ball on 3rd down with 2 yards to go at the opponent’s 26 yard line, we should assign the same value to that situation regardless of how they got to that point (e.g. via a lucky catch in the middle of the field vs. an open catch near the sideline).
Because of this, it is not necessary to develop new models for betweenplay value using tracking data. One benefit of this is that any model for betweenplay value can be substituted into this piece of our framework, without affecting any other piece of the framework. For the remainder of this paper, we use the expected points and win probability models from Yurko et al. (2019) for this purpose, since they are reproducible, publicly available, wellcalibrated, and interpretable in terms of game outcomes.
3.3 Framework for ContinuousTime Play Value
Given an appropriate model for betweenplay value, our goal is now to model the features that are used as input to the betweenplay model. From Yurko et al. (2019), these features include the down, yard line, yards to go, score differential, and other minor factors.^{2}^{2}2Other factors may include the time remaining, which can be estimated using commonsense methods; the timeouts remaining for each team, which do not change within plays; and indicators that are direct functions of the yard line or the time remaining. We notice that the down, yard line, yards to go, and score differential on play are each functions of the yard line at which the play ended. As such, in order to update the estimates of betweenplay value, we only need to estimate the yard line at which the current play ends, and then update the other betweenplay variables accordingly.
Our framework for providing continuouslyupdating withinplay valuations is organized as follows:
Rushing Plays: Model the expected yards gained from the ballcarrier’s current position,

Obtain the associated expected endofplay yard line, , through linearity of expectations ( = + [player’s current yard line])

Use this quantity as input into the chosen play value model from Section 3.2, along with commonsense updates to other covariates used in the play value model (e.g. increment the down or reset it to 1, adjust the time remaining, update the score, etc) at the end of the play.^{3}^{3}3For simplicity, we do not model rare events like fumbles or laterals within the ballcarrier model. This is discussed in depth in Section 6.
Passing Plays: Model the QB’s decision probabilities,

Throw Ball Away:

The play ends at the play’s original yard line

Update the covariates for the play value model accordingly (e.g. increment the down, adjust the time remaining)


Run / Sack:

Use the ballcarrier model

Follow the same procedure used for rushing plays


Pass: Model , each receiver’s target probability on play

Normalize these probabilities at each time ^{4}^{4}4We suggest the use of Softmax normalization here, to handle rare cases where the estimated target probabilities are all 0.

For each receiver : Model , the incompletion probability of a pass on play

Incomplete: The play ends; update the covariates for the play value model accordingly (e.g. increment the down, adjust the time remaining, maintain same yard line)

Caught: Model , for as each of the 5 offensive receivers and 11 defenders

Normalize to

For each potential passcatcher: Assume they are the ballcarrier, and input the current situation into the ballcarrier model, following the same procedure used for rushing plays



The above framework is illustrated in Figure 4. In the above framework, the predictions from every model are updated at each time throughout the play, and (given the play type) can be combined to get an overall expected endofplay yard line. For rushing plays, the expected endofplay yard line directly estimated. For passing plays, each possible node on the decision tree in the framework above has two pieces of information:

The node’s probability of being achieved, which is computed using the estimated probabilities at each step/split in the tree

The expected endofplay yard line, since each node eventually ends with the ballcarrier model’s estimate of the yards gained (or ends without a ballcarrier, in the case of an incompletion or throw away)
These two pieces of information are easily combined across all nodes into a single estimate of the expected endofplay yard line.
After we estimate the endofplay yard line, we can easily determine the additional covariates in the play value model from Section 3.2. For example, the updated down number and yards to go depend only on the previous yards to go and the yards gained on the play. Similarly, the possession team is easily determined, since the pass catcher is either on the offensive or defensive team, and turnovers on downs occur only if the yards gained on the play is less than the previous yards to go.
3.4 BallCarrier Model
First, we model , the yards gained by the ballcarrier from their current position on the field conditional on the team in possession and the locations and trajectories of all 22 players on the field (including the ballcarrier).
This ballcarrier model is the most important model in our continuoustime play value framework, because (1) it is the only model used for all rushing plays, and (2) all nonincomplete passing plays require the estimation of the yards gained by the ballcarrier (QB, receiver after catching the ball, defender after intercepting the ball, etc) from the current position on the field.
Of key importance, only a single model is needed, and this model can be used for any situation in which a player is carrying the ball (with no intent to pass). In other words, our framework requires a single model for all of the following ballcarrier situations:

a running back on rushing plays

a quarterback on scrambles or designed quarterback rushes

a wide receiver on endarounds, reverses, etc

a passcatcher after that player catches the ball (comprising both offensive players who catch a pass and defensive players who intercept a pass)
We experiment with several implementations of this model for rushing plays, described in Section 4. Once we estimate , we can easily obtain an estimate of the endofplay yard line, , by adding the ballcarrier’s current yard line to , due to linearity of expectations.
3.5 Quarterback Decision Model
For passing plays, we must model the decision that a quarterback will make. Specifically, on a given passing play, the quarterback has three possible decisions, described by the set , where:

: Throw the ball away

: Run (or be sacked)

: Pass to a receiver
Let be a probability mass function for the decision made by the quarterback on play , a passing play, conditional on the locations and trajectories of all players and the ball over the course of play up until time . follows a multinomial distribution over the set .
We leave the implementation of this model as a task for future work. Possible methods for implementing this model include recurrent neural networks with a multinomial response, multinomial logistic regression, or decision tree frameworks like random forests (Breiman, 2001) or XGBoost (Chen and Guestrin, 2016).
3.6 Pass Target Probability Model
For passing plays where the QB’s decision is to pass (rather than run, be sacked, or throw the ball away), we must model each receiver’s target probability, . Since is a binary response variable, there are many suitable methods implementing this model.
Importantly, when training this model, each play in the tracking dataset should be replicated five times (once for each possible targeted receiver on the offensive team), and each replicated play’s explanatory and response variables should be updated to be with respect to the receiver in question. That is, if a receiver is targeted on this play, then , and . Similarly, will be with respect to .
Once the target probability is calculated for each of the five receivers, these five quantities must be Softmaxnormalized so that they form a valid probability distribution over the space of possible targeted receivers.
We leave the implementation of this model as a task for future work.
3.7 Incompletion Probability Model
For each possible targeted receiver, we next model , the probability that a pass to that receiver will be incomplete.
It may seem counterintuitive to model incompletion probability rather than completion probability, but we do this for a specific purpose: So that the catch probabilities for each offensive receiver and defensive player (from the subsequent pass catching model) can be computed with the same model, and then Softmaxnormalized to the quantity .
A pass can only be caught or not caught (incomplete), so our random variable can take only two values: 1 if the pass is incomplete, and 0 if the pass is caught (by an offensive receiver or defensive player). Since is a binary response variable, there are many suitable methods implementing the incompletion model (e.g. logistic regression, treebased methods, or a recurrent neural network with a binomial response).
We do not implement this model in this paper, since this area has been extensively studied. For example, Deshpande and Evans (2019) implement a similar model, but for catch probability.
3.8 Catch Probability Model
Finally, we model , the probability that player catches the ball, given that the pass targeted to receiver was not incomplete.
Similar to the target probability model, when training the catch probability model, each play in the tracking dataset should be replicated 16 times (once for each eligible receiver on the offensive team, and once for each of the 11 defensive players), and each replicated play’s explanatory and response variables should be updated to be with respect to the receiver in question. That is, if a receiver is targeted on this play, then , and . Similarly, will be with respect to .
Since is a binary response variable, there are many suitable methods implementing the incompletion model (e.g. logistic regression, treebased methods, or a recurrent neural network with a binomial response).
Once the catch probability is calculated for each of the 16 possible passcatchers, these 16 quantities must be Softmaxnormalized so that they form a valid probability distribution over the space of possible passcatchers.
We leave the implementation of this model as a task for future work.
4 The BallCarrier Model
An advantage of our framework is the modularity of the models. Modularity implies that we can develop each model independently, then plug the best model for each task into the framework. For example, once we develop a ballcarrier model, we can use this model to compute continuoustime play value for each moment in a game when a player is running the football.
Our ballcarrier model estimates the yards gained from the player’s current yard line (and thus the final yard line a ball carrier will reach on a play), conditional on the locations and trajectories of all 22 players in the field. Section 4.1 introduces the features we use for our ballcarrier model, Section 4.2 describes the different ballcarrier models we tried, and Section 4.3 describes how we evaluate our ballcarrier models.
4.1 Features for the BallCarrier Model
The tracking data provides a wealth of information about a football play, including who is on the field, where they are on the field, which direction they are facing, how fast they’re running, and more. A first step in developing our ballcarrier model is deciding what information will be helpful to use in modeling the yards gained from the ballcarrier’s current position.
The first set of features we use is based on the location of each player relative to the ballcarrier. For each player, we record their coordinate, coordinate, speed, direction, distance traveled in the previous frame, and Euclidean distance to the ballcarrier. We split the players into three groups: ballcarrier, offense, and defense. For the offensive and defensive groups, we order the players based their Euclidean distance to the ballcarrier. For example, the feature defense2_x gives the coordinate of the second closest defender, the feature bc_s gives the speed of the ballcarrier, and so on.
The second set of features uses the Voronoi tessellation of player locations (Voronoi, 1908). The Voronoi tessellation partitions the playing surface into regions, where each region corresponds to the area of the playing surface closest to an individual player. These regions help expose some of the more complex geometric relationships between the players.^{5}^{5}5Several authors use Voronoi tessellations to analyze tracking data in sports. For an overview, see Gudmundsson and Horton (2016).
We extract three simple features from the Voronoi tessellation: the area of the Voronoi region associated with the ball carrier, and the xcoordinates of the closest and farthest points from the target endzone on the boundary of the ballcarrier’s Voronoi region. Figure 5 exhibits these features for the handoff and first contact frames from the Cordarrelle Patterson TD run example from Figure 2.
We created the Voronoi tessellations with the deldir package in R (Turner, 2019). For each frame of each play, we calculate both the complete set of vertices that define the tessellation, and the area of each player’s region. The set of vertices lets us calculate the features described above. This set of vertices also allows for future exploration of Voronoi features for the ballcarrier model (and, potentially, for other models in the continuoustime play value framework). A complete list of the features used in our ballcarrier model is given in Table 2.
Variables  Description 

bc_x,  
offenseX_x, defenseX_x  Horizontal x coordinate on field for the ballcarrier, X closest teammate, and X closest defender. For example, defense1_x represents the xcoordinate of the closest defender. 
bc_y,  
offenseX_y, defenseX_y  Vertical y coordinate on field for the ballcarrier, X closest teammate, and X closest defender. 
bc_s,  
offenseX_s, defenseX_s  Speed in yards/second for ballcarrier, X closest teammate, and X closest defender. 
bc_dir, offenseX_dir, defenseX_dir  Direction in degrees the ballcarrier, X closest teammate, and X closest defender is facing. 
bc_dis, offenseX_dis, defenseX_dis  Distance traveled since previous frame by the ballcarrier, X closest teammate, and X closest defender. 
offenseX_dist_to_ball, defenseX_dist_to_ball  Euclidean distance from ballcarrier for X closest teammate and X closest defender. 
voronoi_bc_close  Xcoordinate of the ballcarrier’s Voronoi region that is closest to the target endzone. 
voronoi_bc_far  Xcoordinate of the ballcarrier’s Voronoi region that is farthest to the target endzone. 
voronoi_bc_area  Area of the Voronoi region associated with the ballcarrier. 
Each feature in Table 2 is centered and scaled. We also explored lagged variables, but did not find that these variables improved the performance of our models. This list of features is only a starting point, and future feature engineering, such as the space ownership approach from Fernandez and Bornn (2018), may significantly improve the ballcarrier model. Similarly, we currently do not have a good approach for directly accounting for the positioning of blockers, which may be especially useful for ballcarrier segments in the open field (though this is done indirectly via the Voronoi features). Improving upon the feature space used as input for the ballcarrier model may improve the model’s prediction accuracy, and is a task left to future work.
4.2 Models
The ballcarrier model has several important aspects:

High dimensions. Since there are 22 players on the field, and each player has an xcoordinate, ycoordinate, angle, speed, etc., we can use many features to estimate the final yardline of the ballcarrier.

Nonlinearity. We don’t expect the best prediction for the final yardline to have a simple linear structure. For example, we would expect a player facing the ballcarrier to have a better chance of making the tackle than a player than a player not facing the ballcarrier.

Interactions. Our features should depend on eachother. For example, a defender is more likely to tackle the ballcarrier if no one is blocking him.

Time. Since we’re estimating the final yardline at each time frame, the predictions should be smooth from frametoframe, and we should be able to use this temporal structure in our models.
Thus, we select models that capture these aspects of the data, and we use appropriate regularization to avoid overfitting. Before moving to more complicated models, we establish a baseline model. The baseline model only uses an intercept, which means it doesn’t use any of the features described in Section 4.1. We use the baseline model to set an initial performance benchmark.
The next model we use is the LASSO regression model (Tibshirani, 1996). The LASSO works well in high dimensions, is easy to interpret, and has a fast implementation. We used the glmnet implementation in R, choosing the one standard error regularization penalty from model training via cross validation (Friedman et al., 2010).
We also explored additive gradient boosting trees using the popular XGBoost implementation (Chen and Guestrin, 2016). Like the LASSO, XGBoost works in high dimensions, and also accounts for nonlinear interactions in the data via treebased partitioning. Of course, the LASSO can also account for nonlinear interactions, but that would require the explicit construction of additional features. We implemented XGBoost via the xgboost R package, and found the default settings (100 trees, max depth of 3 splits) to yield the best results in crossvalidation training among the regularization parameters that were considered.
Another flexible model that works well in high dimensions, and can capture nonlinear interactions, is a feedforward neural network (Haykin, 1998). Chapter 6 of Goodfellow et al. (2016) provides a clear and detailed overview of this type of model. We used a feedforward neural network with three layers, where each layer has five hidden units. We used a ReLu activation function for each layer^{6}^{6}6Glorot et al. (2011) describe the ReLu activation function, and show that it outperforms other activation functions for deep networks., and regularized each layer with an L1 penalty. We trained the network with the Adam algorithm (Kingma and Ba (2014)), and implemented the network with the keras R package (Allaire and Chollet (2019)).
So far, none of our models have explicitly accounted for the temporal structure of the data. To remedy this, we can adapt our feedforward neural network into a recurrent neural network. Specifically, we use a long shortterm memory (LSTM) network (Hochreiter and Schmidhuber (1997)). Our LSTM has three layers, with five inputs in each layer, and we use a recurrent dropout rate of 20% for each layer. Finally, because not all ballcarrier sequences are the same length, we zeropad each sequence to the size of the longest ballcarrier sequence. Table 3 summarizes the five different models we use, in terms of aspects we considered in the beginning of this section.
Model  Highdimensions  Nonlinear  Interactions  Time 
Baseline  
LASSO  ✓  
XGBoost  ✓  ✓  ✓  
Feedforward Neural Network  ✓  ✓  ✓  
Long shortterm memory (LSTM)  ✓  ✓  ✓  ✓ 
4.3 Model Validation
Since our ultimate goal is to generate continuoustime valuations for every playertracking frame in the data, we need to ensure that our selected model is performing well across the sample of provided games. As a computationally feasible alternative to the ideal leaveoneframeout cross validation, we use leaveoneweekout (LOWO) crossvalidation (e.g. train on all frames from games in weeks one through five, then generate predictions on all frames from games in holdout week six) to select the ballcarrier model. We evaluate the LOWO predictions with three criteria: (1) overall root meansquared error (RMSE), (2) weighted average RMSE across number of frames from end of ballcarrier sequence, and (3) the mean expected points added.
The first criterion, overall holdout RMSE, is connected to our goal of generating baseline continuoustime withinplay values across all individual frames. The second criteria places more emphasis on frames closer to the end of the ballcarrier sequences due to the variation in the length of runs as seen in Figure 3(A). A model is unlikely to accurately forecast the outcome of a ballcarrier sequence at the first frame when the length of entire ballcarrier sequence is long. The model should be more accurate on frames closer to the end of the ballcarrier sequence. The final criterion connects the generated results from the ballcarrier model to the endgoal of generating wellcalibrated expected points values, as described in Yurko et al. (2019). If the ballcarrier model is ultimately generating expected points added values that are not centered at 0, this would indicate a bias in the established baseline used for evaluating movements within a play.
5 Results
This section walks through various results and analysis of our ballcarrier model.
5.1 Model Comparison and Selection
Table 4 displays the overall LOWO CV RMSE for each of the candidate models. We see that the LSTM performs best with the lowest RMSE. Unsurprisingly, all covariateinformed models perform better than the interceptonly baseline approach. Additionally, we see that the LASSO results in higher RMSE as compared to the flexible nonlinear models.
Model  RMSE 

Baseline  7.72 
LASSO  6.43 
XGBoost  5.98 
Feedforward Neural Network  6.18 
LSTM  5.65 
The results for our second criterion are displayed in Table 5, revealing that the LSTM again performs best when upweighting the predictions for frames closer to the end of the ballcarrier sequences. For reference, Figure 6 displays the RMSE across the number of frames away from the end of the ballcarrier sequence that are used for generating the weighted values in Table 5. We see the poor performance of the baseline across all moments in ball carrier sequences, and also that the LSTM appears to displays the optimal performance across the majority of frames in sequences. The increase in RMSE as we get farther out from the end of the play is to be expected, due to selection bias: plays that are 100 frames from the end of the ballcarrier sequence (i.e. 10 seconds from the play ending) are almost always long runs.
Model  Weighted average RMSE 

Baseline  6.10 
LASSO  4.68 
XGBoost  4.41 
Feedforward Neural Network  4.60 
LSTM  4.11 
To measure the calibration of the candidate models, we perform the calculation of continuoustime play value for rushing plays, as described in Section 3.2, by using the LOWO CV model predictions. The predicted yard line a ballcarrier is expected to reach then determines the subsequent down (incremented by one if a firstdown is not achieved, and reset to 1 if a first down is achieved or if there is a turnover on downs), the possession team (changes only if a turnover on downs takes place), the resulting yards to go for a first down or goaldown situation, and the score differential (changes only if a touchdown was scored on the run). For now, we use the observed time of the ballcarrier sequence for adjusting the amount of time remaining in the game. This adjusted contextual information is used to generate the expected points for each frame in the ballcarrier using the multinomial logistic regression model from Yurko et al. (2019). The calculations for were done using the calculate_expected_points function available in nflscrapR (Horowitz et al., 2017). The input features for the win probability model from Yurko et al. (2019) are similar to those of the expected points model, and thus require no additional explanation here.
Figure 7 displays a comparison of the holdout expected points added (EPA) values for the different candidate models, displaying the mean plus or minus two standard errors. Here, we see a clear bias in the baseline model, as well as noticeable meanshifts from zero for both the LASSO and feedforward neural network models, but can clearly see that the LSTM has the closest mean to zero.
Since the LSTM meets all three criteria of achieving accurate predictions according to RMSE, while also providing wellcalibrated expected points added values, we proceed to train a LSTM model on the full six weeks of data. We use the same settings described in Section 4.2 on all of the available ballcarrier sequences to generate the results for the example play and player evaluations with the full LSTM model below.
5.2 Analysis of Feature Importance
For context regarding the covariates considered, we additionally trained the XGBoost and LASSO models on the entire dataset. Figure 8 displays the top ten variables in terms of importance from the XGBoost model. It shows that the two most important variables are the distance the to closest defender (defense1_dist_to_ball) and the ballcarrier’s current speed (bc_s). This is consistent with the top variables selected by the LASSO model trained on the entire dataset, as indicated by Figure 9. The directions of the LASSO coefficients are consistent with intuition, e.g. the faster the ballcarrier is moving the further they are expected to carry the football.
5.3 ContinuousTime Play Value: Examples
Using the LSTM model from Section 5.1 trained on all available data, we again calculate the continuoustime play values by feeding the LSTM predictions into both the expected points and win probability models from Yurko et al. (2019), making the appropriate corrections as described in Section 5.1. This framework for computing expected points additionally allows us to generate the continuoustime win probability by using the adjusted time remaining and framelevel as inputs for the generalized additive win probability model in Yurko et al. (2019). The calculation for was done using the calculate_win_probability function available in nflscrapR (Horowitz et al., 2017).
We return to the Cordarrelle Patterson TD run from Figure 2 to demonstrate. On offense, the Raiders trailed the Chargers 1410 in the fourth quarter with eight minutes left on down with two yards to go to at the Chargers’ 47 yard line. Figure 1 displays the change in expected points and win probability estimates over the course of the run, starting with the initial betweenplay value and changing over the course of the play until reaching the endzone for a touchdown. This resulted in the Raiders taking the lead and advancing their win probability beyond the 50% mark.
Figure 10 displays an updated version of Figure 2 with the expected yard line (in red), that the ballcarrier (black) is predicted to reach given all information regarding his teammates (blue) and opponents (orange) using the LSTM model at (A) handoff, (B) first contact, and (C) the first frame when the expectation was a touchdown. At handoff the expectation is roughly an elevenyard gain and increases steadily through first contact, as captured by Figure 1, until the expectation reaches the prediction of a TD run.
For context in understanding the change in the expected points and win probability within the touchdown run, Figure 11 displays the (A) change in the distance to closest defender, as well as (B) Patterson’s speed and (C) Patterson’s Voronoi area in each frame of the run. We see that the moment Patterson was no longer expected to score a touchdown occurred when the closest defender was within the same distance as the point of first contact. But he then gained additional separation from the opponent, leading to an expectation of scoring a touchdown once again.
5.4 Player Evaluation with ContinuousTime Play Value
As noted in Section 1.3, we can use the resulting continuous expected points values from the LSTM model to gain insight into the contributions of individual athletes over the course of a play. Figure 12 demonstrates this by displaying the joint distribution of the EPA per frame and and framelevel success rate, a novel update of Brian Burke’s success rate (Burke, 2009) now calculated to be the proportion of player frames leading to positive expected points added. For simplicity, this figure only displays players with a minimum of 1000 frames of carrying the football. We see running back Leonard Fournette stand out for his high EPA per frame, while Seattle Seahawks’ QB Russell Wilson appears to provide the most value with his legs among qualified QBs during these first six weeks of the 2017 NFL season. In this small sample of data, these playerlevel metrics are heavily influenced by long runs and touchdown runs. For example, Leonard Fournette had six touchdown runs in the first six weeks of the 2017 season, including long runs of 90 and 75 yards.
We are also able to calculate the total win probability added (WPA) for each player from their various movements over the course of the runs using our ballcarrier model. Tables 6 and 7 display the top and bottom five players according to the total WPA accumulated from their ballcarrier movements.
Name  Total WPA 

Leonard Fournette  0.23 
Kareem Hunt  0.23 
Dak Prescott  0.22 
Cordarrelle Patterson  0.21 
Orleans Darkwa  0.19 
Name  Total WPA 

Le’Veon Bell  0.40 
Ty Montgomery  0.41 
Chris Carson  0.46 
Melvin Gordon  0.51 
Jay Ajayi  0.60 
With limited data, it is difficult to evaluate these framelevel metrics and make claims about their discriminatory ability. Each of these continuoustime estimates are a function of all twentytwo players on the field, while the above metrics are merely attributing the observed change in value of the framelevel data to the ball carrier. Regressionbased approaches such as the implementation in Yurko et al. (2019) could provide a starting point for dividing the credit among players within the play. Additionally, our model accounts for the player’s speed as an input which is an inherent function of the ballcarrier. Future work would consider imputing average speed levels for all ballcarriers at particular moments over the course of the run or generate the ballcarrier model without speed accounted for. However, due to the limited availability of data this currently presents a challenge that could be addressed when more data are made available.
6 Discussion & Future Directions
In this work, we provide a framework for continuoustime withinplay valuations of game outcomes in football using player and balltracking data from the National Football League. We implement the core piece of this framework, a model for the expected yards gained from a ballcarrier’s current yard line, conditional on the locations and trajectories of all 22 players on the field, and we test several different modeling approaches for doing so. As input for this ballcarrier model, we create a rich set of features that describe the location of the ballcarrier relative to other players on the field, e.g. with features generated from Voronoi tessellations of all 22 players on the field. For this ballcarrier model, we find that all tested models substantially outperform a baseline interceptonly model, but that a long shortterm memory (LSTM) recurrent neural network outperforms alternative approaches according to the three evaluation measures we set forth in this paper.
We provide the results of the ballcarrier model and, thus, an implementation of continuoustime valuation of game outcomes in football for all rushing plays, using the NFLprovided tracking data from Weeks 16 of the 2017 season. Using these withinplay estimates of expected points and win probability, we briefly discuss metrics for evaluating individual rushers, such as each player’s expected points added per frame and framelevel rushing success rate.
There are many potential directions for future work. First, there are several remaining aspects to a football game that we do not currently handle. First, we assume the play type is known at the start of the play, which could be problematic. For example, runpass options have become increasingly popular in recent seasons, with teams like the 2018 Baltimore Ravens using this as a core feature of their offensive gameplan in the second half of the season (Pennington, 2018). Currently, our models condition on the play type at the top level of the framework in Figure 4.
Second, we currently do not handle special teams. A brief sketch of how this important piece of a football may fit into our framework is as follows: For kickoff and punt returns, we can use the ballcarrier model, provided enough training data (this was not possible with only six weeks of data for this paper). For field goals, since blocked kicks are rare, continuoustime play value is likely of limited additional value above what is possible with discretetime (betweenplay) play value models. Similarly, blocked punts are rare, so attempting to model these may prove more challenging than its worth.
Third, we currently do not handle fumbles by the ballcarrier. To do so, we would have to incorporate a survival component into our model, accounting for the hazard of a fumble at each moment throughout a ballcarrier sequence, conditional on the features of that sequence that may be indicative of changes in fumble rates. However, fumbles are rare events, and even rarer in a sixweek sample of games (there were only 77 rushing fumbles in 153,184 rushing frames across 4,447 ballcarrier sequences in our dataset), rendering the estimation of this component of the ballcarrier model impractical. This task is left to future work, if/when multiple seasons of tracking data are available.
Fourth, we currently use an ad hoc approach for estimating the time remaining at the end of plays. An elegant approach would be to model the joint distribution of the yards gained from the ballcarrier’s current position and the time remaining at the end of the play. However, doing so would (at least) double the size of the parameter space. Additionally, time remaining is typically of little value in a betweenplay model for play value, and only comes into play in somewhat rare situations at the end of the 1st or 2nd half. With a limited set of six weeks of tracking data, the ad hoc approach we use here will suffice.
Fifth, there is more work to be done in the area of feature engineering. As discussed, using a Voronoilike approach that accounts for the velocity of players on the field, similar to what Fernandez and Bornn (2018) do for modeling space creation and occupation in soccer, may yield some improvements in model predictions. Additionally, accounting for blockers (e.g. by joining the adjacent Voronoi polygons of teammates to identify a path through which the ballcarrier can travel) may also lead to improved prediction accuracy.
Sixth, in the context of player evaluation, researchers should be careful about how they use our models when evaluating players. As demonstrated in Figure 8, the ballcarrier speed is one of the most important features in modeling yards gained from the current position on the field. However, if we condition on the speed of a player in the model, any gains a ballcarrier makes as a result of being faster than other ballcarriers (or losses from being slower) will be not be attributed to that ballcarrier. As such, researchers using our models for player evaluation should consider using the average speed of player when evaluating individuals, so that deviations above and below average are attributed to that player.
Along these lines, future researcher may use our continuoustime, withinplay valuation of game outcomes to evaluate microactions of all players on the field, similar to what has been done in basketball (Sicilia et al., 2019) and soccer (Fernandez and Bornn, 2018; Decroos et al., 2019). Similar ideas have been implemented for players at offensive skill positions at the discretetime level in football (Yurko et al., 2019), but never implemented for all 22 players on the field, and never implemented in a continuoustime framework.
Next, Pospisil and Lee (2018) propose methods for conditional density estimation with random forests and neural networks, which may prove valuable in our ballcarrier model. In particular, estimating the entire distribution of possible outcomes at each frame would provide a more complete picture of the possible outcomes at each portion of the play, and would allow for more interesting methods of player evaluation. For example, instead of using metrics like framelevel expected points added (which compare players to average), similar metrics could be generated that measure performance relative to a baseline (e.g. replacement level) that can be objectively defined from conditional density estimates.
Finally and most importantly, we currently only provide an implementation of the ballcarrier model, and we do not implement the other modular submodels in our framework for continuoustime play value (e.g. QB decision model, target probability model, catch probability model, etc). Implementation of these models is somewhat straightforward, given an appropriate feature space: Since the responses in these models are either binary (target probability, incompletion probability, catch probability) or multinomial (QB decision), simple adjustments can be made to LSTM we use for the ballcarrier model to enable a similar approach to be used for these pieces of the framework. Additionally, some authors implement excellent versions of these models already. For example, Deshpande and Evans (2019) implement a catch probability model, and Burke (2019) implements both a QB decision model and a target probability model. We look forward to incorporating these models in our framework for continuoustime valuation of game outcomes in football.
References
 Allaire and Chollet (2019) Allaire, J. and F. Chollet (2019): keras: R Interface to ’Keras’, URL https://CRAN.Rproject.org/package=keras, r package version 2.2.4.1.
 Breiman (2001) Breiman, L. (2001): “Random forests,” Machine Learning, 45, 5–32, URL https://doi.org/10.1023/A:1010933404324.
 Burke (2009) Burke, B. (2009): “How coaches think: Run success rate,” URL https://www.advancedfootballanalytics.com/index.php/home/research/general/114howcoachesthinkrunsuccessrate.
 Burke (2019) Burke, B. (2019): “Deepqb: Deep learning with player tracking to quantify quarterback decisionmaking & performance,” MIT Sloan Sports Analytics Conference, URL http://www.sloansportsconference.com/wpcontent/uploads/2019/02/DeepQB.pdf.
 Cervone et al. (2014) Cervone, D., A. D’Amour, L. Bornn, and K. Goldsberry (2014): “Pointwise: Predicting points and valuing decisions in real time with nba optical tracking data.” MIT Sloan Sports Analytics Conference, 28, 3, URL http://www.sloansportsconference.com/wpcontent/uploads/2018/09/cervone_ssac_2014.pdf.
 Cervone et al. (2016) Cervone, D., A. D’Amour, L. Bornn, and K. Goldsberry (2016): “A multiresolution stochastic process model for predicting basketball possession outcomes,” Journal of the American Statistical Association, 111, 585â–599, URL https://arxiv.org/abs/1408.0777.
 Chen and Guestrin (2016) Chen, T. and C. Guestrin (2016): “Xgboost: A scalable tree boosting system,” in Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, New York, NY, USA: ACM, 785–794, URL http://doi.acm.org/10.1145/2939672.2939785.
 Chu et al. (2019) Chu, D., L. Wu, M. Reyers, and J. Thomson (2019): “Routes to success,” NFL Big Data Bowl, URL https://danichusfu.github.io/files/Big_Data_Bowl.pdf.
 Decroos et al. (2019) Decroos, T., L. Bransen, J. V. Haaren, and J. Davis (2019): “Actions speak louder than goals: Valuing player actions in soccer,” MIT Sloan Sports Analytics Conference.
 Deshpande and Evans (2019) Deshpande, S. K. and K. Evans (2019): “Expected hypothetical completion probability,” NFL Big Data Bowl, URL https://operations.nfl.com/media/3668/bigdatabowldeshpande_evans.pdf.
 Dutta et al. (2019) Dutta, R., R. Yurko, and S. L. Ventura (2019): “Unsupervised methods for identifying pass coverage among defensive backs with nfl player tracking data,” .
 Fernandez and Bornn (2018) Fernandez, J. and L. Bornn (2018): “Wide open spaces: A statistical technique for measuring space creation in professional soccer,” MIT Sloan Sports Analytics Conference.
 Fernández et al. (2019) Fernández, J., L. Bornn, and D. Cervone (2019): ‘‘Decomposing the immeasurable sport: A deep learning expected possession value framework for soccer,” MIT Sloan Sports Analytics Conference, URL http://www.sloansportsconference.com/wpcontent/uploads/2019/02/DecomposingtheImmeasurableSport.pdf.
 Friedman et al. (2010) Friedman, J., T. Hastie, and R. Tibshirani (2010): “Regularization paths for generalized linear models via coordinate descent,” Journal of Statistical Software, 33, 1–22, URL http://www.jstatsoft.org/v33/i01/.
 Glorot et al. (2011) Glorot, X., A. Bordes, and Y. Bengio (2011): “Deep sparse rectifier neural networks,” in G. Gordon, D. Dunson, and M. DudÃk, eds., Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, volume 15, Fort Lauderdale, FL, USA: PMLR, Proceedings of Machine Learning Research, volume 15, 315–323, URL http://proceedings.mlr.press/v15/glorot11a.html.
 Goodfellow et al. (2016) Goodfellow, I., Y. Bengio, and A. Courville (2016): Deep Learning, MIT Press, http://www.deeplearningbook.org.
 Gudmundsson and Horton (2016) Gudmundsson, J. and M. Horton (2016): “Spatiotemporal analysis of team sports  A survey,” CoRR, abs/1602.06994, URL http://arxiv.org/abs/1602.06994.
 Haar (2019) Haar, A. V. (2019): “Exploratory data analysis of passing plays using nfl tracking data,” NFL Big Data Bowl, URL https://operations.nfl.com/media/3672/bigdatabowlvonderhaar.pdf.
 Haykin (1998) Haykin, S. (1998): Neural Networks: A Comprehensive Foundation, Upper Saddle River, NJ, USA: Prentice Hall PTR, 2nd edition.
 Hochreiter and Schmidhuber (1997) Hochreiter, S. and J. Schmidhuber (1997): “Long shortterm memory,” Neural computation, 9, 1735–1780.
 Horowitz et al. (2017) Horowitz, M., R. Yurko, and S. L. Ventura (2017): nflscrapR: Compiling the NFL playbyplay API for easy use in R, URL https://github.com/maksimhorowitz/nflscrapR, r package version 1.4.0.
 Kingma and Ba (2014) Kingma, D. P. and J. Ba (2014): ‘‘Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980.
 Link et al. (2016) Link, D., S. Lang, and P. Seidenschwarz (2016): “Real time quantification of dangerousity in football using spatiotemporal tracking data,” PLoS ONE, 11, URL https://doi.org/10.1371/journal.pone.0168768.
 Pennington (2018) Pennington, B. (2018): “The ravensâ downtoearth approach is unnerving the n.f.l.” The New York Times, URL https://www.nytimes.com/2018/12/14/sports/baltimoreravenslamarjackson.html.
 Pospisil and Lee (2018) Pospisil, T. and A. Lee (2018): “Rfcde: Random forests for conditional density estimation,” URL https://arxiv.org/abs/1804.05753.
 Sicilia et al. (2019) Sicilia, A., K. Pelechrinis, and K. Goldsberry (2019): “Deephoops: Evaluating microactions in basketball using deep feature representations of spatiotemporal data,” .
 Sterken (2019) Sterken, N. (2019): “Routenet: a convolutional neural network for classifying routes,” NFL Big Data Bowl, URL https://operations.nfl.com/media/3671/bigdatabowlsterken.pdf.
 Tibshirani (1996) Tibshirani, R. (1996): “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society. Series B (Methodological), 58, 267–288, URL http://www.jstor.org/stable/2346178.
 Turner (2019) Turner, R. (2019): deldir: Delaunay Triangulation and Dirichlet (Voronoi) Tessellation, URL https://CRAN.Rproject.org/package=deldir, r package version 0.116.
 Voronoi (1908) Voronoi, G. (1908): “Nouvelles applications des paramÃ¨tres continus Ã la thÃ©orie des formes quadratiques. premier mÃ©moire. sur quelques propriÃ©tÃ©s des formes quadratiques positives parfaites.” Journal fÃ¼r die reine und angewandte Mathematik, 133, 97–178, URL http://eudml.org/doc/149276.
 Yurko et al. (2019) Yurko, R., M. Horowitz, and S. Ventura (2019): “nflwar: A reproducible method for offensive player evaluation in football,” Journal of Quantitative Analysis in Sports, Forthcoming, URL https://arxiv.org/abs/1802.00998.