Flexible discrete space models of animal movement
Ephraim M. Hanks^{1,2 *} and David A. Hughes^{2,3}
^{1} Department of Statistics, The Pennsylvania State University
^{2} Center for Infectious Disease Dynamics, The Pennsylvania State University
^{3} Department of Entomology, The Pennsylvania State University
^{*} hanks@psu.edu
Summary

Movement drives the spread of infectious disease, gene flow, and other critical ecological processes. To study these processes we need models for movement that capture complex behavior that changes over time and space in response to biotic and abiotic factors. Penalized likelihood approaches, such as penalized semiparametric spline expansions and LASSO regression, allow inference on complex models without overfitting.

Continuoustime Markov chains (CTMCs) have been recently introduced as a flexible discretespace model for animal movement. Modeling with CTMCs involves discretizing an animal’s path to the resolution of a raster grid. The resulting stochastic process model can easily incorporate environmental and other covariates, represented as raster layers, that affect directional bias and overall movement rate.

We introduce a weighted likelihood approach that allows for modeling movement using CTMCs, with path uncertainty due to missing data modeled by imputing continuoustime paths between telemetry locations. The framework we introduce allows for inference on CTMC movement models using existing software for fitting Poisson regression models, including penalized versions of Poisson regression.

The result is a flexible, powerful, and accessible framework for modeling a wide range of animal movement behavior.
Keywords: Animal movement, resource selection, random effects, semiparametric, model selection, muliple imputation.
Acknowledgements: Support for this work was provided by NSF EEID 1414296. The authors thank Lauren Quevillon, Jeremy Sterling, Devin Johnson, and others in the Hughes Lab and the NOAA National Marine Mammal Lab for providing movement data.
1 Introduction
Animal movement is a fundamental process underlying the spread of information, genetic material (McRae, 2006), infectious disease (Keeling and Rohani, 2008), and invasive species. Movement is a complex combination of behaviors, and often exhibits changing behavior over time and in response to biotic and abiotic drivers. Movement is typically observed as telemetry data: discrete points in space and time marking an animal’s location, often with observational error. In contrast, environmental factors that could influence movement behavior often are available in gridded form, with one value for each environmental factor at each grid cell (i.e. forest cover or elevational gradient).
A promising approach to pairing pointreferenced telemetry data with gridded covariates is to model movement only at the discretized resolution at which gridded environmental factors are observed (Hooten et al., 2010; Hanks et al., 2015; Avgar et al., 2016). Hooten et al. (2010) modeled movement as a dynamic occupancy process where an animal moved through a succession of bordering grid cells. Avgar et al. (2016) use step selection functions to model both resource selection (preference for some combinations of landscape characteristics) and directional persistence (correlated random walk behavior). Hanks et al. (2015) propose a stochastic process model for movement based on a continuoustime Markov chain (CTMC).
Both the approaches of Hanks et al. (2015) and Avgar et al. (2016) are appealing in that movement parameters describing directional persistance and resource selection can be fit within a generalized linear model (GLM) framework. This leads to extremely efficient computing, and allows complex movement behavior to be modeled with relative ease. The CTMC approach of Hanks et al. (2015) is especially appealing because of the ease with which a wide variety of movement behavior can easily be modeled, including directional bias in movement, changes in overall movement rate, and directional persistance.
As movement models become more and more complex, model comparison and selection are critical to avoid overfitting the data. Penalized likelihood methods (e.g., Hooten and Hobbs, 2015) provide a framework in which highly complex, sometimes semiparametric, relationships between predictors (i.e. environmental drivers) and response (i.e. observed movement) can be estimated. Penalized likelihood approaches include information criteria based model selection, ridge regression, LASSO regression, penalized spline fitting, generalized additive models (GAMs), and more. One major hurdle for using penalized likelihood approaches in movement analyses is the prevalence of missing data, which can occur when observation equipment fails, or (more generally) when there are gaps in observations where the animal’s movement path between observations is not clear. Within a Bayesian context, missing movement data can be accounted for by integrating over the uncertainty in the movement path between observations (Johnson et al., 2008; Hooten et al., 2010; Hanks et al., 2011, 2015). However, this typically requires computationally intensive Markov chain Monte Carlo (MCMC) methods. The most common likelihoodbased approach to account for missing data is multiple imputation (MI: Rubin, 1987, 1996), in which multiple stochastic realizations of the missing data are imputed from a carefully specified imputation distribution, the desired statistical model is fit to each imputed set of data in turn, and results from these multiple model fits are averaged posthoc. While there have been some attempts to apply penalized likelihood approaches to multiple imputation (Chen and Wang, 2013), these approaches only allow for point estimates, without a full characterization of uncertainty.
Our goal in this manuscript is to (1) present discrete space CTMC models as a flexible framework for modeling animal movement, (2) introduce a novel stacked weighted likelihood (SWL) approach for inference in the presence of missing data, and (3) show that combining these leads to a powerful and accessible approach to modeling complex movement behavior using existing GLM software. In Section 2 we introduce the CTMC as a stochastic process, show how Hanks et al. (2015) used CTMCs to model movement, and describe how a wide variety of common animal movement behaviors can be modeled within a CTMC framework. In Section 3 we review existing approaches for statistical inference on movement parameters in a CTMC, and introduce our SWL approach for inference in the presence of missing observations. In Section 4, we provide two data examples that illustrate how this provides a flexible framework for understanding and simulating complex movement behavior. In the first, we model changing behavior over time in a northern fur seal (Callorhinus ursinus). In the second, we model spatiallyvarying within nest movement behavior that varies among classes of the common black carpenter ant (Camponatus pennsylvanicus). We conclude in Section 5 with a discussion.
2 CTMC Models for Movement on a Landscape Grid
A continuous time Markov chain (CTMC) is a stochastic process defined in continuous time and on a discrete space. We follow Hanks et al. (2015) in our description of this stochastic process. A complete treatment can be found in multiple textbooks (e.g., Ross, 2009; Kulkarni, 2011). In the context of animal movement, let be the location of an animal at time , where can only take on one of distinct values: . We will assume for simplicity that the locations are the grid cells on a raster representation of the animal’s spatial domain, though any graphical structure is possible.
An animal’s continuous time movement path can be defined by the sequence of locations that the animal passes through, which is called the embedded chain, and the transition times at which the animal moves between locations . The transition times can also be equivalently represented by the residence times , where is the time spent in , the th location visited by the animal.
A CTMC statistical model for such a path is written in terms of the transition rates which are parameters that control movement behavior between spatial locations (grid cells). If it is impossible to directly move from location to location , then . Higher rates between locations correspond to increased rate of movement between those locations. If the th location in the embedded chain is (), and the animal’s movement follows a CTMC with transition rates , then the time the animal will remain in location is exponentiallydistributed with rate equal to sum of all with first index  all rates “leaving” location :
(1) 
So the mean residence time in node is .
When an animal leaves node and transitions to some neighboring spatial location, the probability that the transition is to node is
(2) 
Consider the case where we have two sets of covariates, one set which we assume is related to an animal’s absolute speed and another set which we assume is related to directional bias. We can use these two sets of covariates to model the transition rates of a CTMC, as follows. Let () be raster covariates, where is the value of the th covariate at the th grid cell, and denotes a column vector of these values at each of the grid cells. We assume that these covariates affect absolute speed (motility), and will call these covariates “motility covariates”. Additionally, assume that we have raster covariates () that we assume affect the directional bias of animal movement, through the gradient of a potential surface (Brillinger et al., 2001; Hanks et al., 2011; Preisler et al., 2013). The gradient of at () is a vector pointing in the direction of steepest increase in at that point in space. To model directional bias, first define a vector for each transition rate . Under this formulation, points in the direction that an animal moves if it transitions from grid cell to grid cell in the CTMC model. The dot product
(3) 
is positive if (the direction of movement from node to ) and (the direction of steepest ascent of the covariate at node ) point in the same direction, zero if and are at right angles to each other, and negative if they point in opposite directions. The dot product provides a directional covariate that captures the correspondence between a potential movement and the gradient of a covariate, and can be used to model movement bias in the direction of increasing (or decreasing) levels of a covariate.
The CTMC is an appealing framework for jointly modeling variation in absolute movement rate and directional bias, as both motility and directional covariates can be integrated into a loglinear model for the CTMC transition rates
(4) 
Consider the simplified case in which and the transition rates are . The overall rate of movement out of cell is , and so is defined by the motility covariates () and their coefficients . If grid cells and are both neighbors of , then , so there is no directional bias in movement. From this we see that the motility covariates () only affect absolute rate of movement.
When the are not all zero, the transition probabilities (2), which define directional bias in movement, are independent of the motility covariates (), as
and the transition probabilities (2) are
(5) 
The result is that a CTMC model on a raster grid with rates defined by (4) provides a flexible general framework for jointly modeling covariate effects on absolute movement rate and directional bias.
2.1 Modeling Movement Using Covariate Rasters
A wide variety of movement behavior can be modeled by careful specification of covariate raster layers. Examples include modeling migration along environmental gradients (Hooten et al., 2010), centralplace foraging behavior (Hanks et al., 2011, 2015), and interaction with conspecifics (Hanks et al., 2015; Quevillon et al., 2015). We now provide examples of modeling common movement behaviors using raster layers to define CTMC movement rates (4).
Movement along environmental gradients. Animals may use environmental gradients for navigation. For example, a mule deer might move predominantly in the direction of increasing elevation during a spring migration (e.g., Hooten et al., 2010), or a seal might follow gradients in sea surface temperature to navigate toward land (e.g., Hanks et al., 2011). This could be modeled by using a raster layer of the relevant environmental variable and then using the gradient (3) in the loglinear model for CTMC transition rates (4).
Activity centers. The gradient of Euclidean distance to the animal’s “central place” (e.g., den or rookery) could be used as a gradientbased covariate (3). A negative regression coefficient in (4) would then lead to movement biased to return to the central place.
Environmentally driven movement rate. Many animals tend to move more quickly through unfavorable terrain (e.g., when crossing a road). In other situations, animals may move more cautiously and slowly through unfavorable terrain (e.g., when entering an open field). An indicator raster layer for each relevant cover type could be used as a motility covariate in the model for CTMC transtion rates (4), with positive values of the corresponding regression coefficient indicating an increase in overall movement rate through the th land cover type.
Conspecific Interaction. An animal’s movement patterns can be greatly affected by interaction with conspecifics. When inference is desired on the movement path of one focal animal, and there are observed paths of other nearby animals, a gradientbased covariate (3) could be created from the timevarying distance from each conspecific to the focal animal, thus modeling attraction to (with ) or repulsion from (with ) the conspecific.
Directional persistence. To model correlated random walk behavior within the Markovian CTMC framework, Hanks et al. (2015) include an autocovariate in the loglinear model for , where the autocovariate is created from the gradient covariate as in (3) where the gradient vector is a vector of the animal’s current direction of movement from the most recent grid cell in the embedded chain to the current grid cell. Positive values of the regression coefficient related to this autocovariate indicate that the animal is likely to maintain its direction of movement over time.
Memory. Including a raster covariate of distance to past locations as a gradientbased covariate (3) models memory through attraction to spatial locations visited in the past.
Individual heterogeneity. Including a random intercept in the loglinear model (4) would model individual heterogeneity in overall movement rates. Similarly, an individualspecific random slope model for a regression coefficient in (4) would model individual variation in response to that raster covariate.
Changing behavior over time. Hanks et al. (2015) use varyingcoefficient models (Hastie and Tibshirani, 1993) to model periodic (i.e., seasonal or diurnal) variation in movement response to covariates. In a varyingcoefficient model, a static in time term in (4) is replaced by , where is a timevarying coefficient often modeled smoothly using a penalized spline basis expansion.
3 Statistical Inference for CTMC Model Parameters
The likelihood of the observed CTMC path (), defined by the embedded chain , and the residence times , is given by the product of the density of the embedded chain (2) and the residence times (3)
(6) 
with transition rates a function of raster covariates and regression parameters as in (4).
Hanks et al. (2015) introduced a set of auxilliary variables to facilitate maximizing (6). For each grid cell in the embedded chain, create defined as indicator variables for the embedded chain, so that each is zero except for , the latent variable corresponding to , the next grid cell in the embedded chain
(7) 
Hanks et al. (2015) showed that maximizing the likelihood (6) of the CTMC path is identical to maximizing the likelihood of a Poisson regression with canonical log link, where as the response, are the covariates in the linear predictor, and is an offset
(8) 
with as in (4). The likelihood of (8) is given by
which, with some algebraic manipulation, is equivalent to (6). This means that the CTMC parameters in (4) can be estimated using standard Poisson GLM software by first creating the from the embedded chain and then fitting the model (8) using existing software.
3.1 Inference from Discrete Observation in Time
Instead of observing a complete CTMC path , it is typical to observe an animal’s location at discrete points in time, leading to data in the form of . Once discretized to raster cells , the likelihood of the observed data is
where . Obtaining these transition probabilities for a fixed time requires matrix exponentiation (Ross, 2009) of an generator matrix defined by the set of all transition rates between nighboring grid cells. For large raster grids this is computationally prohibitive.
Instead, multiple authors have considered inference based on stochastic imputation of continuoustime continuousspace paths linking the observed locations (e.g., Johnson et al., 2008; Hooten et al., 2010; Hanks et al., 2011). In this approach, continuoustime paths can be imputed from the posterior distribution of a movement model fit to the telemetry data. For example, the continuoustime correlated random walk (CTCRW) model of Johnson et al., (2008) or the functional movement model of Buderman et al. (2016) could be fit to the original telemetry data, and multiple paths could be simulated from the fitted model. Inference on CTMC movement parameters, conditioned on the telemetry data, could then be obtained by integrating over the imputed paths (Johnson et al., 2008; Hooten et al., 2010; Hanks et al., 2011, 2015). This integration could be carried out in a Bayesian context through sampling a new continuoustime movement path at each iteration of a MCMC sampler, or approximated by multiple imputation when making inference based on maximum likelihood (as done in Hanks et al., 2015). The Bayesian approach is computationally intensive, and existing penalized likelihood approaches using multiple imputation (Chen and Wang, 2013) only allow for point estimates without quantification of uncertainty (e.g., through confidence intervals).
3.2 A Stacked Weighted Likelihood Approach to Inference With Missing Data
To facilitate fitting semiparametric and other complex models to movement data, we present a novel stacked weighted likelihood (SWL) approach to inference in the presence of missing data. The SWL approach is to first impute continuous time movement paths from an imputation distribution, such as the posterior predictive distribution of the functional movement model of Buderman et al. (2016). Under multiple imputation, a CTMC model would be fit to each path individually, and inference on could be made by combining the results of each individual fit using the rules developed by Rubin (1987). See Hanks et al. (2015) and Nakagawa and Freckleton (2010) for examples.
Instead, we propose inference based on maximizing the following likelihood, which is the geometric mean of the likelihoods of each imputed CTMC path
(9) 
Each term in the product is the likelihood (8) of one of the imputed paths after the creation of the auxiliary (7). This formulation is a weighted likelihood approach, similar to data cloning (Lele et al., 2007, 2011), where we assign a weight of to each of the imputed paths, reflecting that the imputed paths should each be given equal weight, and taken all together should be given the “weight” of the 1 observed set of telemetry data. As the individual terms in the SWL likelihood (9) are given by the likelihood of independent Poisson regression observations (8), the SWL approach amounts to “stacking” the imputed data sets , treating each as independent but downweighted observations. We show in Appendix A that the inference using the SWL (9) is nearly identical to inference obtained using multiple imputation. The SWL is appealing because of the ease with which it can be implemented, and the straightforward extension to estimation using penalized likelihood approaches, as the SWL likelihood takes on a closed form (9).
3.3 A General Framework For Flexible Discrete Space Movement Modeling
Pairing the SWL approach to inference with the CTMC model for animal movement provides a flexible and powerful framework that is easy to implement using existing GLM software. We suggest the following general steps.

Collect telemetry data in the form of .

Collect and create covariate rasters. Each covariate defines either (or both) a motility covariate or a directional (gradient based) covariate in (4).

Impute continuous paths for each set of animal telemetry data using a suitable continuoustime movement model.

Discretize each continuous path in space to obtain CTMC paths , and the corresponding covariates from (4).

Convert each CTMC path to data appropriate for a Poisson GLM likelihood by creating the auxiliary variabless (7) for each path.

Stack together the response , offset , and covariates from (4).

Make inference on CTMC transition rate parameters under the SWL likelihood (9), or a penalized version, using Poisson GLM software with each observation given a “weight” of , thus averaging over the uncertainty in the unobserved parts of the movement path. Alternately, Bayesian approaches or multiple imputation can be used for inference.
4 Examples
We now illustrate using two example systems the ease with which complex movement behaviors can be modeled in this framework.
4.1 Example 1: Northern Fur Seal Movement
We first consider a northern fur seal (henceforth, seal) movement path. Northern fur seals are pelagic foragers found primarily in the North Pacific Ocean. During the Summer months, the northern fur seal population gathers at the Pribilof Islands off the coast of Alaska, USA to breed. Figure 1a shows a 21day movement path taken by a male seal who leaves the Pribilof Islands (shown with a red triangle), forages in the open Pacific ocean, and then returns to the Pribilof Islands. Observations were obtained using the Argos system (Argos website https://www.argossystem.org), with telemetry fixes attempted every 4 hours. The background image shows sea surface temperature (SST) in degrees Celsius. The animal’s movement path shows timevarying directional bias, as the seal first swims away from its rookery on the Pribilof Islands (Figure 1b), and then returns. Further details on data collection are given in Hanks et al. (2011).
Hanks et al. (2011) modeled changing behavior over time using a stateswitching model which allowed for an unknown number of movement states through a reversible jump MCMC algorithm. The velocitybased model of Hanks et al. (2011) is only able to model directional bias through gradient based covariates ( in (4)), and the reversible jump algorithm took multiple hours to converge. Through the use of a CTMC model, we will model additional complexity in seal movement by modeling overall movement rate in different environmental conditions (different water temperatures), and show that inference using the SWL approach takes only seconds using standard GLM software.
We first fit the functional movement model of Buderman et al. (2016) to the seal telemetry data. Four imputed paths from the resulting posterior predictive distribution are shown in Figure 1c. We imputed a total of continuoustime paths, and used the ‘ctmcmove’ package (Hanks, 2016) in the R statistical computing environment to discretize each imputed path to a CTMC path (Figure 1d shows a detailed discritization of one path). Code to replicate this analysis is given in Appendix A.
We modeled seal movement as a CTMC with sea surface temperature (SST) as a motility covariate (4) affecting overall rate of movement. In addition, we modeled timevarying desire to swim away from, and then return to, the Pribilof Islands, through a directional covariate raster (Figure 1b) with values equal to the great circle distance to the Pribilof Islands. We used a semiparametric varyingcoefficient model to capture movement bias first in the direction of increasing distance from the seal’s rookery, and then in the direction of decreasing distance. This amounts to a timevarying potential function approach (Preisler et al., 2013). We also modeled directional persistance through the inclusion of an autocovariate , where is an autocovariate pointing in the direction of current movement, as described in Section 2.1. Thus, the model we assume for the timevarying CTMC transition rates (4) between neighboring grid cells and is
(10) 
where is the gradient of the distance from grid cell to the Pribilof Islands, and SST is the sea surface temperature of the th grid cell. The time varyingcoefficient was modeled using a semiparametric spline basis expansion using Bsplines. Smoothness in this additive model was imposed by penalizing the integral of the square of the second derivative (e.g., Fahrmeir et al., 2013). The penalized loglikelihood used for estimation was then given by
(11) 
with the tuning parameter chosen via crossvalidation. The appeal of the SWL approach is that (11) can be fit using any software which fits varying coefficient models for Poisson regression. We used the ‘mgcv’ package in R (Wood, 2016) which implements a variety of penalized semiparametric models. Estimation took only a few seconds, representing a significant increase in computational efficiency relative to the simpler directionalonly model of Hanks et al. (2011).
The results of this analysis showed that the seal in general moves more slowly in higher temperature water (, value=.014), the seal exhibits directional persistance (, value ), and that the seal is first drawn away from its rookery in the Pribilof Islands, and then back toward the rookery (Figure 1e). Figure 1f shows 3 simulated 21day seal movement paths from the fitted model. These paths show similar behavior to the observed NSF movement in Figure 1a, with simulated animals making loops out and back to the rookery, and showing slower movement in regions of higher SST. This behavior could not have been captured without modeling absolute movement rate based on covariates, directional bias based on covariates, and changing behavior over time. Pairing the CTMC model for movement with penalized approaches for inference leads to a flexible and powerful framework for modeling complex movement behavior.
4.2 Example 2: Semiparametric Modeling of Ant Movement
We now illustrate how CTMC movement models can be applied to populationlevel movement through the analysis of a colony of the common black North American carpenter ant Camponatus pennsylvanicus. Ants, like those in the genus Camponatus, provide an important laboratory species for studying collective behavior, community space use, and social contact structure, as entire colonies can be observed at once, something rarely possible with humans or other vertebrates. Our goal is to capture ant movement behavior sufficiently that we can simulate realistic ant movement. This would allow for future simulation of epidemics through the ant colony. For this analysis, we consider a single ant colony from the study described in Quevillon et al. (2015), in which a Queen and 16 workers were placed in a 4chambered wooden nest measuring 7cm by 11cm. Videos of ant movement were recorded using a GoPro Hero2 camera with a modified IR filter (RageCams.com) illuminated under infrared light. As ants cannot detect infrared light, we expect ant behavior in this constructed environment to be similar to ant behavior in natural dark nest environments (tunnels in dead and living wood for Camponatus).
Quevillon et al. (2015) found that, similar to human societies, different classes of ants have different roles in the colony, and exhibit different space use and social contact behavior. Our goal of this analysis is to fit a flexible movement model that captures differences in behavior between classes of ant sufficiently that we can simulate realistic ant movement. We first divided worker ants into two broad classes: “foragers”, which are all worker ants seen to have left the nest in the observation window, and “nest ants”, which are all worker ants who have not been observed leaving the nest. Figure 2a2c show the observed patterns of space use for each of these classes of ant, discretized to a 1cm square grid. The Queen (Figure 2c) confines herself to a region of the topright chamber. The nest ants (Figure 2b) spend much of their time hear the Queen, while the foragers (Figure 1a) spend most of their time in other chambers of the nest. This segregation in space can convey a measure of social immunity to disease (Quevillon et al., 2015) as the foragers, which are at high risk of encountering pathogens while out of the nest, are buffered from contact with the Queen.
We take a purely semiparametric approach to modeling ant movement. For each class of ant, we specify the CTMC movement rates (4) at each 1cm grid cell in the nest as a loglinear function of a motility surface , which models spatiallyvarying average movement rate, and the gradient of a potential surface , which models spatiallyvarying directional bias
(12) 
Both the motility surface and potential surface are modeled using a Bspline expansion ( and ). This is similar to the approach proposed by Russell et al. (2016). We estimated and by using LASSO regression in which the SWC likelihood (9) is penalized by the sum of the absolute values of and . This allows for joint penalization (and model selection) of spatiallyvarying movement rate () and directional bias (defined by the potential function ).
We allowed each class of ants to have their own potential and motility surfaces, and show the resulting estimated surfaces in Figure 2d2i. Ants tend to move in the direction of decreasing potential (), so the estimated potential surface for the Queen (Figure 2f) indicates that the Queen will reside mostly within a potential “valley” in the topright chamber of the nest. The foragers’ potential surface (Figure 2d) shows a potential “hill” in the same location, indicating that foragers will in general avoid entering the immediate vicinity of the Queen. The estimated motility surfaces for the foragers (Figure 2g) and nest ants (Figure 2h) show increased motility (movement rate) in the doorways between chambers, indicating that ants are unlikely to rest in these locations. This semiparametric model provides a significant increase in our understanding of ant movement and space use in the nest relative to the parametric analysis in Quevillon et al. (2015).
We simulated ant movement from the CTMC model by simulating CTMC paths on the 1cm grid for a Queen and a set of foragers and nest ants matching the numbers of Camponatus observed in each of the 8 days of observation. The simulated space use (Figures 2j2l) shows that the CTMC model captures space use within the nest of the three classes of ant modeled here.
5 Discussion
Pairing a CTMC model for movement with our proposed SWL approach for inference results in a flexible, powerful framework for modeling a wide variety of movement behavior. In particular, the discretespace nature of a CTMC allows for specifying raster layers of environmental (or other) factors as covariates driving variation in movement rate and directional bias. We have provided two examples in which changing movement behavior in time and space is modeled using semiparametric models with parameters estimated using penalized likelihood approaches. The SWL (9) allows easy model selection based on AIC, or other penalized approaches (Hooten and Hobbs, 2015), and straightforward computing using existing software for fitting GLMs.
Discretizing space results in an approximation to a true continuousspace movement path. A growing body of literature provides examples of increasingly complex continuousspace movement models (Hooten and Johnson, 2016; Russell et al., 2015; Scharf et al., 2015). When the discretization required for a CTMC model is undesirable, the ease with which a CTMC model can be fit makes it an appealing tool for exploratory analysis prior to fitting a more complex continuousspace model using MCMC (or something similar).
We have provided code to replicate the analyses in this paper in Appendix A and Appendix B. Additionally, the ‘ctmcmove’ Rpackage (Hanks, 2016) provides code to facilitate the general framework for modeling movement described in Section 3.3 of this manuscript, as illustrated by the code in the Appendices.
The straightforward inference possible within a CTMC framework for modeling movement promises to make complicated movement modeling more accessible and facilitate more realistic modeling. Our two examples illustrate that fitting flexible CTMC movement models using penalized likelihood approaches leads to models for movement that capture many realistic features of animal movement. In both examples the CTMC framework led to novel insights into animal behavior not initially understood in earlier publications. The ability to easily fit complex models to telemetry data will make it possible to build more realistic descriptions of movement behavior and ultimately increase our knowledge of movement ecology and the important processes, like gene flow and the spread of infectious disease, that are driven by movement.
References
 Avgar et al. (2016) Avgar, T., J. R. Potts, M. A. Lewis, and M. S. Boyce. 2016. Integrated step selection analysis: bridging the gap between resource selection and animal movement. Methods in Ecology and Evolution pages n/a–n/a.
 Brillinger et al. (2001) Brillinger, D., H. Preisler, A. Ager, and J. Kie, 2001. The use of potential functions in modeling animal movement. Pages 369–386 in A. K. Salah, editor. Data analysis from statistical foundations. Nova Publishers.
 Buderman et al. (2016) Buderman, F. E., M. B. Hooten, J. S. Ivan, and T. M. Shenk. 2016. A functional model for characterizing longdistance movement behaviour. Methods in Ecology and Evolution 7:264–273.
 Chen and Wang (2013) Chen, Q., and S. Wang. 2013. Variable selection for multiplyimputed data with application to dioxin exposure study. Statistics in Medicine 32:3646–3659.
 Fahrmeir et al. (2013) Fahrmeir, L., T. Kneib, S. Lang, and B. Marx. 2013. Regression. Springer Berlin Heidelberg, Berlin, Heidelberg.
 Hanks (2016) Hanks, E., 2016. ctmcmove: Modeling Animal Movement with ContinuousTime DiscreteSpace Markov Chains.
 Hanks et al. (2011) Hanks, E., M. Hooten, D. Johnson, and J. Sterling. 2011. VelocityBased Movement Modeling for Individual and Population Level Inference. PLoS ONE 6:e22795–e22795.
 Hanks et al. (2015) Hanks, E. M., M. B. Hooten, and M. W. Alldredge. 2015. Continuoustime Discretespace Models for Animal Movement. The Annals of Applied Statistics 9:145–165.
 Hastie and Tibshirani (1993) Hastie, T., and R. Tibshirani. 1993. VaryingCoefficient Models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 55:757–796.
 Hooten and Hobbs (2015) Hooten, M. B., and N. T. Hobbs. 2015. A Guide to Bayesian Model Selection for Ecologists. Ecological Monographs 85:3–28.
 Hooten and Johnson (2016) Hooten, M. B., and D. S. Johnson. 2016. Basis Function Models for Nonstationary ContinuousTime Trajectories. arXiv:1601.05408 [stat] .
 Hooten et al. (2010) Hooten, M. B., D. S. Johnson, E. M. Hanks, and J. H. Lowry. 2010. AgentBased Inference for Animal Movement and Selection. Journal of Agricultural, Biological, and Environmental Statistics 15:523–538.
 Johnson et al. (2008) Johnson, D. S., D. L. Thomas, J. M. Ver Hoef, and A. Christ. 2008. A general framework for the analysis of animal resource selection from telemetry data. Biometrics 64:968–976.
 Keeling and Rohani (2008) Keeling, M. J., and P. Rohani. 2008. Modeling Infectious Diseases in Humans and Animals. Princeton University Press.
 Kulkarni (2011) Kulkarni, V. G. 2011. Introduction to Modeling and Analysis of Stochastic Systems. Springer Texts in Statistics, Springer New York, New York, NY.
 Lele et al. (2007) Lele, S. R., B. Dennis, and F. Lutscher. 2007. Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods. Ecology letters 10:551–63.
 Lele et al. (2011) Lele, S. R., K. Nadeem, and B. Schmuland. 2011. Estimability and Likelihood Inference for Generalized Linear Mixed Models Using Data Cloning. Journal of the American Statistical Association 105:1617–1625.
 McRae (2006) McRae, B. 2006. Isolation by resistance. Evolution 60:1551–1561.
 Nakagawa and Freckleton (2010) Nakagawa, S., and R. P. Freckleton. 2010. Model averaging, missing data and multiple imputation: a case study for behavioural ecology. Behavioral Ecology and Sociobiology 65:103–116.
 Preisler et al. (2013) Preisler, H. K., A. A. Ager, and M. J. Wisdom. 2013. Analyzing animal movement patterns using potential functions. Ecosphere 4(3):32–32.
 Quevillon et al. (2015) Quevillon, L. E., E. M. Hanks, S. Bansal, and D. P. Hughes. 2015. Social, spatial, and temporal organization in a complex insect society. Scientific Reports 5.
 Ross (2009) Ross, S. M. 2009. Introduction to Probability Models, Tenth Edition. 10th edition edition. Academic Press, Amsterdam ; Boston.
 Rubin (1987) Rubin, D. B. 1987. Multiple Imputation for Nonresponse in Surveys. John Wiley and Sons.
 Rubin (1996) Rubin, D. B. 1996. Multiple Imputation After 18+ Years. Journal of the American Statistical Association 91:473–489.
 Russell et al. (2015) Russell, J. C., E. M. Hanks, and M. Haran. 2015. Dynamic Models of Animal Movement with Spatial Point Process Interactions. Journal of Agricultural, Biological, and Environmental Statistics 21:22–40.
 Russell et al. (2016) Russell, J. C., E. M. Hanks, M. Haran, and D. P. Hughes. 2016. A SpatiallyVarying Stochastic Differential Equation Model for Animal Movement. arXiv:1603.07630 [stat] .
 Scharf et al. (2015) Scharf, H. R., M. B. Hooten, B. K. Fosdick, D. S. Johnson, J. M. London, and J. W. Durban. 2015. Dynamic social networks based on movement. arXiv:1512.07607 [stat] .
 Wood (2016) Wood, S., 2016. mgcv: Mixed GAM Computation Vehicle with GCV/AIC/REML Smoothness Estimation.